Replay: The EASIEST way to create AI Cover Songs!

Bob Doyle Media
26 Mar 202419:03

Summary

TLDRThis video script delves into the world of AI voice cloning and music, focusing on a tool called 'Replay' that allows users to change any song's voice to their desired voice. The presenter shares their experience with Replay, detailing the free and easy-to-use interface that enables voice conversion and music remixing without copyright issues. The script also guides viewers through downloading models, using the software to create songs with different voices, and even merging models to create unique vocal combinations.

Takeaways

  • 🎀 **Voice Cloning and Music Focus**: The channel explores creative uses of AI, with a recent focus on voice cloning and music.
  • πŸ”„ **Replay Tool Introduction**: The host introduces Replay, a tool for voice conversion in songs, allowing users to change any song's voice to a desired one.
  • πŸ†“ **Free and No Subscriptions**: Replay is completely free with no subscriptions, making it accessible for voice conversion tasks.
  • πŸ’» **Local Machine Processing**: Replay operates primarily on the user's machine, with occasional model downloads from the internet.
  • 🎡 **Voice Model Integration**: Users can integrate voice models into Replay to convert vocals in audio tracks, offering a wide range of voices and characters.
  • πŸ” **Searching for Models**: The script guides users on how to search for and download voice models from the Weights website.
  • 🎚️ **Adjusting Vocal and Instrument Pitch**: Users can adjust the relative pitch to match the original and converted vocals, as well as transpose the instrument track to align with the new vocal pitch.
  • πŸ–₯️ **Batch Processing and Multimodel Features**: Replay allows for batch processing and merging of multiple voice models into one song.
  • 🎢 **Creating Music from Text Prompts**: Replay has a feature to create short audio snippets from text prompts, although it's noted to be time-consuming and not as high quality as other tools.
  • πŸŽ‰ **Fun and Creative Potential**: The host emphasizes the fun and creative potential of Replay, encouraging users to experiment with different voices and songs.
  • πŸ“ˆ **Quality and Ease of Use**: The script highlights the good quality of separated vocal tracks and the ease of using Replay for voice conversion.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is exploring the use of AI in voice cloning and music, specifically focusing on a tool called 'Replay' that allows users to change the voice in any song with a voice of their choice.

  • What does Replay offer that sets it apart from other voice conversion tools?

    -Replay is highlighted for its ease of use and the fact that it is a free tool with no subscriptions required. It allows users to download models from the internet to facilitate voice conversion, and most of the processing takes place locally on the user's machine.

  • How does Replay handle the process of voice conversion?

    -Replay allows users to upload or record a music track, select a voice model, and then convert the vocals of the track to the selected voice. It also provides options to adjust the relative pitch and instrument pitch to match the original track better.

  • Where can users find and download voice models for Replay?

    -Users can find and download voice models from a website called weights.G, where they can search for specific voices or characters and download the models for use in Replay.

  • What file types are associated with the voice models in Replay?

    -The voice models in Replay are associated with '.pth' and '.index' file types. The '.pth' file is the actual voice model, and the '.index' file tells the voice model how to behave.

  • How does Replay handle the process of downloading and using multiple voice models?

    -Replay allows users to download multiple voice models and use them for voice conversion. Users can drag and drop the '.pth' file into Replay, and the model becomes available in the list of models for conversion.

  • What is the 'multimodel' feature in Replay and how does it work?

    -The 'multimodel' feature in Replay allows users to select multiple models to use for one song, creating a batch processing effect. Users can choose different models and Replay will convert the song using each of the selected voices.

  • Can Replay be used to create music from a text prompt?

    -Yes, Replay has a feature that allows users to create music from a text prompt, although the video suggests that this feature might be somewhat obsolete compared to other tools like Sunno, and it may not produce high-quality results for longer tracks.

  • How does the video creator describe their experience with Replay over the weekend?

    -The video creator describes spending many hours using Replay over the weekend, downloading tracks from YouTube, changing voices, and enjoying the process, indicating a high level of engagement and satisfaction with the tool.

  • What is the video creator's recommendation for those interested in AI and creativity?

    -The video creator invites those interested in AI and creativity to subscribe to their channel for more content related to exploring AI tools and being a 'mad scientist' about it.

Outlines

00:00

🎀 Voice Cloning and Music with Replay

The script introduces a channel focused on creative AI uses, particularly voice cloning and music. The host discusses Replay, a tool for voice conversion that can change any song's voice to a desired one. They mention their recent exploration of various software for voice cloning and music conversion, emphasizing Replay's ease of use despite not being a perfect solution. The host shares their excitement about spending a weekend converting songs and invites the audience to try Replay, which is free and available for Windows, Mac, and Linux. The tool may occasionally download models from the internet, but most processing happens locally. The host avoids copyright issues by generating their own song in Sunno, a platform for creating music, and uses it as an example to demonstrate Replay's capabilities.

05:01

🎡 Using Replay for Voice Conversion

The host provides a step-by-step guide on how to use Replay, starting with downloading the audio track from a song created in Sunno. They explain the process of selecting and previewing the track, choosing voice models, and the importance of adjusting the relative pitch when changing voices significantly. The script details how to find and download voice models from the Weights website, emphasizing the vast selection available, including singers and characters. The host demonstrates how to import the downloaded voice model into Replay, rename it for clarity, and adjust settings such as stem method and pitch. They also discuss advanced settings and the impact of GPU on the conversion speed, sharing their experience with different Nvidia cards. The host concludes by highlighting the ability to remix songs using the original and converted tracks, showcasing the potential for creative experimentation.

10:02

🎼 Advanced Features of Replay: Multimodel and Text-to-Music

The script delves into advanced features of Replay, such as multimodel processing, which allows the conversion of a song using multiple voice models simultaneously. The host demonstrates how to merge models to create a new voice and adjust the balance between them. They also touch on the ability to convert speech to different voices, not just music. Additionally, the host discusses Replay's text-to-music feature, which generates short audio snippets based on text prompts. While acknowledging the feature's limitations, especially when compared to platforms like Sunno, the host provides an example of creating a marching band pop ballad with accordions. They emphasize the time-consuming nature of this feature and the superior quality of Sunno for music creation, concluding with an invitation for the audience to subscribe and explore the creative potential of AI and music tools.

15:03

πŸ“ Creative AI and Music Mashups

In the final paragraph, the host reflects on their weekend spent experimenting with Replay, downloading models, and changing voices in songs, expressing their enjoyment and the high quality of voice separations. They compare Replay's capabilities with Sunno, highlighting Sunno's superior quality for music creation. The host invites the audience to subscribe for more content on AI creativity, mad science, and tool mashups, using a playful and engaging tone to encourage subscription. The script ends with a humorous note, promising to find and pursue those who do not subscribe, followed by a musical cue.

Mindmap

Keywords

πŸ’‘Voice Cloning

Voice cloning refers to the process of replicating a person's voice to make it sound like they are speaking when they are not. In the context of the video, voice cloning is used to change the voice in a song to any desired voice, showcasing the creative potential of AI in music and voice manipulation. The script mentions using various software tools to achieve voice cloning, emphasizing its role in the creative process.

πŸ’‘Voice Conversion

Voice conversion is the process of altering a voice to sound like another specific voice or character. The video discusses using AI to change the voice in a song, such as converting a male voice to a female voice or to the voice of a specific singer or character. This is demonstrated through the use of different models and software, like the 'replay' tool, which allows users to experiment with voice conversion in music.

πŸ’‘Replay

Replay is a tool mentioned in the video that enables users to replace vocals in an audio track with any voice they choose. It is described as a user-friendly, free software that can download models from the internet to facilitate voice conversion. The video creator uses Replay to demonstrate how to change the vocals of a song, emphasizing its ease of use and the fun of experimenting with different voices.

πŸ’‘Audio Track Separation

Audio track separation is the process of extracting individual components of a mixed audio track, such as separating vocals from instrumentals. The video script discusses using Replay to download models that help in separating the vocals from the music, which is a crucial step before applying voice conversion. This technique allows for the isolation and replacement of vocals in a song.

πŸ’‘Models (Voice Models)

In the context of the video, models refer to the specific voice presets or profiles that can be downloaded and used within the Replay software to convert one voice to another. The script mentions searching for and downloading models from a platform called 'weights' to use in voice conversion, such as turning a song's vocals into the voice of Dean Martin or Billy Elish.

πŸ’‘Relative Pitch

Relative pitch is the ability to identify or re-create a pitch without a reference tone. In the video, relative pitch adjustment is used when changing voices that are significantly different in range, such as from a male to a female voice. The script describes adjusting the relative pitch to ensure the converted voice matches the original music's key, using the Replay software's pitch adjustment features.

πŸ’‘Instrumental Pitch

Instrumental pitch refers to the pitch of the non-vocal elements of a song, such as the melody played by instruments. The video discusses adjusting the instrumental pitch to match the converted vocal track, which can help make the final product sound more natural and harmonious. This is particularly important when the vocal conversion results in a significant pitch change.

πŸ’‘Multimodel

The term 'multimodel' in the video refers to the feature in Replay that allows users to apply multiple voice models to a single song, creating a batch of converted tracks with different voices. This feature enables creative experimentation, such as combining the voices of Darth Vader and Garth Brooks in the same song, as demonstrated in the script.

πŸ’‘Merging Models

Merging models is a feature in Replay that creates a new voice model by combining two existing ones. The video script describes this process as creating an entirely new voice by merging the weights of two different models, resulting in a unique vocal sound that is a mix of the two original voices, such as a blend of Billy Eish and Sheldon Plankton.

πŸ’‘Text-to-Music

Text-to-music is a concept where a description or prompt in text form is used to generate a piece of music. The video mentions a feature in Replay that allows creating music from a text prompt, although it is noted as being somewhat obsolete compared to more advanced tools like Sunno. The script provides an example of generating a 'marching band pop ballad with accordions' from a text description.

πŸ’‘Sunno

Sunno is a music creation tool mentioned in the video that can generate songs based on text prompts or styles. The video script contrasts Sunno with Replay's text-to-music feature, highlighting Sunno's superior quality and ease of use for creating music. Sunno is used to demonstrate the potential of AI in music creation, showing how it can quickly produce high-quality songs based on specific styles or descriptions.

Highlights

Introduction to Replay, a tool for voice conversion in songs.

Replay allows changing any song's voice to a desired voice.

Replay is free and has no subscriptions.

Replay can download models from the internet to enhance voice conversion.

Most processing happens locally on the user's machine.

Guide on how to replace vocals in an audio track using Replay.

Replay can generate songs without copyright restrictions.

Demo of how to use Replay to convert a song's vocals.

Explanation of how to download and use voice models from weights.G.

Previewing voice models before downloading them.

Guide on how to install and use a specific voice model in Replay.

Adjusting relative pitch to match the original and converted vocals.

Option to change the pitch of the music track to match the converted vocals.

Ability to remix songs using the original and converted tracks.

Creating a new voice model by merging two existing models.

Adjusting the mix ratio of merged voice models.

Using Replay to convert speech into different voices.

Creating music from text prompts using Replay.

Comparison of Replay's text-to-music feature with Sunno's quality and efficiency.

Encouragement for subscribers to explore AI creativity tools like Replay.

Transcripts

00:00

welcome back to the channel where we

00:01

explore the creative uses of AI and we

00:03

have been on a voice cloning and music

00:05

kick lately haven't we boys and girls

00:07

well I am not going to stop now because

00:09

somebody told me about replay now if you

00:12

don't know about replay and you have an

00:14

interest in voice conversion

00:15

specifically taking the voice of any

00:17

song and changing it with whatever voice

00:19

you'd like to put in there I've been

00:20

showing all sorts of tools the past

00:22

several months really about how to use

00:24

different pieces of software to get

00:26

various parts of that task done use this

00:28

software to download and separate the

00:30

tracks use this software to convert the

00:32

audio on your computer and then more

00:33

software to put it together and while

00:35

this is not a 100% total voice cloning

00:38

and conversion solution it goes a long

00:39

way to making the process super super

00:41

easy and I'm a person who likes to use

00:43

tools local on my machine but this is a

00:45

tool that I have become completely

00:47

addicted to and I spent way too many

00:49

hours converting songs all weekend long

00:51

I'm going to show you why and I'll bet

00:52

you you do it too if you haven't

00:54

explored it yet the first thing you need

00:55

to do is go get replay and it is free

00:58

boys and girls every aspect this project

01:00

is free isn't that nice there's no

01:02

subscriptions of any kind to do what

01:04

we're about to do here it's pretty

01:06

mind-blowing actually when you get here

01:07

you've got choices you can download it

01:09

defaults to Windows on my platform but

01:11

you can also download to Mac and Linux

01:13

pretty simple install be aware that as

01:15

you're running this program it may

01:16

occasionally download models from the

01:18

internet and slow the process but all

01:20

those model downloads are a one-time

01:21

operation and it appears that most

01:23

everything is taking place on your

01:24

machine which is fantastic if I'm wrong

01:26

somebody correct me because this is all

01:28

about replacing vocals in an audio track

01:31

and because I don't want to deal with

01:32

copyright restrictions even though I'm

01:34

so tempted to play you all the stuff I

01:35

did this weekend I just am not going to

01:37

take that chance and luckily with sunno

01:39

we don't need to worry about that we can

01:41

go generate our own song and it's going

01:43

to be great so here in sunno I've

01:45

already created this song just to

01:46

eliminate all the trial and error that's

01:48

going to be perfect for our example I'll

01:49

play a little bit of it here basically

01:52

it was a kuner male solo Jazzy upbeat

01:54

swing quartet brass Summer Breeze in the

01:57

city I was taking a bunch of Franks and

01:59

atra titles and Mish mashing them and

02:00

see what they come up with and this is

02:02

what we've

02:03

[Music]

02:11

got okay we won't listen to the whole

02:13

thing now so we're going to go ahead and

02:14

download the audio track from that

02:15

particular song and then we're going to

02:17

go to replay now your screen won't look

02:19

exactly like this because I've been

02:20

filling it up with stuff all weekend

02:22

including models that I've downloaded

02:23

and I'm going to show you how to do

02:24

exactly that but let me just give you a

02:26

quick demo of how this basically works

02:28

you select any song you'd like music

02:30

track with audio on it and drop it right

02:32

here or you can record your own vocal

02:34

track or the most fun part is entering

02:36

in a URL for YouTube and it will

02:38

download the video separate the audio do

02:40

the conversion for you all in one step

02:43

it's really quite amazing as tempted as

02:45

I am to play you demos of all the stuff

02:46

I did this weekend I just don't want to

02:47

risk it so let's just play with what we

02:49

did so we're going to drop the song we

02:51

just downloaded from sunno right in here

02:52

to select or Drop Audio here we can

02:54

preview it to make sure it's the right

02:56

one it's the right one and now we choose

02:59

our our models but Bob I don't have any

03:01

modelss I just downloaded this thing

03:02

where do I get these models from of

03:04

which you speak look at this it says

03:06

right here if you just would relax it

03:08

says right here looking for more models

03:10

yeah 20,000 plus available on weights.

03:13

let's click that link right there and

03:15

then boom so when you get to weights. G

03:18

the first thing you want to do is to

03:19

create an account and it's free so all

03:22

you got to do is Click login up here and

03:24

then you can continue with any of these

03:25

ways to get in I generally use Google

03:27

once you're in now you can search for

03:29

and download these models the easiest

03:31

way to do that because the organization

03:32

around here is questionable at best just

03:35

click on the magnifying glass to search

03:36

and just type in the name of the voice

03:38

you'd like now they have a lot of

03:39

singers here they also have a lot of

03:40

characters for example all the SpongeBob

03:42

gang is here if I just type sponge

03:43

you'll see that and let's say a singer

03:46

like Billy Alish if we just type Billy

03:48

you'll see that we've got lots of

03:49

choices to choose from so you can spend

03:51

a lot of time here just finding and

03:52

playing with models you can preview what

03:54

the model sounds like by the way by just

03:56

clicking this Arrow here as a speaking

03:58

sample for awaits GGA and it gives you a

04:00

speaking sample as you can see we can

04:01

dive deep into this side let's get a

04:03

voice that I have not used just for the

04:05

purposes of this let's get Dean Martin

04:07

because he's a kuner and this is a kuner

04:09

song let's see if we got Dean Martin up

04:11

here we seem to have one let's listen to

04:12

it as a speaking sample for awaits GGI

04:15

voice model sure sounds enough like him

04:17

let's give it a try so we're going to

04:18

download the model and what we're going

04:20

to get is a ZIP file and when you open

04:22

it most of the time you're going to have

04:23

something like this sometimes you're

04:25

going to have a little bit of a

04:25

directory structure but you're going to

04:27

have a pth file which is the actual

04:29

voice model and you're going to have an

04:30

index file which kind of tells the voice

04:32

model how to behave first we're going to

04:34

extract this into a folder with the rest

04:36

of the voice models that we have I'm

04:38

actually going to create a folder for

04:40

this flect folder and extract once

04:42

that's extracted we're going to want to

04:43

rename these files because they all come

04:45

down as model and that's going to get

04:46

quite confusing so let's just rename

04:48

this to Dean Martin pth and Dean Martin

04:53

index and these are all being saved in

04:55

the default model file for replay which

04:58

is defined under the app drop down click

05:01

on show settings and here's where you

05:03

define the current APP directory once we

05:05

have that model renamed it's ready to

05:06

drag right into replay so we just take

05:08

the pth file and literally drag and drop

05:11

it right where it says and it says

05:12

successfully added Dean Martin and now

05:14

Dean Martin shows up in this growing

05:16

list of model files that I have and

05:18

let's pop in on the settings real quick

05:20

because we want to make sure of a few

05:21

things stem only means it's going to

05:24

skip the voice conversion and output the

05:26

vocals only so it's just going to take

05:28

the vocals off of this track and create

05:30

a file that you can download of the

05:32

existing vocals not anything converted

05:34

we're not going to do that pre- stemmed

05:36

means that the vocals have already been

05:38

separated from the audio so if I was to

05:40

record my voice here and the record your

05:42

own option that was up here I would use

05:43

the pre- stemmed option because there's

05:45

no music to separate the relative pitch

05:47

you change when you're drastically

05:49

changing the voices from one to the

05:50

other for example if I'm going to change

05:52

to a female voice and the female voice

05:54

is probably an octave higher than mine I

05:56

need to tell Replay that my voice is an

05:58

octave lower the easiest way to do that

06:00

is just to click on minus1 12 in this

06:03

case it's a male voice to a male voice

06:04

so we're just going to leave it right

06:05

where it is the instrument pitch this

06:07

allows you to change the pitch of the

06:09

actual music track so if you change the

06:11

vocal track to sound a little bit more

06:13

real you might need to change the

06:14

instrument track to transpose the audio

06:17

to meet that vocal you can leave the

06:18

stem method here alone and you can leave

06:20

all this alone you can take a peek at

06:22

advanced settings but I literally have

06:24

not changed any of these things except

06:26

to set the microphone input value for

06:27

when I do record my own thing everything

06:29

else I leave just where it is you won't

06:31

have Cuda here unless you have an Nvidia

06:33

card in here but uh you just go with

06:35

whatever you've got you'll probably have

06:37

the option of CPU or some sort of GPU if

06:39

you have one all right now we're just

06:41

going to click on create song now the

06:43

first time it separates the audio it

06:45

takes a couple of minutes but once

06:46

that's done that separated audio is on

06:48

your system so it's really easy just to

06:50

go through a bunch of other voices and

06:52

audition them which we will do so you

06:54

can see what's happening here it's

06:55

separating the track it's about 16% of

06:57

the way through your GPU definitely

07:00

determines how fast this goes right now

07:01

I'm running this on an Nvidia RTX 2070

07:05

super and on another system I run it

07:08

with an RTX 390 and it's way faster on

07:11

all processes but for the purposes of

07:13

this demo I'm using this and it will

07:15

take a little bit longer for the track

07:16

separation to occur you can by the way

07:18

cue jobs while this is doing this I can

07:20

go ahead and start another job

07:25

completely beautiful dreamer I'm going

07:27

to back to replay which we're still

07:28

separating the track instead of changing

07:30

Bing's voice out with Dean Martin why

07:32

don't we change it out with Squidward

07:35

all this stays the same and we're going

07:37

to go ahead and click on create song and

07:40

now you'll see two songs cued now we're

07:42

waiting for the first track to finish

07:44

separating and do the conversion and

07:46

then the other one we'll start while

07:47

we're having fun with the first one our

07:48

track is finished so if we click on it

07:50

up here we'll see that in addition to

07:52

the finished conversion we also have the

07:54

ability to download The Source tracks we

07:56

have the original song track the

07:58

converted vocals only the original

08:00

vocals only and then the instrumentals

08:02

only it's great if we want to remix this

08:04

which is what we're going to do let's

08:06

just take a listen real quick again to

08:08

what the original song sounded

08:09

[Music]

08:16

like now let listened to it with Dean

08:18

Martin's voice put

08:27

ink awesome so let's just click remix

08:29

again and let's just quickly choose

08:30

another voice this time we'll choose a

08:32

billy eish voice and because her voice

08:34

is higher we're going to go down here to

08:35

relative pitch and we're going to click

08:37

on plus 12 and we will click on create

08:40

song and you'll notice there's no

08:41

separation going on we're just loading

08:42

in the voice conversion model real quick

08:44

changing the voice creating audio files

08:46

and it's done a lot quicker already done

08:49

to New York in the

08:53

summer that's a little high for her

08:55

voice what we could do is try and change

08:57

the relative pitch and try the

08:58

instrumental pitch pitch which I've

08:59

never actually done so this is a good

09:01

opportunity to do this since that seemed

09:03

a little high for her I think the zero

09:04

would still be too low let's go down

09:07

about six steps here all right but that

09:09

means we're going to also have to change

09:10

the instrument steps down to six so

09:13

let's just see what happens when we do

09:15

that definitely

09:22

transposed much better

09:27

right I think that's great it sounds

09:30

pretty good but if we want to sweeten it

09:31

up a little bit this is where the beauty

09:32

of being able to download those

09:34

individual tracks is so now we just go

09:36

down here and we just want the

09:37

instrumental track and the converted

09:39

vocals now that we've got those two

09:41

audio tracks I'm just going to download

09:42

each of them into my favorite multirack

09:45

editor you can use whichever one you

09:46

want to provided you can do effects and

09:48

other basic editing I have already

09:50

created a very simple effects rack with

09:52

Reverb that I put the vocal track on and

09:55

I've left the music track alone let's

09:57

just play it how it is and see how the

09:58

levels are hello again this is the

10:00

original instrument track now with the

10:02

Billy eish sample on

10:08

top I took off to New York in the Summer

10:13

Breeze I did it my

10:17

way that's great let's do the same thing

10:19

but let's download the Dean Martin

10:20

tracks that we did originally now with

10:22

the Dean Martin one we're going to have

10:23

to redownload the instrumental track

10:25

because that one was not transposed like

10:27

we did for the Billy ish ones click that

10:28

and now click the converted vocals those

10:30

are both being downloaded and I'll just

10:32

drag those in like I did before and

10:35

click

10:39

play I took off to New York in the

10:43

summer

10:48

RS we don't want to forget about

10:50

Squidward and Beautiful Dreamer so let's

10:51

just click here we'll download the

10:53

instrumental track we'll download the

10:56

converted vocals go back into audition

10:58

we'll take these out we'll drag the

11:00

instrumental in here we'll drag the

11:03

vocal in here sometimes when you get

11:05

these downloads you'll see that there's

11:06

some noise here in the vocal track which

11:08

is basically not really singing what

11:10

I'll do in my case is I'll just go into

11:12

these files make sure that it's noise

11:15

and I'm just going to silence this in my

11:16

case I'm just going to go under edit

11:18

insert silence and click okay and it

11:21

just replaces that whole piece with

11:23

silence I'm going to do the same thing

11:24

here and then let's peek here what this

11:26

is yeah that's noise too we also silence

11:28

that okay and here in audition those

11:30

changes are automatically applied here

11:32

in the multitrack so now you can see

11:34

that's all cleaned up so let's just go

11:35

back to the beginning and hear beautiful

11:37

dreamer with Squidward

11:39

Tentacles all under fair

11:42

use because we are significantly

11:46

changing the original by changing the

11:50

main vocal here he is now Squidward

11:53

Tentacles beautiful

11:57

dreamer why for

12:00

me Starlight and doom gaps are waiting

12:05

for

12:07

the I'm your hero right now aren't I so

12:10

hopefully your mind is already

12:11

sufficiently blown because you could

12:13

just sit here and do this all day with

12:15

the ease that it is to download these

12:17

models and play these out and once you

12:18

separate it it's really super fast it's

12:20

really cool so have fun with all of that

12:22

let's take a look at another little

12:24

feature here multimodel what does that

12:26

mean select multiple models to use for

12:28

one song this is just sort of like a

12:30

batch processing let's go back to Summer

12:32

Breeze in the city for example and click

12:33

remix again click on multimodel I want

12:35

Darth Vader and I want farth Brooks

12:38

versions of beautiful dreamer because

12:40

these are male I need to make sure this

12:42

relative pitch is down to zero again

12:43

would not make sense to do a mobile

12:45

batch with female and a male voice if

12:47

the ranges are drastically different and

12:49

then I click create song and again all

12:51

it's going to do now is it's going to

12:52

just convert each of those voices now

12:55

what we end up with when we look over

12:56

here now we have the number two here now

12:58

we have have converted tracks for Frank

13:00

Sinatra and G Brooks now let's try

13:02

merging models first you have to have

13:04

multimodel chosen then we have to choose

13:06

the models that we want let's just say

13:07

Lily eish and Sheldon Plankton now I'm

13:10

going to click on merge and I'm going to

13:13

click on create song and see what we get

13:15

it's creating an entirely new model from

13:17

those weights which didn't take long and

13:19

now we have the Summer Breeze in the

13:20

city Billy Alis Sheldon Plankton making

13:22

sure it's still just one track let's

13:26

listen I took off to New York in the

13:29

Summer

13:31

Breeze I mean how are you going to say

13:33

that that's wrong cuz we don't know what

13:34

Billy Alish and Squidward would sound

13:36

like together this is a 50/50 mix as it

13:38

says right here let's see if we've got

13:40

the ability to shift who gets how much

13:42

yes if I click this icon here I can say

13:44

I want more Billy than Sheldon and it

13:46

just automatically changes so let's try

13:48

it with a preponderance of Billy and

13:50

then we'll try it with a preponderance

13:51

of Sheldon I took off to New York in the

13:55

Summer

13:56

Breeze and I did it

13:59

no one to so that's odd because I did

14:02

mix a female voice and a male voice and

14:05

blankon is way down here and she's

14:07

somewhere up here maybe if I had done to

14:09

12 plus it's okay let's go ahead and

14:11

change those ratios and see what we get

14:13

bring Billy down and Sheldon Plankton up

14:16

took

14:19

York that's mostly Plankton maybe I need

14:22

to back it up just a little bit more get

14:24

a little more Billy in there I did it my

14:27

way no one to PE I freaking love this so

14:31

much fun now you don't have to just

14:33

convert music if you don't want to if

14:34

you just want to convert speech it works

14:36

perfectly well like that too for example

14:38

let's convert my voice

14:40

into Squidward

14:43

denticles why is my voice constantly

14:46

being used for Folly why can't I be

14:49

taken seriously as an artist want to

14:52

make sure you click that little save

14:54

disc there it's not

14:56

intuitive why is my voice constantly

14:58

being being used for all right that's me

15:01

let's go down here to Squidward and this

15:03

is where we would click pre stemmed and

15:04

when you record your own it

15:05

automatically assumes that it's pre-

15:07

stemmed it already defaults to zero here

15:09

so let's just click on creates on this

15:12

should take seconds

15:14

yep why is my voice constantly being

15:17

used for Folly why can't I be taken

15:20

seriously as I'm okay do Darth Vader

15:23

I've been going over your records it

15:25

seems you are a little late on your TPS

15:28

report

15:30

I'm hoping that perhaps you can get that

15:32

done for me and yeah I'm also going to

15:37

need you to come in on

15:39

Saturday that' be great I've been going

15:42

over your records it seems you are a

15:45

little late on your TPS report and yeah

15:50

I'm also going to need you to come in on

15:52

Saturday so provided you download or can

15:54

make your own models this is an amazing

15:56

way to quickly change out any voice for

15:59

any voice in any song pretty much hands

16:01

off except for whatever editing you do

16:03

on the other side but this again takes

16:05

what you can do with sunno to new levels

16:07

because now for any song you create you

16:09

can use any singer you want and even

16:10

create Duets there's one other feature

16:12

that this program has that I'm afraid is

16:14

kind of obsolete especially with things

16:16

like sunno but this allows you to create

16:19

music from a text prompt not songs not

16:21

lyrics short little Snippets of audio

16:24

probably not more than 10 seconds unless

16:25

you have all kinds of time on your hands

16:27

and it's not going to be nearly the

16:29

quality that you're going to get with

16:30

something like sunno at least not yet

16:32

these are using models created by meta

16:34

slfb to create music from text and

16:37

several months ago it was pretty

16:38

freaking incredible and there is a lot

16:40

you can do with it here since I don't

16:42

really think anyone's going to use this

16:43

I'll just show you an example of what

16:44

I'm talking about right here so let's

16:46

say a marching

16:49

band pop

16:52

ballad with accordians you're going to

16:55

click on settings and see that the song

16:57

duration default is 10 seconds and if

16:59

you haven't downloaded models or if you

17:00

have a slower computer this 10-second

17:02

clip is going to take you a hot minute

17:03

to get and I just don't know that a lot

17:05

of people are going to have the patience

17:06

for it there are different models that

17:08

are being used and downloaded depending

17:10

on what you're asking it to do how long

17:12

you want it to be whether or not you're

17:13

trying to guide it with a Melody every

17:15

time you activate a new model it

17:17

downloads it takes a while but it's a

17:19

onetime deal and then it all happens on

17:21

your system right now let's just say a

17:23

marching band pop ballad with accordion

17:25

create music running onetime setup which

17:27

means I've asked it to download a model

17:28

which I don't get have on my system so

17:30

that's going to take a little bit of

17:32

time you'll notice that once it actually

17:33

starts doing the conversion we get a

17:35

countdown here 45 out of 500 steps and

17:39

truly these days the payoff just isn't

17:41

worth sitting here and doing this but I

17:42

want to show you anyway cuz it's in here

17:44

let's see what we

17:46

[Applause]

17:49

got now let's just for fun go into sunno

17:52

and give it the exact same prompt I'm

17:54

going to go into custom mode I'm going

17:56

to say instrumental and the the style is

17:59

exactly what I did there a marching band

18:01

pop ballad with accordians and click

18:15

create no freaking contest I am not

18:18

kidding when I say I have spent hours

18:20

with replay this weekend just

18:22

downloading things from YouTube changing

18:24

out the voices laughing my ass off and

18:26

doing it again and the quality of the

18:28

separations is really really good if you

18:30

isolate the vocal tracks They sound

18:32

amazing so have a ton of fun with this

18:34

I'd love to hear what you do with it if

18:35

you enjoy this type of material anything

18:37

Ai and creativity related and going down

18:40

rabbit holes and Mish mashing tools

18:41

together and being a real mad scientist

18:43

about it I invite you to subscribe if

18:45

you subscribe now I will not look for

18:48

you I will not pursue you but if you do

18:51

not I will look for you I will find you

18:56

and

18:57

I

18:59

[Music]