I Challenged My AI Clone to Replace Me for 24 Hours | WSJ
TLDRIn a fascinating experiment, Joanna Stern explores the potential of AI to replace her for a day. With the help of Synthesia and ElevenLabs, she creates an AI avatar and voice clone to tackle challenges like phone calls, creating a TikTok, bank biometrics, and video calls. The results show that while AI voices are impressive, video clones still have a way to go. This experiment not only highlights the growing capabilities of AI but also raises concerns about potential misuse and the need for vigilance in distinguishing real from AI.
Takeaways
- 🎥 Joanna Stern explores the possibility of replacing herself with an AI clone for a day to understand the capabilities of AI-generated voice and video.
- 🤖 The AI avatar of Joanna was created by Synthesia, a startup that recorded her head movements and voice for training data.
- 💬 The voice of the AI clone was improved by ElevenLabs, which used two hours of Joanna's previous recordings to enhance the quality.
- 📞 In the first challenge, AI Joanna successfully passed as the real Joanna in a phone call with Snap CEO Evan Spiegel.
- 📹 The second challenge involved creating a TikTok video, where the AI avatar's lack of natural movements and limited facial expressions led to its failure.
- 🔊 AI-generated voices can be used for bank biometrics, as demonstrated when AI Joanna's voice was enough to pass through to a service rep at Chase.
- 👤 Video calls proved to be a challenge for the AI clone; participants could tell it wasn't the real Joanna due to the avatar's unnatural appearance and behavior.
- 🚫 The experiment showed that while AI voices are becoming quite convincing, video clones still have a way to go before they can fully deceive.
- 💡 The video highlights the potential for saving time using AI tools, but also the risks of misuse, such as scammers using AI voices to impersonate individuals.
- 🛑 Synthesia and ElevenLabs have measures in place to prevent misuse of their technology, such as requiring verbal consent and being able to identify their voices.
Q & A
What was Joanna Stern's main objective in the video?
-Joanna Stern aimed to explore whether she could replace herself with an AI clone for a day, in order to see if the AI could perform tasks and interactions on her behalf.
How was Joanna's AI avatar created?
-Joanna's AI avatar was created by Synthesia, a startup that recorded her performing head movements and reading a script. This data was used to train their AI neural networks to generate the avatar.
What was the role of ElevenLabs in Joanna's AI experience?
-ElevenLabs was used to improve the quality of Joanna's AI voice. After uploading two hours of her previous recordings, ElevenLabs produced a better voice clone than the initial version provided by Synthesia.
What were the four challenges Joanna set for her AI clone?
-The four challenges Joanna set were: making phone calls, creating a TikTok, passing bank biometrics, and conducting video calls.
How did Joanna's sister, Julia, react to the AI's call about her dead fish?
-Initially, Julia was fooled and thought it was Joanna. However, she later realized it wasn't her sister due to the AI not pausing for talking back.
What was the outcome of the TikTok challenge?
-The TikTok challenge failed because the AI-generated video had limitations such as the avatar not moving its arms, mismatched mouth movements, and lack of facial expressions.
How successful was the AI voice in deceiving the bank's voice biometrics system?
-The AI voice was successful in passing the bank's voice biometrics system, as it confirmed Joanna's identity and transferred her to a customer service representative without additional questions.
Why did the video call challenge fail?
-The video call challenge failed because the participants noticed the AI's limitations, such as the avatar looking like a hologram version of Joanna, poor posture, and lack of jokes, leading them to realize it wasn't the real Joanna.
What concerns did Joanna express about the potential misuse of AI clones and voices?
-Joanna expressed concerns about scammers potentially using AI voices to impersonate individuals to call banks or families, and the need for people to be on high alert to distinguish between real and AI-generated interactions.
What measures do Synthesia and ElevenLabs take to prevent misuse of their technology?
-Synthesia requires verbal consent from those creating avatars, while ElevenLabs requires users to check a box confirming they have permission to use the voice. Both companies claim to be capable of identifying their AI voices if misused.
Outlines
🎥 Introduction to AI Avatar Creation
The script begins with Joanna Stern's excitement about creating an AI avatar that resembles and behaves like her. She discusses the current state of AI tools that generate text and images, and how AI-generated voice and video will further blur the line between reality and fiction. Joanna outlines four challenges she has devised to test if her AI avatar can replace her for a day, aiming to free up time for her to engage in personal activities. The paragraph also touches on the unsettling feeling of seeing her frozen avatar and the process of getting ready for the challenges.
🤖 AI Avatar Development and Voice Improvement
Joanna delves into the process of creating her AI avatar with Synthesia, a startup that recorded her head movements and voice in a professional studio. The company used this data to train their AI neural networks. However, the initial voice output wasn't satisfactory, leading to the use of ElevenLabs, which improved the voice clone after analyzing two hours of Joanna's previous recordings. Both Synthesia and ElevenLabs operate on a similar principle where users can type in text for AI clones to recite. Joanna explains the commercial applications of these AI tools, with Synthesia targeting companies for internal video creation and ElevenLabs offering voice cloning services at an affordable monthly fee.
📞 Challenges Involving AI: Phone Calls and TikTok
Joanna describes the first challenge involving phone calls, where she successfully used her AI voice to impersonate Evan Spiegel, CEO of Snap, and discuss the impact of AI on communication. The second challenge was creating a TikTok video using a script written by ChatGPT in Joanna's voice, discussing an obscure iOS 16 tip. Despite the initial struggle with ChatGPT fabricating information, a usable script was eventually produced. The resulting TikTok video, however, did not impress the platform's audience due to the avatar's lack of movement and mismatched mouth movements, highlighting areas for improvement in AI avatar technology.
🏦 Bank Biometrics and Video Calls
The third challenge involved using the AI voice for bank biometrics, where Joanna's voice was successfully used to authenticate and transfer her to a customer service representative at Chase. An attempt by an intern to impersonate Joanna with a voice biometric system resulted in further verification being requested. The final challenge was video calls, where Joanna's AI avatar was integrated into Google Meet calls. However, the avatar was quickly identified as fake due to its lack of natural human characteristics such as posture and humor, resulting in a failed challenge.
🚫 Conclusion and Reflection on AI Impersonation
Joanna concludes the video by reflecting on the day's challenges, noting that while AI voices are quite convincing, video clones are not yet capable of fooling people. She expresses her mixed feelings about the potential time-saving benefits of AI clones and the risks of misuse, such as scammers using cloned voices. Joanna emphasizes the importance of vigilance in distinguishing between real and AI-generated content and ends with a humorous reminder to stay human.
Mindmap
Keywords
💡AI Clone
💡Synthesia
💡ElevenLabs
💡Challenges
💡Phone Calls
💡TikTok
💡Bank Biometrics
💡Video Calls
💡AI Voices
💡Misuse
💡Authenticity
Highlights
Joanna Stern explores the possibility of replacing herself with an AI clone for 24 hours.
AI tools are blurring the lines between real and fake, prompting Joanna to test the limits of AI-generated voice and video.
Joanna's AI avatar is created by Synthesia, a startup that records head movements and voice for training data.
ElevenLabs improves upon Synthesia's voice output, offering a more realistic Joanna Stern voice clone.
Synthesia targets companies wanting to make internal videos, charging at least $1,000 for a custom avatar.
Creating a voice clone with ElevenLabs costs $5 a month, making it accessible for a wider range of applications.
Challenge one involves making phone calls, testing the effectiveness of AI voice in a real-world scenario.
Joanna's AI clone successfully deceives her sister and Evan Spiegel, the CEO of Snap, during phone calls.
Challenge two requires creating a TikTok, showcasing the AI avatar's ability to perform on social media platforms.
Despite initial struggles with ChatGPT's scriptwriting, the AI-generated TikTok video is well-received.
TikTok's algorithm, however, detects inconsistencies in the AI-generated video, indicating room for improvement.
In challenge three, the AI voice clone passes a bank's voice biometrics test, raising concerns about potential misuse.
The AI clone fails to convince during video calls, as participants notice the lack of natural movements and expressions.
The experiment reveals that while AI voices are becoming increasingly convincing, video clones still have a way to go.
Joanna highlights the need for vigilance to distinguish between real and AI-generated content moving forward.
Companies like Synthesia and ElevenLabs are implementing measures to prevent misuse of their technology.
The video concludes with a call to action for viewers to stay human and be aware of the inevitable rise of AI in our lives.