New AI Video Goes Hard At Open AI!
TLDRThe video discusses a new AI video generator named 'Vu', which is being compared to the upcoming Sora model. Vu, developed by Shinu Technology and Singua University, can produce 16-second clips at 1080p resolution. The video showcases a sizzle reel and longer examples of Vu's output, highlighting its architecture based on the Universal Video Transformer (UvIT), which combines Vision Transformers with a U-Net model for image generation. While not as detailed as Sora, Vu demonstrates temporal coherence and a unique aesthetic. The video also touches on the challenges and post-production work required to refine AI-generated footage for professional use, referencing a short film created using Sora. The speaker, Tim, provides a signup link for Vu and mentions an upcoming interview about Sora's integration into Adobe Premiere and After Effects.
Takeaways
- š¬ A new AI video generator called 'Vu' has emerged, potentially rivaling Sora in quality.
- š Vu can generate video clips up to 16 seconds at 1080p, showcasing its capabilities through a sizzle reel.
- š Vu's architecture is based on the Universal Video Transformer (UvIT), which combines Vision Transformers and Unet for image analysis and generation.
- š§ The UvIT model uses tokens and long skip connections, allowing it to maintain temporal coherence throughout the video.
- šŗ Examples of Vu's output include a panda playing guitar and a beach vacation scene, demonstrating its ability to generate coherent visuals.
- š¤ While Vu's outputs are impressive, they are not as detailed as Sora's, but they maintain a consistent and appealing aesthetic.
- š½ A side-by-side comparison with Sora shows that both models have their strengths, with Sora leading in environment realism.
- š„ The production process for AI-generated videos still requires significant human effort for post-production to achieve consistency.
- š There is a sign-up link for Vu on their website, but as of the recording, the submit button may not be working due to high traffic.
- š The potential of AI video generators like Vu and Sora is being explored by filmmakers, with examples like the short film 'Airhead'.
- š An exclusive interview with Adobe discusses Sora's integration into Premiere and future plans for After Effects.
Q & A
What is the name of the new AI video generator discussed in the script?
-The new AI video generator discussed is referred to as 'Vu' or 'Vidu'.
What is the maximum duration and resolution that the new AI video generator can produce?
-The AI video generator can produce clips up to 16 seconds at 1080p resolution.
Which two models or technologies does the new AI video generator's architecture seem to be based on?
-The architecture of the new AI video generator is based on UID (Universal Video Transformer), which seems to be a combination of two separate papers: DPM Solver and 'All Are Worth Words'.
How does the new AI video generator differ from Sora in terms of video generation?
-While Sora creates videos using temporal spaces, the new AI video generator (Vidu) has an in and an out point, utilizing long skip connections to chart a path between the first and last frames of the video.
What is the significance of the long skip connections in the new AI video generator?
-Long skip connections allow the AI to maintain awareness of the first and last frames of the video, which helps in generating more coherent and less hallucinatory transitions between frames.
What is the aesthetic quality of the new AI video generator's outputs compared to Sora?
-The new AI video generator's outputs look really good but are not as detailed as Sora's. They have a mid-journey V4 kind of look, which is appreciated for its surreal aesthetic.
What is the significance of the 'Sizzle reel' mentioned in the script?
-The 'Sizzle reel' is a promotional video showcasing the capabilities of the new AI video generator. It includes clips that are direct references to the initial Sora video release.
How does the new AI video generator handle transitions between video frames?
-The new AI video generator handles transitions by treating everything as tokens and utilizing its understanding of the beginning and end of the video to chart a coherent path between frames.
What is the current status of the sign-up link for the new AI video generator?
-As of the recording, there is a sign-up link on the website, but the submit button appears to be broken, possibly due to high traffic.
What is the role of post-production in refining AI-generated videos like those from Sora?
-Post-production plays a significant role in cleaning up AI-generated footage. This includes curation, script writing, editing, voice over, music sound design, color correction, and other typical post-production processes to achieve a semi-consistent final product.
How does the new AI video generator compare to Sora in terms of creating realistic environments?
-While both the new AI video generator and Sora are capable of creating compelling imagery, Sora tends to produce more action and clearly defined visuals in its environments. However, the new AI video generator also creates realistic-looking places, albeit with some minor discrepancies in movement or detail.
What is the future potential of AI video generation technology in film and media production?
-AI video generation technology can be used to create compelling imagery and can be integrated into full production processes. It allows for the creation of unique and surreal aesthetics, and with further development and refinement, it could play a significant role in film and media production.
Outlines
š Introduction to a Potential Sora Rival AI Video Generator
The video script introduces a new AI video generator called 'Vu', which is being compared to Sora, a yet-to-be-released model. The presenter acknowledges the irony of comparing it to Sora before its launch. The video dives into the features of the new model, its potential to match Sora's quality, and the possibility of its use before Sora's release. The script also mentions a signup link for the audience. Vu is developed by Shinu Technology and Singua University, and it targets creating 16-second clips at 1080p resolution. The architecture of Vu is based on the Universal Video Transformer (UViT), which is a combination of two research papers: DPM solver for better predictions in diffusion models and 'All Are Worth Words' for combining Vision Transformers with a Unet model. Vu's strength lies in its ability to treat all elements as tokens and utilize long skip connections for coherent video generation.
š„ Analysis of Longer Vid Outputs and Comparison with Sora
The script provides an analysis of full 16-second clips generated by the Vidu AI, highlighting the references to Sora in the initial hype reel. It discusses the temporal coherence of the generated content, comparing it to Sora's outputs. The presenter appreciates the mid-journey V4 aesthetic of the TVs in one of the clips, which is reminiscent of a favorite model. Another clip features a panda bear playing a guitar, which, while not the most realistic, still impresses with its background coherence and reactive shadow. A beach vacation villa clip showcases an interesting dissolve between shots, hinting at the model's ability to handle transitions. An imaginative clip with a ship in a bedroom demonstrates the model's reaction to movement and environmental interaction. The script also includes a brief comparison with Sora, noting that while Sora's environment realism is slightly superior, Vidu's output still appears as a real place. The presenter reminds the audience that both models have their strengths and that the examples shown are cherry-picked, with Sora also producing less consistent videos that require significant post-production work.
š Post-Production Processes and Future of AI in Filmmaking
The video script concludes with a discussion on the post-production process necessary to refine AI-generated videos into a final product. It mentions the use of AI tools in creating compelling imagery and the effort that goes into making these videos look semi-consistent. The presenter references a production company's use of Sora to create a short film, 'Airhead', and the extensive cleanup required to achieve a polished result. The script also highlights the creative process used by Paul Trello in his short film 'Notes to My Future Self', where AI imagery was integrated with traditional VFX techniques. Finally, the presenter provides a signup link for Vidu, noting a potential temporary issue with the website's submit button, and teases an upcoming interview with Adobe about Sora's integration into Premiere and future plans for After Effects.
Mindmap
Keywords
š”AI Video Generator
š”Sora
š”Shinu Technology and Singua University
š”Universal Video Transformer (UvIT)
š”Temporal Coherence
š”Sizzle Reel
š”DPM Solver
š”All Are Worth Words
š”Long Skip Connections
š”V4 Aesthetic
š”Post-Production
Highlights
A new AI video generator, potentially rivaling Sora, has been revealed.
The AI can generate clips up to 16 seconds at 1080p resolution.
The model was developed by Shinu technology and Singua University.
Vid's architecture is based on the Universal Video Transformer (UvIT).
UvIT combines Vision Transformers with a U-Net model for image generation.
The model treats all elements, including time, as tokens and utilizes long skip connections.
Vid's output is compared to Sora, showing differences in temporal coherence and video generation methods.
Vid's 16-second clips showcase temporal coherence and detailed visuals.
The AI-generated videos are noted for their aesthetic appeal, with a mid-journey V4 look.
Vid's beach vacation video demonstrates an interesting dissolve effect.
A ship in a bedroom video shows the model's ability to react to water movement.
A side-by-side comparison with Sora reveals strengths in camera movement and environment realism.
The Tokyo walk sequence from Vid shows the model's capability to handle complex scenes.
Sora's video generation still requires significant post-production work for consistency.
AI video generation technology is being used to create compelling imagery, as demonstrated by Paul Trello's VFX breakdown.
Vidu has a sign-up link on their website, but the submit button may be temporarily broken due to high traffic.
Adobe's integration of Sora into Premiere and future plans for After Effects are discussed in an exclusive interview.