Chinese Company Unveils SORA Competitor - "Vidu" AI Video Generator
TLDRA Chinese company, Shu, has announced a new AI video generator called Vidu, which is positioned as a competitor to SORA. Vidu claims to generate 16-second 180p video clips with a single click, utilizing a self-developed architecture known as Universal Vision Transformer (Uvit). This architecture combines the strengths of the diffusion and Transformer models, which are pivotal in the advancement of generative AI. The company's research team first proposed the core technology of Uvit in September 2022, prior to Sora's model. Vidu's capabilities are showcased in a reel that demonstrates its ability to produce realistic videos, although there are noted inconsistencies in some generated scenes. The company is currently accepting applications for access to the tool at shanguai.com. This development highlights the competitive landscape in AI, with China emerging as a significant player alongside tech giants in the US.
Takeaways
- π A Chinese company named Shu has announced a new AI video generator called 'Vidu', which is positioned as a competitor to SORA.
- πΉ Vidu can generate a 16-second 180p video clip with a single click, utilizing a self-developed architecture known as Universal Vision Transformer (Uvit).
- π The UVit architecture integrates two AI models: diffusion and Transformer, which is seen as an advancement in generative AI, overcoming some limitations of previous models.
- π The Transformer model, which Vidu uses, is adept at understanding context, which should theoretically lead to more coherent and accurate video/image generation.
- π Vidu's research team first proposed the core technology of UVit in September 2022, prior to Sora's model architecture.
- π In a side-by-side comparison, Vidu generates hands well, with realistic detail, although there are some inconsistencies in certain elements like hair and leaves.
- π Despite the lower resolution of the Vidu showcase videos, the Global Times article mentions that Vidu can output 1080p quality.
- π To apply for access to Vidu, interested parties can fill out a form on the Shangu AI website, leaving their contact details for a marketing consultant to follow up.
- π€ Recent advancements from China in the AI space include a new language model and a high-speed robot, indicating a surge in innovation and competition.
- π While Vidu showcases impressive capabilities, it is suggested that it may not yet be on par with the yet-to-be-released Open AI Sora.
- π¬ The presenter encourages viewers to share their thoughts on Vidu and whether they will apply for access, fostering community engagement and discussion.
Q & A
What is the name of the AI video generator announced by the Chinese company Shu?
-The name of the AI video generator is 'Vidu'.
What is the core technology behind Vidu's AI video generator?
-The core technology behind Vidu's AI video generator is the Universal Vision Transformer (UViT), which integrates two text-video AI models: the diffusion model and the Transformer model.
How long does it take for Vidu to generate a 16-second 180p video clip?
-Vidu can generate a 16-second 180p video clip with just one click.
What are some of the limitations of the stable diffusion model?
-Some limitations of the stable diffusion model include its inability to generate text very well and its difficulty in understanding context or following more complicated prompts.
How does the Transformer model improve upon the diffusion model?
-The Transformer model, which is good at understanding context, can be merged with the diffusion model to create more coherent and accurate images or videos.
Who is Ju Jun and what is his role in the development of Vidu's technology?
-Ju Jun is the vice dean of The Institute of AI at Chingua University and the chief scientist at Shangu. He states that after the release of Sora, it closely aligned with their technical roadmap, motivating them to advance their research.
What are some of the features that make Vidu's AI video generator stand out?
-Vidu's AI video generator stands out for its ability to generate realistic hands with five fingers, and its overall realistic and coherent video generation capabilities.
How does Vidu's video quality compare to Sora's?
-While Vidu produces high-quality videos, the details in its videos may not be as crisp or sharp due to a lower resolution compared to Sora's full HD videos. However, the Global Times article mentions that Vidu can output 1080p.
What is the process to apply for access to use Vidu's AI video generator?
-To apply for access to use Vidu's AI video generator, one needs to visit shanguai.com, scroll down to the video generation section, and fill out a form with their name, phone number, company name, and wait for a marketing consultant to serve them.
How does the emergence of Vidu and other Chinese AI advancements impact the global AI race?
-The emergence of Vidu and other Chinese AI advancements shows that other countries might not be far behind in the AI race, providing more competition and potentially driving innovation in the field.
What are some other recent significant AI developments from China?
-Other recent significant AI developments from China include the launch of Since Nova 5.0 by the Chinese company Since Time, which reportedly beats GPT-4 Turbo on nearly all benchmarks, and the unveiling of the S1 robot by the company ASOT.
What is the general sentiment towards competition in the AI video generation space?
-The general sentiment is positive towards competition in the AI video generation space, as it is believed to drive innovation and lead to better products and services.
Outlines
π Introduction to Shu's AI Video Generator
The video script introduces a new AI video generator developed by a Chinese company named Shu. The generator, called SORA, is claimed to be a competitor to OpenAI's Sora. Shu's tool is capable of creating a 16-second 180p video clip with a single click, utilizing a self-developed architecture known as Universal Vision Transformer (UViT). UViT combines the strengths of diffusion and Transformer models, which are pivotal in the evolution of generative AI. The script discusses the limitations of previous models and how UViT aims to overcome them by integrating context understanding and coherence. The video also includes a comparison between Shu's and OpenAI's video generation capabilities, noting that while Shu's results are impressive, they may not yet match the unreleased Sora.
π Comparative Analysis of Video Generation Technologies
This paragraph presents a side-by-side comparison between the video generation outputs of Shu's AI and OpenAI's Sora. It highlights the quality and realism of the generated videos, pointing out specific details such as the accurate depiction of hands and the consistency in the transformation of elements within the video. The narrator also notes inconsistencies in Shu's video, such as a green leaf disappearing and a misrepresentation of a wooden toy ship on a carpet. Despite these flaws, the video quality from Shu is acknowledged to be good, although it is not in full HD resolution. The script also mentions the application process for using Shu's technology and provides a link to the company's website.
π Global AI Competition and Recent Chinese Innovations
The final paragraph shifts the focus to the broader landscape of AI development, emphasizing the recent advancements made by Chinese companies in the AI space. It mentions the unveiling of a new language model and a fast-speed robot by other Chinese entities, suggesting that the global AI race is becoming more competitive. The script expresses enthusiasm for the unveiling of Shu's AI video generator, VDU, as it adds to the competition and potentially pushes the boundaries of what is possible in video generation technology. The narrator encourages viewers to share their thoughts on VDU and whether they plan to apply for access to the technology.
Mindmap
Keywords
π‘AI Video Generator
π‘SORA
π‘Universal Vision Transformer (Uvit)
π‘Diffusion Model
π‘Transformer Model
π‘Generative AI
π‘Shu
π‘Stable Diffusion
π‘Runway and Pika
π‘Resolution
π‘WeChat Page
Highlights
Chinese company Shu announces a new AI video generator, Vidu, as a competitor to SORA.
Vidu's AI video generator is claimed to be on par with OpenAI's SORA.
Vidu can generate a 16-second 180p video clip with a single click.
The technology is built on a self-developed visual transformation model architecture called Universal Vision Transformer (Uvit).
Uvit merges the diffusion and Transformer models, which is considered the next step in generative AI.
The Transformer model is known for its ability to understand context, which could improve the coherence of generated content.
The core technology of Uvit was proposed by Vidu's research team before Sora's model architecture.
Vidu's video generator produces realistic hands with detail.
Comparisons between Vidu and Sora show potential for Vidu to be a close competitor.
Vidu's video generation has some inconsistencies, such as transforming hair into a red ribbon.
Vidu's videos are currently in 720p resolution, whereas Sora's are in full HD.
The Global Times article states that Vidu can output 1080P videos.
To apply for access to Vidu, one can visit shanguai.com and fill out a form.
Recent advancements from China in AI space include a new language model and a fast robot from other Chinese companies.
The new language model from a Chinese company reportedly beats GPT-4 Turbo on nearly all benchmarks.
The unveiling of Vidu provides more competition in the AI video generation space.
Competition in the AI field is seen as beneficial for innovation and progress.