Sakana Evolutionary Model Merge - and other AI News

Olivio Sarikas
23 Mar 202410:02

TLDRThe video script discusses the latest advancements in AI, highlighting innovative projects like an avatar generator, a full-resolution photo glitch effect creator, and a cavi pet creator. It introduces Google's Vlogger project, which generates complete videos from audio inputs, and Sakana AI's evolutionary model merch. The script also touches on the potential of AI in 3D modeling and the integration of AI with real-world applications, such as Elon Musk's neural link chip. The presenter reflects on the rapid pace of AI development and its impact on creativity and reality perception.


  • 🎨 The AI avatar generator creates consistent avatars with varying facial expressions using face detailer technology.
  • 🔄 A randomization process is showcased, producing endless unique character prompts with each click of a cube button.
  • 🐾 An AI pet creator is demonstrated, which generates pets in the style of an input image, reflecting an interest in anime style.
  • 🎥 Google's 'Vlogger' project uses audio and image inputs to generate complete videos, including body and facial movements.
  • 🧬 Sakana AI's 'Evolutionary Model Merch' merges different AI models and tests them in an evolutionary manner to find the best performing model.
  • 📹 'Stable Video 3D' technology allows for the creation of high-quality 3D rotational videos and potential 3D printing of objects.
  • 🌟 'Anime Diff Lightning' is a tool that enables fast video creation with lightning effects, though the quality may not be as high.
  • 🎭 Meta's project uses AI and language models to understand and navigate spaces, creating virtual environments for AI training.
  • 🤖 AI is rapidly creating and merging information, necessitating its use to manage and interpret the vast data outputs.
  • 👓 The integration of AI with reality is increasing, making it challenging to distinguish between AI-generated and handcrafted content.

Q & A

  • What is the main topic of the video script?

    -The main topic of the video script is the presentation of various AI projects and their implications on the future of technology and society.

  • What are the three workflows created for Patreon supporters?

    -The three workflows created for Patreon supporters include an avatar generator that produces different facial expressions while maintaining character consistency, a randomization tool that generates endless unique character prompts, and a pet creator that uses image input to produce anime-style pets.

  • How does the 'vlogger' project from Google work?

    -The 'vlogger' project from Google uses audio input and an image to create a complete video. It renders not only the lip movements but also the body, head movements, and facial expressions in a way that fits the audio, suggesting potential future uses of AI in video creation.

  • What is the concept behind Sakana AI's 'evolutionary model merch'?

    -The 'evolutionary model merch' concept involves merging different AI models and testing them against each other in an evolutionary manner to determine which model combination performs the best, guided by the AI itself.

  • How does the 'stable video 3D' technology work?

    -The 'stable video 3D' technology creates a rotation video around an object, resulting in a high-quality 3D representation. It does not create a 3D mesh from a single image but can be used to generate a 3D mesh and potentially a physical object through 3D printing in the future.

  • What is the purpose of the 'anime diff lightning' tool?

    -The 'anime diff lightning' tool is designed to quickly create lightning-style animations for videos. While the quality may not be as high, it is useful for testing different prompts and concepts.

  • What is the significance of the project by Meta that uses AI and language models to understand space?

    -The project by Meta leverages language models to deduce information about a space, such as identifying walls, windows, or doors, based on logical progression similar to predicting the next word in a sentence. This approach helps in understanding and enriching the environment, and it can be applied in various practical ways, such as guidance through spaces or estimating the size and weight of objects.

  • What is the real-world application mentioned in the script involving a neural link chip?

    -The real-world application mentioned is the implantation of a neural link chip into a person's brain, which allows them to control a computer mouse and play a chess game using only their thoughts, a project developed by Elon Musk.

  • How does the rapid advancement of AI affect the creation and perception of art?

    -The rapid advancement of AI, particularly in image generation, has led to a flood of high-quality outputs, which can make hand-drawn or hand-crafted art seem less impressive in comparison. This shift challenges artists to elevate their skills and adapt to the changing landscape of art and technology.

  • What are the implications of AI's role as a companion alongside humans?

    -AI's role as a companion alongside humans suggests a collaborative future where AI assists in various tasks, from content creation to inputting our thoughts and ideas. This partnership can enhance productivity, innovation, and quality of life but also raises questions about the nature of human creativity and the potential for over-reliance on technology.

  • How does the script suggest the future of AI development?

    -The script suggests that the future of AI development will be characterized by rapid iteration, increased integration with reality, and the need for AI to help manage and interpret the vast amounts of information being generated. It also hints at the potential for AI to play a significant role in everyday life and in addressing challenges faced by individuals with disabilities or special needs.



🎨 AI in Art and Creativity

The first paragraph discusses various AI-powered creative tools and workflows developed for Patreon supporters. It highlights an avatar generator that produces consistent character avatars with varying facial expressions using face detailer technology. Additionally, it mentions a randomization feature for endless character creation and introduces a full-size, high-resolution photo integration in stable diffusion with a glitch effect. The paragraph also talks about an image-to-image AI concept using Allur to create anime-style pet avatars. The main theme revolves around the innovative use of AI in enhancing and expanding creative possibilities in art and design.


🤖 Advancements in AI and their Real-world Applications

The second paragraph delves into recent AI projects and their implications. It begins with 'Vlogger,' a Google project that uses audio input and images to generate complete videos, including body movements and facial expressions. The discussion then moves to the concept of AI influencers and the potential for AI to replace human imperfections in media. The paragraph also touches on the rapid iteration and improvement in AI, making it harder for individuals to establish a niche. It continues with a project by Sakana AI, which focuses on the evolution of AI models through merging and testing. The paragraph concludes with a mention of 'Stable Video 3D' and its high-quality results, suggesting a future where AI-generated images could be turned into physical objects through 3D printing.



💡AI news

AI news refers to the latest developments and breakthroughs in the field of Artificial Intelligence. In the context of the video, it highlights the presenter's focus on sharing exciting and innovative AI advancements that range from avatar generation to neural link chips.

💡Patreon supporters

Patreon supporters are individuals who financially contribute to a content creator's work on the Patreon platform. In this video, the creator has developed exclusive workflows for these supporters, showcasing unique AI applications and concepts.

💡Avatar generator

An avatar generator is a software tool or AI system that creates digital representations or characters, often used as a user's profile picture in online communities. In the video, the creator uses an avatar generator to produce avatars with consistent details but varying facial expressions.

💡Face detailer

Face detailer is a technology or tool that manipulates or alters facial features and expressions in images or videos. In the video, it is used to change the emotions of the generated avatars, demonstrating the precision with which AI can modify visual content.


Randomization is the process of generating a variety of outcomes from a set of possibilities without any predictable pattern. In the context of the video, it refers to the creation of endless, unique AI-generated prompts or characters by clicking a button.

💡Stable diffusion

Stable diffusion is a term related to AI image generation models that can produce high-quality, stable images from textual descriptions. The video discusses a project that uses this technology, but with the added ability to incorporate full-size, high-resolution photos.

💡Anime style

Anime style refers to the distinctive visual art style often associated with Japanese animated TV shows and movies. In the video, the creator discusses using AI to generate anime-style images, demonstrating the versatility of AI in mimicking and producing content in various artistic styles.


Vlogger refers to a video blogger who creates and shares content on the internet, often through platforms like YouTube. In the context of the video, 'vlogger' is a project by Google that uses AI to generate complete videos from audio inputs, including body movements and facial expressions.

💡Evolutionary model merch

Evolutionary model merch refers to the concept of merging different AI models and testing them against each other to see which combination performs the best, guided by evolutionary principles. This approach aims to improve and refine AI models through a process similar to natural selection.

💡Stable video 3D

Stable video 3D refers to a technology or method that creates three-dimensional videos with a high degree of stability and quality. In the video, the creator has made a tutorial on using this technology locally, which allows for the creation of rotation videos around an object, potentially leading to 3D mesh creations.

💡Neural link chip

A neural link chip is a type of brain-computer interface that allows for direct communication between the brain and external devices. In the video, it is mentioned as a significant development where a person can control a computer mouse and play chess using only their thoughts.

💡AI and language models

AI and language models refer to the use of artificial intelligence in developing systems that can understand, interpret, and generate human language. In the video, this concept is used by Meta to help AI understand the space around them by using language logic, rather than relying solely on visual data.


AI workflows for Patreon supporters showcasing avatar generators and randomization for endless character creation.

Use of face detailer to change the emotion of the face in avatars while maintaining character consistency.

Experiment with full-size, full-resolution photos in stable diffusion, rendering glitch effects over the images.

Cavi pet Creator, an interesting concept using image-to-image AI to generate pets in anime style.

Google's project Vlogger, which uses audio input and images to create complete videos, including body and facial movements.

The future of AI use as envisioned by Google, including the potential for AI influencers and stylized representations in media.

Sakana AI's evolutionary model merch, an AI that merges different models and tests them against each other for performance.

Stable video 3D, a tutorial on creating high-quality 3D rotational videos and the potential for 3D printing from these models.

Anime diff lighting, a tool for creating fast video lightning effects using AI within comi.

Meta's project using AI and language models to understand and navigate spaces, overlaying information for guidance.

The creation of 100,000 virtual environments for training AI in spatial understanding.

The first person with a neural link chip, capable of controlling a computer mouse with their thoughts.

AI's role in creating and merging models, explaining complex data, and serving as an input device for human thoughts.

The challenge of differentiating between AI-generated and handcrafted works due to the high quality of AI outputs.

The rapid integration of AI creations with reality and its impact on our perception and valuation of art and skill.