FREE Stable Diffusion 2.1 Is The Biggest Disappointment Yet?

9 Dec 202221:57

TLDRThe speaker, 'Overlord', discusses the recent release of Stable Diffusion 2.1, expressing mixed feelings about its capabilities compared to previous versions. They highlight the minor improvements and new features of 2.1, such as the ability to generate wider images and better-shaped hands, while also emphasizing that the differences are not substantial. Overlord critiques the community's expectations and the company's communication, advocating for patience and understanding as the technology continues to develop.


  • 🎥 The speaker, Overlord, discusses the release of Stable Diffusion 2.1 and shares mixed feelings about it.
  • 💭 Overlord was initially hesitant about creating content on the 2.1 release due to similarities with version 2.0.
  • 📌 A simplified video format is chosen to discuss the 2.1 release, its community impact, and future prospects.
  • 🔄 Detailed instructions are provided for installing Stable Diffusion 2.1, including model and YAML file downloads.
  • 🚀 The 2.1 version allows for super wide images, a feature not possible in previous versions.
  • 💻 Generating wide images with 2.1 requires a powerful GPU and may not be feasible for all users.
  • 🌟 The speaker believes that version 1.5 produces more realistic images than 2.0 or 2.1.
  • 🖌️ Art styles are back in 2.1, but the speaker finds the results underwhelming compared to 1.5.
  • 🤲 The 2.1 version shows slight improvement in hand generation, but the differences are minimal.
  • 👥 The community's reaction to the 2.0 and 2.1 releases has been mixed, with some feeling betrayed by the changes.
  • 💬 Stability AI has been criticized for poor communication and not being transparent about their decisions and improvements.

Q & A

  • What is the speaker's name and how does he feel about the release of Stable Diffusion 2.1?

    -The speaker's name is Overlord and he is not pleased with the release of Stable Diffusion 2.1.

  • What type of video is the speaker planning to make regarding Stable Diffusion 2.1?

    -The speaker plans to make a simpler, less edited video, which he refers to as a 'ramp video', to discuss his opinions on the newest release, the community, and the future of Stable Diffusion.

  • What are the two different models available for Stable Diffusion 2.1?

    -The two different models available for Stable Diffusion 2.1 are the 768 model and the 512 model.

  • What is the speaker's recommendation regarding the two models of Stable Diffusion 2.1?

    -The speaker recommends downloading the 768 model as it provides the best results based on his testing.

  • What changes were made in Stable Diffusion 2.1 compared to version 2.0?

    -Stable Diffusion 2.1 brought back some features from version 1.5, such as art styles and more realistic celebrity images, and allowed for the creation of super wide images.

  • What is the speaker's opinion on the quality of images generated by Stable Diffusion 2.1?

    -The speaker believes that while 2.1 produces better shaped hands and some cool images, especially landscapes, the differences between 2.0 and 2.1 are minimal and not significantly better.

  • How does the speaker feel about the community's reaction to Stable Diffusion 2.0 and 2.1?

    -The speaker feels that the community is acting too entitled and should appreciate the free access to this technology, which did not exist a few months ago.

  • What is the speaker's comparison between Stable Diffusion and other AI models like Mid-journey?

    -The speaker feels that Stable Diffusion is currently far behind other AI models like Mid-journey in terms of ease of use and quality of generated images.

  • What issue does the speaker have with Stability AI's communication?

    -The speaker criticizes Stability AI for being secretive and not effectively communicating the reasons and details behind their model updates and decisions.

  • What is the speaker's final verdict on Stable Diffusion 2.1?

    -The speaker concludes that Stable Diffusion 2.1 is an improvement but not by much, and he still believes that version 1.5 is better at generating images.

  • How can someone try out Stable Diffusion 2.1 without installing it on their own computer?

    -A person can try out Stable Diffusion 2.1 quickly by using platforms like Runpod to rent a GPU for a few cents or use Runboard to install and use Invoke AI directly on the website.



🚀 Introduction to Stable Diffusion 2.1

The speaker begins by addressing the audience and expressing dissatisfaction with the recent release of Stable Diffusion 2.1. They debate whether to make a video discussing it but ultimately decide to create a simplified video to cover various topics related to the new release, including installation instructions, differences between versions, and personal opinions on the update. The speaker emphasizes that their commentary will be subjective but aims to provide a comprehensive overview of Stable Diffusion 2.1.


📋 Installation and Version Comparison

The speaker provides a detailed guide on how to install Stable Diffusion 2.1, recommending the 768 model for better results. They explain the process of downloading the necessary files, including the yaml files, and modifying the web UI user.bat file for precision. The speaker then compares versions 1.5, 2.0, and 2.1, arguing that 1.5 produces more realistic images, especially in terms of texture and detail. They also mention that while 2.1 has some improvements, such as better-shaped hands, the differences are minimal and dependent on the type of images generated.


🎨 Art Styles and Image Quality

The speaker discusses the art styles and image quality in Stable Diffusion 2.1, noting that while the new version allows for more artistic styles, the results are not as impressive as they could be. They compare images generated using different art styles in 2.1 with those from 1.5 and find the latter to be more aesthetically pleasing and stylized. The speaker also touches on the community's reaction to the changes in Stable Diffusion, expressing disappointment in the company's communication and the sense of entitlement within the community.


🌐 Community and Future of Stable Diffusion

The speaker reflects on the community's reaction to the 2.0 release, which divided opinions due to its shift towards ultra-realistic images and away from the diverse art styles of 1.5. They express concern that Stable Diffusion is lagging behind other AI models like Mid-journey in terms of ease of use and image quality. The speaker also criticizes Stability AI's communication with the community, noting the lack of clear explanations for their decisions. Despite these issues, they remain hopeful for the future of Stable Diffusion, especially with the upcoming release of DreamBooth 2.0.


📢 Final Thoughts and Community Engagement

The speaker concludes by reiterating that while Stable Diffusion 2.1 has some improvements, it is still significantly behind competitors like Mid-journey. They encourage the community to try the 2.1 version for themselves and to share their thoughts. The speaker also acknowledges the support from their features and supporters, congratulating the week's AR challenge winner and inviting viewers to join their Discord server for further engagement.



💡Stable Diffusion

Stable Diffusion is an AI model used for text-to-image generation. It is the main subject of the video, with the speaker discussing its different versions and their capabilities. The speaker compares the performance of Stable Diffusion 2.1, 2.0, and 1.5, noting differences in image quality and style.

💡Version 2.1

Version 2.1 is a specific iteration of the Stable Diffusion AI model. It is described as a minor improvement over the previous version, with some new features like the ability to generate super wide images and slightly better hands. However, the speaker expresses that the differences are not significant enough to warrant a switch from version 1.5.


Installation refers to the process of downloading and setting up the Stable Diffusion models on one's computer. The speaker provides a step-by-step guide on how to install version 2.1 of Stable Diffusion, including the download of specific files and the configuration of the web UI.

💡Art Styles

Art Styles refer to the different visual aesthetics that can be applied to the images generated by the Stable Diffusion model. The speaker mentions that while version 2.1 has reintroduced some art styles from version 1.5, the results are not as impressive as one might hope.

💡Celebrity Images

Celebrity Images refer to the AI-generated images of well-known personalities. The speaker discusses the challenges of generating realistic celebrity images with Stable Diffusion 2.1, noting that while the hands may be improved, the overall likeness of the celebrity can be compromised.


Community in this context refers to the group of users and enthusiasts of Stable Diffusion. The speaker addresses the community's mixed reactions to the new versions of the AI, highlighting the division between those who appreciate the improvements and those who feel the models have regressed in quality.


Mid-Journey is another AI model for text-to-image generation, mentioned as a comparison to Stable Diffusion. The speaker suggests that Mid-Journey currently outperforms Stable Diffusion in terms of ease of use and quality of generated images.


Quality refers to the standard or level of excellence of the AI-generated images. The speaker evaluates the quality of images produced by different versions of Stable Diffusion and compares it with other AI models like Mid-Journey.

💡Negative Prompts

Negative prompts are specific instructions given to the AI model to avoid certain elements in the generated images. The speaker mentions using a combination of negative prompts to achieve desired art styles in the images produced by Stable Diffusion 2.1.


GPU stands for Graphics Processing Unit, which is a hardware component used for rendering images and performing complex calculations. The speaker discusses the need for a powerful GPU to generate certain types of images with Stable Diffusion 2.1, particularly the super wide images.


Communication in this context refers to the way Stability AI, the company behind Stable Diffusion, communicates with its user community. The speaker criticizes the company's communication style, noting that it is cryptic and not transparent about the changes and improvements in their AI models.


Stable Diffusion 2.1 was recently released, but the speaker, Overlord, expresses dissatisfaction with the update.

Overlord debates whether to make a video about the new release, considering there isn't much new to say compared to version 2.0.

The video aims to be a simpler, less edited version, focusing on the speaker's opinions and the future of Stable Diffusion.

Instructions for installing Stable Diffusion 2.1 on a personal computer are provided at the beginning of the video.

Two different models are available for download: the 768 model and the 512. The 768 model is recommended for better results.

The speaker advises against downloading models with pickle Imports due to potential virus risks, except for those released by Stability AI.

The differences between versions 2.1, 2.0, and 1.5 are discussed, with the speaker favoring the 1.5 version for its image quality.

Version 2.1 allows for the creation of super wide images, a feature not possible in previous versions.

The speaker criticizes the community for being entitled and unappreciative of the free access to the technology.

Stable Diffusion is seen as lagging behind other text-to-image AIs like Mid-Journey in terms of ease of use and image quality.

The speaker praises Stable Diffusion's versatility and potential for customization but notes its current shortcomings.

Stability AI's communication strategies are criticized for being unclear and secretive.

The future of Stable Diffusion is bright, with anticipated improvements and the release of DreamBooth 2.0.

The speaker has stopped using Stable Diffusion for thumbnail creation due to its limitations compared to other AIs.

The video encourages viewers to try Stable Diffusion 2.1 for themselves and provides links for easy access.

The speaker concludes by acknowledging the video's unusual format and invites feedback from viewers.