the best REALISTIC models for Stable Diffusion

James Beltman
26 Jul 202308:44

TLDRThe video discusses the best models for creating highly realistic images using Stable Diffusion. The presenter shares their favorite model, Epic Realism, which excels at capturing facial details. They provide tips for using the model, such as keeping prompts simple, using specific parameters like steps and CFG scale, and experimenting with different samplers. The presenter also recommends using an upscaler for better resolution and details. They then explore the Magic Mix model, which is particularly effective for dramatic and dark scenes but has limitations in facial generation. Optimal settings and strategies for this model are also discussed. Lastly, the Analog Madness model is introduced for its versatility and ability to generate images of ordinary individuals, with an emphasis on crafting vivid prompts for captivating results. The video concludes with a reminder to visit the presenter's website for more guides.

Takeaways

  • 🎨 **Epic Realism** is a favorite model for creating lifelike images, especially noted for capturing facial details.
  • ✅ **Prompt Simplicity** is crucial; avoid extra keywords like 'Masterpiece' or '8K', but include negative keywords like 'cartoon' to maintain realism.
  • 🔍 **Fine-Tuning Parameters** such as steps (20 or higher), CFG scale (author recommends 5), and the choice of sampler (DPM sde Caris or dpm2m Keras) are essential for quality and realism.
  • 📈 **High-Resolution Upscaling** improves detail; using nmkd super scale or nmkd faces with a denoising setting of 0.35 and an upscale factor of 2 is recommended.
  • 🚫 **Effective Use of Negatives** helps refine the image and counteract biases, like the tendency to generate East Asian women in many models.
  • 💡 **Lighting Details** should be handled by the model without extra keywords, and avoid 'cinematic' for a more natural effect.
  • 🖼️ **Magic Mix Model** is praised for dramatic and dark scenes, but has limitations with facial generation, often generating East Asian women with a slim face filter look.
  • 🔧 **Optimizing Magic Mix** involves using a sampler like Euler a or dpm2m Karis, a steps range of 20-40, and high res upscale with settings tuned for the best results.
  • 🌟 **Analog Madness** stands out for its versatility and dynamicism, capable of generating images of ordinary individuals with high realism.
  • 📝 **Crafting Prompts** is key with Analog Madness; vivid and robust prompts lead to more captivating outputs.
  • ⚙️ **Workflow with Analog Madness** includes using the sde Cara sampler, a steps range of 25-35, and a conflict scale default of 7 for the best results.

Q & A

  • What is the title of the video transcript discussing?

    -The title of the video transcript is 'the best REALISTIC models for Stable Diffusion'.

  • Which model is currently one of the speaker's favorites for creating lifelike images?

    -Epic Realism is one of the speaker's current favorites for creating lifelike images.

  • What is the key to maintaining the perfect balance between quality and realism in the Epic Realism model?

    -The key to maintaining the perfect balance between quality and realism lies in fine-tuning several parameters, including steps, CFG scale, and the choice of sampler.

  • What are some of the negative keywords that should be included in prompts to enhance the realistic qualities of the generated images?

    -Negative keywords such as 'cartoon', 'painting', and 'illustration' should be included to enhance the realistic qualities of the generated images.

  • What is the recommended setting for the CFG scale in the Epic Realism model?

    -The author recommends setting the CFG scale to five in the Epic Realism model.

  • Which samplers are suggested for achieving an extra dose of realism with the Epic Realism model?

    -For an extra dose of realism with the Epic Realism model, the suggested samplers are DPM sde Caris, or dpm2m Keras.

  • What are the recommended upscaling tools and settings to improve the level of detail on the generated image?

    -The recommended upscaling tools are nmkd super scale or nmkd faces, with a denoising setting of 0.35 and an upscale factor of 2.

  • What is the bias tendency of most realistic models in stable diffusion?

    -Most realistic models in stable diffusion tend to be biased towards creating East Asian women.

  • What is the unique strength of the Magic Mix model?

    -The Magic Mix model has a unique strength in the realm of dramatic and dark lit scenes, bringing out moodiness and mystery in the generated images.

  • What are the recommended parameters for the Analog Madness model?

    -For the Analog Madness model, the recommended parameters include using the sde Cara sampler, maintaining a range of 25 to 35 steps, and setting the conflict scale to the default of 7.

  • What is the importance of crafting specific and pointed prompts for the Analog Madness model?

    -Crafting specific and pointed prompts is crucial for the Analog Madness model as it significantly enhances the output's realism and detail, particularly for generating non-modalesque figures.

  • How can one download additional models for Stable Diffusion as mentioned in the video?

    -To download additional models for Stable Diffusion, one can navigate to the link provided in the video description, download the desired model into the Stable Diffusion web UI/slash models folder, and then select it in the application.

Outlines

00:00

🎨 Epic Realism for Lifelike Image Generation

The first paragraph introduces the Epic Realism model, which is favored for its ability to transform simple prompts into highly realistic images, particularly excelling in facial detail. The speaker advises on prompt construction, emphasizing simplicity and the avoidance of certain keywords that don't affect the outcome. They discuss the importance of fine-tuning parameters such as steps, CFG scale, and the choice of sampler, recommending DPM sde Caris or dpm2m Keras for an extra dose of realism. The use of high-resolution upscalers like nmkd super scale or nmkd faces is also suggested to enhance image detail, with specific settings provided. Additional tips include the effective use of negative prompts to prevent biases and the recommendation of the Epic realism help alla to further enhance image realism. The process of downloading and selecting the Epic Realism model in the stable diffusion interface is outlined.

05:00

🌗 Magic Mix for Dramatic and Moody Scenes

The second paragraph discusses the Magic Mix model, which is recognized for its strengths in creating dramatic and darkly lit scenes, enhancing the moodiness and mystery of the generated images. However, it is noted that the model has limitations, particularly a tendency to generate East Asian women with a slim face filter look. Optimal sampler choices, steps, and upscaler settings are provided for this model, along with the recommendation to experiment with different settings to achieve the best results. The convex shell parameter is highlighted as important, with a suggested range, and the use of specific positive prompts to enhance image quality is discussed. The paragraph also covers the use of textual inversions to improve image outcomes and the model's proficiency in creating images with striking lighting effects and atmospheric settings, despite its propensity for a specific facial style.

Mindmap

Keywords

💡Epic Realism

Epic Realism is a model for Stable Diffusion that is favored for its ability to transform simple prompts into highly lifelike images. It is particularly noted for its excellence in capturing facial details, which is a critical aspect when striving for realism in generated images. In the video, it is mentioned as the creator's current favorite, highlighting its effectiveness in producing mind-blowing results with ease.

💡Automatic 1111

Automatic 1111 refers to a specific setting or version within the Stable Diffusion software that the video's narrator uses to demonstrate the creation process of images. It is the interface where the user can input prompts and adjust parameters to generate images, and it is where the models like Epic Realism are applied.

💡Prompts

Prompts are the textual descriptions or instructions given to the Stable Diffusion models to guide the generation of images. They are crucial for steering the output towards the desired outcome. The video emphasizes the importance of simplicity in prompts and avoiding certain keywords that do not affect the outcome, while including others that can detract from realism.

💡Parameters

Parameters in the context of Stable Diffusion models are the adjustable settings that users can fine-tune to influence the quality and realism of the generated images. The video discusses several parameters, including steps, CFG scale, and sampler, which are essential for achieving the perfect balance between quality and realism.

💡Denoising Strength

Denoising Strength is a parameter related to the upscaling process of images generated by Stable Diffusion models. It determines how much the upscaler will reduce noise or artifacts in the image. A higher denoising strength means the final output will be closer to the pre-upscaled image, while a lower strength allows for more detail to be retained.

💡High Res Upscaler

High Res Upscaler is a tool used to improve the resolution and detail of generated images. The video mentions two specific upscalers, nmkd super scale and nmkd faces, which are used to enhance the level of detail on the generated images, making them appear more lifelike and refined.

💡Negatives

Negatives refer to the keywords or terms that are included in the prompts to specify what should be avoided in the generated images. Effective use of negatives can help add realism to the image and define what the user does not want to see. An example from the video is adding 'Asian, Chinese' to the negatives if the user is not aiming for an East Asian ethnicity in their image.

💡Magic Mix

Magic Mix is another Stable Diffusion model discussed in the video, known for its unique strengths in creating dramatic and dark-lit scenes. It is noted for its tendency to generate images with a specific facial style, which may appeal to users looking for a particular aesthetic. The video provides optimization tips for using Magic Mix to achieve better results.

💡Analog Madness

Analog Madness is a versatile and dynamic Stable Diffusion model highlighted for its ability to generate images of ordinary individuals, offering a refreshing alternative to the typical model-like outputs. The model's effectiveness is highly dependent on the potency of the prompts provided, with more vivid and robust prompts leading to more captivating outputs.

💡Steps

Steps refer to a parameter in Stable Diffusion models that determines the number of iterations the algorithm runs to refine the image. The video suggests that higher steps can help with image errors or artifacts, but also needs to be balanced with computational load and the desired level of detail.

💡Conflict Scale

Conflict Scale is a parameter that affects the level of detail and computational load in the image generation process. The video mentions that for Analog Madness, a default setting of 7 offers the best results in terms of realism, suggesting a balance between computational efficiency and image quality.

Highlights

Epic Realism is a favored model for creating lifelike images, particularly excelling in capturing facial details.

The model can transform simple prompts into stunningly realistic results.

Users can view and click on images on the download page to see prompts used by others.

For optimal results, avoid adding extra keywords like 'Masterpiece' or '8K' to prompts.

Including negative keywords such as 'cartoon' and 'painting' can help maintain realistic qualities.

Fine-tuning parameters like steps and CFG scale is crucial for balancing quality and realism.

The author recommends using DPM sde Caris or dpm2m Keras for an extra dose of realism.

High res upscalers like nmkd super scale or nmkd faces enhance image detail with a denoising setting of 0.35 and upscale factor of 2.

Effective use of negatives can counteract biases, such as the tendency to generate East Asian women.

Magic Mix model is particularly strong in creating dramatic and dark-lit scenes with moody and mysterious tones.

The Magic Mix model has limitations in facial generation and tends to produce a specific, slim face filter look.

Optimal samplers for Magic Mix include Euler a, Euler dpm2m Karis, or dpmsc cares with steps between 20 and 40.

The use of convex shell parameter between 6 and 8 is recommended for positive prompts with Magic Mix.

Analog Madness model stands out for its versatility and ability to generate images of ordinary individuals.

The potency of Analog Madness lies in the strength of the prompts provided.

The sde Cara sampler is ideal for Analog Madness, with steps between 25 and 35 for optimal balance.

Keywords like '3D Max', 'grotesque', and 'desaturated' make images more realistic with Analog Madness.

Analog Madness can create a wide variety of realistic and unique images by playing with prompts.