
30 Dec 202313:09

TLDRThe video script introduces a creative process for generating consistent character images across different scenes using AI tools like ChatGPT and Midjourney. It emphasizes the importance of using a 'prompt formula' that includes垫图 (padding images), scene settings, character descriptions, activities, and artistic styles to maintain unity. The tutorial demonstrates how to iteratively refine AI-generated images to achieve a desired outcome, highlighting the need for patience and continuous adjustments to get the best results.


  • 📖 Utilize ChatGPT to generate a story outline with specific characters and settings.
  • 🖌️ Break down the story into smaller scenes that correspond to the images you want to create.
  • 🦊 Create an image of the main character and give it a name for consistent reference in Midjourney.
  • 🌲 Generate a scene image that sets the context for the story's environment.
  • 🔗 Learn to use padding images (垫图) in Midjourney to guide the AI in creating consistent character appearances.
  • 🎨 Understand the use of weights (iw) in prompts to control the influence of padding images on the final output.
  • 📝 Write prompts with a clear structure: image address, setting, character name, main features, activity, artist style, and parameters.
  • 🔄 Use double colons (::) to separate different elements in the prompt for clearer instructions to the AI.
  • 👶 Iteratively refine the AI's output by selecting the best results and using them as a basis for further generation.
  • 🎭 Choose an artist style that aligns with the desired tone of the story for a cohesive visual theme.
  • 💪 Patience and practice are key in achieving consistent and high-quality AI-generated images through trial and error.

Q & A

  • What is the main topic of the video script?

    -The main topic of the video script is about using AI tools like ChatGPT and Midjourney to generate a consistent character across different images for storytelling, specifically in creating a fairy tale with a fox character.

  • How does the video script suggest generating a storyline?

    -The video script suggests using ChatGPT to generate a storyline by providing it with specific prompts, such as the desire for a short fairy tale in Chinese with a fox as the main character and certain restrictions on secondary characters and scenes.

  • What is the purpose of breaking down the story into smaller paragraphs?

    -Breaking down the story into smaller paragraphs allows for the depiction of individual scenes, each corresponding to the images that will be generated next, making it easier to visualize and create the narrative through images.

  • How does the script recommend maintaining character consistency when generating images with Midjourney?

    -The script recommends maintaining character consistency by first generating an image of the story's main character, naming it, and then repeatedly using that name in subsequent image generation prompts to help Midjourney recognize and remember the character across different scenes.

  • What is the significance of '垫图' (padding images) in the context of the video script?

    -In the context of the video script, '垫图' refers to the use of pre-generated images (such as the main character's image and scene images) that are垫 (padded) into the prompt to guide Midjourney in creating new images that are stylistically and thematically consistent with the padded images.

  • What is the role of '权重' (weight) in the image generation process?

    -The '权重' (weight), specifically 'image weight (iw)', controls the influence of the padded image on the final result. A higher weight means the generated image will more closely resemble the padded image, making it crucial for maintaining consistency in character appearance and style across different images.

  • How does the script explain the use of double colons in the prompt formula?

    -The script explains that double colons in the prompt formula act as a separator, distinguishing different elements within the prompt. This helps Midjourney to better understand the intended meaning by separating the elements and preventing confusion or misinterpretation of the prompt.

  • What is the importance of iterating the AI's generated images?

    -Iterating the AI's generated images is important because it allows for refinement and improvement of the results. Since AI may not generate the desired outcome in one attempt, iterative adjustments help achieve the desired consistency in character and style.

  • What is the final goal of the process described in the video script?

    -The final goal of the process described in the video script is to create a series of images that tell a cohesive and visually consistent fairy tale story, with the main character and style remaining uniform across all images.

  • How does the video script address the challenges of using AI for creative tasks?

    -The video script acknowledges that while AI tools can be powerful, they also require patience and iterative refinement. It emphasizes the need for users to adjust and rework the prompts based on the AI's output, highlighting the importance of human creativity and guidance in the process.

  • What additional resources does the script offer for those interested in trying out the process?

    -The script offers guidance for those without access to Midjourney and GPT-4 by suggesting the use of合租平台 (shared hosting platforms), and the video creator provides a link in the video description for further assistance.



🎨 Introducing AI Art and Storytelling

This paragraph introduces the trend of using AI tools like ChatGPT and Midjourney for generating novel and animated story plots, as well as AI drawing software for creating images. It acknowledges the complexity of maintaining character consistency across different images and sets the stage for a tutorial on how to achieve this using Midjourney's advanced features, such as 'padding' images and adjusting weights.


📖 Crafting a Fairy Tale with AI

The speaker demonstrates the process of generating a fairy tale plot using ChatGPT, focusing on a fox as the main character. The story is then broken down into smaller scenes, each corresponding to a potential image. The paragraph emphasizes the importance of preparing the AI with a clear image of the protagonist and setting the scene for Midjourney to generate consistent character images across different scenes.


🖌️ Advanced Prompting Techniques for Midjourney

This section delves into the technicalities of creating prompts for Midjourney, explaining the use of padding images, setting scenes, describing the main character, and detailing the character's activities. It introduces the concept of 'image weight' (iw) to control the influence of the垫图 on the final image and discusses the use of double colons to separate different elements within the prompt for better AI understanding.

🌟 Iterative AI Art Generation Process

The speaker shares practical insights into the iterative process of generating AI art, using the example of creating a consistent character, Mark the fox, in various scenes. The paragraph highlights the importance of patience and continuous refinement to achieve the desired results, as AI may not always generate perfect images on the first attempt. It concludes with an invitation for viewers to try the method themselves and provides resources for accessing AI tools.




ChatGPT is an AI language model developed by OpenAI that generates human-like text based on the prompts given to it. In the context of the video, it is used to create a short fairy tale plot with a fox as the protagonist, demonstrating its capability to produce creative content.


Midjourney is an AI-based image generation software that can create images from textual descriptions or 'prompts'. In the video, it is used to generate images that correspond to different scenes of the fairy tale created by ChatGPT, with a focus on maintaining the consistency of the main character across different images.


DAll-E is an AI system designed to generate images from textual descriptions. While not explicitly used in the video, it is mentioned as one of the AI drawing software options that have gained popularity recently, alongside Midjourney.

💡Seed Value

A seed value is a random number used as the starting point for generating a series of outputs in AI systems. In the context of the video, it refers to a method used in AI image generation to produce multiple images from a single prompt, with the seed value influencing the variation in the outputs.

💡垫图 (Padding Image)

In the context of AI image generation, '垫图' or 'padding image' refers to the technique of using an existing image as a reference or '垫' for the AI to generate new images that are stylistically or thematically similar. This method helps in maintaining the consistency of characters or elements across different images.

💡权重 (Weight)

In AI image generation, '权重' or 'weight' refers to the influence that a particular element, such as a垫图 or a descriptive phrase, has on the final output. Higher weight values mean the element has a stronger impact on the generated image, ensuring that the AI's output aligns more closely with the reference or description.


Multi-Prompts is a feature in AI image generation that allows users to input multiple prompts or descriptive phrases separated by double colons (::) to guide the AI in creating an image that incorporates all the elements mentioned. This helps the AI to understand and prioritize different aspects of the prompt more effectively.

💡艺术家 (Artist)

In the context of the video, '艺术家' or 'artist' refers to the specific artistic style or a particular artist that the user wants the AI to emulate when generating images. By specifying an artist's name, the AI can generate images that are stylistically similar to the works of that artist.

💡纵横比 (Aspect Ratio)

The '纵横比' or 'aspect ratio' is the proportional relationship between the width and the height of an image. It is an important parameter in image generation, as it determines the shape and orientation of the output, such as whether it is landscape (wider) or portrait (taller).

💡迭代 (Iteration)

In the context of AI image generation, '迭代' or 'iteration' refers to the process of refining and improving the AI's output by repeatedly adjusting the prompts and parameters based on the results obtained. This is necessary because AI may not produce the desired outcome on the first attempt.

💡合租平台 (Sharing Platform)

A '合租平台' or 'sharing platform' refers to online services that allow multiple users to share access to a subscription or account for a service, such as AI tools like Midjourney or GPT-4. This can be a cost-effective way for individuals to access these tools without having to pay for a separate subscription.


Using ChatGPT for generating novel or animated plots and Midjourney or DALL-E for image creation has become a popular method.

The process may seem simple but executing it effectively is challenging, especially in maintaining character consistency across images.

Demonstrates a method, officially introduced by Midjourney, to maintain character uniformity in different images, creating a continuous narrative.

Emphasizes the use of foundational images (垫图) and the importance of weight parameters in AI image generation.

Shows how to enhance AI drawing skills with advanced techniques.

Starts with generating a story plot using ChatGPT, focusing on a brief Chinese fairy tale with a fox protagonist.

Breaks the complete story into smaller paragraphs, each depicting a scene to be visualized.

Describes preparing for Midjourney image creation by generating initial images of the main character and desired scenes.

Introduces a structured prompt formula for Midjourney to create images with character and scene continuity.

Explains how to use foundational images and adjust image weight (iw) to influence the final image outcome.

Illustrates the meaning and use of image weight (iw) with a flower and birthday cake example.

Clarifies the concept of multi-prompts and the importance of using double colons for distinguishing between different elements in prompts.

Demonstrates generating a character image and emphasizes the importance of a frontal photo with a clear background.

Provides a detailed example of creating a scene with the main character using a specifically structured prompt.

Addresses challenges when the AI does not generate the expected elements, showing how to iteratively refine the image.

Highlights the key to creating continuous character roles across different scenes is to reuse character names and maintain foundational images.

Mentions the generation of nearly 100 images to select 8 that best fit the fairy tale, emphasizing patience and iterative refinement.

Encourages trying the method and offers support for users without access to Midjourney or GPT-4.

Invites viewers to appreciate the complete fairy tale version, emphasizing the joy of creativity and support for the content creator.