Midjourney Vs DallE-3 Prompt Shootout!

Theoretically Media
21 Sept 202312:22

TLDRIn a recent video, Tim discusses the integration of Dall-E 3 into Chat GPT and the resulting debate over whether Midjourney is now obsolete. He argues that Midjourney is not dead and provides a prompt shootout comparing the outputs of both AI image generators. Tim showcases various images generated by Dall-E 3, highlighting its ability to handle text and create surreal, anime, and photorealistic images. He then runs the same prompts through Midjourney to compare the results. While Midjourney misses some elements of the prompts, it still produces captivating images. Tim notes that Midjourney is set to release a 3D feature and an improved language model in the future. He concludes that the advancements in AI image generation are a win for everyone, as they offer more options for creators and do not necessarily compete against each other.

Takeaways

  • 📢 OpenAI has announced the integration of Dall-E 3 into Chat GPT, which will be available to Plus plan subscribers.
  • 🚀 The integration aims to make prompting more conversational, addressing the complexity of mid-journey's command-based system.
  • 🖼️ Dall-E 3 has shown the ability to parse longer, descriptive text and generate images with solid sense of imagination.
  • 🎨 Dall-E 3 has been designed to deny requests of particular artists, aligning with a trend of reduced reliance on artist tokens in AI image generators.
  • 📈 Mid-journey is not considered 'dead' despite Dall-E 3's advancements; it is expected to improve with its upcoming version 6 and 3D feature.
  • 🌐 Mid-journey attracted 21 million users to its website in July, showcasing its popularity despite competition.
  • 📈 Chat GPT had 1.4 billion visitors in the same period, indicating a broader user base for the language model.
  • 🔍 Dall-E 3's output was compared with mid-journey using prompts from the Dall-E website, revealing differences in how each AI interprets and visualizes the prompts.
  • 🎉 The advancements in AI image generation are seen as a win for everyone, promoting a collaborative rather than competitive environment.
  • 🔍 Mid-journey's outputs sometimes missed elements of the prompts, suggesting room for improvement in its language model.
  • 📈 The convergence of large language models and image generators is expected to lead to a fascinating era for visual creation.

Q & A

  • What significant announcement has OpenAI made regarding Dolly 3?

    -OpenAI has announced that Dolly 3 will be integrated into Chat GPT, which is expected to make prompting more conversational.

  • What is the current availability of Dolly 3 in Chat GPT?

    -As of the time of the transcript, Dolly 3 is not yet available in Chat GPT, but it will be soon, and it will only be accessible to those on the plus plan of Chat GPT.

  • What is the main advantage of connecting Dolly 3 to Chat GPT?

    -The main advantage is that it will allow Dolly 3 to parse longer stretches of descriptive narrative text more effectively and pick out key tokens from it.

  • How does the integration of Dolly 3 with Chat GPT address a common criticism of Midjourney?

    -The integration aims to simplify the prompting process, which is often criticized for being overwhelming and complex in Midjourney due to its many commands and unique system.

  • What was the selling point of the recently released Ideogram?

    -The selling point of Ideogram was its ability to handle text, which is a feature that the next generation of AI image models, including Dolly 3, also seem to be adopting.

  • What is the general direction AI image generators are taking regarding artist tokens?

    -AI image generators, including Midjourney, are moving towards giving less weight to artist tokens, as noted by the speaker's observation of recent changes.

  • What feature is Midjourney planning to introduce in the next six months?

    -Midjourney has announced that a 3D feature will be introduced within the next six months.

  • What improvements are expected in Midjourney's version 6?

    -Midjourney's version 6 is expected to have a much improved language model, which should allow it to handle prompts from Dolly more effectively.

  • How many visitors did the Dolly website have in July, as mentioned in the transcript?

    -The Dolly website had 13 million visitors in July.

  • What is the difference in the number of visitors between Midjourney and Chat GPT within the same time frame?

    -While Midjourney attracted 21 million users to its website, Chat GPT had 1.4 billion visitors, indicating a significant difference in user base.

  • What is the speaker's perspective on the competition between Dolly 3 and Midjourney?

    -The speaker does not believe that Dolly 3's integration into Chat GPT signifies the death of Midjourney. Instead, they see it as a win for everyone and are excited about the convergence of large language models and image generators.

  • What feature of Dolly 3 has the speaker noticed is not present in the Chat GPT version?

    -The speaker has noticed that there is no indication of in or out painting in the Chat GPT version of Dolly, which has been a feature in Dolly's standalone version.

Outlines

00:00

📢 Dolly 3 Integration with Chat GPT and Prompt Shootout

The video discusses the recent announcement of Dolly 3's integration with Chat GPT, which is expected to enhance the conversational aspect of AI prompting. The creator expresses skepticism about the claim that Dolly 3's introduction signals the end for Mid-Journey, a competing AI. The video features a prompt shootout, comparing Dolly 3's image generation capabilities with those of Mid-Journey, using prompts provided by Open AI. The comparison showcases Dolly 3's ability to parse descriptive text and generate images with solid imagination and photorealism, as demonstrated by the examples provided by Will Depew. The video also touches on the evolving direction of AI image generators, including the reduced emphasis on artist tokens and the potential for future improvements in Mid-Journey's language model.

05:02

🎨 Mid-Journey's Image Generation Compared to Dolly 3

The second paragraph of the video script focuses on a detailed comparison of image generation between Mid-Journey and Dolly 3. It presents various prompts and the corresponding images generated by both AI systems. The comparison highlights the strengths and weaknesses of each system in capturing the elements of the provided prompts. Mid-Journey's outputs are noted for their vibrant colors and detail, although they sometimes miss certain prompt elements like the 'pepperoni sun' or 'turbulent waves in a coffee mug.' Dolly 3's images are praised for their adherence to the prompts and the inclusion of all requested elements. The video also discusses the potential for Mid-Journey's upcoming 3D feature and improved language model in version 6, suggesting that the current shortcomings may be addressed in future updates.

10:03

🌐 Market Reception and the Future of AI Image Generation

The final paragraph addresses the market reception of Dolly 3 and Mid-Journey, comparing their visitor numbers and the potential impact on users interested in image generation. It mentions that despite Dolly 3's popularity with 13 million visitors in July, Mid-Journey attracted 21 million users over the same period, although it's unclear how many of these users are engaged with image generation features. The paragraph also notes the vast difference in user engagement between these platforms and Chat GPT, which had 1.4 billion visitors. The speaker expresses optimism about the convergence of large language models and image generators, predicting a fascinating future for visual content creation. They conclude by stating that the advancements are a win for everyone and that there's room for multiple platforms to coexist and thrive in the AI image generation space.

Mindmap

Keywords

💡Midjourney

Midjourney refers to an AI image generation model that has been a subject of comparison in the video. It is mentioned as being potentially overshadowed by the newer Dall-E 3 model, but the video argues that Midjourney still has its merits and is not necessarily 'dead' despite the advancements in Dall-E 3.

💡Dall-E 3

Dall-E 3 is an advanced text-to-image AI model developed by OpenAI. It is highlighted in the video for its integration with Chat GPT and its ability to parse longer, more descriptive text to generate images. The video discusses its features and compares its output to that of Midjourney.

💡Chat GPT

Chat GPT is a language model developed by OpenAI that is capable of generating human-like text based on given prompts. In the context of the video, it is noted that Dall-E 3 will be integrated with Chat GPT, which is expected to enhance the conversational aspect of image generation.

💡Prompt Shootout

A prompt shootout is a comparison of the outputs generated by different AI models when given the same textual prompts. The video conducts a prompt shootout between Midjourney and Dall-E 3 to evaluate their performance and capabilities.

💡Text-to-Image Generation

Text-to-image generation is the process where AI models create images based on textual descriptions provided as prompts. It is the core functionality being compared in the video, with both Midjourney and Dall-E 3 showcasing their ability to generate images from text.

💡Photorealism

Photorealism in the context of the video refers to the ability of AI models to generate images that closely resemble real-life photographs. The video discusses how Dall-E 3 is capable of producing photorealistic images, which is a notable feature of its image generation capabilities.

💡Imagination

Imagination, as discussed in the video, is the capacity of Dall-E 3 to create surreal and imaginative images that may not necessarily exist in reality. It is an important aspect when evaluating the creative potential of AI image generation models.

💡Papercraft Art

Papercraft art is a form of art where three-dimensional objects are created from paper. In the video, it is used as a prompt to test the AI models' ability to generate images that reflect the style and characteristics of papercraft.

💡Diorama

A diorama is a model representation of a scene, typically smaller than life-size. The video uses the concept of a diorama to prompt the AI models to generate images of a miniature scene, such as a cafe, to test their ability to represent spatial relationships and details.

💡Aesthetics

Aesthetics in the video refers to the visual appeal and artistic style of the images generated by the AI models. It is a criterion used to judge the quality and appeal of the outputs from Midjourney and Dall-E 3.

💡Language Model

A language model is an AI system that understands and generates human language. The video mentions that Midjourney's upcoming version 6 will feature an improved language model, which is expected to enhance its ability to interpret and generate images from complex prompts.

Highlights

OpenAI has announced the integration of Dall-E 3 into Chat GPT.

Dall-E 3 is expected to be available soon but only to Chat GPT Plus plan subscribers.

The integration aims to make prompting more conversational, addressing a common criticism of Midjourney's complex command system.

Dall-E 3's ability to parse longer, descriptive text is a significant advancement in AI image generation.

Dall-E 3 has been showcased to produce high-quality images, including surrealist and anime styles.

Dall-E 3 has a feature to deny requests mimicking particular artists, which is a new direction for AI image generators.

Photorealistic images produced by Dall-E 3 demonstrate its dual capabilities of realism and imagination.

Dall-E 3 was the first text-to-image generator, setting a precedent in AI technology since its release in January 2021.

A prompt shootout compares Dall-E 3 and Midjourney using the same prompts to evaluate their outputs.

Midjourney's output sometimes missed elements of the prompt, such as turbulent waves in a coffee mug or a pepperoni sun.

Midjourney's 3D feature and version 6 with an improved language model are upcoming, promising better performance.

Midjourney attracted 21 million users to its website in July, showing its popularity despite competition.

The convergence of large language models and image generators is an exciting development for creators.

The competition between Dall-E 3 and Midjourney is seen as beneficial, with no need for one to outperform the other.

The advancements in AI image generation are moving at a rapid pace, with significant improvements in a short period.

The speaker, Tim, expresses his fascination with the evolving capabilities of AI in creating visuals.