Midjourney vs Flux vs Stable Diffusion vs Dall E 3, Which AI Image Generator Should You Choose?

Introduction

In the world of AI-generated images, four major players have emerged: Midjourney, Flux, Stable Diffusion, and Dall E 3. Each model has its strengths and weaknesses, depending on the specific prompts and creative requests provided. In this article, we’ll dive into a comparison of how each model performs using selected prompts, assessing their capabilities for generating imaginative and contextually relevant images.

First Prompt: No Face at a TED Talk

The first prompt requested a hyper-realistic, high-resolution photo of No Face from Studio Ghibli's Spirited Away giving a speech at a TED Talk event.

Midjourney: Understood the character reference but didn’t accurately reflect the TED Talk background.
Flux: Struggled with the character reference entirely but managed to create a decent TED Talk setting.
Stable Diffusion: Captured the character reference but lacked the TED Talk context, resulting in a less realistic image.
Dall E 3: Also struggled with the character recognition, although it had some semblance of a TED Talk setting during a second attempt.

Second Prompt: A Crying Maid and a Yelling Woman

The next prompt depicted a fat Korean maid crying on the floor, while a Hispanic woman with dark red hair yelled at her in a mansion.

Midjourney: Produced remarkably dramatic and visually appealing results.
Flux: Offered a similar quality to Midjourney’s output, with both generating strong imagery.
Stable Diffusion & Dall E 3: Both models flagged the prompt as inappropriate, failing to deliver on the request.

Third Prompt: An Influencing Evil Spirit

The third prompt involved creating an image of an evil spirit influencing humanity, which tested the creativity of each AI model.

Midjourney: Generated compelling and fitting images depicting an evil spirit.
Flux: Did not effectively capture the ‘evil’ aspect in its portrayal.
Stable Diffusion: Managed to create reasonable images, although they were not as impressive as Midjourney’s.
Dall E 3: Initially struggled but improved on the HD setting, resulting in better outputs.

Fourth Prompt: Teen Model with Text

In the fourth prompt, users requested an image of a teenage female model wearing an oversized white T-shirt that featured the words "Coco" and "Comfy."

Midjourney: Had difficulties accurately rendering the text.
Flux, Stable Diffusion, and Dall E 3: All generated the text accurately, though Flux Pro’s output was considered more realistic and cinematic.

Fifth Prompt: A Dramatic Landscape

The final challenge involved a more complex scene: a dusty, dark African landscape with a lion and a doctor superimposed, creating a dramatic image showing both elements roaring in unison.

Midjourney: Delivered multiple outputs with the second version being highly acclaimed.
Flux: Misinterpreted the prompt and produced a simple image of a lion without context.
Stable Diffusion: Managed to understand the prompt successfully in multiple trials, but the realism was less impressive compared to Midjourney.
Dall E 3: Eventually succeeded in capturing the essence during its third attempt.

Conclusion

After examining the results from all five prompts across these AI image generators, the clear preference for many arose around Midjourney. However, both Flux and stable diffusion offered competitive results in certain areas, while Dall E 3 showed potential for improvement especially with context handling. Ultimately, the best choice will depend on specific needs and the type of imagery desired.

Keywords

AI image generation
Midjourney
Flux
Stable Diffusion
Dall E 3
TED Talk
Evil spirit
Korean maid
Text rendering
African landscape

FAQ

Q: Which AI model performed the best overall?
A: Midjourney generally produced the most compelling results across various prompts, particularly known for its creative outputs.

Q: Did any models refuse to generate images?
A: Yes, both Stable Diffusion and Dall E 3 flagged the second prompt about the maid as inappropriate.

Q: How did the models perform with text in images?
A: Midjourney struggled with accurate text rendering, while Flux, Stable Diffusion, and Dall E 3 managed to generate accurate text.

Q: Can I test these AI models myself?
A: You can access Flux, Stable Diffusion, and Dall E 3 through platforms like Anakin AI, which offers free daily credits to users.