Creative + Design
Creative + Design

News + Thought

Creative + Design

Which AI image generator is the best?

Tom Wittlin , Creative Director

Between Gemini, Midjourney, Firefly, GPT and every other nameless AI image generator under the sun, knowing which one to pick can feel as confusing as it is tiresome.

So which is actually the best?

At the risk of drifting into agency buzzword territory, I genuinely do get asked all the time which one is best - so I thought this post would help to break that down. Unfortunately, I’m going to lean into the Gary Vaynerchuk playbook and answer with the dreaded two-word phrase: “it depends”.

The unfashionable truth is that there isn’t really a single “best” AI image generator anymore, and the more you use them, the more obvious that becomes. Asking which one is best is a bit like asking which person is best in Skywire - it completely depends on what you’re trying to get done.

The reality is that these tools all have strengths and weaknesses, and the real skill is learning to match the task to the right one.

Take something like Midjourney. It’s incredibly strong when it comes to stylised, cinematic, highly creative outputs. If you want something that looks like a film still, a fantasy scene, or a dramatic concept piece, it’s often brilliant. But it does tend to lean into stylisation quite heavily. That can be a good thing, but it also means you sometimes get odd artefacts - slightly strange hands, facial details that don’t quite hold up on close inspection, or subtle distortions that you only really notice when you zoom in or when realism is the priority.

Then you have a tool like Google’s Nano Banana / Gemini, which tends to be much stronger when realism matters. If you generate something like a group of people in a natural environment, or a corporate setting, it often produces results that feel far more grounded. The lighting, proportions, and facial structures are more consistent, and in some cases the output can genuinely be mistaken for a real photograph. It doesn’t always have the same artistic flair as Midjourney, but what it provides in abundance is reliability and photorealism.

So if you take the same prompt and run it through both, you’ll often see a clear difference. I’ve illustrated this below: Midjourney gave me inconsistent perspective, mutated fingers, and faces that would be easy to carve into a pumpkin.

Meanwhile, from the very same prompt, Gemini gave me something that most would struggle to distinguish from a real photograph. The contrast between the two platforms is remarkable.

So Midjourney might have an advantage when it comes to producing something visually striking and artistic but slightly “off” in terms of realism, while Gemini might give you something that looks far more like a real-world photo, especially in scenes involving multiple people or structured environments.

Neither is better in an absolute sense, they’re just optimised differently.

The easiest way to think about all of this is to stop treating these tools as competitors and start treating them like specialists.

You wouldn’t go to the same person for every job in real life. You naturally think, “Who’s good at this specific thing?”, and you delegate accordingly. 

It’s the same here. If you need something bold and imaginative, you might lean one way. If you need something believable and grounded, you might lean another way - and over time, you learn to build an instinct for it. 

That instinct only really comes from using them, though.

Reading about differences is useful, but the real understanding comes when you actually run the same prompt across different platforms and compare results directly.

After a while you start to build a mental map of what each tool tends to do well: which ones handle faces reliably, which ones are better with composition, which ones drift into stylisation, and which ones stay close to reality.

There’s also nothing stopping you from using one platform to generate a theme or a concept, and then taking that output to another platform to refine, innovate or polish that into a final draft.

I often do this myself, and whenever I explain this process to people I liken it to how a cinematographer would shoot a film and then an editor would edit it. 

Once you reach that point, the whole process becomes less about searching for “the best AI image tool” and more about choosing the right one for the job in front of you.


So if you want my recommendation, that’s it: it depends.

To anyone who still believes that one must be superior to the rest I’d simply advise you to stop looking for a single answer, and take advantage of the plethora of image generation software options out there to build yourself a valuable toolkit instead.

Your creative director will thank you ;)


Tom Wittlin

About the Author

Tom Wittlin Creative Director

Tom is an award-winning Creative Director at Skywire. With over 25 years of experience, Tom works with a plethora of prestige clients including Explora Journeys, Canary Wharf, Iles Formula, and many more.