11
0 Comments

Stability AI releases Stable Diffusion 3.5, their latest flagship AI image generator

Stability AI describes Stable Diffusion 3.5 as the best AI image generator on the market, but is that true?

No story of the current AI moment can be told without mentioning Stability AI. The creators of the AI image generator Stable Diffusion, they have consistently been at the vanguard of the space.

However, it’s been a tough few years for the company. They’ve dealt with a range of executive departures, including founder and CEO Emad Mostaque, a fairly serious cash crunch, and a release in Stable Diffusion 3 that didn’t live up to the lofty standards they’ve built for themselves, with many finding it to be a disappointment.

It turns out that Stability themselves weren’t entirely pleased with SD3. So, they are now releasing Stable Diffusion 3.5, the latest upgrade to their flagship product that should improve its realism, prompt adherence, and text rendering. 

What is Stable Diffusion?

As mentioned in the intro, Stable Diffusion is an AI image generator. It turns text prompts into images via a process known as diffusion. This means that it takes an image with a bunch of noise and then subtracts the noise away until the correct image appears.

So, let’s say you want a picture of a cat. In this case, the process will look like this:

If you want the full technical details of how Stable Diffusion works, this is a great article that’s definitely worth a read

Stable Diffusion 3.5

Stable Diffusion 3.5 offers three new models:

  • Stable Diffusion 3.5 Large: The most powerful model being released. It contains 8 billion parameters and is ideal for professional use cases at 1 megapixel resolution.

  • Stable Diffusion 3.5 Large Turbo: A distilled version of 3.5 Large that can generate images in just four steps, making it considerably faster than 3.5 Large.

  • Stable Diffusion 3.5 Medium: A 2.5 billion parameter model that is designed to be run “out of the box” on consumer hardware. It is capable of generating images from 0.25 and 2 megapixels of resolution. 

The first two models are immediately available, while 3.5 Medium won’t be available until October 29th. 

Stability’s goal was to make the models the most customizable and accessible image models on the market. Accordingly, the models are:

  • Easily fine-tuned, with users also being able to build applications based on custom workflows.

  • Optimized to run on consumer hardware.

  • Able to create diverse images representative of the world. In other words, they can create people with different skin tones and features.

  • Capable of generating a wide range of styles and aesthetics like 3D, photography, painting, and line art.

  • The best models for prompt adherence, according to Stability.

Add it all together, and it sounds like a pretty compelling upgrade.

What do people think?

The consensus seems to be that it’s a definite improvement over Stable Diffusion 3, but still not as good as Flux. That’s what Levels thinks, and as the creator of an AI image generator himself, I take his opinion pretty seriously.

But, if you want to judge for yourself, here are some examples of the two side-by-side (SD 3.5 is on the left, Flux is on the right).

So, it appears that Stable Diffusion 3.5 is a step in the right direction for the embattled AI pioneer, but there is still work to be done if Stability wants to reclaim their AI image generation crown.

Photo of Stephen Flanders Stephen Flanders

Stephen Flanders is an Indie Hackers journalist and a professional writer who covers all things tech and startups. His work is read by millions of readers daily and covers industries from crypto and AI to startups and entrepreneurship. In his free time, he is building his own WordPress plugin, Raffle Leader.