Stable Diffusion 3 suddenly released!In the same architecture as Sora, everything is more realistic

White See Poor Seedfei from Temple Quantum | Public Account QBITAI

Stable Diffusion 3, it is finally here!

It has been brewing for more than a year. Compared with the previous generation, a total of Three Capability has been evolved.

Come, the effect directly!

First of all, it is the text rendering ability hanging.

and look at the chalk on this blackboard:

Go Big or Go Home ), This is murderous ~

The neon effect of the street sign, bus lamp sign:

And the embroidery "hook" is about to see the "good night" of the pin:

As soon as the work was put on, netizens shouted: too accurate .

So that some people said: Hurry up and arrange Chinese. .

Second, Multi-theme prompt ability is directly full.

What does it mean?Although you plug in more than N "elements" in the prompt, Stable Diffusion 3: Leaks a one to lose.

Na, carefully look at the figure below, there are "astronauts", "small pig wearing ballet skirts", "pink umbrella", "Knowing Birds wearing a hat hat"In the corner, there are several large words" Stable Diffusion "(not a watermark).

With this ability, how rich you want to be rich in a work.

Finally, it is image quality , which has evolved again.

Looking at the previous figures, is it impacted?Intersection

and all kinds of ultra -clearing specially written, thats it.

At present, the official list has been opened, and everyone can go to the official website to apply.

Keke, I have to say that this AI circle is really lively recently.

Some netizens call directly, my computer has been hold ...

STableDiffusion 3 is here!

How good is the new Stable Diffusion effect, and then give it to everyone.

Of course, all the pictures come from the official, such as the head of the Stabilityai media:

I have to say that the text effect is really the most eye -catching, and various forms can be presented well and "Ying Jing".

When you see the picture above, you have to think of "Midjourney embarrassing appearance of academic circles: messages to biological papers"-- With SD3, can we make very professional academic pictures?

In addition to these, SD3s "alcohol ink painting" is also quite different:

Anime style:

Again, you can add clear text on it.

Since the current need to be queued, everyone is not good for the actual test.

But the witty netizens have fed Midjourney (V 6.0) with the same prompt.

For example, the beginning of the "red apple and blackboard"Go Home "Written in Chalk)

The results given by the final Midjourney are as follows:

From this set of comparison, it can be said that the SD3 is better than text spelling or quality and color coordination.

In terms of technology, at present, the parameter range of the model can be selected from 800m to 8B.

The detailed technical report has not yet been announced. At present, the official only revealed that it mainly combines the diffuse transformer architecture and Flow Matching .

The former is actually the same as Sora. The attached technical papers are the DIT written by the 22 -year William Peebles and Xie Senning.

DIT combines the Transformer with the diffusion model for the first time, and the relevant papers are hired by the ICCV 2023 as an ORAL paper.

In this research, researchers have trained the potential diffusion model to replace the commonly used U-Net trunk network with TRANSFORMER operated by the potential PATCH.They analyzed the scalability of the diffusion Transformer (DIT) by the forward transmission complexity measured by GFLOPS.

and the latter Flow Matching is also from 22 years, which is completed by scientists at Meta AI and Weitzman Institute.

They have proposed a new paradigm based on continuous home-based flow (CNFS), and the concept of Flow Matching. This is a kind ofThe method of simulating CNFS based on a vector field based on the probability path of regression fixed condition.It was found that the Flow Matching with a diffusion path can be trained more stable and stable.

But recently watched so many video generation progress, some netizens said:

Woolen cloth?

One More Thing

In addition, it was the day before, their video products STable Video Formally open public beta.

Based on SVD1.1 (Stable Video Diffusion 1.1), everyone is available.

It mainly supports two functions: Wensheng video and map video.

Reference link: [1] https://stability.ai/news/stable-diffusion-3 [2] https://arxiv.org/abs/2212.09748 [3] https://arxiv.org/abs/2210.02747 [4] https://twitter.com/pabloaumemene/Status/1760678508173660543