Stability AI Open-Sources Stable Diffusion: An Artificial Intelligence Model That Converts Text To Image

On Sep 7, 2022

In collaboration with Runway, Machine Vision and Learning research group at LMU Munich, Eleuther AI, and LAION, Stability AI created the Stable Diffusion text-to-image model that instantly generates beautiful artwork. Stable Diffusion can produce photorealistic 512×512 pixel images based on a textual description of the situation.

It’s a major improvement in speed and quality, allowing it to be executed on consumer GPUs. Because of this, picture generation may now be run by anyone, not only academics, in various settings.

Following the publication of the code and a restricted release of the model weights to the research community, the weights are now being made available to the general public. The most recent version of Stable Diffusion is now available for download and usage on standard desktop PCs. The model can do more than only convert text to a picture; it can also upsize images and transfer their styles between them. With this launch, Stable AI has also made available a beta version of its web-based user interface and API for the model, dubbed DreamStudio.

Several procedures are feasible under the Stable Diffusion framework. As with DALL-E, it can be instructed to produce an image closely resembling a given textual description. Additionally, it can produce a photorealistic image using just a sketch and some written description.

Further, the Stability AI team collaborated with the HuggingFace and CoreWeave to add the following features:

The model is being made available under a Creative ML OpenRAIL-M license. Use for both profit and non-profit is permitted under this license. Users must ensure that the model is used in a way that does not break the law, and this license must be included in any model distribution. For any service based on the model, it is also required that this information be made available to end users.
They built an AI-based Safety Classifier standard component of the suite of applications. This considers concepts and other criteria over generations to filter out results that the model’s user might find undesirable.

These models were trained on image-text pairs from a broad internet scrape, which means the model may duplicate some social biases and produce harmful content. The team believes that open mitigation measures and debate about those biases can help them to improve the model’s performance. Therefore, they encourage everyone to utilize this resource responsibly and participate in the community and related discussions to help improve this technology.

Code: https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/stable_diffusion.ipynb

References:

https://www.infoq.com/news/2022/09/stable-diffusion-image-gen/
https://stability.ai/blog/stable-diffusion-public-release

Asif Razzaq is an AI Journalist and Cofounder of Marktechpost, LLC. He is a visionary, entrepreneur and engineer who aspires to use the power of Artificial Intelligence for good.

Asif’s latest venture is the development of an Artificial Intelligence Media Platform (Marktechpost) that will revolutionize how people can find relevant news related to Artificial Intelligence, Data Science and Machine Learning.

Asif was featured by Onalytica in it’s ‘Who’s Who in AI? (Influential Voices & Brands)’ as one of the ‘Influential Journalists in AI’ (https://onalytica.com/wp-content/uploads/2021/09/Whos-Who-In-AI.pdf). His interview was also featured by Onalytica (https://onalytica.com/blog/posts/interview-with-asif-razzaq/).

Credit: Source link