Stability AI Introduces SDXL Turbo: A Real-Time Text-to-Image Generation Model

On Dec 5, 2023

Stability AI introduces SDXL Turbo, which represents a remarkable advancement in text-to-image synthesis, driven by an innovative distillation method known as Adversarial Diffusion Distillation (ADD). This breakthrough enables the model to generate high-fidelity image outputs swiftly, reshaping the approach to real-time text-to-image conversion.

SDXL Turbo, an evolution from its predecessor SDXL 1.0, introduces ADD, a distillation technique that amalgamates adversarial training and score distillation. This innovative approach enables the model to generate real-time text-to-image outputs with unparalleled fidelity, retaining quality while dramatically reducing the required step count from 50 to just one. For an in-depth understanding of the technical intricacies, the research paper delves into the specifics of this innovative distillation technique.

Notably, SDXL Turbo’s ADD brings several key advantages reminiscent of Generative Adversarial Networks (GANs), such as single-step image synthesis, circumventing common artifacts and blurriness observed in other distillation methodologies. The paper elucidates this novel distillation technique, highlighting its impact on real-time image generation.

Performance evaluations conducted against various diffusion model variants—StyleGAN-T++, OpenMUSE, IF-XL, SDXL, and LCM-XL—underscore SDXL Turbo’s supremacy. In blind tests assessing fidelity to prompts and image quality, SDXL Turbo outshone a 4-step LCM-XL configuration with a single step. It even surpassed a 50-step SDXL configuration with only four steps. These results accentuate SDXL Turbo’s remarkable performance, beating state-of-the-art multi-step models with significantly reduced computational demands while preserving superior image quality.

Moreover, the inference speed achieved by SDXL Turbo is noteworthy. On an A100, the model generates a 512×512 image in a mere 207ms (prompt encoding + a single denoising step + decoding, fp16), with only 67ms attributed to a single UNet forward evaluation.

To experience the capabilities of SDXL Turbo firsthand, individuals can explore real-time image generation through Clipdrop, the image editing platform. The beta demonstration showcases the prowess of SDXL Turbo in transforming text prompts into stunning visual outputs. Clipdrop is accessible across most browsers and offers a free trial to explore the cutting-edge capabilities of SDXL Turbo

Check out the Model, Reference Article, and Demo. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 33k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

Niharika is a Technical consulting intern at Marktechpost. She is a third year undergraduate, currently pursuing her B.Tech from Indian Institute of Technology(IIT), Kharagpur. She is a highly enthusiastic individual with a keen interest in Machine learning, Data science and AI and an avid reader of the latest developments in these fields.

✅ [Featured AI Model] Check out LLMWare and It’s RAG- specialized 7B Parameter LLMs

Credit: Source link