SalesForce AI Research BannerGen: An Open-Source Library for Multi-Modality Banner Generation

Effective graphic designing is the backbone of a successful marketing campaign. It acts as a communication bridge between the designers and their audience by captivating the users, highlighting essential details, and enhancing the campaign’s visual appearance. However, current methodologies are both time-consuming and involve layer-by-layer assembly work, which requires expertise and is not easily scalable.

To address the abovementioned issue, the researchers at Salesforce have introduced an open-source library, BannerGen, that streamlines the design process using the power of generative AI. The library consists of three parallel multimodal banner generation methods – LayoutDETR, LayoutInstructPix2Pix, and Framed Template RetrieveAdapter. Each one has been trained on a large corpus of designed graphical data, which allows them to expedite the design process. Moreover, all of them have been open-sourced in BannerGen’s GitHub repository and can be imported as Python modules, making it easy for the developers to experiment with each method. BannerGen also has licensed fonts and carefully crafted templates, allowing developers to build high-quality designs.

The user can upload an image that they want to create a banner of. The image then undergoes a cropping process that focuses on the main elements to create multiple sub-images. Users can also specify the type of banner they want and the text they want to include. The sub-images are then integrated into the selected template to create a stunning visual. The final design is produced as an HTML and a PNG file.

The researchers have integrated the VAEGAN framework into their approach to align the generated designs with real-world patterns. The DETR architecture has also been incorporated into BannerGen and is referred to as LayoutDETR. The researchers have modified the DETR decoder to handle multimodal foreground inputs. This architecture allows BannerGen to understand the background and foreground elements better, leading to better results.

BannerGen has also incorporated InstructPix2Pix, an image-to-image editing technique powered by diffusion models. The same has been fine-tuned to convert background images into images with superimposed text. 

The third method, Framed Template RetrieveAdapter, is used to enhance the diversity of generated designs and consists of three components – the retriever, which finds the most suited frame on the basis of the metrics; the adaptor, which customizes input images and texts to fit in the frame, and the renderer which produces the design in HTML/CSS by integrating the background layer with the user’s inputs. 

In conclusion, BannerGen is a powerful and versatile framework that enables users to seamlessly create customized banners by leveraging generative AI. The architecture of BannerGen has been designed to learn from real layouts and understand the background and the foreground elements. The final design is generated as an HTML and a PNG file, which allows for easy manual adjustments and can be embedded into any media for immediate use. BannerGen aims to make the process of graphic designing less time-consuming and help users generate high-quality and professional-grade designs.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.


🐝 [Free Webinar] Alexa, Upgrade my App: Integrating Voice AI into Your Strategy (Dec 15 2023)

Credit: Source link

Comments are closed.