This Paper Introduces DiLightNet: A Novel Artificial Intelligence Method for Exerting Fine-Grained Lighting Control during Text-Driven Diffusion-based Image Generation
Researchers from Microsoft Research Asia, Zhejiang University, College of William & Mary, and Tsinghua University recently introduced a novel method, DiLightNet, to address the challenge of fine-grained lighting control in text-driven diffusion-based image generation. The existing models can generate images from text prompts, but they often need more precise control over lighting conditions, leading to correlated image content and lighting.
Current text-driven generative models can produce detailed images from simple text prompts but struggle with controlling lighting independently from the image content. The proposed method introduces a three-stage process to address this issue. First, a provisional image is generated under uncontrolled lighting. Second, a refined diffusion model called DiLightNet is employed to resynthesize the foreground object with precise lighting control using radiance hints. Finally, the background is inpainted to match the target lighting, resulting in images consistent with both the text prompt and specified lighting conditions.
DiLightNet leverages radiance hints and visualizations of the scene geometry under the target lighting to guide the diffusion process. These hints are derived from a coarse estimate of the foreground object’s shape obtained from the provisional image. The model is trained on a diverse synthetic dataset containing objects with various shapes, materials, and lighting conditions. Extensive experiments demonstrate the efficacy of the proposed method in achieving consistent lighting control across different text prompts and lighting conditions.
The paper presents a novel approach to address the challenge of fine-grained lighting control in text-driven image generation. The experiments showcase the effectiveness of the method in generating realistic images consistent with both text prompts and specified lighting conditions. Overall, the proposed approach offers a valuable advancement in text-driven image generation with enhanced lighting control capabilities.
Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and Google News. Join our 38k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.
If you like our work, you will love our newsletter..
Don’t Forget to join our Telegram Channel
You may also like our FREE AI Courses….
Pragati Jhunjhunwala is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Kharagpur. She is a tech enthusiast and has a keen interest in the scope of software and data science applications. She is always reading about the developments in different field of AI and ML.
Credit: Source link
Comments are closed.