Researchers from the University of Wisconsin and ByteDance Introduce PanoHead: The First 3D GAN Framework that Synthesizes View-Consistent Full Head Images with only Single-View Images

On Jul 6, 2023

In computer vision and graphics, photo-realistic portrait image synthesis has been constantly emphasized, with a wide range of downstream applications in virtual avatars, telepresence, immersive gaming, and many other areas. Indistinguishable from genuine images, recent developments in Generative Adversarial Networks (GANs) have shown a remarkably high image synthesis quality. Contemporary generative methods, however, don’t model the underlying 3D scenes; instead, they operate on 2D convolutional networks. As a result, it is impossible to properly ensure 3D consistency when synthesizing head pictures in different positions. Traditional methods call for a parametric textured mesh model learned from extensive 3D scan collections to produce 3D heads with various forms and looks.

The produced pictures, however, need more fine details and have poor expressiveness and perceptual quality. To make more realistic 3D-aware face pictures, conditional generative models have been created with the advent of differentiable rendering and implicit neural representation. These methods, however, frequently depend on either a multi-view image or 3D scan supervision, which is challenging to get and has a constrained appearance distribution because it is normally recorded in controlled environments. Recent developments in implicit neural representation in 3D scene modeling and generative adversarial networks (GANs) for picture synthesis have accelerated the development of 3D-aware generative models.

Figure 1 shows how our PanoHead enables high-fidelity geometry and 360 view-consistent photo-realistic full-head image synthesis to create realistic 3D portraits from a single perspective.

One of these, the pioneering 3D GAN, EG3D, has impressive quality in view-consistent picture synthesis and was trained using single-view image sets found in the wild. These 3D GAN methods can only synthesize in near-frontal perspectives, though. Researchers from ByteDance and the University of Wisconsin-Madison suggest PanoHead, a unique 3D-aware GAN trained using solely in-the-wild unstructured photos, enabling high-quality complete 3D head synthesis in 360. Numerous immersive interaction situations, including telepresence and digital avatars, benefit from their model’s ability to synthesize consistent 3D heads that can be seen from all perspectives. They believe their methodology is the first 3D GAN approach to realize 3D head synthesis in 360 degrees fully.

[Sponsored] 🔥 Build your personal brand with Taplio 🚀 The 1st all-in-one AI-powered tool to grow on LinkedIn. Create better LinkedIn content 10x faster, schedule, analyze your stats & engage. Try it for free!

There are several major technological obstacles to full 3D head synthesis when using 3D GAN frameworks like EG3D: Many 3D GANs can’t distinguish between foreground and background, leading to 2.5D head geometry. Large postures cannot be rendered because the background, normally structured as a wall structure, gets entangled with the created head in 3D. They develop a foreground-aware tri-discriminator that, using previous information from 2D picture segmentation, concurrently learns the decomposition of the foreground head in 3D space. Additionally, hybrid 3D scene representations, such as tri-plane, offer significant projection uncertainty for 360-degree camera postures, resulting in a “mirrored face” on the rear head despite their efficiency and compactness.

They provide a unique 3D tri-grid volume representation that separates the frontal characteristics from the rear head while preserving the effectiveness of tri-plane representations to address the problem. Finally, getting accurate camera extrinsic of in-the-wild rear head pictures for 3D GANs training is quite challenging. Additionally, there is a discrepancy in picture alignment between these and frontal photos with discernible face landmarks. Unattractive head geometry and a noisy appearance result from the alignment gap. As a result, they suggest a unique two-stage alignment method that reliably aligns photos from all perspectives. This procedure considerably reduces the 3D GANs’ learning curve.

They specifically suggest a camera self-adaptation module that dynamically modifies rendering camera locations to account for alignment drifts in the rear head pictures. As seen in Figure 1, their approach significantly improves the 3D GANs’ capacity to acclimatize to in-the-wild whole-head photos from arbitrary viewpoints. The resulting 3D GAN creates high-fidelity 360° RGB pictures and geometry and outperforms cutting-edge techniques in quantitative measures. With this model, they demonstrate how to create a 3D portrait with ease by reconstructing a whole head in 3D from a single monocular-view shot.

The following is a summary of their principal contributions:

• The first 3D GAN framework capable of rendering 360-degree full-head image synthesis that is view-consistent and high-fidelity. They use high-quality monocular 3D head reconstruction from photos taken in the field to illustrate their methodology.

• A unique tri-grid formulation for expressing 3D 360-degree head scenarios that compromises effectiveness and expressiveness.

• A tri-discriminator that separates 2D backdrop synthesis from 3D foreground head modeling.

• A cutting-edge two-stage picture alignment technique that adaptively accommodates poor camera postures and misaligned image cropping, enabling the training of 3D GANs from photos taken in the wild with a broad range of camera poses.

Check Out the Paper, Github Repo, and Project. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Featured Tools:

Aragon: Get stunning professional headshots effortlessly with Aragon.
StoryBird AI: Create personalized stories using AI
Taplio: Transform your LinkedIn presence with Taplio’s AI-powered platform
Otter AI: Get a meeting assistant that records audio, writes notes, automatically captures slides, and generates summaries.
Notion: Notion AI is a robust generative AI tool that assists users with tasks like note summarization
tinyEinstein: tinyEinstein is an AI Marketing manager that helps you grow your Shopify store 10x faster with almost zero time investment from you.
AdCreative.ai: Boost your advertising and social media game with AdCreative.ai – the ultimate Artificial Intelligence solution.
SaneBox: SaneBox’s powerful AI automatically organizes your email for you, and the other smart tools ensure your email habits are more efficient than you can imagine
Motion: Motion is a clever tool that uses AI to create daily schedules that account for your meetings, tasks, and projects.

🚀 Check Out 100’s AI Tools in AI Tools Club

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.

🔥 StoryBird.ai just dropped some amazing features. Generate an illustrated story from a prompt. Check it out here. (Sponsored)

Credit: Source link