Take This and Make it a Digital Puppet: GenMM is an AI Model That Can Synthesize Motion Using a Single Example
Computer-generated animations are becoming more and more realistic every day. This advancement can be best seen in video games. Think about the first Lara Croft in the Tomb Raider series and the most recent Lara Croft. We went from a puppet with 230 polygons doing funky movements to a life-like character moving smoothly on our screens.
Generating natural and diverse motions in computer animation has long been a challenging problem. Traditional methods, such as motion capture systems and manual animation authoring, are known to be expensive and time-consuming, resulting in limited motion datasets that lack diversity in style, skeletal structures, and model types. This manual and time-consuming nature of animation generation brings a need for an automated solution in the industry.
Existing data-driven motion synthesis methods are limited in their effectiveness. However, in recent years, deep learning has emerged as a powerful technique in computer animation, capable of synthesizing diverse and realistic motions when trained on large and comprehensive datasets.
Deep learning methods have demonstrated impressive results in motion synthesis, but they suffer from drawbacks that limit their practical applicability. Firstly, they require long training times, which can be a significant bottleneck in the animation production pipeline. Secondly, they are prone to visual artifacts such as jittering or over-smoothing, which affect the quality of the synthesized motions. Lastly, they struggle to scale well to large and complex skeleton structures, limiting their use in scenarios where intricate motions are required.
We know there is a demand for a reliable motion synthesis method that can be applied in practical scenarios. However, these issues are not easy to overcome. So, what can be the solution? Time to meet with GenMM.
GenMM is an alternative approach based on the classical idea of motion nearest neighbors and motion matching. It uses motion matching, a technique widely used in the industry for character animation, and produces high-quality animations that appear natural and adapt to varying local contexts.
GenMM is a generative model that can extract diverse motions from a single or a few example sequences. It achieves this by leveraging an extensive motion capture database as an approximation of the entire natural motion space.
GenMM incorporates bidirectional similarity as a new generative cost function. This similarity measure ensures that the synthesized motion sequence contains only motion patches from the provided examples and vice versa. This approach maintains the quality of motion matching while enabling generative capabilities. To further enhance diversity, it utilizes a multi-stage framework that progressively synthesizes motion sequences with minimal distribution discrepancies compared to the examples. Additionally, an unconditional noise input is introduced in the pipeline, inspired by the success of GAN-based methods in image synthesis, to achieve highly diverse synthesis results.
In addition to its capability for diverse motion generation, GenMM also proves to be a versatile framework that can be extended to various scenarios beyond the capabilities of motion matching alone. These include motion completion, key frame-guided generation, infinite looping, and motion reassembly, demonstrating the broad range of applications enabled by the generative motion matching approach.
Check Out The Paper, Github, and Project Page. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com
🚀 Check Out 100’s AI Tools in AI Tools Club
Ekrem Çetinkaya received his B.Sc. in 2018, and M.Sc. in 2019 from Ozyegin University, Istanbul, Türkiye. He wrote his M.Sc. thesis about image denoising using deep convolutional networks. He received his Ph.D. degree in 2023 from the University of Klagenfurt, Austria, with his dissertation titled “Video Coding Enhancements for HTTP Adaptive Streaming Using Machine Learning.” His research interests include deep learning, computer vision, video encoding, and multimedia networking.
Credit: Source link
Comments are closed.