A New AI Approach Based On Operator Splitting Methods For Accelerating Guided Sampling in Diffusion Models
Diffusion models have recently achieved state-of-the-art results in content generation, including images, videos, and music. In this paper, researchers from VISTEC in Thailand focus on accelerating the sampling time of diffusion models, which allows for conditioning the sampling procedure to generate examples that belong to a specific class (such as “dog” or “cat”) or that are conditioned by an arbitrary prompt. The authors investigate numerical methods used to solve differential equations to accelerate the sampling process of guided diffusion models. These have already been used in unconditional diffusion models, but the authors show that integrating them into guided diffusion models is challenging. Therefore, they propose considering more specific integration schemes built upon the idea of “operator splitting.”
In the landscape of generative models, diffusion models belong to likelihood-based methods, such as normalizing flows or variational autoencoders, as they are trained by maximizing a lower bound on the likelihood of data and offer a stable training framework compared to generative adversarial approaches (GAN), while still offering close performance. They can be described through a Markov chain that we would like to reverse: starting from a high-dimensional point of the data distribution, an initial point is degraded by iteratively adding Gaussian perturbations (a kind of encoding procedure). The generative process consists of learning a denoising decoder that reverses those perturbations. The overall process is highly computationally costly, as it involves many iterations. In this paper, the authors focus on the generative procedure for which the forward pass can be interpreted as the solution of a differential equation. The equation associated with the guided diffusion of the paper has the following form:
The right-hand side is the diffusion term, while the second term can be understood as a penalization term that enforces gradient ascent on the conditional distribution. It brings the trajectory to a high-density region corresponding to the conditional density f. The authors stress that directly applying a high-order numerical integration scheme (e.g., Runge-Kutta 4 or Pseudo Linear Multi-Step 4) fails to accelerate the sampling procedure. Instead, they propose using a splitting method. Splitting methods are commonly used for solving differential equations that involve different operators. For example, the simulation of ocean pollution by a chemical substance can be described by advection-diffusion equations: when using a splitting method, we can separately treat the transport of this pollution (advection) and then apply a diffusion operator. This is the kind of method that the authors propose to consider in this paper by “splitting” the above ODE into two to evolve the above equation from time t to time t+1.
Among the existing splitting methods, the authors compare two different ones: the Lie-Trotter Splitting method and the Strang Splitting method. For each splitting method, they investigate different numerical schemes. Their experiments involve text and class-conditional generative tasks, super-resolution, and inpainting. Their results support their claims: the authors show that they are able to reproduce samples with the same quality as the baseline (which use a 250-steps integration scheme) using 32-58% less sampling time.
Proposing efficient diffusion models that require less computation is an important challenge, but ultimately the contribution of this paper goes beyond this scope. It is part of the literature on neural ODEs and their associated integration schemes. Here, the authors focus specifically on improving a class of generative models, but the scope of this type of approach could apply to any type of architecture that can be interpreted as a solution to a differential equation.
Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 14k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Simon Benaïchouche received his M.Sc. in Mathematics in 2018. He is currently a Ph.D. candidate at the IMT Atlantique (France), where his research focuses on using deep learning techniques for data assimilation problems. His expertise includes inverse problems in geosciences, uncertainty quantification, and learning physical systems from data.
Credit: Source link
Comments are closed.