Google AI Introduces OptFormer: The First Transformer-Based Framework For Hyperparameter Tuning

OpenML and other public machine learning data platforms, along with hyperparameter optimization (HPO) services like Google Vizier, Amazon SageMaker, and Microsoft Azure, have facilitated the availability of comprehensive datasets with hyperparameter assessments. Optimization of hyperparameters is crucial in machine learning since they can make or break a model’s performance on a given task.

There is a growing interest in using this kind of information to meta-learn hyperparameter optimization (HPO) algorithms. Still, working with large datasets that include experimental trials in the wild can be difficult due to the wide variety of HPO problems and the text metadata that describes them. Consequently, most meta- and transfer-learning HPO approaches consider a constrained environment where all tasks must share the same hyperparameters so that the input data can be represented as fixed-sized vectors. As a result, the data used to learn priors using such methods is limited. This is a particularly serious problem for huge datasets that include valuable information.

Google AI has developed the OptFormer, one of the first Transformer-based frameworks for hyperparameter tuning, which can learn from massive amounts of optimization data by employing versatile textual representations. 

Earlier works have shown the Transformer’s versatility. However, not many pieces of research focused on its potential for optimization, particularly in the realm of text. The paper “Towards Learning Universal Hyperparameter Optimizers with Transformers” presents a meta-learning HPO system that is the first to learn policy and function priors from data in multiple search spaces simultaneously.

Unlike traditional approaches, which often only use numerical data, the proposed method uses concepts from natural language and depicts all of the research data as a series of tokens, including textual information from the original metadata.

The T5X codebase is used to train the OptFormer in a conventional encoder-decoder fashion with standard generative pretraining for several hyperparameter optimization objectives, including Google Vizier’s real-world data and public hyperparameter (HPO-B) and black-box optimization benchmarks (BBOB). The OPTFORMER can generalize the habits of seven distinct black-box optimization algorithms (non-adaptive, evolutionary, and Bayesian).

Source: https://arxiv.org/pdf/2205.13320.pdf

According to the researchers, OptFormer may mimic many algorithms at once because it learns from the optimization paths of numerous algorithms. OptFormer will act the same way as the selected algorithm if given a textual prompt in the algorithm’s metadata (such as “Regularized Evolution”). 

Finally, model-based optimization, including Expected Improvement acquisition functions, makes OPTFORMER policies a formidable competitor among HPO methods. According to the team, this is the first time acquisition functions for online adaption are added to Transformers. 

The OptFormer can also estimate the degree of uncertainty and make predictions about the optimized objective value (such as accuracy). The researchers compared OptFormer’s prediction with a standard Gaussian Process. The findings demonstrate that OptFormer’s prediction is much more accurate than a normal Gaussian Process.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'Towards Learning Universal Hyperparameter Optimizers with Transformers'. All Credit For This Research Goes To Researchers on This Project. Check out the paper and reference article.

Please Don't Forget To Join Our ML Subreddit


Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.


Credit: Source link

Comments are closed.