MIT researchers have developed a new technique that can enable a machine learning model to quantify how confident it is in its predictions

Robust machine learning models are helping humans solve complex issues like seeing cancer in medical photos or detecting barriers on the road for autonomous vehicles. However, since machine learning models are imperfect, people must understand when to believe a model’s predictions in high-stakes situations.

It is well understood that neural networks must be more confident when generating uncertainty measures straight from the output label distribution. For the learned model to attain the necessary accuracy and uncertainty prediction performance simultaneously, existing techniques primarily address this issue by retraining the entire model to impose the uncertainty quantification capabilities. However, starting from scratch with the model’s training is computationally expensive and may only sometimes be possible. 

One method for enhancing a model’s dependability is uncertainty quantification. In general, uncertainty quantification techniques can be categorized as intrinsic or extrinsic depending on how the uncertainties are derived from the machine learning models. The uncertainty quantification model generates a score along with the prediction that indicates the degree of confidence in the accuracy of the forecast. Quantifying uncertainty is helpful, but current methods often involve retraining the entire model. Training includes giving countless examples for a model to learn a task. Then, retraining is necessary, requiring a massive amount of new data inputs that can be expensive and challenging to get.

🚨 Read Our Latest AI Newsletter🚨

Researchers at MIT and the MIT-IBM Watson AI Lab created a strategy that allows a model to execute uncertainty quantification more effectively while utilizing far less processing power and no additional data than earlier approaches. Their method is adaptable enough for various applications because it doesn’t require the user to retrain or adjust a model. The process entails building a simpler companion model that helps the machine-learning model estimate uncertainty. With this more compact model, researchers can pinpoint the various uncertainty sources contributing to false predictions.

The research team developed a smaller, more straightforward model, known as a metamodel, to address the quantification problem. It is attached to the bigger, pre-trained model. It leverages the features that the larger model has already learned to assist it in making uncertainty quantification judgments. Researchers used a technique that incorporates both model and data uncertainty while designing the metamodel to get the output for uncertainty quantification. Data corruption and improper labeling are the leading causes of data uncertainty, and they can only be fixed or replaced by new data. In the presence of model uncertainty, the model is unsure of how to interpret newly observed data and may make inaccurate predictions, most commonly due to insufficient training examples similar to the new data. This challenge, while frequently occurring when models are deployed, is particularly difficult. They often come into data that differ from the training sample in real-world circumstances.

The user still requires confirmation that the uncertainty quantification score the model generates is accurate. To test a model on data that was held out from the original training set, researchers frequently create a smaller dataset to confirm correctness. The model can attain good prediction accuracy while still being overconfident. Hence this technique does not perform well for quantifying uncertainty.

By introducing noise to the data in the validation set, which is more akin to out-of-distribution data and can lead to model uncertainty, researchers were able to build a novel validation technique. The researchers use this noisy dataset to assess uncertainty quantifications. Their approach not only surpassed every baseline in every downstream task but also did it with less training time.

In addition to being adaptable to other model architectures, such as transformer and language models, researchers believe that the metamodel approach has the versatility to handle different applications relevant to uncertainty quantification, such as quantifying transfer-ability in transfer learning and domain adaptation. Future studies could be fascinating by investigating these potential uses and providing a theoretical understanding of the meta-model.

Check out the Paper and Reference Article. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 14k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.


Niharika is a Technical consulting intern at Marktechpost. She is a third year undergraduate, currently pursuing her B.Tech from Indian Institute of Technology(IIT), Kharagpur. She is a highly enthusiastic individual with a keen interest in Machine learning, Data science and AI and an avid reader of the latest developments in these fields.


Credit: Source link

Comments are closed.