MIT improves the assessment of uncertainty in machine learning

A new approach to improve uncertainty assessment in machine learning models: a scalable method for applications in healthcare and other critical areas

Mit researchers have developed an effective way to improve estimates of machine learning uncertainty, enabling more accurate and faster results in applications such as healthcare. This method helps users make informed decisions based on model reliability.

Photo by: Domagoj Skledar/ arhiva (vlastita)

Today's research in the field of machine learning often focuses on estimating uncertainty so that users can better understand how reliable model decisions are. This assessment is especially important in situations where the stakes are high, such as recognizing diseases in medical images or filtering job applications.

However, uncertainty estimates are only useful if they are accurate. If a model claims to be 49 percent confident that a medical image shows a pleural effusion, then that model should be right in 49 percent of the cases.

Researchers at MIT have developed a new approach to improving uncertainty estimates in machine learning models. Their method generates more accurate uncertainty estimates compared to other techniques and does so more efficiently.

Additionally, this technique is scalable and can be applied to large deep learning models that are increasingly used in healthcare and other situations where safety is of crucial importance.

This technique can provide end users, many of whom do not have expertise in machine learning, with better information to assess the reliability of the model and decide on its application in specific tasks.

Quantifying Uncertainty
Methods for quantifying uncertainty often require complex statistical calculations that are difficult to scale to machine learning models with millions of parameters. Also, these methods often require assumptions about the model and the data used for its training.

MIT researchers approached this problem differently. They used the principle of Minimum Description Length (MDL), which does not require assumptions that can limit the accuracy of other methods. MDL is used to better quantify and calibrate uncertainty for test points that the model needs to label.

The technique developed by the researchers, known as IF-COMP, makes MDL fast enough for use with large deep learning models applied in many real-world environments.

MDL involves considering all possible labels that the model can give for a particular test point. If there are many alternative labels for that point that fit well, the model's confidence in the selected label should be proportionally reduced.

"One way to understand how confident a model is, is to give it some counterfactual information and see how willing it is to change its belief," says Nathan Ng, lead author of the study and a PhD student at the University of Toronto who is also a visiting student at MIT.

For example, consider a model that claims a medical image shows a pleural effusion. If researchers tell the model that the image shows edema, and the model is willing to change its belief, then the model should be less confident in its original decision.

With MDL, if a model is confident when labeling a data point, it should use a very short code to describe that point. If it is not confident because the point can have many other labels, it uses a longer code to cover those possibilities.

The amount of code used to label a data point is known as the stochastic complexity of the data. If researchers ask the model how willing it is to change its belief about a data point given contrary evidence, the stochastic complexity of the data should decrease if the model is confident.

But testing each data point using MDL would require an enormous amount of computational power.

Accelerating the Process
With IF-COMP, the researchers developed an approximation technique that can accurately estimate the stochastic complexity of data using a special function, known as the influence function. They also used a statistical technique called temperature scaling, which improves the calibration of the model's outputs. This combination of influence functions and temperature scaling enables high-quality approximations of the stochastic complexity of data.

In the end, IF-COMP can efficiently produce well-calibrated uncertainty estimates that reflect the model's true confidence. The technique can also determine if the model has mislabeled certain data points or detect which data points are outliers.

Researchers tested their system on these three tasks and found that it was faster and more accurate than other methods.

"It is really important to have some assurance that the model is well-calibrated, and there is an increasing need to detect when a certain prediction is not quite correct. Auditing tools are becoming increasingly necessary in machine learning problems as we use large amounts of unverified data to build models that will be applied to problems people face," says Marzyeh Ghassemi, senior author of the study.

IF-COMP is model-agnostic, meaning it can provide accurate uncertainty estimates for many types of machine learning models. This could enable broader application in real-world environments, ultimately helping more practitioners make better decisions.

"People need to understand that these systems are very error-prone and can make conclusions based on insufficient data. The model may appear to be very confident, but there are many different things it is willing to believe given contrary evidence," says Ng.

In the future, the researchers plan to apply their approach to large language models and explore other potential applications of the Minimum Description Length principle.

Source: Massachusetts Institute of Technology

Find accommodation nearby

Creation time: 17 July, 2024

A new approach to improve uncertainty assessment in machine learning models: a scalable method for applications in healthcare and other critical areas

Find accommodation nearby

Science & tech desk

Events Croatia

Green Trails on Lošinj: 30 kilometers of hiking and cycling trails in Nerezine are being developed by 2028

Mariana Ilkiv at Lisinski is preparing a charity concert that brings Zagreb an evening of Ukrainian culture and solidarity

Poreč in the sign of cycling: Nexetis triumphed among female cyclists, Viggo Moore won the Poreč Classic and the finale

Aminess highlighted the strength of its female employees on Women’s Day: women hold more than half of the managerial positions in the company

Zagreb Festival of Lights 2026 brings artistic installations, messages about nature and a new spring rhythm to the city

Poreč at the heart of the Istrian Spring: after the Poreč Classic comes the grand finale of the stage race across Istria

Baške Oštarije Trail 2026 on Velebit: Karlobag and the mountain challenge bring three races and a sporting spectacle

Opatija, March 7, 2026 in the rhythm of health: free Nordic walking and Thalasso Cardio Walk on the Lungomare

A new approach to improve uncertainty assessment in machine learning models: a scalable method for applications in healthcare and other critical areas

Find accommodation nearby

Related