Postavke privatnosti

New method from MIT allows artificial intelligence smarter problem solving with half the consumption

Researchers from MIT have developed a new technique that allows large language models to dynamically adjust the effort needed to solve tasks. This method, called instance-adaptive scaling, reduces computational consumption by as much as 50 percent while maintaining accuracy, making smaller models just as capable as the largest ones on the market.

New method from MIT allows artificial intelligence smarter problem solving with half the consumption
Photo by: Domagoj Skledar - illustration/ arhiva (vlastita)

In the world of generative artificial intelligence, where the race for bigger, faster, and smarter models is constantly accelerating, researchers from the prestigious MIT (Massachusetts Institute of Technology) have just presented a solution that could fundamentally change the rules of the game. Their new method, presented to the scientific community this week, focuses not on merely increasing model size, but on drastically smarter use of resources these models already possess.


The problem the industry has faced until now was quite bizarre, yet real: most large language models (LLMs) approach every question with the same "amount" of thinking. Whether a user asks "How much is 2 plus 2?" or seeks a complex analysis of the geopolitical situation in the 19th century, standard models often allocate a fixed computational budget. This results in massive energy waste on trivial queries, while simultaneously, complex problems do not receive enough "cognitive" attention needed for accurate resolution.


This is exactly where the MIT team steps onto the scene with their revolutionary approach called "instance-adaptive scaling". Their method allows artificial intelligence something humans do instinctively – the ability to assess the difficulty of a problem before and during the solution process itself and to dynamically adjust the effort required to reach the correct answer.


Why is "thinking" expensive?


To understand the significance of this discovery, we must look back at the way modern language models function. To answer harder questions, researchers have recently started applying a technique known as "inference-time scaling". This technique allows the model to spend more time generating potential solutions, exploring different reasoning paths or chain-of-thought, before delivering a final answer.


However, previous approaches were rigid. They set a fixed computational budget for every problem, regardless of its complexity. This meant that a model might waste precious graphics processing unit (GPU) resources on simple questions requiring an immediate answer, or, even worse, would not have enough resources to tackle problems requiring deep logic and multiple verification steps.


The new algorithm developed by MIT researchers enables the model to dynamically adjust its budget. In practice, this means the model can "pause", assess the difficulty of the question and the probability that the current direction of thinking will lead to the correct solution, and based on that decide whether to invest more effort or if the answer is already ready.


Revolutionary results: Less is sometimes more


The testing results of this method are impressive. The research team discovered that their approach enables large language models to use even 50 percent fewer computing resources compared to existing methods, while maintaining the same level of accuracy across a wide spectrum of questions of varying difficulties.


Perhaps an even more significant discovery is the fact that this method democratizes the power of artificial intelligence. Namely, the research showed that smaller, less resource-demanding models, when equipped with this adaptive algorithm, can match or even surpass the performance of significantly larger and more expensive models on complex problems. This opens the door to applying advanced AI technology on devices with limited resources, such as smartphones or laptops, without the need for a constant connection to massive data centers.


How does "digital metacognition" work?


The core of this system lies in the model's ability to "know what it doesn't know". Navid Azizan, a professor in the Department of Mechanical Engineering and the Institute for Data, Systems, and Society (IDSS) at MIT and senior author of the study, highlights the importance of this concept.


"The computational cost of inference has rapidly become a major bottleneck for providers of the most advanced models, who are actively trying to find ways to improve computational efficiency per user query," explains Azizan. "For example, the recent release of the GPT-5.1 model emphasizes the efficiency of the 'adaptive inference' approach that our work proposes. By enabling models to recognize their knowledge limits, we can allow them to spend more computing power on the hardest problems and most promising solution paths, and significantly fewer tokens on simple ones. This makes the inference process more reliable and far more efficient."


Technically speaking, the framework uses a component known as a Process Reward Model (PRM). This "supervisory" model evaluates every potential step in solving a problem. Imagine it as a strict teacher watching a student while they solve a math task. The PRM assesses the difficulty of the question and helps the main model (LLM) decide how many resources need to be allocated.


Solving the problem of overconfidence


One of the key challenges researchers faced was the tendency of existing reward models (PRM) to be too optimistic. They would often overestimate the probability that a certain step in solving is correct, which would lead the system to prematurely conclude the "thinking" process and deliver a wrong answer.


"If we had simply trusted current PRMs, which often overestimate the chance of success, our system would have reduced the computational budget too aggressively," explains Young-Jin Park, a PhD student at MIT and lead author of the study. "That is why we first had to find a way to better calibrate these models to make inference-time scaling more efficient and reliable."


The solution was found in a new calibration method. Instead of the PRM giving a simple binary assessment (good/bad) or a single numerical value, researchers taught it to generate a range of probabilities. In this way, the system gets a more realistic picture of uncertainty. If the model is "sure" it is on the right track, it reduces the number of alternative scenarios it explores, saving resources. If it is unsure, it expands the search.


Hao Wang, a researcher at the MIT-IBM Watson AI Lab and a team member, draws an interesting parallel with human thinking: "This is actually the way humans solve problems. We come up with some partial solutions, and then decide: should I continue with one of them, or stop and revise, or even go back to a previous step and continue solving the problem from there?"


The future of AI agents and autonomous systems


This research, which is being presented this week, in early December 2025, at the prestigious Neural Information Processing Systems (NeurIPS) conference, has implications reaching far beyond the academic community. Reducing the energy consumption of generative AI systems is crucial for industry sustainability, especially in light of growing concerns about the carbon footprint of large data centers.


Besides the ecological aspect, this technique opens doors for using LLMs in high-risk and time-sensitive situations. Kristjan Greenewald, a researcher at the MIT-IBM Watson AI Lab, highlights the dynamic nature of their solution: "The beauty of our approach is that this adjustment happens on the fly, while the problem is being solved, rather than happening all at once at the beginning of the process."


Looking into the future, researchers plan to apply this technique to other areas, such as automatic code generation and the development of autonomous AI agents. Calibration of reward models (PRM) could also find application in reinforcement learning and model fine-tuning.


Akash Srivastava, Director and Chief Architect for Core AI at IBM Software, who was not directly involved in the work but follows its development, emphasizes the transformative potential of this technology for the workforce of the future:



"Human employees learn on the job — some CEOs even started as interns — but today's AI agents remain mostly static pieces of probabilistic software. Work like this paper is an important step toward changing that: helping agents realize what they don't know and building mechanisms for continuous self-improvement. These capabilities are key if we want agents that can work safely, adapt to new situations, and deliver consistent results at scale."



Collaboration of giants for a smarter future


It is important to note that this research is the result of collaboration between some of the strongest names in the tech world and academia. The project was funded, among others, by the MIT-IBM Watson AI Lab, MIT-Amazon Science Hub, MIT-Google Program for Computing Innovation, and the company MathWorks.


At a moment when the world faces the question of limits to artificial intelligence growth, the MIT team proves that the solution is not always in a "bigger hammer", but in a more precise strike. By introducing an element of metacognition – thinking about one's own thinking – artificial intelligence becomes not only more efficient but also more similar to biological systems it attempts to mimic.


For end users, this could soon mean faster answers to simple questions, deeper and more accurate analyses for complex queries, and AI assistants on our mobile phones that don't drain the battery in a few minutes. In a world where computing power is the new currency, the ability to save that currency might be the most valuable innovation of this year.

Find accommodation nearby

Creation time: 11 hours ago

Science & tech desk

Our Science and Technology Editorial Desk was born from a long-standing passion for exploring, interpreting, and bringing complex topics closer to everyday readers. It is written by employees and volunteers who have followed the development of science and technological innovation for decades, from laboratory discoveries to solutions that change daily life. Although we write in the plural, every article is authored by a real person with extensive editorial and journalistic experience, and deep respect for facts and verifiable information.

Our editorial team bases its work on the belief that science is strongest when it is accessible to everyone. That is why we strive for clarity, precision, and readability, without oversimplifying in a way that would compromise the quality of the content. We often spend hours studying research papers, technical documents, and expert sources in order to present each topic in a way that will interest rather than burden the reader. In every article, we aim to connect scientific insights with real life, showing how ideas from research centres, universities, and technology labs shape the world around us.

Our long experience in journalism allows us to recognize what is truly important for the reader, whether it is progress in artificial intelligence, medical breakthroughs, energy solutions, space missions, or devices that enter our everyday lives before we even imagine their possibilities. Our view of technology is not purely technical; we are also interested in the human stories behind major advances – researchers who spend years completing projects, engineers who turn ideas into functional systems, and visionaries who push the boundaries of what is possible.

A strong sense of responsibility guides our work as well. We want readers to trust the information we provide, so we verify sources, compare data, and avoid rushing to publish when something is not fully clear. Trust is built more slowly than news is written, but we believe that only such journalism has lasting value.

To us, technology is more than devices, and science is more than theory. These are fields that drive progress, shape society, and create new opportunities for everyone who wants to understand how the world works today and where it is heading tomorrow. That is why we approach every topic with seriousness but also with curiosity, because curiosity opens the door to the best stories.

Our mission is to bring readers closer to a world that is changing faster than ever before, with the conviction that quality journalism can be a bridge between experts, innovators, and all those who want to understand what happens behind the headlines. In this we see our true task: to transform the complex into the understandable, the distant into the familiar, and the unknown into the inspiring.

NOTE FOR OUR READERS
Karlobag.eu provides news, analyses and information on global events and topics of interest to readers worldwide. All published information is for informational purposes only.
We emphasize that we are not experts in scientific, medical, financial or legal fields. Therefore, before making any decisions based on the information from our portal, we recommend that you consult with qualified experts.
Karlobag.eu may contain links to external third-party sites, including affiliate links and sponsored content. If you purchase a product or service through these links, we may earn a commission. We have no control over the content or policies of these sites and assume no responsibility for their accuracy, availability or any transactions conducted through them.
If we publish information about events or ticket sales, please note that we do not sell tickets either directly or via intermediaries. Our portal solely informs readers about events and purchasing opportunities through external sales platforms. We connect readers with partners offering ticket sales services, but do not guarantee their availability, prices or purchase conditions. All ticket information is obtained from third parties and may be subject to change without prior notice. We recommend that you thoroughly check the sales conditions with the selected partner before any purchase, as the Karlobag.eu portal does not assume responsibility for transactions or ticket sale conditions.
All information on our portal is subject to change without prior notice. By using this portal, you agree to read the content at your own risk.