Postavke privatnosti

The mit Revolution: Scientists finally discover how AI 'thinks' key to discovering new drugs and vaccines

Scientists at MIT have developed a revolutionary technique that reveals for the first time how protein language models make decisions. This breakthrough into the 'black box' of artificial intelligence enables faster development of medicines and vaccines and opens the door to completely new biological insights, changing the future of medicine.

The mit Revolution: Scientists finally discover how AI
Photo by: Domagoj Skledar - illustration/ arhiva (vlastita)

The revolution brought by artificial intelligence in biology and medicine is gaining a new, crucial chapter. Over the past few years, we have witnessed a boom in powerful tools, so-called protein language models, which have fundamentally changed the way scientists approach drug research, vaccine development, and the understanding of the very foundations of life. These sophisticated systems, based on the architecture of large language models (LLMs) like those that power popular chatbots, have shown a stunning ability to predict the structure and function of proteins with incredible accuracy. Despite their success, one fundamental problem remained unsolved and posed a significant obstacle – their complete opacity. Scientists were getting extremely accurate answers but had no insight into how the model reached those conclusions. They were working with a kind of "black box," which limited trust and the possibility of further refinement.


A recent study, originating from a laboratory at the prestigious Massachusetts Institute of Technology (MIT), marks a turning point in solving this problem. The research team has successfully applied an innovative technique that, for the first time, allows scientists to peek inside this "black box" and precisely determine which protein features the artificial intelligence considers when making its predictions. This breakthrough not only increases the transparency and explainability of AI models but also opens the door for accelerated development of new therapies and a better understanding of complex biological processes.


Decoding the "black box": How AI makes decisions


Understanding the decision-making process within these models is crucial for their further application. The MIT team, led by Onkar Gujral as the lead author and mentored by Bonnie Berger, a distinguished professor of mathematics and head of the Computation and Biology group, has developed a method that demystifies the inner workings of protein language models. Their work, published in the prestigious scientific journal Proceedings of the National Academy of Sciences, has the potential to transform how these powerful tools are used in biomedical research.


Protein language models, whose foundations were laid back in 2018 by Professor Berger and her then-student Tristan Bepler, function by analyzing vast databases of amino acid sequences, similar to how language models analyze text. By learning the patterns and relationships between amino acids, they can predict the three-dimensional structure of a protein and its biological function. It was precisely such models that were key to the rapid development of revolutionary tools like AlphaFold, ESM2, and OmegaFold. However, the problem was that the information within the model was encoded in a very dense and incomprehensible way. Scientists could see the final result, but not the path that led to it. It was like having a genius student who always solves the most complex math problem correctly but can never show you their work.


An innovative technique that brings light into the darkness


To solve this problem, the MIT researchers turned to an algorithm known as a "sparse autoencoder." This is the first time such an approach has been successfully applied to protein language models. The principle of operation is elegant and powerful. In standard models, information about a specific protein is encoded through the activation of a relatively small number of "nodes" within the neural network, for example, 480. In such a dense representation, each individual node must encode multiple different protein features simultaneously, making interpretation practically impossible.


The sparse autoencoder works in the opposite way: it drastically expands the representation space. Instead of 480 nodes, the model now uses, for example, 20,000 nodes. At the same time, the algorithm introduces a "sparsity constraint" which ensures that only a small number of these nodes are activated to describe the protein. This allows the information, which was previously compressed, to be "spread out." The consequence is that a single specific feature of a protein, which was previously encoded across multiple different nodes, can now occupy its own, unique node. "In a sparse representation, the neurons that fire do so in a more meaningful way," explains Gujral. Before this method, the networks packed information so tightly that it was impossible to decipher the role of individual neurons.


The role of artificial intelligence in interpreting itself


After obtaining these "purified" and sparse representations for thousands of different proteins, the scientists faced a new challenge: how to understand what each of these activated nodes means. For this purpose, they used the help of another artificial intelligence, an assistant known as Claude. Claude's task was to compare the sparse representations with the already known characteristics of each protein, such as its molecular function, the family it belongs to, or its location within the cell.


By analyzing a vast number of examples, Claude was able to link the activation of specific nodes with concrete biological properties and then describe them in simple, human-understandable language. For example, the algorithm could generate a description like: "This neuron appears to detect proteins involved in the transmembrane transport of ions or amino acids, particularly those located in the plasma membrane." Through this process, the nodes became "interpretable," and for the first time, scientists gained a clear insight into what the model "thinks." It turned out that the features most commonly encoded by the models are the protein family and specific functions, including various metabolic and biosynthetic processes.


Practical implications: From faster drug discovery to new biological insights


This advancement has far-reaching consequences. Understanding the features that a particular protein model encodes allows researchers to choose the most appropriate model for a specific task. Whether it's identifying new target molecules for drugs or designing more effective vaccines, it is now possible to use a tool that is best "tuned" to solve a specific problem. This directly accelerates and reduces the cost of the entire research and development process.


For example, in a 2021 study, Professor Berger's team used a protein language model to predict which parts of viral surface proteins were least likely to mutate. By doing so, they identified promising targets for the development of universal vaccines against influenza, HIV, and SARS-CoV-2. With the new method for interpretation, it is now possible not only to get such a prediction but also to understand on the basis of which biochemical and structural properties the model made that decision, which provides an additional level of confirmation and directs further laboratory research.


Furthermore, analyzing the features that the model independently recognizes as important could one day lead to completely new biological discoveries. It is possible that artificial intelligence, by analyzing patterns in data that the human eye cannot perceive, will identify previously unknown protein functions or discover new connections between different biological pathways. "One day, when the models become even more powerful, we might learn more about biology than we currently know, precisely by opening up the models themselves," Gujral concludes optimistically. This technology promises not only to help us find answers to known questions but also to pose entirely new ones that will shape the future of science.

Find accommodation nearby

Creation time: 21 August, 2025

Science & tech desk

Our Science and Technology Editorial Desk was born from a long-standing passion for exploring, interpreting, and bringing complex topics closer to everyday readers. It is written by employees and volunteers who have followed the development of science and technological innovation for decades, from laboratory discoveries to solutions that change daily life. Although we write in the plural, every article is authored by a real person with extensive editorial and journalistic experience, and deep respect for facts and verifiable information.

Our editorial team bases its work on the belief that science is strongest when it is accessible to everyone. That is why we strive for clarity, precision, and readability, without oversimplifying in a way that would compromise the quality of the content. We often spend hours studying research papers, technical documents, and expert sources in order to present each topic in a way that will interest rather than burden the reader. In every article, we aim to connect scientific insights with real life, showing how ideas from research centres, universities, and technology labs shape the world around us.

Our long experience in journalism allows us to recognize what is truly important for the reader, whether it is progress in artificial intelligence, medical breakthroughs, energy solutions, space missions, or devices that enter our everyday lives before we even imagine their possibilities. Our view of technology is not purely technical; we are also interested in the human stories behind major advances – researchers who spend years completing projects, engineers who turn ideas into functional systems, and visionaries who push the boundaries of what is possible.

A strong sense of responsibility guides our work as well. We want readers to trust the information we provide, so we verify sources, compare data, and avoid rushing to publish when something is not fully clear. Trust is built more slowly than news is written, but we believe that only such journalism has lasting value.

To us, technology is more than devices, and science is more than theory. These are fields that drive progress, shape society, and create new opportunities for everyone who wants to understand how the world works today and where it is heading tomorrow. That is why we approach every topic with seriousness but also with curiosity, because curiosity opens the door to the best stories.

Our mission is to bring readers closer to a world that is changing faster than ever before, with the conviction that quality journalism can be a bridge between experts, innovators, and all those who want to understand what happens behind the headlines. In this we see our true task: to transform the complex into the understandable, the distant into the familiar, and the unknown into the inspiring.

NOTE FOR OUR READERS
Karlobag.eu provides news, analyses and information on global events and topics of interest to readers worldwide. All published information is for informational purposes only.
We emphasize that we are not experts in scientific, medical, financial or legal fields. Therefore, before making any decisions based on the information from our portal, we recommend that you consult with qualified experts.
Karlobag.eu may contain links to external third-party sites, including affiliate links and sponsored content. If you purchase a product or service through these links, we may earn a commission. We have no control over the content or policies of these sites and assume no responsibility for their accuracy, availability or any transactions conducted through them.
If we publish information about events or ticket sales, please note that we do not sell tickets either directly or via intermediaries. Our portal solely informs readers about events and purchasing opportunities through external sales platforms. We connect readers with partners offering ticket sales services, but do not guarantee their availability, prices or purchase conditions. All ticket information is obtained from third parties and may be subject to change without prior notice. We recommend that you thoroughly check the sales conditions with the selected partner before any purchase, as the Karlobag.eu portal does not assume responsibility for transactions or ticket sale conditions.
All information on our portal is subject to change without prior notice. By using this portal, you agree to read the content at your own risk.