Postavke privatnosti

Hidden flaw in large language models revealed: mit researchers explain why AI ignores key data

Mit researchers have discovered why large language models like GPT-4 show positional bias, neglecting key information in the middle of documents. This phenomenon, known as “getting lost in the middle”, is a direct consequence of the model architecture and can compromise the reliability of AI systems in medicine and law.

Hidden flaw in large language models revealed: mit researchers explain why AI ignores key data
Photo by: Domagoj Skledar/ arhiva (vlastita)

Large language models (LLMs), such as advanced systems like GPT-4, Claude, and Llama, are becoming an indispensable tool in a growing number of professions, from law and medicine to programming and scientific research. Their ability to process and generate human-like text has opened the door to new levels of productivity. However, beneath the surface of this technological revolution lies a subtle but significant flaw that can lead to unreliable and inaccurate results: positional bias. Recent research has revealed that these complex systems tend to give disproportionate importance to information located at the very beginning or end of a document, while simultaneously ignoring key data placed in the middle.


This problem means that, for example, a lawyer using an AI-powered virtual assistant to find a specific clause in a thirty-page contract has a significantly higher chance of success if that clause is on the first or last page. Information in the central part of the document, regardless of its relevance, often remains "invisible" to the model.


Uncovering "Lost in the Middle": A Problem Affecting Even the Most Advanced Systems


The phenomenon known as lost-in-the-middle manifests through a specific "U-shaped" accuracy pattern. When a model's ability to find a correct answer within a long text is tested, performance is best if the information is at the beginning. As the target information moves toward the middle, accuracy drops drastically, reaching its lowest point in the very center of the document, only to slightly improve toward the end. This flaw is not just a technical curiosity but represents a serious risk in applications where every piece of information is critically important.


Imagine a medical AI system analyzing a patient's extensive medical history. If a key symptom or lab test result is mentioned in the middle of the documentation, the model might overlook it, potentially leading to a misdiagnosis. Similarly, a programmer relying on an AI assistant to analyze complex code might get an incomplete picture if the model ignores critical functions located in the central part of the software package. Understanding and addressing this problem is crucial for building trust in AI systems and ensuring their safe application.


Researchers from MIT Have Traced the Root of the Problem


A team of scientists from the prestigious Massachusetts Institute of Technology (MIT), located in the city of Cambridge, has succeeded in discovering the fundamental mechanism that causes this phenomenon. In a new study, to be presented at the International Conference on Machine Learning, the researchers developed a theoretical framework that allowed them to peek inside the "black box" of large language models.


Led by Xinyi Wu, a student at MIT’s Institute for Data, Systems, and Society (IDSS), and in collaboration with postdoctoral fellow Yifei Wang and experienced professors Stefanie Jegelka and Ali Jadbabaie, the team determined that positional bias is not an accidental bug but a direct consequence of certain design choices in the model's architecture itself. "These models are black boxes, so as a user, you probably don't know that positional bias can make your model inconsistent," Wu points out. "By better understanding the underlying mechanism of these models, we can improve them by addressing these limitations."


The Anatomy of a Transformer: How Architecture Creates Bias


At the heart of modern language models is a neural network architecture known as a transformer. Transformers process text by first breaking it down into smaller pieces, so-called "tokens," and then learning the relationships between these tokens to understand context and predict the next words. The key innovation that enables this is the attention mechanism, which allows each token to selectively "pay attention" to other relevant tokens in the text.


However, allowing every token in a 30-page document to pay attention to every other token would be computationally expensive and infeasible. That's why engineers use "attention masking" techniques that limit which tokens a particular token can look at. The MIT research showed that one of these techniques, known as a causal mask, is one of the main culprits for the bias. A causal mask allows tokens to pay attention only to those tokens that appeared before them. This method, while useful for tasks like text generation, inherently creates a bias towards the beginning of the input sequence. The deeper the model, meaning the more layers of the attention mechanism it has, this initial bias is further amplified because information from the beginning is used more and more frequently in the model's reasoning process.


The Role of Data and Opportunities for Correction


The model's architecture is not the only source of the problem. The researchers confirmed that training data also plays a significant role. If the data on which the model was trained is itself biased in a certain way, the model will inevitably learn and reproduce that bias. Fortunately, the theoretical framework developed by the MIT team not only diagnoses the problem but also offers potential solutions.


One of the proposed strategies is the use of positional encodings, a technique that provides the model with explicit information about the location of each word within the sequence. By more strongly linking words to their immediate neighbors, this technique can help redirect the model's "attention" to more relevant parts of the text and thus mitigate the bias. However, the researchers warn, the effect of this method can weaken in models with a large number of layers.


Other possibilities include using different masking techniques that do not favor the beginning of the sequence, strategically removing excess layers from the attention mechanism, or targeted fine-tuning of the model on data known to be more balanced. "If you know that your data is biased, you should fine-tune your model while adjusting the design choices," advises Wu.


Practical Consequences and the Future of More Reliable Artificial Intelligence


The results of this research have far-reaching consequences. Solving the problem of positional bias could lead to significantly more reliable AI systems. Chatbots could have longer and more meaningful conversations without losing context. Medical systems could analyze patient data more fairly, while coding assistants could review entire programs in more detail, paying equal attention to all parts of the code.


Amin Saberi, a professor and director of the Center for Computer-Driven Market Design at Stanford University, who was not involved in the work, praised the research: "These researchers offer a rare theoretical insight into the attention mechanism at the heart of the transformer model. They provide a compelling analysis that clarifies long-standing oddities in the behavior of transformers." His words confirm the importance of this step towards demystifying AI technologies.


In the future, the research team plans to further investigate the effects of positional encoding and to study how positional bias might even be strategically exploited in certain applications. As Professor Jadbabaie points out, "If you want to use a model in high-stakes applications, you need to know when it will work, when it won't, and why." This research represents a crucial step toward that goal, paving the way for the creation of more accurate, reliable, and ultimately more useful artificial intelligence systems.

Source: Massachusetts Institute of Technology

Find accommodation nearby

Creation time: 19 June, 2025

Science & tech desk

Our Science and Technology Editorial Desk was born from a long-standing passion for exploring, interpreting, and bringing complex topics closer to everyday readers. It is written by employees and volunteers who have followed the development of science and technological innovation for decades, from laboratory discoveries to solutions that change daily life. Although we write in the plural, every article is authored by a real person with extensive editorial and journalistic experience, and deep respect for facts and verifiable information.

Our editorial team bases its work on the belief that science is strongest when it is accessible to everyone. That is why we strive for clarity, precision, and readability, without oversimplifying in a way that would compromise the quality of the content. We often spend hours studying research papers, technical documents, and expert sources in order to present each topic in a way that will interest rather than burden the reader. In every article, we aim to connect scientific insights with real life, showing how ideas from research centres, universities, and technology labs shape the world around us.

Our long experience in journalism allows us to recognize what is truly important for the reader, whether it is progress in artificial intelligence, medical breakthroughs, energy solutions, space missions, or devices that enter our everyday lives before we even imagine their possibilities. Our view of technology is not purely technical; we are also interested in the human stories behind major advances – researchers who spend years completing projects, engineers who turn ideas into functional systems, and visionaries who push the boundaries of what is possible.

A strong sense of responsibility guides our work as well. We want readers to trust the information we provide, so we verify sources, compare data, and avoid rushing to publish when something is not fully clear. Trust is built more slowly than news is written, but we believe that only such journalism has lasting value.

To us, technology is more than devices, and science is more than theory. These are fields that drive progress, shape society, and create new opportunities for everyone who wants to understand how the world works today and where it is heading tomorrow. That is why we approach every topic with seriousness but also with curiosity, because curiosity opens the door to the best stories.

Our mission is to bring readers closer to a world that is changing faster than ever before, with the conviction that quality journalism can be a bridge between experts, innovators, and all those who want to understand what happens behind the headlines. In this we see our true task: to transform the complex into the understandable, the distant into the familiar, and the unknown into the inspiring.

NOTE FOR OUR READERS
Karlobag.eu provides news, analyses and information on global events and topics of interest to readers worldwide. All published information is for informational purposes only.
We emphasize that we are not experts in scientific, medical, financial or legal fields. Therefore, before making any decisions based on the information from our portal, we recommend that you consult with qualified experts.
Karlobag.eu may contain links to external third-party sites, including affiliate links and sponsored content. If you purchase a product or service through these links, we may earn a commission. We have no control over the content or policies of these sites and assume no responsibility for their accuracy, availability or any transactions conducted through them.
If we publish information about events or ticket sales, please note that we do not sell tickets either directly or via intermediaries. Our portal solely informs readers about events and purchasing opportunities through external sales platforms. We connect readers with partners offering ticket sales services, but do not guarantee their availability, prices or purchase conditions. All ticket information is obtained from third parties and may be subject to change without prior notice. We recommend that you thoroughly check the sales conditions with the selected partner before any purchase, as the Karlobag.eu portal does not assume responsibility for transactions or ticket sale conditions.
All information on our portal is subject to change without prior notice. By using this portal, you agree to read the content at your own risk.