MIT's robots see through obstacles with the help of AI

MIT and generative artificial intelligence: how robots use wireless signals to see hidden objects and rooms

Photo by: Domagoj Skledar - illustration/ arhiva (vlastita)

Generative artificial intelligence helps robots "see" through obstacles: MIT unveiled a system that reconstructs hidden objects and entire rooms from wireless reflections

Researchers from the Massachusetts Institute of Technology have unveiled a new generation of wireless "vision" that could significantly change the way robots find objects, navigate enclosed spaces, and work alongside humans. At the center of their work is the combination of millimeter waves, a type of wireless signal also used in modern communication systems, with generative artificial intelligence that supplements what the sensor cannot directly register. The result is two techniques that can more accurately reconstruct the shape of a hidden object from reflected signals, as well as the layout of an entire room with furniture, without traditional cameras and without the need for the sensor to be mounted on a moving robot. MIT announced that both papers will be presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2026, which is being held from June 3 to June 7, 2026, in Denver.

Although the idea of "seeing through obstacles" sounds like science fiction, it is a field on which Professor Fadel Adib's laboratory has been working for more than a decade. His research group Signal Kinetics at the MIT Media Lab and the Department of Electrical Engineering and Computer Science develops systems that use wireless signals to perceive the world in situations where human vision and traditional cameras are limited. According to MIT, this new phase of the work is not merely a technical improvement of earlier methods, but a kind of qualitative leap: from partial reconstructions toward understanding complex reflections and creating a more complete picture of objects and spaces that are blocked from direct view.

How the system works when the object is out of sight

Previous MIT systems relied on millimeter waves, or mmWave signals, which can pass through common obstacles such as drywall, plastic, cardboard, or fabric and bounce off a hidden object. Based on those reflections, it is possible to estimate where the object is located and partially determine its shape. The problem arises because such waves often reflect specularly, in one dominant direction. Because of this, the sensor typically "sees" only part of the surface, for example the top side of the object, while the side and bottom surfaces remain beyond the reach of measurement. It was precisely this incomplete geometry that for years represented one of the main limitations of wireless 3D perception.

MIT's new system, Wave-Former, tries to solve exactly that problem. Instead of stopping at a rough and incomplete reconstruction, the system first proposes possible object surfaces from the available reflections, then lets a generative model complete its shape, and then further refines the result. In other words, the sensor provides partial information, and the model learns how to infer the most probable full 3D shape from those fragments. The researchers emphasize that the model does not work arbitrarily and does not "invent" geometry without a basis, but is trained to take into account the physical properties of mmWave reflections and the noise patterns characteristic of such measurements.

This is important because in systems like these it is very easy to cross the line between a useful estimate and speculation. That is precisely why the MIT team did not treat generative artificial intelligence as a universal magic tool, but tied it to the physical model of signal propagation. Since there are no huge datasets with mmWave recordings of hidden objects, the researchers adapted existing computer vision datasets to mimic the specularity and noise characteristic of wireless reflections. Thus, instead of spending years collecting a new database, they created a synthetic dataset on which the model could learn what the "missing" part of the shape looks like when the input information is incomplete and degraded.

According to the paper abstract available on arXiv, Wave-Former increased recall, in direct comparison with the best existing approaches, from 54 to 72 percent while maintaining a high precision of 85 percent. MIT News also describes that shift on a practical level: the system faithfully reconstructed about 70 everyday objects, including cans, boxes, cutlery, and fruit, while they were hidden behind cardboard, wood, drywall, plastic, and fabric. In the context of robotics, this means that a machine would no longer have to guess what exactly is behind an obstacle or in a box, but would get a more convincing spatial estimate of the object before attempting grasping, sorting, or checking the contents.

From a hidden object to a map of the entire room

The second system, called RISE, goes a step further and does not deal with just one object, but tries to reconstruct an entire indoor space. In this case too, the basis is mmWave signals, but this time the researchers use the fact that people move through the room. When a person moves, part of the signal reflects off them, then again off the walls or furniture, and only then returns to the sensor. Such secondary reflections have traditionally been considered interference or "ghosts" in the signal, because they create false or shifted copies of the original reflection. MIT's approach starts from the opposite assumption: those "ghosts" actually carry information about the spatial layout.

In other words, what was previously discarded as noise becomes a source of data. RISE observes how secondary reflections change as a person moves through the room and builds a rough spatial image from those changes. Then a generative model fills in the gaps and improves the resolution of the initial reconstruction. According to the arXiv abstract, this is the first system and the first benchmark for understanding indoor spaces using a single static radar, with the system simultaneously targeting reconstruction of the spatial layout and object detection. The researchers state that their dataset contains 50,000 frames collected through more than 100 real motion trajectories in interiors.

In the results they published, RISE reduced Chamfer distance, a measure of error in geometry reconstruction, by 60 percent, to 16 centimeters, compared with previous methods. In addition, the paper also reports 58 percent IoU for object detection, which the authors describe as the first result of its kind in mmWave room understanding based on a single static radar. MIT News summarizes that progress more simply: reconstructed scenes were approximately twice as accurate as existing techniques. That is not the level of detail provided by cameras or LiDAR, but it is a very important step forward for situations in which optical sensors have limitations due to occlusion, poor visibility, or privacy concerns.

Why MIT talks about privacy, not just robotics

Both papers also strongly emphasize a socially sensitive dimension: privacy. Traditional systems for understanding indoor spaces often rely on cameras, depth sensors, or LiDAR, which can provide a very detailed picture of a person, their appearance, face, and behavior. The wireless approach developed by MIT does not work with a person's visual identity, but with reflected signals from which the geometry of the space and the position of the body relative to the surroundings are inferred. This does not mean that every privacy concern is automatically resolved, but it does mean that the system's basic design is less intrusive than constant video recording of interiors.

In practice, such a difference could be important in homes for older people, smart homes, hospitals, warehouses, and industrial facilities. A robot that needs to know whether a person is behind a corner, whether a passage is clear, or where an object has been placed does not necessarily need to have a camera that constantly records everything that is happening. That is precisely why MIT's authors place scenarios of human-machine collaboration, safer robot movement in enclosed spaces, and better room understanding without traditional visual surveillance in the foreground.

Possible applications: from logistics to the smart home

The most direct business applications can be seen in logistics and warehouses. If a robot can more reliably estimate the contents of a package or the shape of an object hidden inside cardboard packaging, it is easier to verify whether an order has been packed correctly before shipping. MIT states in its announcement, as one example, the reduction of waste associated with product returns, which is a particularly sensitive topic in e-commerce, where incorrectly delivered products create cost, additional transport, and unnecessary accumulation of packaging. In a warehouse, this also opens the possibility for a robot to obtain a more realistic estimate of the shape of an object hidden behind other boxes or under packing material before the actual handling.

Another group of applications relates to household and service robots. A system that can estimate where a person is in a room, where they are moving, and what the furniture layout looks like without a camera could be useful for navigating assistive robots, especially in dynamic home conditions. In such an environment, obstacles are not static: doors open, chairs change position, objects remain on the floor, and people are constantly moving. For a robot that needs to collaborate with a human, it is not enough merely to "see" what is directly in front of it; it must also understand what is partially occluded, as well as the broader layout of the scene.

It should nevertheless be emphasized that MIT does not claim that this is a finished commercial product ready for the mass market. These are research systems presented at a scientific conference, with results that show the direction of development, but still leave open questions about equipment cost, robustness in different real conditions, operating speed, and possible integration with other types of sensors. The research group itself states that it wants to increase the granularity and detail of reconstructions and in the future build larger foundation models for wireless signals, analogous to what GPT, Claude, or Gemini have become for language and vision.

Who is behind the work and why CVPR matters

The senior author of both papers is Fadel Adib, an associate professor at the MIT Media Lab and EECS and the leader of the Signal Kinetics group. According to MIT, the Wave-Former paper involved Laura Dodds as the lead author along with Maisy Lam, Waleed Akbar, and Yibo Cheng, while the RISE paper was authored by Kaichen Zhou, Laura Dodds, Sayed Saad Afzal, and Fadel Adib. On Adib's official page and publication list, both papers are listed as upcoming papers for CVPR 2026. The CVPR conference itself is considered one of the world's most important gatherings in the field of computer vision and pattern recognition, and the official website states that this year's edition will be held at the Colorado Convention Center in Denver from June 3 to June 7, 2026.

That is also relevant because MIT's papers do not come from an isolated laboratory environment, but enter an international scientific arena in which they are compared with the latest trends in computer vision, multimodal models, robotics, and scene understanding systems. Over the past few years, generative artificial intelligence has strongly influenced image processing, 3D reconstruction, and spatial modeling, but MIT's contribution lies in applying that wave to data that are not classic photographs, but wireless reflections burdened with specific physical limitations. In this way, the research is positioned not only as just another AI demonstrator, but as an attempt to connect learning models with the real laws of signal propagation.

What really changes for future robots

The biggest change is not that robots will suddenly "see through walls" in the way popular culture sometimes imagines. Much more important is that they could make fewer wrong decisions in situations where they currently work with incomplete information. In a warehouse, that can mean fewer failed grasps and less damage to goods. In the home, that can mean safer movement around people, children, or pets. In an industrial environment, that can mean a better understanding of the zone behind an obstacle without placing additional cameras at every point in the space.

MIT's announcement suggests that in this case generative artificial intelligence is not used merely to beautify the image, but to correct the fundamental limitation of wireless perception: the sensor sees only fragments, and the model helps infer what is missing. If that approach can be further scaled and validated in different environments, it could open a new class of systems that combine less privacy-invasive perception with practical use in robotics, logistics, and smart spaces. For now, this is research that still has to travel the road from the laboratory to broad application, but the published results show that the boundary between what is hidden and what a machine is capable of understanding is slowly, but visibly, shifting.

Sources:
- MIT News – announcement about the new Wave-Former and RISE systems, the authors, applications, and the presentation date at CVPR (link)
- CVPR 2026 – official conference website with dates and location (link)
- MIT / Fadel Adib – official website of the researcher and the Signal Kinetics group with an overview of work on wireless perception and a list of upcoming papers (link)
- arXiv – abstract of the paper "Wave-Former: Through-Occlusion 3D Reconstruction via Wireless Shape Completion" with the method and results (link)
- arXiv – abstract of the paper "RISE: Single Static Radar-based Indoor Scene Understanding" with a description of the benchmark and performance metrics (link)

Find accommodation nearby

Creation time: 20 March, 2026

MIT and generative artificial intelligence: how robots use wireless signals to see hidden objects and rooms

Generative artificial intelligence helps robots "see" through obstacles: MIT unveiled a system that reconstructs hidden objects and entire rooms from wireless reflections

How the system works when the object is out of sight

From a hidden object to a map of the entire room

Why MIT talks about privacy, not just robotics

Possible applications: from logistics to the smart home

Who is behind the work and why CVPR matters

What really changes for future robots

Find accommodation nearby

Science & tech desk

Events Croatia

Vukovar opened the 8th festival “ALL toGETHER CROATIA'S BEST” with heritage, concerts and a remembrance programme

The “Senza basso” concert in Rijeka brings a rare Baroque dialogue of violin and traverso to the Sugar Palace at the festival

The International Small Scene Festival in Rijeka and Opatija brings six productions from four countries

Graphics of Rebellion at Rijeka’s MMSU brings an activist print workshop and a pop-up exhibition to Benčić

Ivana Matetića Ronjgova Music School opens its doors with concerts, instruments and enrollment information

Shakespeare’s King Lear at the Rijeka CNT brings a story about power, betrayal and the collapse of trust

Rijeka marks 25 years of the Polish Language Lectorate through exhibitions, concerts, workshops and theatre

Extended weekend in Karlovac brings a bike ride, jazz, tours, exhibitions and a river cruise on Zora

MIT and generative artificial intelligence: how robots use wireless signals to see hidden objects and rooms

Generative artificial intelligence helps robots "see" through obstacles: MIT unveiled a system that reconstructs hidden objects and entire rooms from wireless reflections

How the system works when the object is out of sight

From a hidden object to a map of the entire room

Why MIT talks about privacy, not just robotics

Possible applications: from logistics to the smart home

Who is behind the work and why CVPR matters

What really changes for future robots

Find accommodation nearby

Related