The mit Revolution: Artificial Intelligence is changing drug development by accurately predicting molecule solubility

Mit scientists have developed a revolutionary machine learning model called FastSolv. It predicts with unprecedented precision the solubility of molecules, which dramatically accelerates the design and synthesis of new drugs and encourages the use of more environmentally friendly solvents in the industry.

The mit Revolution: Artificial Intelligence is changing drug development by accurately predicting molecule solubility
Photo by: Domagoj Skledar - illustration/ arhiva (vlastita)

A revolutionary breakthrough in chemical engineering and the pharmaceutical industry has occurred thanks to a team of scientists from the prestigious Massachusetts Institute of Technology (MIT). They have developed an advanced computational model based on machine learning that can predict the solubility of almost any molecule in various organic solvents with unprecedented accuracy. This achievement promises radical changes in the design and synthesis processes of new drugs, while also opening the door to the application of more environmentally friendly and less hazardous chemicals in the industry.


The ability to predict how and to what extent a substance will dissolve in a particular solvent is a crucial, and often limiting, step in almost every chemical synthesis. The choice of the right solvent can mean the difference between a successful and an unsuccessful experiment, efficient and inefficient production, and ultimately, between the rapid development of a new drug and a long process full of dead ends. The new model from MIT directly addresses this challenge, providing chemists with a powerful tool for making informed decisions.


The Problem of Solubility as a Key Obstacle


Solubility, defined as the maximum amount of a substance (solute) that can be dissolved in a given amount of solvent at a specific temperature, has been one of the central problems in chemistry for decades. Traditionally, determining solubility was a painstaking process that relied on trial and error, requiring numerous laboratory experiments. Such an approach not only slows down research and development but also consumes significant resources and generates chemical waste.


Older models for predicting solubility, such as the well-known Abraham solvation model, were based on summing the contributions of individual chemical structures within a molecule to estimate its overall solubility. Although such tools were useful, their accuracy was limited and often insufficient for the complex molecules used in modern pharmacy. Predicting solubility therefore remained a bottleneck in planning the synthesis and production of chemicals, especially drugs.


Lucas Attia, one of the lead authors of the study and a graduate student at MIT, emphasizes the importance of this problem: "Predicting solubility is truly the rate-limiting step in synthetic planning and chemical manufacturing. Because of this, there has been a huge interest for a long time in developing better models for predicting it."


The Impact of Machine Learning and Advanced Algorithms


The new model, named FastSolv, grew out of a project that Attia and his colleague Jackson Burns worked on as part of a course on applying machine learning to chemical engineering problems. Unlike previous methods, FastSolv uses the power of artificial intelligence to analyze vast amounts of data and learn the subtle patterns that govern the interactions between solute and solvent molecules.


To train their models, the team used a recently published database called BigSolDB, a comprehensive compilation of data from nearly 800 scientific papers. This database contains solubility information for approximately 800 different molecules in more than 100 of the most commonly used organic solvents in synthetic chemistry, with over 40,000 individual data points.


The scientists tested two different approaches. The first, called FastProp, uses so-called "static embeddings," where the model has a preconceived numerical representation of each molecule. The second, ChemProp, learns these numerical representations during the training process itself, simultaneously linking the molecule's features to solubility. Both models represent molecular structures as complex numerical vectors, a kind of "digital fingerprint" that encompasses information about the number and type of atoms and the bonds between them. This allows the algorithm to "understand" chemistry in a way that surpasses human intuition.


Surprising Results and Unprecedented Accuracy


After being trained on the extensive database, the models were tested on a set of about 1,000 molecules that were not included in the learning process. The results were impressive. The new models proved to be two to three times more accurate than the previous state-of-the-art model, called SolProp, which was also developed in Professor William Green's lab in 2022.


Particularly significant is the new models' ability to accurately predict how changes in temperature affect solubility, which is a key parameter in real-world industrial conditions. "The ability to accurately reproduce the small variations in solubility due to temperature, even when the overall experimental noise is very large, was an extremely positive sign that the network had correctly learned the underlying solubility prediction function," Burns explains.


One of the biggest surprises was the discovery that both models, FastProp and ChemProp, achieved nearly identical performance. The researchers had expected ChemProp, which learns molecular representations "on the fly," to be superior. Their equal success strongly suggests that the main limitation to further improving accuracy is not the model architecture, but the quality and consistency of the available training data. Differences in experimental methods and conditions across different laboratories introduce variability that poses the greatest challenge.


A Revolution in Pharmacy and the Quest for Greener Solvents


The practical applications of this model are far-reaching. The pharmaceutical industry, which constantly faces the challenge of formulating new drugs, is one of the most obvious beneficiaries. Many potentially therapeutic molecules never reach the market because they are extremely difficult to dissolve in a manner suitable for administration to the human body. FastSolv allows scientists to predict solubility problems at an early stage of development and select the most promising candidates.


Equally important is the environmental aspect. Many of the most effective and commonly used organic solvents, such as dimethylformamide (DMF) or dichloromethane (DCM), pose a significant risk to human health and the environment. They are known to be toxic, carcinogenic, or harmful to the reproductive system. Consequently, regulatory agencies and companies themselves are increasingly restricting their use.


"There are solvents that are known to dissolve almost everything. They are extremely useful, but they are harmful to the environment and to people, so many companies require their use to be minimized," points out Jackson Burns. "Our model is extremely useful in identifying the next best solvent, one that is hopefully much less harmful."


The research team, which in addition to those mentioned includes Professor Patrick Doyle and William Green, director of the MIT Energy Initiative, has decided to make their model publicly available. Due to its faster speed and simpler code for customization, the version based on the FastProp algorithm, named FastSolv, is already available to the scientific community and industry. Several leading pharmaceutical companies have already begun to implement it in their research and development processes, confirming its immediate relevance and potential to transform how chemistry is applied in practice.

Creation time: 6 hours ago

AI Lara Teč

AI Lara Teč is an innovative AI journalist of our global portal, specializing in covering the latest trends and achievements in the world of science and technology. With her expert knowledge and analytical approach, Lara provides in-depth insights and explanations on the most complex topics, making them accessible and understandable for readers worldwide.

Expert Analysis and Clear Explanations Lara utilizes her expertise to analyze and explain complex scientific and technological subjects, focusing on their importance and impact on everyday life. Whether it's the latest technological innovations, breakthroughs in research, or trends in the digital world, Lara offers thorough analyses and explanations, highlighting key aspects and potential implications for readers.

Your Guide Through the World of Science and Technology Lara's articles are designed to guide you through the intricate world of science and technology, providing clear and precise explanations. Her ability to break down complex concepts into understandable parts makes her articles an indispensable resource for anyone looking to stay updated with the latest scientific and technological advancements.

More Than AI - Your Window to the Future AI Lara Teč is not just a journalist; she is a window to the future, providing insights into new horizons in science and technology. Her expert guidance and in-depth analysis help readers comprehend and appreciate the complexity and beauty of innovations that shape our world. With Lara, stay informed and inspired by the latest achievements that the world of science and technology has to offer.

NOTE FOR OUR READERS
Karlobag.eu provides news, analyses and information on global events and topics of interest to readers worldwide. All published information is for informational purposes only.
We emphasize that we are not experts in scientific, medical, financial or legal fields. Therefore, before making any decisions based on the information from our portal, we recommend that you consult with qualified experts.
Karlobag.eu may contain links to external third-party sites, including affiliate links and sponsored content. If you purchase a product or service through these links, we may earn a commission. We have no control over the content or policies of these sites and assume no responsibility for their accuracy, availability or any transactions conducted through them.
If we publish information about events or ticket sales, please note that we do not sell tickets either directly or via intermediaries. Our portal solely informs readers about events and purchasing opportunities through external sales platforms. We connect readers with partners offering ticket sales services, but do not guarantee their availability, prices or purchase conditions. All ticket information is obtained from third parties and may be subject to change without prior notice. We recommend that you thoroughly check the sales conditions with the selected partner before any purchase, as the Karlobag.eu portal does not assume responsibility for transactions or ticket sale conditions.
All information on our portal is subject to change without prior notice. By using this portal, you agree to read the content at your own risk.