Scientists from the Skoltech Center for Computational and Data-Intensive Science and Engineering (CDISE) and Helmholtz Munich Center for Environmental Health (HMGU, Germany) created a neural network for visualizing the chemical space of compounds that can be of potential value for the pharmaceutical industry. The new method will help to create new chemical compounds and navigate in the space of the existing chemicals. The results of the study were published in RSC Advances.
Chemists often have to toil through huge databases containing tens or even hundreds of thousands of chemical structures to select the best candidates. To do so, a chemist should know what classes of compounds the database contains. However, going through thousands of molecules is a laborious and thankless task, which would be much easier if the molecules were pictured as dots and placed on a plane or in space, with similar molecules huddled together. This would enable studying the chemical space using a simple tool in much the same way as the geographer uses digital maps of different scales to view a bigger picture or zoom in on a particular area. But here’s the rub: how would the algorithm know where to place the molecules if the tool has no knowledge of chemistry?
A joint group of researchers from CDISE (Dmitry Karlov, Sergey Sosnin and Maxim Fedorov) and HMGU (Igor Tetko) applied AI methods that allow extracting information directly from data, and coupled the deep neural network with the popular t-SNE dimension reduction method to create a neural network capable of generating a 2D view of the compound on a plane based on the compound’s multidimensional structure received as input. The new method helps place molecules with similar properties close to one another, so that the compounds could be grouped into classes according to their properties. The authors of the study trained their neural network on millions of compounds with known biological activity.
“We adapted the t-SNE method to enable visualizing the chemical space of compounds with pharmaceutical potential by training the deep neural network and selecting simple descriptors and a metric for calculating distances in a multidimensional space. We also showed that this approach allows saving more information as compared to other dimension reduction methods, while being on a par with PCA in terms of speed,” says Skoltech researcher and the first author of the study Dmitry Karlov.
In future, the scientists plan to develop a series of tools for chemists and pharmacists to view the arrangement of new unexplored compounds in relation to those already studied and described in the literature. This will help to expedite the R&D phase in the search for new drugs.
*****
The Skolkovo Institute of Science and Technology (Skoltech) is a private graduate research university. Established in 2011 in collaboration with the Massachusetts Institute of Technology (MIT), Skoltech cultivates a new generation of researchers and entrepreneurs, promotes advanced scientific knowledge and fosters innovative technology to address critical issues facing Russia and the world in the third millennium. Skoltech applies the best Russian and international research and educational practices, with particular emphasis on entrepreneurship and innovation. Web: https://www.skoltech.ru/
Contact information:
Skoltech Communications
+7 (495) 280 14 81