The test of excellence: comparative study of Receptor.AI’s virtual screening technologies

General assessment pipeline and description of key technologies

Results of the initial detection comparison

Secondary Detection Comparison Results

Comparative study of the Receptor.AI virtual screening platform against 24 competing techniques.

LONDON, UNITED KINGDOM, Aug. 28, 2022 / — All AI techniques are based on extensive validation, and AI-based drug discovery is no exception. Receiver.AI pays special attention to the validation and experimental testing of all pieces of technology that are used in our SaaS platform and internal services.

When it comes to virtual projection, a core technology of our platform, there are two different measures of its performance. The first is the ability to distinguish “binders” from “non-binders”. The fewer nonbinders that appear among the top-ranked molecules, and the fewer good binders that are missed, the better the virtual detection method.
The second measure is the correct classification of the molecules according to their affinity and/or activity. The higher the correlation between the actual and predicted binding affinities and/or biological activities, the better the methods.
These two performance metrics are typically appropriate for two different stages of virtual projection. The former is more relevant to an initial screen, which is designed to scan a huge chemical space and select potential binders quickly and with reasonable accuracy. The second generally applies to secondary screening in the selected pool of potential binders, which should prioritize compounds with the best characteristics for further development.

Receptor.AI virtual screening technologies
Our technology stack is designed to follow the idea of ​​the virtual selection funnel based on a holistic approach. The funnel starts with the chemical space, which could be intelligently pre-processed and clustered to achieve unprecedented detection performance (billions of databases could be evaluated in just a few hours). After that, the initial AI-based virtual detection module is applied. The initial screening results are filtered using an advanced AI-based ADME-Tox module consisting of 38 predictive endpoints and fed into the selectivity prediction module against ~10,000 human proteins. After that, the secondary selection is done, which is based on fully automated coupling with the AI ​​score, and the final set of ranked candidates is formed.

The initial selection stage is represented by two drug-target interaction models: 3DProtDTA and FB-DTI, which are applied in parallel in a consensus mode.

Initial Detection Performance Check
To test the performance of the model architectures for initial selection, we ran two experiments using different test data sets.

See also  Entrepreneur of the Year 2022: Tom Madden designs treatment with delivery technology from Acuitas Therapeutics

The first experiment was performed with two generalized reference data sets for AI-based drug target affinity predictions named “Davis” and “KIBA”.

We compared our 3DProtDTA model with 8 state-of-the-art open source AI algorithms for drug-target affinity prediction using the same training set, test set, and performance metrics.

We have shown that our approach outperforms all competitors by a significant margin, ensuring that our model architecture and training protocol are top notch.

In the second experiment, we tested the ability of 3DProtDTA to discriminate binders from non-binders on a large internal test dataset containing 6,618 unique proteins and 80,079 unique impact compounds with known affinities. This translates to 157,809 experimentally validated protein-ligand pairs (the binders), which were augmented with 1,408,400 non-binding pairs, which are used as negative controls. The latter were composed of experimentally validated pairs with non-active compounds and randomly generated pairs.

We compute the Precision-Recall curve, which is commonly used to assess the performance of predictive AI models. The area under this curve (AUC) represents the overall ability of the model to make a correct prediction.
Our model has an AUC=0.917, which means that it predicts the correct affinity in almost 92% of the cases.

Secondary Detection Performance Test
To test the performance of secondary detection, we took four common proteins with a significant number of ligands known to have reliable binding affinities.

We selected the 16 most widespread docking techniques dedicated to predicting the poses and affinities of ligands. Some of them are based on AI scoring features, which makes them especially interesting for us.

For our part, we tested not only the coupling of Receiver.AI with the new AI score (which is our dedicated method for secondary detection), but also our DTI and FB-DTI models, as well as the consensus model of DTI and docking with the new AI score.
There is an elaborate framework of consensus functions used in our technology stack. For example, the DTI and FB-DTI models are balanced by giving them different weights depending on the number of ligands for a particular protein, the reliability of its binding pocket annotation, the size of the binding pocket, and user preference. Such intelligent weighting enables automatic prioritization of the most relevant and reliable DTI model for a given protein target. Another proprietary consensus function is used to combine the results of the DTI models with coupling scores.

See also  Researchers at the Gwangju Institute of Science and Technology develop a new method to remove noise from images

It needs to be emphasized that DTI models are designed for initial screening, so they are not required to have a high throughput in correctly sorting out molecules with significant binding affinities. For such techniques, it is crucial to discriminate binders from non-binders, but they may not classify binders as accurately as dedicated coupling techniques.

First, we augment the sets of known ligands for selected proteins with a large number of decoys (guaranteed to be non-binders) and check whether our DTI model recovers real ligands from the decoys. The results are expected to be excellent: the top 20 compounds contain all 10 of the 10 known ligands for three proteins and 13 of the 16 for the fourth.

We then evaluated the binding scores for known ligands using our techniques and the 16 competing docking techniques and compared the correlations between predicted and experimental values ​​for all of them.

Surprisingly, our DTI and FB-DTI techniques, which are not designed for the correct detailed classification of compounds with high binding affinities, perform on par with the best dedicated docking techniques.
Internal coupling with AI recovery is slightly better than this, while a combination of DTI with coupling and AI recovery gives the best possible result.

This is a remarkable result, showing that Receptor.AI’s virtual screening techniques could compete with dedicated docking algorithms in their ability to correctly classify ligands with high binding affinity, while their combination with the Receptor.AI scoring function docking and AI outperforms the competition.

Alan Nafyyev
+49 1517 2837276
[email protected]
Visit us on social networks:

Leave a Comment