3:00 PM
 | 
Mar 22, 2018
 |  BC Innovations  |  Strategy

Janssen’s virtual discovery

How Janssen is using machine learning to accelerate compound screens

In the latest example of how AI can accelerate drug discovery, Janssen has shown it can use machine learning to create virtual drug screens that outperform conventional assays. The most promising application may be in drugging difficult targets.

Despite improvements in computational use of structural data for predicting binding and activity, most programs only work well when the molecule is closely related to others with known activity. They fail to predict the activity of compounds in untested chemical space, which limits their value against difficult targets.

Last month in Cell Chemical Biology, a group from the Janssen Pharmaceutica NV unit of Johnson & Johnson described the development of artificial intelligence (AI) screens based on high-content imaging that produced higher hit rates and greater chemical diversity than standard cellular and biochemical assays.

In general, companies choose single morphological features as readouts of imaging screens and ignore the rest.

High-content imaging measures changes in cellular morphology that are influenced by a compound’s activity on a range of molecular pathways. As the targets behind those effects are often unknown, the screens are not biased towards compounds of any particular structure.

The Janssen group used images from previous cell-based assays on over 500,000 compounds to capture as many features as possible and create a “fingerprint” of each compound’s activity (see “When Worlds Collide”).


Figure: When worlds collide

Johnson & Johnson’s Janssen Pharmaceutica NV has published an approach to virtual drug screening that uses machine learning to analyze historical data and create computational models that can predict compound activity in dozens of assays. Results were reported in Cell Chemical Biology.

The method involves creating two data matrices from different types of archival data.

(1a) The first comes from analyzing high-content images of cells to create an image fingerprint of each compound’s effects on morphology.

(1b) The second matrix is populated with any available historical activity data from the same compounds across various biochemical and cellular assays.

2) The Janssen team showed two machine learning approaches could generate successful models -- the Macau method, which uses Bayesian matrix factorization, and deep neural network analysis.

3) In either case, the output was a third matrix containing the...

Read the full 1817 word article

User Sign in

Trial Subscription

Get a 4-week free trial subscription to BioCentury Innovations

Article Purchase

$100 USD
More Info >