Tom Pesnot is the Head of Medicinal Chemistry at Concept Life Sciences. I invited him to talk about AI and virtual screening in the drug discovery process.
By way of review, Tom laid out the overall process of discovery. One needs to identify a target whose activity can be modulated in a way that is of course, relevant to the disease of interest. Most often we are trying to stop a protein from carrying out its normal function.
Then we are looking for hits — interactions of candidate compounds with the target molecule. The quality of those hits are important.
Typically, this has been done in high throughput screening using in vitro assays. This requires lots of compounds and lots of assays, making the process inaccessible for many. As you might imagine, it is very expensive with fancy robots etc.
All of this provides the rationale for virtual screening because computers are becoming more powerful for predicting interactions between small molecule compounds and target proteins.
Instead of starting with a compound collection (that few have access to), you start with a database. It’s possible to virtually make tens of billions of compounds in silico for screening. What blew my mind was the fact that they are only screening molecules that can be made in two or three steps from existing building blocks. Tens of billions! That means the time from identification to testing is essentially the time needed for shipping the constituent compounds.
And of course, at the other end, you still need a model to recapitulate the proposed activity in vitro.
AI is used along with known protein structures to see what molecules fit and how well in the target’s binding site. I asked about binding in other places that would affect activity. Ligand-based interactions are legitimate, Tom told me. For example, GPCRs (G-protein coupled receptors) elicit different pharmacology depending on where binding occurs, but AI has more impact in structure-based screening focused on active site binding.
Either way, I appreciate you spending time here.
The big innovation is narrowing down the possibilities to test. The traditional brute force approach, even with AI, is to screen one compound at a time. This requires huge amounts of computing power. An AI-derived algorithm that tests the most likely candidates can accelerate the process 1000-fold.
“And that means that because you're accelerating the process by a hundredfold or a thousandfold, then you don't need 10,000 CPUs. But you need 12 CPUs. And then you can screen billions of compounds using, you know, average Joe’s (gaming) computer and get that done in a week. So that's really one of the aspects where AI is having a huge impact on virtual screening. It means that even for huge collections, this process is accessible to small biotechs, to everybody.”
While machine learning is working on making hits more relevant, false positives are a still a challenge. Many things need to work well for a drug to be approved. Safety, efficacy, solubility etc are all important.
We’re not making virtual medicines
So how many compounds from a screening will be tested in an actual in vitro assay? Tom says they might start with 500-1000 molecules. Then those are whittled down to 50-100.
Then they make/buy them and do an in vitro assay.
I’ve been curious about where we are in terms of AI developed drugs in the pipeline. It’s still early days with respect to approved drugs from discovery by AI. According to this article, as of August 2023, none are yet at the approval stage.
One big problem, yet to be overcome, is that typically negative data are not published.
“The problem is, We have a lot of positive data points, negative data points are not necessarily as available because we don't tend to publish negative data. Even though there are some channels to do that and the problem is to build and test and validate a machine learning model or any model, you need to have positive and negative data.”
There are many reasons why a tested compound doesn’t work including a specific protocol or human error. Yet, I can’t help but wonder how much money and effort is wasted on testing compounds that have already been shown to be ineffective, but the data not shared.
Worse yet is the fact that there are published papers with fake data written by AI which is a whole other topic.
Maybe drug discovery is getting harder because we are getting to the proteins that are involved in more complex processes. But Tom points out that many targets that were thought to be undruggable have seen success. Ideally, AI will help us get there faster.
My question for all of you: Where else might AI be applied to make drug discovery more successful, improving on the 90% failure rate? And is anyone working on that?
Your deepest insights are your best branding. I’d love to help you share them. Chat with me about custom content for your life science brand. Or visit my website.
Share this post