Editor Rebekah Jordan spoke to Kevin Kreutter, Ph.D., senior vice president, Drug Discovery, Empress Therapeutics about how the company is using AI to analyse genetic data, to accelerate and refine small molecule drug discovery.
Q. Could you please explain more about the process in how particular DNA sequences encode enzymes to advance drug discovery?
Those who are familiar with the central dogma of biology know that DNA encodes RNA encodes proteins. Empress adds the next logical step, that certain proteins known as enzymes create or modify chemistry. By understanding how enzymes catalyse the chemical reactions necessary to create a specific small molecule, it becomes possible to use genetic engineering and synthetic biology techniques to go from DNA sequences to compounds.
This matters because essentially every chemical compound in our body, aside from those absorbed directly from our diet, are made or modified by enzymes. And every possible disease target is in constant contact with these compounds. So, being able to identify and generate compounds based on their DNA sequences is a powerful new tool for rapidly discovering both drug leads and disease targets based on genetics.
If this sounds like how the advent of recombinant DNA technology accelerated the discovery and development of biologic drugs, that is not coincidental. Developers of biologic drugs use recombinant DNA technologies to isolate a DNA sequence that encodes a therapeutic protein of interest and engineer it into a cell to express the protein, which is then purified to become the drug substance. Empress harnesses the insight that multiple proteins/enzymes work together within cells to form specific chemical compounds, allowing us to use similar genetic engineering approaches to combine and express the enzymes that make a therapeutic small molecule. We can then optimise the pharmacological properties of that molecule, develop analogs, and scale-up its production via more traditional synthetic chemistry.
Q. How does Empress’s AI approach differ from traditional methods of analysing genetic material?
Most genetics-based drug discovery approaches focus on the ~20,000 protein-coding genes found in human cells, using comparisons between individuals or populations with disease and their healthy counterparts to identify causally important biological targets. Empress adds a new, orthogonal source of data from the trillions of microbial cells that reside within and are an evolutionarily conserved part of the human body.
To give you a sense of scale, our metagenome, comprising the combined genomes of all microorganisms found inside and on the human body, encodes over 170 million proteins. Multiplying this across the tens of thousands of individuals and the dozens of indications in Empress’ database, as well as the technical challenge of discovering novel biosynthetic pathways and the compounds they encode, drives the need for computational methods that use artificial intelligence (AI) and machine learning (ML).
We draw an analogy with ChatGPT, which uses natural language processing to parse text from published literature to identify and decipher the meaning of words, understand the rules of grammar and syntax, and then reassemble that information into new text. But where ChatGPT goes from text to text, the Empress platform takes us from the code of DNA to chemistry, by way of the enzymes involved in chemical synthesis and modification. We estimate that evolution could have created and tested as many as 1024 small molecules within the human body. AI helps us canvass this evolutionary-scale database to find the drug leads and targets most relevant to human health, and most likely to translate to novel medicines.
Q. How does Empress ensure the accuracy and reliability of the AI-generated interpretations?
Our AI is focused on predicting which DNA sequences encode important medicinal chemistry. The easiest way to confirm the accuracy of the predictions is to use synthetic biology and confirm that, when expressed in cells, the biosynthetic enzymes do in fact produce novel chemical compounds.
But at the end of the day, the utility of our platform lies in the pharmacology of the drugs produced; any lead compound identified by the Empress platform goes through the same rigorous testing that happens in traditional drug discovery. The compound must be synthesised and fully characterised in terms of pharmacological properties as well as for evidence of efficacy or toxicity in in vitro and in vivo models. Successful candidates then move to advanced preclinical testing and, ultimately, to clinical trials. By working with privileged starting points - compounds and targets discovered within the human body from patient data – we enter the drug discovery process feeling more confident that our lead compounds will succeed. To date, significantly higher success rates in terms of in vitro and in vivo activity, activity against compelling targets, and robust safety profiles bear out this belief.
Q. What are the key advantages of using AI to discover small molecule drugs compared to traditional methods?
Our approach to drug discovery is fundamentally different from traditional approaches, which start with a target that ideally has a strong genetic association to a particular disease and then screens that target with ever-larger libraries of compounds that may or may not be effective and that may or may not be toxic. To do this faster and more efficiently, many have turned to AI to make highly informed guesses as to what might work.
Empress, by contrast, starts with targets and compounds with strong genetic associations to disease, and those compounds already offer evidence of human compatibility. We started with the hypothesis that co-evolution already made and tested a larger compound library than humans could reasonably create. By taking advantage of the fact that programs for synthesizing these compounds are written into genetic code that can now be deciphered with tools like AI and synthetic biology, we can quickly discover novel targets and drug leads.
Q. Are there any challenges associated with using AI to analyse genetic data and how would Empress overcome these?
From our perspective, the biggest challenge is that AI and computational models are often limited by the focus and quality of the datasets upon which they are built. We anticipate that our training set data will continue to grow in tandem with our ongoing platform scaling efforts, and it’s highly gratifying that we’ve already discovered multiple advanced leads with line-of-sight to the clinic to treat multiple diseases. We rely heavily on AI tools that have proven effective in certain areas, such as those able to ingest and translate language or code, since the metagenome is an ideal use case for this form of AI. Likewise, we exercise caution and skepticism with AI approaches that may be less well proven. Our approach is always rooted in data and proof: the biology data generated by our co-evolved small molecule starting points – and their analogs – are constantly being fed into our models to further improve our platform’s utility.
Q. What are the future plans for Empress’s AI-driven drug discovery platform?
Empress will continue to advance its lead programs in immune and inflammatory diseases toward the clinic and ultimately, to patients who need novel, first-in-class, safe oral medicines. Central to that effort, we will leverage partnerships to accelerate these programs and expand the breadth of novel medicines we can deliver.
Q. How do you envision AI transforming the pharmaceutical industry in the coming years?
I envision AI transforming the pharmaceutical industry in several key ways, including:
Uncovering novel biology and drug mechanisms: At Empress, we are leveraging AI to uncover previously unknown biological pathways and therapeutic modalities, by identifying drug-like molecules directly from genetic data. This could open up entirely new avenues for drug development.
Accelerating drug discovery timelines and reducing costs: The ability of AI to rapidly process and analyse large datasets is enabling drug candidates to move from target identification to preclinical development much faster than traditional methods. Empress’ twist on this is that our AI engine allows us to “start closer to the finish line” with advanced, co-evolved molecules that need only minor tweaks to potentially be truly impactful medicines for patients.
Improving probability of clinical success: By starting with human-derived data and models that better reflect real-world biology, AI-driven drug discovery holds the promise of identifying targets and candidates with a higher inherent probability of success in clinical trials. This could help address the industry’s persistently high clinical failure rates.