AI Molecular Design in Pharmaceuticals: 10 Advances (2025)

1. Target Identification

AI accelerates the discovery of new drug targets by analyzing massive biological datasets that would overwhelm human researchers. By combing through genomics, proteomics, and other “omics” data, AI can uncover hidden patterns and identify proteins or genes that play key roles in disease processes. This data-driven approach helps pinpoint novel therapeutic targets that traditional methods might overlook, thereby broadening the opportunities for drug development. Ultimately, AI streamlines the target identification phase, giving scientists a head start in focusing on the most promising biological pathways for intervention.

AI has become a crucial tool for target discovery in practice. For example, pharmaceutical giant AstraZeneca partnered with the AI company BenevolentAI to identify new biological targets for chronic kidney disease and idiopathic pulmonary fibrosis – an effort that successfully revealed novel targets to pursue. In general, AI systems can analyze vast and complex biomedical data (e.g. multi-omics and network analyses) to propose previously unrecognized targets. Researchers report that AI-driven approaches not only speed up the target identification process but also provide deeper insights into disease networks, giving confidence that the chosen targets are biologically relevant. Such early successes demonstrate how AI is enhancing the efficiency and scope of target identification in drug discovery.

Ocana, A., Pandiella, A., Privat, C., et al. (2025). Integrating artificial intelligence in drug discovery and early drug development: a transformative approach. Biomarker Research, 13, Article 45. / Thomas, U. (2024, June 1). AI emboldens the exploration of target space. Genetic Engineering & Biotechnology News.

2. Hit Discovery

In the hit discovery stage, AI dramatically improves the efficiency of finding initial active compounds (“hits”) that interact with a target. Traditional high-throughput screening tests thousands to millions of compounds in the lab to find a few hits – a costly and time-consuming process. AI models can replace or augment this by virtually screening vast chemical libraries and predicting which molecules are likely to bind the target with desired activity. By using techniques like deep learning-based docking simulations or chemical generative models, AI filters out unlikely candidates and prioritizes a smaller set of promising hits for experimental testing. This means researchers can find viable starting compounds faster and with fewer experiments, jump-starting the drug discovery pipeline.

AI-driven screening has already demonstrated the ability to uncover new drug candidates much faster than conventional methods. In 2023, scientists used a machine-learning model to analyze 6,680 chemical compounds in silico and identified a new antibiotic effective against a deadly drug-resistant bacterium. The AI system screened the entire library in about 1.5 hours, proposing a shortlist of candidates, out of which 9 were confirmed as potent antibiotics including a novel compound named abaucin. This AI-guided hit discovery process dramatically improved the hit rate compared to random screening – roughly 0.13% of the compounds in the library turned out to be hits, which is several times higher than typical hit rates in blind screening. Such results highlight how AI can rapidly explore chemical space and pinpoint active molecules (in this case, an antibiotic for a “superbug”) far more efficiently than traditional brute-force screening.

Yang, M. (2023, May 25). Scientists use AI to discover new antibiotic to treat deadly superbug. The Guardian.

3. Lead Optimization

Once a promising hit is identified, AI assists in refining it into a “lead” compound with optimal properties. In lead optimization, chemists make systematic modifications to a molecule to improve its efficacy, selectivity, and drug-like characteristics (while minimizing side effects). AI greatly accelerates this trial-and-error process. Machine learning models can predict how a change in chemical structure might affect biological activity or pharmacokinetics, allowing researchers to virtually screen modifications before synthesizing them. Moreover, generative AI algorithms can suggest entirely new analogues that satisfy multiple objectives (potency, safety, stability, etc.). By guiding chemists toward the most promising modifications, AI reduces the number of compounds that need to be made and tested, cutting down the time and cost to achieve a high-quality lead candidate.

AI-driven lead optimization has already shrunk drug design timelines from years to months in real projects. A notable example is the AI-designed drug candidate DSP-1181 (for obsessive-compulsive disorder) created by Exscientia and Sumitomo Pharma. Using AI algorithms to design and optimize this compound, the team advanced it from project start to a Phase I clinical trial in just 12 months, whereas traditional drug design typically takes around 4–5 years. This AI-assisted efficiency is not a one-off: Exscientia’s platform iteratively generated and evaluated hundreds of analogues in silico, rapidly honing in on a lead with the desired activity and pharmacokinetic profile. The result was a clinical candidate achieved in roughly 20% of the usual time. Such successes illustrate how AI can dramatically speed up lead optimization by prioritizing the most effective structural modifications and focusing experimental work on a smaller, more potent set of compounds.

Albert, H. (2022, August 16). These six biotechs are winning the race to get AI-designed drugs to the clinic. Inside Precision Medicine.

4. Prediction of Drug-like Properties

Not every active molecule is a viable drug – it must also have “drug-like” properties (adequate absorption, distribution, metabolism, excretion, and low toxicity, collectively known as ADMET). AI helps predict these pharmacokinetic and physicochemical properties early in the design process. Using models trained on large datasets of compounds, AI can estimate metrics like oral absorption potential, blood-brain barrier permeability, metabolic stability, and more from a molecule’s structure. By flagging compounds with poor solubility or likely toxicity, for example, AI allows researchers to modify or discard them before investing in costly lab tests. This proactive filtering ensures that only candidates with a favorable drug-like profile move forward. Overall, AI’s ability to forecast ADMET properties reduces late-stage failures and guides chemists to design molecules that are not only potent but also developable as medicines.

Predictive modeling of drug-like properties is crucial because suboptimal ADMET traits remain a significant cause of drug failure. Recent analyses of clinical development data from 2010–2017 showed that about 10–15% of drug candidates failed in trials due to poor pharmacokinetics or other “drug-like” property issues. To mitigate this, AI-driven ADMET prediction tools are now widely used in pharma. These models, often based on deep learning, can achieve impressive accuracy – for instance, a computational study reported around 75% accuracy in predicting various toxicity outcomes for drug candidates using deep neural networks. By integrating such tools, companies routinely screen out compounds with red flags (like likely liver toxicity or low bioavailability) at an early stage. This early elimination of ADMET-weak compounds focuses resources on more promising candidates and has contributed to a steep drop in pharmacokinetic-related failure rates over the past decades. AI’s growing accuracy in property prediction thus directly translates into a more efficient and successful drug design process.

Sun, D., Gao, W., & Hu, H. (2022). Why 90% of clinical drug development fails and how to improve it? Acta Pharmaceutica Sinica B, 12(7), 3049–3062. / Mostafa, F., & Chen, M. (2024). Computational models for predicting liver toxicity in the deep learning era. Frontiers in Toxicology, 5, 1340860.

5. Toxicity Prediction

Safety is paramount in drug development, and AI plays a critical role in early toxicity prediction. By learning from vast toxicological datasets (including animal studies and clinical trial outcomes), AI models can predict whether a new compound is likely to have harmful effects – for example, causing liver damage or heart arrhythmias. This allows researchers to identify and fix toxicity issues long before human testing. AI can screen virtual compounds against known toxicity targets (like the hERG channel associated with cardiac toxicity) and evaluate structural alerts for carcinogenicity or other risks. Modern AI systems even tackle complex endpoints like drug-induced liver injury by analyzing patterns across multiple biological scales. By catching these red flags early, AI-driven toxicity prediction greatly reduces the chance of late-stage failures and helps ensure that only the safest drug candidates advance.

Toxicity remains a leading reason for drug attrition, but AI is helping address this challenge. An analysis of recent clinical trial failures found that unmanageable toxicity accounted for roughly 30% of drug failures in the 2010s. To combat this, researchers employ AI models that can accurately predict many toxic effects before a drug ever reaches animal or human testing. For example, multi-task deep neural network models have achieved 70–80% accuracy in predicting various toxic outcomes (such as liver toxicity) based on a compound’s structure and known bioactivity profiles. These models are increasingly used to evaluate large libraries of compounds and eliminate those with high predicted toxicity risk. The payoff is significant: by removing likely-toxic molecules early, companies avoid costly late-stage failures and, more importantly, improve patient safety. The case of baricitinib (an AI-identified COVID-19 therapy) underscored this value – AI predicted it would have an acceptable safety profile in the new indication, which was later confirmed in trials. Overall, AI-driven toxicity prediction has become a vital tool to ensure that drug candidates hitting the clinic have been vetted for safety as thoroughly as possible in advance.

6. Synthesis Prediction

AI aids chemists in figuring out how to actually make a drug molecule by suggesting efficient synthetic routes. This area, known as retrosynthesis planning, involves working backwards from the target molecule to simpler starting materials through a series of chemical reactions. AI-powered retrosynthesis tools can instantly analyze millions of known reactions and propose step-by-step pathways to synthesize a given compound. These suggestions often include alternative routes, allowing chemists to choose the one with the cheapest or safest reagents and the fewest steps. By utilizing machine learning and expert chemical rules, AI can even find novel routes that human chemists might not consider, potentially avoiding bottlenecks or hazardous steps. The result is a faster, more cost-effective path to actually produce new drug candidates in the lab, accelerating the development process.

AI-driven retrosynthetic planning has proven remarkably successful in practice. In a landmark experiment, researchers used an AI system (known as Chematica, now sold as Synthia) to design synthetic routes for eight different medicinally relevant molecules. The AI’s proposed routes were then carried out in the laboratory – and 100% of them succeeded, yielding the target compounds. In each case, the computer-designed syntheses were more efficient than prior methods, often reducing the number of steps or avoiding expensive reagents. For instance, the AI found a new route to a complex drug molecule that cut the synthesis from 9 steps down to 6, a substantial improvement in efficiency. These results, published in 2018, marked the first time that AI-planned syntheses were systematically validated in the lab, demonstrating that such systems can match expert chemists and even introduce fresh ideas. Today, pharmaceutical companies use AI retrosynthesis tools to plan manufacturing routes, which can save months of experimentation and significantly lower production costs for drug candidates.

Klucznik, T., Mikulak-Klucznik, B., bishops, K., et al. (2018). Efficient syntheses of diverse, medicinally relevant targets planned by computer. Chem, 4(3), 522–532.

7. Biased Library Design

Rather than testing millions of random compounds, AI enables the design of “biased” libraries enriched with molecules more likely to succeed. By learning from past drug discovery data, AI models can generate or select compounds that possess features associated with known active drugs. This focused approach means that the screening library is biased toward chemical space with higher hit probabilities. For example, AI can analyze which molecular scaffolds or properties led to hits in previous projects and then suggest new compounds following those patterns. These tailored libraries can be much smaller than brute-force collections but yield more hits and leads. In effect, AI optimizes the compound library itself, allowing researchers to put quality over quantity and dramatically improving the efficiency of discovery campaigns.

I-tailored libraries have demonstrated significantly higher hit rates compared to traditional collections. Historically, high-throughput screening hit rates are often 1–2% or lower (only a few hits per hundreds of compounds tested). Recent AI-driven efforts have blown past this benchmark. For instance, Insilico Medicine reported a virtual screening campaign with an AI-filtered library that achieved a 23% hit rate, and Schrödinger Inc. similarly claimed about 26% hit rate using AI methods – an order of magnitude better than conventional screens. Moreover, in 2024 the startup Model Medicines published results for their AI-designed library “ChemPrint,” where 19 out of 41 tested compounds showed novel biological activity (a 46% hit rate) against various targets. These remarkable numbers, independently validated, illustrate how biased libraries curated by AI can vastly increase the likelihood of finding active compounds. By focusing on promising chemical motifs and avoiding redundant or “dead” space, AI-designed libraries save time and resources and quickly deliver viable starting points for drug programs.

Umansky, T. (2024, September 10). AI hit rates and novelty (The AIDD Code Series – Part 1). Model Medicines.

8. Enhanced Drug Repurposing

AI is revolutionizing drug repurposing – finding new therapeutic uses for existing drugs – by intelligently mining biomedical data for hidden connections. A medication already approved for one condition might treat another if the underlying disease biology overlaps. AI can ingest vast sources (like scientific literature, clinical trial databases, genomic data) to identify drugs that affect pathways involved in a different disease. This includes analyzing patterns of side effects (a drug causing a certain side effect might treat a disease characterized by that effect) or gene expression signatures. By doing this at scale, AI uncovers repurposing opportunities far faster than human intuition. Successful repurposing is hugely valuable because the drug’s safety is already known, so development for the new indication can proceed much faster and at lower cost, as seen during urgent needs like the COVID-19 pandemic.

One of the earliest triumphs of AI-driven drug repurposing was during the COVID-19 outbreak. In February 2020, researchers using an AI-based knowledge graph identified the rheumatoid arthritis drug baricitinib as a potential treatment for COVID-19, by recognizing its ability to inhibit the inflammatory and viral entry processes of the coronavirus. This AI-predicted use was published in The Lancet that month and subsequently tested in clinical trials. Remarkably, baricitinib proved effective: in a large NIH-sponsored trial, adding baricitinib reduced mortality in hospitalized COVID-19 patients by 38%. By November 2020, the FDA had granted baricitinib Emergency Use Authorization for COVID-19, making it one of the first AI-predicted therapies to be clinically validated and deployed. Beyond this case, AI systems are now repurposing drugs for many diseases – for example, identifying oncology drugs that can be used in rare genetic disorders – all by connecting the dots in vast biomedical datasets. The baricitinib story stands as a proof-of-concept that AI can significantly accelerate drug repurposing, delivering life-saving treatments in record time.

Richardson, P., Griffin, I., Tucker, C., et al. (2020). Baricitinib as potential treatment for 2019-nCoV acute respiratory disease. The Lancet, 395(10223), e30–e31.

9. Personalized Medicine

AI is paving the way for truly personalized medicine – treatments tailored to the individual characteristics of each patient. By integrating data such as a patient’s genomic sequence, medical history, lifestyle, and even microbiome, AI can help determine which therapy is most likely to be effective for that specific person. In drug design, this means AI might suggest different lead compounds for different patient subgroups (for example, targeting a mutation present only in some patients’ tumors). AI can also optimize dosing regimens by predicting how an individual will metabolize a drug (pharmacogenomics). In cutting-edge applications, AI algorithms design entirely personalized drugs like neoantigen vaccines for cancer, crafted to match the unique set of mutations in a patient’s tumor. By accounting for the genetic and molecular differences between people, AI-enabled personalized medicine promises higher treatment success rates and fewer adverse reactions, moving away from the one-size-fits-all model.

The feasibility of AI-driven personalized drug design is already being demonstrated in cancer immunotherapy. In 2024, researchers at Ludwig Cancer Research unveiled an AI-powered pipeline called NeoDisc that designs personalized cancer vaccines for individual patients. The system analyzes a patient’s tumor DNA to identify “neoantigens” (unique mutated peptides) and uses AI algorithms to predict which neoantigens the patient’s T-cells will recognize. It then designs a custom vaccine composed of those specific neoantigens. This approach has moved into clinical trials in Lausanne, Switzerland, where patients are receiving vaccines uniquely tailored to the mutations in their tumors. Early results are promising, showing strong immune responses targeted to each patient’s cancer. Similarly, AI is used to personalize treatment plans – for example, in predictive models that match patients to the best cancer therapy based on their tumor profile, significantly improving response rates in trials. These advances underscore how AI enables an unprecedented level of personalization, from bespoke biologics like vaccines to individualized drug selection, heralding a new era of precision medicine.

Huber, F., Bassani-Sternberg, M., & Zhang, W. (2024). NeoDisc: A fully integrated computational pipeline for designing personalized cancer vaccines. Nature Biotechnology, 42(10), 1348–1357. / Ludwig Cancer Research. (2024, October 11). An AI-powered pipeline for personalized cancer vaccines. [Press release].

10. Automated Literature Review

The pace of scientific publishing is tremendous, and AI helps researchers keep up by automating literature reviews. Instead of manually reading through thousands of papers, scientists can leverage AI-based text mining and summarization tools. These tools use natural language processing to scan articles, abstracts, and patents for relevant information – such as experimental results, molecular structures, or clinical outcomes – and condense the findings. AI can also identify trends or contradictions across the literature, helping to form a consensus or highlight gaps in knowledge. In drug design, an automated AI literature review might quickly compile all known data on a protein target or compound class, saving researchers weeks of work. By efficiently extracting and organizing information from the ever-growing body of scientific knowledge, AI ensures that drug designers are informed by the latest research and can make data-driven decisions without being overwhelmed by information overload.

The scale of biomedical literature today makes manual review impractical – over 1.5 million new biomedical research articles are published each year. AI tools are stepping in to tackle this deluge. For example, during the COVID-19 pandemic, the White House and AI researchers released the CORD-19 dataset, containing over 500,000 coronavirus-related publications, specifically to spur AI-driven literature analysis. Dozens of AI systems quickly went to work on this corpus: some algorithms clustered papers by topic to map out research areas, while others used question-answering AI to instantly retrieve facts (e.g., “Which drugs have evidence of inhibiting coronavirus entry?”) from the entire dataset. This automated reviewing proved invaluable – important insights (like mechanisms of viral infection and potential drug targets) were distilled in a fraction of the time it would take humans to read the same documents. More generally, pharma companies now deploy AI literature bots that continuously monitor journals and alert teams to relevant new findings. Such AI-driven literature reviews ensure that no critical piece of knowledge is missed in drug discovery and significantly speed up the process of gathering and synthesizing scientific evidence.

Boylan-Toomey, J., Giguère, S., & Andenmatten, N. (2024). The landscape of biomedical research. Patterns, 5(8), 100982. / Wang, L. L., Lo, K., Chandrasekhar, Y., et al. (2020). CORD-19: The COVID-19 Open Research Dataset. ArXiv preprint arXiv:2004.10706.