July/August 2025
IP Litigator

Artificial Intelligence (AI) is rapidly changing drug discovery. Early reports indicate that “in Phase I trials, AI-derived molecules can have a success rate of 80–90%, which is substantially higher success rates than historic averages.”[1] Some sources predicted that AI will discover 30% of new drugs by 2025.[2]
Generally, drug discovery requires two steps. The first step is the identification of therapeutic targets related to a disease, and the second step involves designing a drug that can effectively act on those targets. AI may offer an advantage at both steps, but target identification and validation will likely be especially impacted by AI.[3]
Before we delve into patentability issues for various AI models, it would be helpful to gain a foundational understanding of AI. The diagram below illustrates the general hierarchy of deep learning AI models. Deep learning is a subset of machine learning (ML). ML allows computers to learn from data by using algorithms and perform tasks without explicit programming. Deep learning employs more complex algorithms and relies heavily on artificial neural networks to process complex data. The image below represents a general overview of the basic hierarchy of AI.

Deep learning AI models are currently used in drug discovery. This includes: (1) Generative Artificial Intelligence (GAI), (2) Graph Neural Networks (GNNs), (3) Transformer-Based Models, and (4) Predictive Models. This article will explore patent-related risks associated with inventions derived from these deep learning AI models and policy considerations aimed at mitigating these risks.
GAI systems are machine learning models designed to produce new data similar to the training data. Generative Adversarial Networks (GANs)[4] are a type of GAI that consists of two neural networks: a generator and a discriminator. The generator produces output learned from training data, while the discriminator evaluates the output as real or fake. This adversarial training enhances both the generator’s output quality and the discriminator’s ability to detect ineffective results over time. GAN models like MoIGAN and ChemGAN generate new chemical structures while considering constraints like solubility and toxicity.
Graph Neural Networks (GNN) are designed to analyze data structured as graphs, such as molecular structures. Graphs in GNNs feature nodes that represent atoms and edges that represent the bonds between the atoms. GNNs relay information among nodes through a system called message passing, which reflects real-life atomic dependencies. This is particularly relevant because even though each functional group predicts certain properties, it can be exponentially more complex to predict the interactions of multiple functional groups attached to the same molecule. GNNs such as RDKit and GraphConv analyze chemical libraries, predicting molecular behavior, activity, and toxicity, which is crucial for drug discovery.
Transformer Models consist of an encoder and a decoder, transforming one sequence into another. The encoder processes the input while the decoder generates the output. An attention mechanism connects them, focusing on the most relevant input segments. For example, it allows the model to summarize patents or scientific literature into a few sentences. Transformer Models like ChemBERTa and IBM RXN for Chemistry can effectively uncover patterns and predict reactions to optimize the drug discovery process.
Generative Pre-trained Transformer (GPT) models combine transformer architecture with generative features to create new data. They extract information from scientific literature, generate hypotheses, and identify trends in drug discovery, highlighting their significance as a powerful tool for drug discovery. Moreover, GPTs can generate molecules that bind with a specified target.
A Predictive Model uses past data to forecast future outcomes through machine learning. Predictive Ensemble Models, a subset of these, combine outputs from multiple algorithms for more accurate predictions. Each algorithm is tailored to the type of training data. It is often trained on chemical reactions, clinical trials, and patient records to assess drug efficacy and toxicity. Predictive models like Schrödinger Suite, Vina, and AutoDock enhance molecular docking simulations, accelerating drug development.
In July 2024, the United States Patent and Trademark Office (USPTO) issued “a guidance update on patent subject matter eligibility to address innovation in critical and emerging technologies (ET), especially artificial intelligence (AI).”[8] According to the guidance, the USPTO will classify AI inventions as “computer-implemented inventions,” which can be patentable subject matter.[9] The USPTO also indicated that an AI-derived biomarker in a method of treatment patent may be rejected for subject matter eligibility.[10]
As of June 29, 2025, the USPTO has not provided guidance on other patentability criteria, such as novelty, non-obviousness, written description, and enablement. The subsequent sections will highlight potential patent-related risks that could arise with each model.
The USPTO issued its “Inventorship Guidance for AI-Assisted Inventions” in February 2024, announcing that AI models cannot be inventors.[11] The invention must have at least one human inventor who significantly contributed to the claimed invention’s inventive concept.[12] Moreover, merely reducing an AI’s invention to practice will not make the invention patentable.[13] A human must contribute significantly by developing “the essential building blocks” that lead to the claimed invention.[14] This guidance is consistent with recent court decisions.[15]
The guidance underscores the important role that record keeping may play in showing significant human contribution. “Prompts” are instructions, constraints, and objectives that a user gives to an AI model. Inventors will likely need to document these prompts, including a brief explanation of why the prompt was chosen, and other inputs or design considerations for the AI model being used.
A patent disclosure must adequately support the claimed invention (“written description”) and enable (“enablement”) a person of ordinary skill in the art (POSA) to make and use it. Written description and enablement concerns may arise because it is difficult to replicate the processes of an AI model. It may also prove challenging to draft a deterministic explanation of processes occurring in the “black box”[16] components of the model. Thus, to fulfill the written description requirement, an inventor may need to explain and focus the scope of claims on the processes that take place outside of this “black box,” such as the work performed on the input and output.
Yet one of the defining characteristics of AI models is precisely their stochastic nature: the introduction of some randomness or uncertainty in the black box processing, approximating human cognition, which is absent from programs that predictably yield the same output in response to the same input. Just as you would expect a hundred students to answer the same essay prompt in a hundred different ways, modern AI possesses the unprecedented ability to generate various satisfactory outputs in response to the same input. Such a defining feature may be incompatible with a restrictive interpretation of enablement. If enablement requires recreating the AI’s output, the inventor might attempt to control specific parameters to increase the reproducibility of an output. Most AI models are programmed for variation, which means that two people could theoretically insert the same input into the same model and obtain different results.[17] To minimize output variations, the inventor must document the exact prompts and current version of the model.[18] However, entering the exact prompt may not be enough to guarantee that a POSA will generate the same output as the inventor. To improve output reproducibility, inventors may also need to control the model’s sampling parameters or settings to control the variability of the output.
Generative models have different methods for choosing the next word in a sentence based on the previous words. To ensure coherence, the model selects a word from a subset of words that are likely to be used in association with the preceding word. The variable that represents the number of words in the pool from which a selection is made is referred to as the “k-value.” For example, if the model is choosing 1 word from a pool of the 100 most frequently associated words, then the k-value is 100 (meaning that the model could produce a 100 different pairings at this juncture); but if the model is set to choose 1 word from a pool of the 10 most frequently associated words, then the k-value is 10 (and there are 10 possible pairings). Thus, the lower the k-value, the lower the variability in outputs and the higher the k-value, the higher the variability. Conversely, some models rely on “p-value” to select word pairing. P-value processing looks at the most likely pairings. For example, if the p-value is set to 0.9, the model will consider the simplest combinations of words that are 90% or more likely to occur together and select from this pool.
In some generative models, a low top-k, top-p, and temperature setting may enhance output reproducibility.[19] Top-k changes how the model chooses the data it will generate as output by selecting a fixed number of the most probable choices or “the top k” and only redistributing probability between those choices, eliminating the rest.[20] Top-k algorithms have been developed for telemedicine diagnostic systems to produce diagnostic accuracy levels as high as 99.81%, while increasing the exclusion rate of severe pathologies.[21]
In natural language models (NLP models)[22], top-p concentrates on a small subset of high-probability words, referred to as the “nucleus,” to produce coherent and contextually appropriate text.[23] Instead of using a fixed number of top candidates (top-k), the top-p setting dynamically selects from the top portion of the probability mass (referred to as top-p).[24] This allows for a more flexible method of narrowing down the choices when generating output.
The temperature setting adjusts the probability distribution of generated outputs and alters how the model chooses its output based on each potential output’s “logits” (a type of score that indicates the strength of each option).[25] The temperature parameter also affects the “softmax” function, which converts logits into probabilities.[26] When the temperature is set between 0 and 1, it makes the sampling lean towards options with higher probabilities.[27] Thus, when the temperature setting is low, the output is more focused and predictable.[28]
By fine-tuning these settings, a user can achieve more consistent and reliable results across different applications. Consistent and reliable outputs may prove useful when attempting to satisfy written description and enablement requirements for AI-assisted inventions.
AI models train on large datasets, which often include a wide variety of publicly available information which would qualify as prior art under 35 U.S.C.§ 102(a). Any reference that shows that a patent application’s claimed invention was already “patented, described in a printed publication, or in public use, on sale, or otherwise available to the public,” more than a year before the effective filing date of a patent application, qualifies as prior art.[29] Public accessibility means a hypothetical POSA must be able to access the reference with reasonable diligence.[30] Typically, when a patent examiner cites a reference from web databases, the reference is typically considered accessible to the POSA.[31] By extension, licensed proprietary data may be more difficult to locate than training data derived from web scraping.
Unlike traditional prior art, which may consist of a patented invention or publications, training data comprises vast amounts of information. Training data can also change over time as models are updated or as new data becomes available. The public or private nature of training data creates serious questions as to whether all training data can be accessed by a POSA with due diligence and thus qualify as prior art.
Unlike other models where the invention may be focused on new uses or targets for known compounds, GAIs can create structures and chemical formulas that are not found in nature.[32] This ability to create new molecules may serve as strong evidence for the novelty of GAI generated molecules. Further, obviousness concerns may arise if the molecules they generate are obvious variations of known molecular structures, or the structures in the data set. This is notable because close structural similarity can lead to a presumption of obviousness.[33]
Obviousness may require a motivation to modify a compound and a reasonable expectation of success without undue experimentation. Even with a motivation to modify a compound for a specific purpose, slim expectations of success and a necessity for excessive experimentation may lead to a conclusion of non-obviousness.[34] As GAIs becomes more efficient, less experimentation may be required, which may place more attention on the inventor’s motivations to modify a compound. Because GAIs can quickly identify unexpected targets for drug compounds that might not be explored otherwise, emphasis should be placed on proving that a POSA would lack the motivation to select a compound for an unlikely target or condition.
Transformer models can help detect new uses for known drug compounds. However, new uses of a known drug compound cannot lead to the patentability of claims reciting the compound (rather than a method for using it).[35]
The invention may be restricted to new uses of the drug compound, including innovative treatment methods for previously unsuspected indications or conditions.
Although Graph Neural Networks (GNNs) may not be predictive in the traditional sense, they often serve predictive functions, such as predicting potential drug interactions. Consequently, GNNs and predictive models might face similar patentability challenges.
Some courts have ruled that predictive models are unpatentable abstract ideas,[36] unless the abstract idea is integrated into a practical application.[37] If a claim is merely describing a mathematical model rather than a specific practical drug discovery application, it could be rejected as an abstract idea. For example, if a GNN finds an adverse drug-drug interaction (DDI), the claim should recite a method of treating a disease wherein use of the adverse drug is discontinued or altered. In this scenario, claims that are merely directed to the identification of the DDI with a GNN model will likely be rejected as abstract ideas.
Predictive models may raise novelty concerns if the claimed invention is a known compound. Like transformer models, these inventions may also be restricted to novel applications of known compounds. Challengers may argue that an AI’s prediction made it obvious to select a drug molecule for clinical trial.
Further, GNNs can also effectively predict new synergistic drug combinations.[38] If an individual drug or the combination has been disclosed in the prior art, the applicant may receive a novelty or obviousness rejection even if the synergistic effect was unknown. To overcome such a rejection, it may be helpful to highlight that the discovery occurred through an AI-driven approach that was validated experimentally, to show that a POSA could not have simply combined a few references to produce the claimed synergistic effects.
The evidentiary role of prompts in future patent disputes over AI-assisted inventions is unclear. Case law will likely provide clarity. In the meantime, it is advisable to proactively establish policies that address potential issues. AI models should be viewed as computerized lab assistants that help to execute the inventor’s ideas. The prompts, or the instructions that inventor gives to the model, should be clear, precise, and contextual, reflecting the inventor’s innovative thoughts.[39] Prompts guide the AI model to achieve the inventor’s desired results. They should also contain domain-specific knowledge from the scientist to help the AI model achieve more focused results, and to establish human contribution that is distinguished from the AI’s automated processes. Iterative cycles of prompts should guide the AI model to a refined output by defining words the AI model may have misunderstood, providing real-time feedback on the quality of the output, and adding context where needed.
The model’s type may also influence prompt design: transformer-based models use structured data like molecular descriptors, while generative models may require clear numerical or domain-specific parameters. Out of an abundance of caution, when working with models that process natural language, like GPTs, it is advisable to avoid giving the AI inventive roles, such as “acting as an inventor” or “acting as a scientist.” Scientists may benefit from prompt engineering training focusing on patentability.
A data scientist may help prepare the data and shape prompts to optimize the AI’s performance and output results. In litigation, the data scientist may also help with strategies to retrieve or provide useful evidence from AI models during discovery. The USPTO has stated that a person “who designs, builds, or trains an AI system in view of a specific problem to elicit a particular solution could be an inventor.”[40] Therefore, when data scientists contribute significantly to fine-tuning the AI model or prompts, those contributions may be inventive.
Although our focus is on patent-related risks, these risks extend to other forms of IP. For example, companies may lose trade secrets protection if proprietary information regarding the models, training methods, or data used to generate molecules is shared with a third party. Employers should clearly define the data scientists’ contribution and secure a contractual obligation to assign all rights to any IP produced during employment as the organization’s IP. If a developer offers the services of a data scientist, the service agreement must include safeguards to mitigate these IP risks. Additionally, confidentiality and non-disclosure agreements will be crucial to avoid any trade secret disputes.
An effective AI policy must reliably and systematically document prompts, curate training data, and define the role of the data scientist. Generally, the policy should also include the key components below:
As AI models become faster and more effective, they will transform drug discovery. Companies will need a robust AI policy to capture and preserve the value of their innovation. Policies must be established to guide AI use, ensuring it serves as a supportive tool and extends an inventor’s capabilities to achieve practical results.
United States Patent and Trademark Office (USPTO), inventorship, Person of Ordinary Skill in the Art (PHOSITA), Written description (35 USC § 112), AI + Patent
Originally printed in the July/August 2025 edition of the IP Litigator. This article is for informational purposes, is not intended to constitute legal advice, and may be considered advertising under applicable state laws. This article is only the opinion of the authors and is not attributable to Finnegan, Henderson, Farabow, Garrett & Dunner, LLP, or the firm’s clients.
European IP Blog
UPC Central Division Revokes Patent Covering Covid-19 Treatment Remdesivir
8 June 2026
Conference
19th Annual Forum on Pharma & Biotech Patent Litigation in Europe
May 19-20, 2026
Amsterdam
Panel Discussion
May 5, 2026
London
Due to international data regulations, we’ve updated our privacy policy. Click here to read our privacy policy in full.