Binding Affinity Prediction:

Methods, AI, and the AQAffinity Approach

Binding affinity is the measure of how strongly a drug candidate sticks to its protein target. It is one of the earliest and most consequential signals in drug discovery: a compound that does not bind with sufficient potency will not produce a therapeutic effect, regardless of how well it performs on other dimensions. Predicting that affinity computationally — before synthesis, before assays, before any wet-lab work — is one of the field’s central challenges, and one where significant progress has been made in recent years. AQAffinity, launched by SandboxAQ in January 2026, addresses a specific and persistent bottleneck in that prediction problem: the requirement for experimentally determined protein structures.

What binding affinity is and why it drives compound selection

When a drug molecule and a protein target interact, they form a complex held together by non-covalent forces — hydrogen bonds, hydrophobic interactions, electrostatics, and van der Waals contacts. The strength of that complex is binding affinity: how tightly the drug holds on, and how readily it lets go. Higher affinity means a drug needs to be present at a lower concentration to achieve its effect, which generally correlates with a better therapeutic window and fewer off-target risks at higher doses.

Experimentally, affinity is measured as a dissociation constant (Kd or Ki) or approximated by the half-maximal inhibitory concentration, IC50. Both are measures of potency: lower values indicate stronger binding. A compound with an IC50 in the nanomolar range is generally considered a strong binder; micromolar-range compounds are weaker and may require substantial optimization before they are useful.

High binding affinity is necessary but not sufficient. A compound that binds tightly to its intended target but equally to several dozen others creates selectivity problems. A compound with excellent potency but poor solubility or rapid metabolic clearance will fail on ADMET grounds. And a compound that requires a twelve-step synthesis from exotic starting materials may never reach clinical scale. Binding affinity is the first gate — the question that must be answered before the others are worth asking.

Predicting it computationally rather than measuring it experimentally matters because experimental screening is slow and expensive. A high-throughput assay run against a target can test tens of thousands of compounds over days. A virtual screen informed by affinity prediction can triage tens of millions of candidates in the same window, directing experimental resources toward the subset most likely to matter.

The methods spectrum — speed vs accuracy

Computational affinity prediction is not a single method. It is a spectrum of approaches that trade accuracy for speed in different ways, and the right choice depends on where in the pipeline it is being applied.

Traditional scoring functions — the kind built into most molecular docking tools — estimate binding affinity using simplified physical models: force field terms, empirical relationships fit to training data, or knowledge-based potentials extracted from structural databases. They are fast enough to run on millions of compounds but approximate enough that their rankings require experimental confirmation. They have been the workhorse of virtual screening for decades, and they remain useful for their speed, but their accuracy ceilings are well established.

Machine learning scoring functions improved on these baselines by learning from experimental binding data rather than hand-coded physical rules. Graph neural networks, transformer architectures, and other deep learning approaches have raised the performance ceiling on standard benchmarks. But a pattern has emerged in prospective validation — the kind that matters for real drug programs. A 2025 study published in PNAS found that ML models “can fail unpredictably when applied to novel targets unseen during training,” and that this failure stems from models developing biases toward structural patterns prevalent in the training data rather than learning the underlying physicochemical principles. Real-world applications of ML scoring functions remain more limited than benchmark results suggest.

Free energy perturbation — FEP — sits at the other end of the spectrum. It calculates binding affinity from thermodynamic first principles, simulating the molecular system with sufficient rigor to produce results that closely match experimental measurements. It is the gold standard for accuracy in lead optimization, where the differences between candidate compounds are small and the decisions are high-stakes. The cost of that accuracy is computational: a high-quality FEP calculation can take up to a day per compound even on a modern GPU. At screening scale, across millions of candidates, FEP is not deployable. For a broader overview of where these methods fit across the discovery pipeline, see the article on computational drug discovery.

The structure requirement bottleneck

Most computational affinity methods — including FEP, most ML scoring functions, and structure-based virtual screening — require a three-dimensional structure of the protein target as input. These structures are determined experimentally using X-ray crystallography or cryo-electron microscopy, and they are expensive and slow to obtain. More importantly, they exist for only a fraction of biologically relevant proteins. Targets that are poorly characterized structurally, that are disordered in regions relevant to drug binding, or that simply have not yet been studied with sufficient resources are inaccessible to structure-dependent methods regardless of how good those methods are.

This creates a hard constraint on which discovery programs can benefit from computational affinity prediction. For well-characterized targets with solved structures, the full toolkit is available. For novel targets, emerging disease mechanisms, or historically undruggable proteins — often the highest-value scientific problems — computational affinity prediction has historically not been applicable at the stages where it would help most.

AQAffinity — structure-free binding affinity prediction

AQAffinity is SandboxAQ’s open-source binding affinity prediction model, built on top of the OpenFold Consortium’s OpenFold3 co-folding framework and launched in January 2026. Its defining characteristic is that it requires no experimentally determined protein structure: it predicts binding affinity directly from protein sequence and SMILES — the molecular string representation of a drug candidate — as inputs.

The practical effect is that AQAffinity can be applied to any target for which a protein sequence is known, regardless of whether a solved structure exists. That opens affinity prediction to the class of targets that structure-dependent methods cannot reach. As the AQAffinity team describes it, this allows researchers to “fail fast” on low-probability candidates before committing to expensive wet-lab cycles — particularly useful for de-risking less structurally characterized proteins before synthesis begins.

In terms of performance, AQAffinity is approximately 1,000 times faster than FEP-based methods, based on published Boltz2 benchmarks on which AQAffinity’s architecture is based. That speed differential makes the model practical at virtual screening scale — the volume where experimental screening cannot go.

The model is trained in part on SAIR — the Structurally Augmented IC50 Repository — SandboxAQ’s own dataset of more than 1,048,857 unique protein-ligand pairs and 5.2 million cofolded 3D structures, curated from ChEMBL and BindingDB. SAIR is the largest publicly available binding affinity dataset with cofolded structures, and it is available openly under CC BY 4.0. Providing that training data publicly is part of how AQAffinity was designed: full transparency in training data, model architecture, and methods, so researchers can benchmark the model against their own data, fine-tune it for specific targets, and integrate it without vendor lock-in. The model is available on Hugging Face under the Apache 2.0 license. The SAIR dataset is available separately for training and benchmarking.

AQAffinity sits within SandboxAQ’s broader AQBioSim platform for AI-driven drug discovery, which spans the full pipeline from hit identification through clinical candidate selection.

The generalization challenge — what no method fully solves yet

AQAffinity addresses the structure requirement bottleneck directly. But it is worth being precise about what it solves and what remains hard.

The generalizability problem in binding affinity prediction is not primarily about structures — it is about training data. When a model encounters a protein target or chemical series that is distant from its training distribution, performance degrades. This is true for structure-based ML models, and it applies to sequence-based models as well when the target is genuinely novel. Removing the structure requirement expands the set of targets that can be addressed, but it does not eliminate the underlying challenge of predicting in unexplored regions of biological and chemical space.

Where affinity prediction from sequences reaches its limits, physics-based methods can extend further. SandboxAQ’s AQFEP — the absolute free energy perturbation approach within AQBioSim — computes binding affinities from thermodynamic first principles, without a reference molecule requirement, and has been validated on targets in neurodegeneration and oncology that were historically considered undruggable. The two approaches serve different roles: AQAffinity for speed and scale at the screening stage, AQFEP for accuracy at lead optimization on the most challenging targets. SandboxAQ’s Large Quantitative Models provide the physics-grounded foundation for that high-accuracy work.

FAQ

What is binding affinity in drug discovery?

Binding affinity is the strength of the interaction between a drug candidate and its protein target. It measures how tightly the drug molecule binds — and therefore how likely it is to produce a therapeutic effect at a given concentration. It is one of the primary criteria used to select and prioritize compounds during hit identification and lead optimization.

What is IC50 and how does it relate to binding affinity?

IC50 is the half-maximal inhibitory concentration — the concentration of a compound required to inhibit a biological target by 50% under defined assay conditions. It is a practical measure of potency that correlates with binding affinity. Lower IC50 values indicate stronger binding and higher potency. It differs from the dissociation constant (Kd or Ki) in that it is assay-dependent, but for most drug discovery purposes it serves as the standard experimental measure of binding.

How is binding affinity predicted computationally?

The main approaches span a spectrum from fast-but-approximate to slow-but-accurate. Traditional scoring functions built into docking tools are the fastest. Machine learning scoring functions improve on these using experimental training data but can fail to generalize to novel targets. Free energy perturbation (FEP) is the most accurate, using thermodynamic simulation, but is too computationally expensive for screening-scale use. Structure-free models like AQAffinity operate from sequence inputs without requiring experimental protein structures, enabling faster triage across larger compound libraries.

What is free energy perturbation (FEP)?

Free energy perturbation is a physics-based computational method that calculates the difference in binding free energy between two molecular states — for example, how much more or less tightly a modified compound binds compared to a reference. It is the most accurate computational method for binding affinity prediction, producing results that closely match experimental measurements. Its limitation is computational cost: a high-quality FEP calculation can take up to a day per compound on a GPU, making it impractical for screening large compound libraries.

What is AQAffinity?

AQAffinity is SandboxAQ’s open-source binding affinity prediction model, built on OpenFold3 and launched in January 2026. It predicts protein-ligand binding affinity directly from protein sequence and SMILES inputs, without requiring an experimentally determined protein structure. It runs approximately 1,000 times faster than FEP-based methods (based on published Boltz2 benchmarks), making it practical for virtual screening at scale. It is available on Hugging Face under the Apache 2.0 license.

What is the SAIR dataset?

SAIR — the Structurally Augmented IC50 Repository — is SandboxAQ’s publicly released binding affinity dataset. It contains more than 1,048,857 unique protein-ligand pairs and 5.2 million cofolded 3D structures paired with experimental potency measurements, making it the largest publicly available dataset of its kind. It is available under the CC BY 4.0 license on Hugging Face and Google Cloud, free for commercial and non-commercial use. The SAIR dataset was created to fill the training data gap that limits most AI binding affinity models.

Explore SandboxAQ’s binding affinity prediction capabilities: