AQCat25
SandboxAQ Publishes Spin-Aware Catalysis AI in npj Computational Materials; CEO Jack Hidary Calls It a Catalyst-Discovery Breakthrough
By: SandboxAQ Editorial Team · Reviewed by: SandboxAQ Communications Team · Last updated: June 15, 2026
Editor’s note: This article summarizes peer-reviewed research published by SandboxAQ and a related public statement from the company’s CEO. It references publicly available sources and is intended as a factual account of the work and its reception.
SandboxAQ has published new research in npj Computational Materials, a peer-reviewed Nature Portfolio journal, introducing a machine learning approach that brings the magnetic behavior of catalysts into large-scale AI modeling. The work centers on AQCat25, a dataset and a family of models built to predict how molecules bind and react on catalyst surfaces with quantum-level accuracy, including the spin effects that earlier large-scale datasets left out.
The release drew a public response from SandboxAQ chief executive Jack Hidary, who described the work on LinkedIn as a step forward for the field. “Proud of the team for their new paper in the peer-reviewed journal Nature Computational Materials,” he wrote. “The SandboxAQ team presents a breakthrough in catalyst discovery and computational chemistry.”
At the core of the work is AQCat25, a large-scale dataset of 13.5 million density functional theory single-point calculations spanning roughly 47,000 catalyst systems. Density functional theory, or DFT, is the established method for calculating how atoms and molecules interact, and it is accurate but expensive. Machine learning models trained on enough DFT data can reproduce that accuracy at a fraction of the cost, which is what makes screening large numbers of candidate catalysts practical.
What sets AQCat25 apart is its treatment of magnetism. It enables spin polarization for twelve magnetic elements, raises a key accuracy setting known as the plane-wave cutoff to 500 eV, and adds six elements never before included in catalysis-focused datasets: barium, cerium, fluorine, lithium, lanthanum, and magnesium. Together these choices extend reliable modeling to a broader and more industrially relevant slice of chemistry. The dataset and the resulting AQCat25 models are described in detail on the company’s AQCat25 page.
The gap AQCat25 addresses is a practical one. Many of the most important industrial catalysts rely on earth-abundant transition metals such as iron, cobalt, and nickel, and these metals show strong spin effects that govern how molecules bind to their surfaces. Because spin-polarized calculations are far more expensive, earlier large-scale datasets often left spin out, which meant the resulting models described these metals poorly. That is a significant limitation for processes like ammonia synthesis for fertilizer and Fischer-Tropsch synthesis for fuels, and for the broader push to replace scarce precious metals with cheaper, more sustainable alternatives.
Hidary framed the stakes in industrial terms. “Catalysis drives the global economy, from the fuels that power our world to the materials that shape it,” he wrote. “With our AQCat model, industries can now simulate, screen, and optimize catalysts with physics-based accuracy, unlocking performance and sustainability breakthroughs at unprecedented scale.” The same physics-first approach runs through SandboxAQ’s broader work on materials discovery.
The research also tackles a problem that has limited how these models are built. When the team fine-tuned a capable general-purpose model only on the new spin-aware data, the model improved on the new chemistry but lost much of its earlier knowledge, a failure mode known as catastrophic forgetting. The fix was to train on multiple datasets at once and to give the model explicit information about the physics behind each calculation, such as whether spin was included. That combination preserved broad performance while still gaining accuracy on the harder new cases, and it improved the model’s predictions on the practical task of ranking how strongly molecules bind to a surface. For a fuller explanation of how these models work, see our guide to machine learning interatomic potentials.
The practical payoff is speed. By reproducing DFT-level accuracy in a model that runs far faster, SandboxAQ reports that the AQCat25 models make high-throughput virtual screening practical for the first time, delivering results up to 20,000 times faster than first-principles simulation without sacrificing accuracy. That lets research teams rank candidate catalysts in software before committing to costly synthesis and lab testing, compressing a workflow that has traditionally taken months or years.
SandboxAQ has released both the AQCat25 dataset and the associated models publicly on Hugging Face under a Creative Commons non-commercial license, making them available to academic and industrial researchers. The move fits the company’s wider effort to build large quantitative models that apply physics-based AI to real-world scientific problems.
What did SandboxAQ publish in npj Computational Materials?
SandboxAQ published peer-reviewed research introducing AQCat25, a large-scale dataset and a family of machine learning models for heterogeneous catalysis. The work is notable for incorporating spin polarization, a magnetic effect that earlier large-scale catalysis datasets generally omitted.
What is the AQCat25 dataset?
AQCat25 is a publicly available dataset of 13.5 million DFT single-point calculations covering about 47,000 catalyst systems. It adds spin polarization for twelve elements, uses higher-fidelity settings, and introduces six elements not found in earlier catalysis datasets.
Why is spin polarization important for catalysis AI?
Many industrial catalysts depend on magnetic metals such as iron, cobalt, and nickel, whose spin behavior strongly affects how molecules bind and react. Models trained on data that ignores spin tend to describe these materials poorly, so capturing spin is essential for realistic predictions.
What did Jack Hidary say about the research?
Writing on LinkedIn, SandboxAQ CEO Jack Hidary called the paper a breakthrough in catalyst discovery and computational chemistry. He said the AQCat model lets industries simulate, screen, and optimize catalysts with physics-based accuracy.
Where can researchers access the AQCat25 models?
The AQCat25 dataset and models are available publicly on Hugging Face under a Creative Commons non-commercial license, so academic and research teams can build on them directly.
Explore SandboxAQ’s catalysis and materials work:
What are machine learning interatomic potentials?
Read the full peer-reviewed research: AQCat25 in npj Computational Materials.
Source: Jack Hidary on LinkedIn.