May 20, 2024

Proteins are biological workhorses.

They build our bodies and orchestrate the molecular processes in cells that keep them healthy. They also present a wealth of targets for new medications. From everyday pain relievers to sophisticated cancer immunotherapies, most current drugs interact with a protein. Deciphering protein architectures could lead to new treatments.

That was the promise of AlphaFold 2, an AI model from Google DeepMind that predicted how proteins gain their distinctive shapes based on the sequences of their constituent molecules alone. Released in 2020, the tool was a breakthrough half a decade in the making.

But proteins don’t work alone. They inhabit an entire cellular universe and often collaborate with other molecular inhabitants like, for example, DNA, the body’s genetic blueprint.

This week, DeepMind and Isomorphic Labs released a big new update that allows the algorithm to predict how proteins work inside cells. Instead of only modeling their structures, the new version—dubbed AlphaFold 3—can also map a protein’s interactions with other molecules.

For example, could a protein bind to a disease-causing gene and shut it down? Can adding new genes to crops make them resilient to viruses? Can the algorithm help us rapidly engineer new vaccines to tackle existing diseases—or whatever new ones nature throws at us?

“Biology is a dynamic system…you have to understand how properties of biology emerge due to the interactions between different molecules in the cell,” said Demis Hassabis, the CEO of DeepMind, in a press conference.

AlphaFold 3 helps explain “not only how proteins talk to themselves, but also how they talk to other parts of the body,” said lead author Dr. John Jumper.

The team is releasing the new AI online for academic researchers by way of an interface called the AlphaFold Server. With a few clicks, a biologist can run a simulation of an idea in minutes, compared to the weeks or months usually needed for experiments in a lab.

Dr. Julien Bergeron at King’s College London, who builds nano-protein machines but was not involved in the work, said the AI is “transformative science” for speeding up research, which could ultimately lead to nanotech devices powered by the body’s mechanisms alone.

For Dr. Frank Uhlmann at the Francis Crick Laboratory, who gained early access to AlphaFold 3 and used it to study how DNA divides when cells divide, the AI is “democratizing discovery research.”

Molecular Universe

Proteins are finicky creatures. They’re made of strings of molecules called amino acids that fold into intricate three-dimensional shapes that determine what the protein can do.

Sometimes the folding processes goes wrong. In Alzheimer’s disease, misfolded proteins clump into dysfunctional blobs that clog up around and inside brain cells.

Scientists have long tried to engineer drugs to break up disease-causing proteins. One strategy is to map protein structure—know thy enemy (and friends). Before AlphaFold, this was done with electron microscopy, which captures a protein’s structure at the atomic level. But it’s expensive, labor intensive, and not all proteins can tolerate the scan.

Which is why AlphaFold 2 was revolutionary. Using amino acid sequences alone—the constituent molecules that make up proteins—the algorithm could predict a protein’s final structure with startling accuracy. DeepMind used AlphaFold to map the structure of nearly all proteins known to science and how they interact. According to the AI lab, in just three years, researchers have mapped roughly six million protein structures using AlphaFold 2.

But to Jumper, modeling proteins isn’t enough. To design new drugs, you have to think holistically about the cell’s whole ecosystem.

It’s an idea championed by Dr. David Baker at the University of Washington, another pioneer in the protein-prediction space. In 2021, Baker’s team released AI-based software called RoseTTAFold All-Atom to tackle interactions between proteins and other biomolecules.

Picturing these interactions can help solve tough medical challenges, allowing scientists to design better cancer treatments or more precise gene therapies, for example.

“Properties of biology emerge through the interactions between different molecules in the cell,” said Hassabis in the press conference. “You can think about AlphaFold 3 as our first big sort of step towards that.”

A Revamp

AlphaFold 3 builds on its predecessor, but with significant renovations.

One way to gauge how a protein interacts with other molecules is to examine evolution. Another is to map a protein’s 3D structure and—with a dose of physics—predict how it can grab onto other molecules. While AlphaFold 2 mostly used an evolutionary approach—training the AI on what we already know about protein evolution in nature—the new version heavily embraces physical and chemical modeling.

Some of this includes chemical changes. Proteins are often tagged with different chemicals. These tags sometimes change protein structure but are essential to their behavior—they can literally determine a cell’s fate, for example, life, senescence, or death.

The algorithm’s overall setup makes some use of its predecessor’s machinery to map proteins, DNA, and other molecules and their interactions. But the team also looked to diffusion models—the algorithms behind OpenAI’s DALL-E 2 image generator—to capture structures at the atomic level. Diffusion models are trained to reverse noisy images in steps until they arrive at a prediction for what the image (or in this case a 3D model of a biomolecule) should look like without the noise. This addition made a “substantial change” to performance, said Jumper.

Like AlphaFold 2, the new version has a built-in “sanity check” that indicates how confident it is in a generated model so scientists can proofread its outputs. This has been a core component of all their work, said the DeepMind team. They trained the AI using the Protein Data Bank, an open-source compilation of 3D protein structures that’s constantly updated, including new experimentally validated structures of proteins binding to DNA and other biomolecules

Pitted against existing software, AlphaFold 3 broke records. One test for molecular interactions between proteins and small molecules—ones that could become medications—succeeded 76 percent of the time. Previous attempts were successful in roughly 42 percent of cases.

When it comes to deciphering protein functions, AlphaFold 3 “seeks to solve the exact same problem [as RoseTTAFold All-Atom]…but is clearly more accurate,” Baker told Singularity Hub.

But the tool’s accuracy depends on which interaction is being modeled. The algorithm isn’t yet great at protein-RNA interactions, for example, Columbia University’s Mohammed AlQuraishi told MIT Technology Review. Overall, accuracy ranged from 40 to more than 80 percent.

AI to Real Life

Unlike previous iterations, DeepMind isn’t open-sourcing AlphaFold 3’s code. Instead, they’re releasing the tool as a free online platform, called AlphaFold Server, that allows scientists to test their ideas for protein interactions with just a few clicks.

AlphaFold 2 required technical expertise to install and run the software. The server, in contrast, can help people unfamiliar with code to use the tool. It’s for non-commercial use only and can’t be reused to train other machine learning models for protein prediction. But it is freely available for scientists to try. The team envisions the software helping develop new antibodies and other treatments at a faster rate. Isomorphic Labs, a spin-off of DeepMind, is already using AlphaFold 3 to develop medications for a variety of diseases.

For Bergeron, the upgrade is “transformative.” Instead of spending years in the lab, it’s now possible to mimic protein interactions in silico—a computer simulation—before beginning the labor- and time-intensive work of investigating promising solutions using cells.

“I’m pretty certain that every structural biology and protein biochemistry research group in the world will immediately adopt this system,” he said.

Image Credit: Google DeepMind