Last year DeepMind put forward a compelling a solution to a 50-year-old science problem, demonstrating how its AlphaFold AI could predict the 3D structures of unique proteins, laying the groundwork for a new era of biological discovery. The company has continued building out these foundations by now sharing the predicted structures for nearly every protein in the human body, which will accelerate research efforts into everything from antibiotic resistance to cancer treatments, and so much more.
As key components in all living things on Earth, understanding the intricate shapes of different individual proteins can help us understand what they do and how they might be beneficially manipulated, through drugs that tackle human disease, for example. But they all start with one-dimensional chains of amino acids that fold into an almost endless range of highly complex 3D structures. Using those amino acid chains to predict what the final structure will look like is known as the "protein folding problem," and it is one scientists have contended with since the early 1970s.
DeepMind's AlphaFold AI was developed to tackle this problem with the power of modern computing. The system is trained on publicly available protein structures that have already been determined through scientific experiments, and last year demonstrated how it could be used to solve protein structures that scientists had been working on for many years.
Described as "astonishingly accurate", "a gamechanger", and "a stunning advance", AlphaFold is seen as a solution to the 50-year protein folding problem, and one that heralds a new chapter in biological research. It could help scientists far more quickly identify malfunctioning proteins and why they cause certain diseases, or greatly accelerate the development of drugs to treat them. Enzymes could be more rapidly developed to degrade plastic waste, and novel viruses could be tackled more effectively by mapping the structures of spike proteins, for example.
Less than a year on, we are already seeing this technology start to reshape the world of scientific research. Last week, a separate team of researchers at the University of Washington demonstrated AlphaFold-like software called RoseTTAFold. It could predict protein structures in as little as 10 minutes with a single gaming computer and was made freely available online.
The team at DeepMind has also been working to make its tool more accessible. Last week it published a paper detailing how the system was developed and shared the source code on GitHub. Today it has published its catalog of predictions for almost all the proteins in the human body, known as the human proteome.
This amounts to 98.5 percent of human proteins and numbers around 20,000 in all. Furthermore, it has provided open access to the proteomes of 20 other organisms of interest, including the fruit fly, mouse, yeast and E.Coli, which amounts to a total of more than 350,000 protein structures. The team plans to build on this in the coming months by expanding this collection to include the more than 100 million proteins known to science, culminating in what DeepMind calls a "veritable protein almanac of the world."
"This will be one of the most important datasets since the mapping of the Human Genome," says Professor Ewan Birney, Deputy Director General of the European Molecular Biology Laboratory.
A paper describing the human proteome predictions was published in Nature, while the video below offers a short demo of the protein structure database.
Source: DeepMind