How the Biggest Distributed Supercomputer Broke the Exaflop Barrier… and What That Means for Us

Molecular structure of the coronavirus’s spike protein, University of Toronto

On March 25, 2020, the Folding@Home network announced that it now has over an exaflop of computing power, after hitting 470 petaflops just days prior. Some estimates by others have suggested that it could be more, as much as up to 1.5 exaflops at that time, but still, either number is a remarkable achievement.

To compare, the world’s most powerful supercomputer (according to the 54th TOP500, in November 2019), Summit, scored 148.6 petaFLOPS on the LINPACK benchmark thanks to its 202,752 CPU cores. As of April 8, 2020, the Folding@Home network is now up to 1182.7 petaFLOPS (with over 9 million CPU cores!), meaning it is nearly 10x more powerful than Summit.

Folding@Home Statistics, as of April 8, 2020.

In fact, if the Folding@Home network is “only” one exaflop, then it is still as fast as the 25 fastest supercomputers on Earth combined. If it has reached the 1.5-exaflop mark, as some members of the PCMR Folding@Home team have speculated, it’s currently as fast as just over the top 100 systems on the TOP500 combined.

What on earth is a FLOP?

FLOP is an acronym for Floating-Point Operation, often referred to by FLOPS, meaning Floating Point Operations Per Second. The FLOPS is a measure of a computer’s performance, specifically in fields of scientific calculations that heavily use floating-point calculations.

To scale this up, we can then talk about a TFLOP (or Teraflop), which is a trillion flops (or 10¹² FLOPS). Scaling up again, we have the Petaflop (10¹⁵ FLOPS). Finally, the big one, the Exaflop, which is a thousand petaflops, or a quintillion (10¹⁸) FLOPS. Insanity.

So What?

To understand the impacts of the Folding@Home project, we must further understand proteins, and more specifically, protein folding.

The Folding@home project (FAH) is dedicated to understanding protein folding, the diseases that result from protein misfolding and aggregation, and novel computational ways to develop new drugs in general. Here, we briefly describe our goals, what we are doing, and some highlights so far.

What is protein folding?

Proteins are necklaces of amino acids, long-chain molecules. They are the basis of how biology effectively gets things done. As enzymes, they are the driving force behind all of the biochemical reactions that make biology work. As structural elements, they are the main constituent of our bones, muscles, hair, skin, and blood vessels. As antibodies, they recognize invading elements and enable the immune system to get rid of the unwanted invaders. Due to all of these reasons, scientists have sequenced the human genome (the blueprint for all of the proteins in biology), but how can we understand what the proteins do and how they work?

The issue is, only knowing this sequence doesn’t tell us a lot about what the protein does and how it works. In order to carry out their function, they have to take on a particular shape, which is also known as a “fold”. Therefore, before proteins can do their work, they assemble themselves. This self-assembly process is called “folding”.

How does this relate to disease?

Diseases, such as Alzheimer’s disease, Huntington’s disease, cystic fibrosis, BSE (mad cow disease), an inherited form of emphysema, and even many cancers are believed to be caused by protein misfolding. When this happens, proteins can clump together and gather in the brain, where they are believed to cause the symptoms of Mad Cow or Alzheimer’s disease.

Coronavirus & Folding@Home

Now, viruses also have proteins that they use to suppress our immune systems and reproduce themselves. FAH’s mission is to understand how viral proteins work and how to design therapeutics to stop them. There are a bunch of methods for finding protein structures, and while extremely powerful, they only reveal a single snapshot of a protein’s shape. However, since proteins have lots of moving parts, we want to see the protein in action. These structures that we can’t see experimentally, could be the key to discovering a new therapeutic.

Using football as an analogy for this situation, it’s like you could only see the players lined up for the snap (the single arrangement the players spend the most time in), but couldn’t see anything else about the rest of the game.

Seeing a single protein structure is important information, but has a lot of missing information. FAH’s specialty is using computer simulations to understand these proteins’ moving parts. Watching how the atoms in a protein move relative to one another is important, because it captures valuable information that is inaccessible by other means. Taking the experimental “snapshots” as starting points, it is possible to simulate how all the atoms in the protein move, effectively filling in that gap that experiments miss.

Folding@Home opens a hidden drug binding site, YouTube

This is truly fascinating, as this cannot only be applied to COVID-19, but other diseases as well. For example, in a recent paper by the FAH team, they simulated a protein from the Ebola virus that is typically considered “undruggable” because the snapshots from experiments don’t have obvious druggable sites. But FAH simulations discovered an alternative structure that does have a druggable site. This led to experiments that confirmed the prediction, and now the search for drugs that bind the newly discovered binding site.