MassiveFold, an optimized version of AlphaFold, overcomes these obstacles by offering advanced sampling capabilities. Developed with the support of the IFB and in collaboration with various computing centers, MassiveFold opens new perspectives for the
In structural biology, the study of protein structures has evolved considerably with the arrival in 2021 of the AlphaFold tool, developed by DeepMind. British scientist Demis Hassabis and American scientist John Jumper were awarded the Nobel Prize in Chemistry on October 9, 2024, for the creation of this revolutionary tool. Initially developed for the prediction of individual protein structures, DeepMind subsequently proposed a version capable of generating protein assembly predictions with previously unmatched, although improvable, quality. Indeed, it was subsequently demonstrated that these assembly predictions can be improved through massive sampling, which requires intensive use of AlphaFold. However, the implementation of this massive sampling technology remained limited due to the costs of intensive computing and data storage. MassiveFold, an optimized and flexible version of AlphaFold, overcomes these limitations and provides access to improved sampling.
A publication on this topic has just appeared in the journal Nature Computational Science, to which the French Institute of Bioinformatics (IFB) contributed. This work was carried out as part of Work Package 4 “Intensive Digital Biology” of the Mutualized Digital Spaces for FAIR data in Life and Health Science (MUDIS4LS) project led by the IFB. The IFB is funded by the Investments for the Future Program (PIA), National Research Agency grant number ANR-11-INBS-0013. Work Package 4 aims to facilitate, for the life sciences community, the use of bioinformatics tools on the intensive computing resources available in national and regional computing centers including IDRIS and CBPsmn, partners in the project. The “MassiveFold” development project aims to give the community access to the full potential of AlphaFold.
This work is the result of a collaboration between the IFB, the UGSF (Unit of Structural and Functional Glycobiology), the IDRIS (Institute for Development and Resources in Scientific Computing) and Linköping University in Sweden. It was initiated as part of the Open Hackathons program.
AlphaFold is an artificial intelligence model developed by DeepMind, a Google subsidiary, that can provide very good predictions of the 3D structure of proteins based on their amino acid sequence in most cases. This advancement has a major impact on research in biology, medicine, and biotechnology.
A scalable solution to facilitate mass sampling
Faced with the challenges of implementing massive sampling for protein assemblies, MassiveFold optimizes resource usage and reduces computational response time from months to hours, thanks to parallel execution on multiple GPUs (Graphics Processing Units).
The tool includes all versions of neural network (NN) models published for AlphaFold2 by DeepMind to date and contains several parameters to increase structural diversity. The program can run many instances in parallel, up to one prediction per GPU, thus optimizing the use of available computing infrastructure and allowing a substantial reduction in the time required to obtain prediction results.
This makes it a powerful tool, accessible to researchers while making the most of available computing infrastructure. MassiveFold thus makes it possible to push the boundaries of protein structure modeling and opens up new perspectives for scientific research.
The full article can be found in Nature Computational Science: “MassiveFold: unveiling AlphaFold’s hidden potential with optimized and parallelized massive sampling” .
