deepmind / alphafold2

Model Overview

Description:

AlphaFold2 is a deep learning model for protein structure prediction developed by the research group at DeepMind, an artificial intelligence (AI) research lab owned by Google {cite:p}jumper2021alphafold. AlphaFold2 builds on the success of its predecessor, AlphaFold, and represents a significant breakthrough in the field of protein structure prediction. This model is available for commercial use.

Third-Party Community Consideration

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case.

License / Terms of Use

The AlphaFold2 code is released under the Apache 2.0 License. The model parameters are licensed under the CC BY 4.0 License.

References:

@ARTICLE{jumper2021alphafold,
    title    = "Highly accurate protein structure prediction with {AlphaFold}",
    author   = "Jumper, John and Evans, Richard and Pritzel, Alexander and Green,
                Tim and Figurnov, Michael and Ronneberger, Olaf and
                Tunyasuvunakool, Kathryn and Bates, Russ and {\v Z}{\'\i}dek,
                Augustin and Potapenko, Anna and Bridgland, Alex and Meyer,
                Clemens and Kohl, Simon A A and Ballard, Andrew J and Cowie,
                Andrew and Romera-Paredes, Bernardino and Nikolov, Stanislav and
                Jain, Rishub and Adler, Jonas and Back, Trevor and Petersen, Stig
                and Reiman, David and Clancy, Ellen and Zielinski, Michal and
                Steinegger, Martin and Pacholska, Michalina and Berghammer, Tamas
                and Bodenstein, Sebastian and Silver, David and Vinyals, Oriol
                and Senior, Andrew W and Kavukcuoglu, Koray and Kohli, Pushmeet
                and Hassabis, Demis",
    journal  = "Nature",
    volume   =  596,
    number   =  7873,
    pages    = "583--589",
    month    =  aug,
    year     =  2021,
    language = "en",
    doi = {10.1038/s41586-021-03819-2},
}



Model Architecture:

Architecture Type: Protein Structure Prediction

Network Architecture: AlphaFold2

Input Type(s): Protein Sequence, Relax Prediction (Default True)

Input Format(s): String (less than or equal to 4096 characters), boolean

Input Parameters: 1D

Other Properties Related to Input: NA

Output:

Output Type(s): Protein Structure(s) in PDB Format

Output Format: PDB (text file)

Output Parameters: 1D

Other Properties Related to Output: Pose (numatm x 3)

Software Integration:

Runtime Engine(s):

  • Python

Supported Hardware Microarchitecture Compatibility:

  • NVIDIA Ampere

[Preferred/Supported] Operating System(s):

  • [Linux]

Model Version(s):

AlphaFold2 2.3.2

Training & Evaluation:

Training Dataset:

Link: A description of the training dataset and relevant download links are available at https://www.nature.com/articles/s41586-021-03819-2#data-availability. This data was not collected by NVIDIA.

** Data Collection Method by dataset

** Labeling Method by dataset

Properties (Quantity, Dataset Descriptions, Sensor(s)):
Uniclust dataset of 355,993 sequences with the full MSAs. These predictions were then used to train a final model with identical hyperparameters, except for sampling examples 75% of the time from the Uniclust prediction set, with sub-sampled MSAs, and 25% of the time from the clustered PDB set.

Evaluation Dataset:

Link: See the description at https://www.nature.com/articles/s41586-021-03819-2#Sec10.

** Data Collection Method by dataset

  • [Not Applicable]

** Labeling Method by dataset

  • [Not Applicable]

Properties (Quantity, Dataset Descriptions, Sensor(s)):
Uniclust dataset of 355,993 sequences with the full MSAs. These predictions were then used to train a final model with identical hyperparameters, except for sampling examples 75% of the time from the Uniclust prediction set, with sub-sampled MSAs, and 25% of the time from the clustered PDB set.

Inference:

Engine: Python

Test Hardware:

  • NVIDIA A6000
  • NVIDIA A100

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns here.

You are responsible for ensuring that your use of NVIDIA AI Foundation Models complies with all applicable laws.