AlphaFold2

Description

This package provides an implementation of the inference pipeline of AlphaFold v2.0. This is a completely new model that was entered in CASP14 and published in Nature. For simplicity, we refer to this model as AlphaFold throughout the rest of this document.

Any publication that discloses findings arising from using this source code or the model parameters should cite the AlphaFold paper. Please also refer to the Supplementary Information for a detailed description of the method.

Home page is at https://github.com/deepmind/alphafold 

License and Disclaimer

This is not an officially supported Google product.

Copyright 2021 DeepMind Technologies Limited.

AlphaFold Code License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Model Parameters License

The AlphaFold parameters are made available for non-commercial use only, under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license. You can find details at: https://creativecommons.org/licenses/by-nc/4.0/legalcode

AlphaFold Databases

There are eight reference databases (parameters) where all of them are downloaded, verified and stored in NeSI filesystem  /opt/nesi/db/alphafold_db .

$ /opt/nesi/db/alphafold_db/
├── bfd
├── mgnify
├── params
├── pdb70
├── pdb_mmcif
├── small_bfd
├── uniclust30
└── uniref90

Singularity container

We prepared a Singularity container image based on the official Dockerfile with some modifications. Image (. simg) and the corresponding definition file (.def) are stored in /opt/nesi/containers/AlphaFold/2021-08-14

Example Slurm script

Input fasta used in following example and subsequent benchmarking is 3RGK (https://www.rcsb.org/structure/3rgk).

#!/bin/bash -e

#SBATCH --account           nesi12345
#SBATCH --job-name          alphafold2
#SBATCH --mem               20G
#SBATCH --cpus-per-task     8
#SBATCH --gpus-per-node     P100:1
#SBATCH --time              01:20:00
#SBATCH --output            slurmout.%j.out

module purge
module load cuDNN/8.1.1.33-CUDA-11.2.0 Singularity/3.8.0

image=/opt/nesi/containers/AlphaFold/2021-08-14
database=/opt/nesi/db/alphafold_db

export SINGULARITY_BIND="$PWD:/etc,$image,/path/to/input/data:/var/inputdata,/path/to/outputs:/var/outputdata,$database:/db"

singularity run --pwd /app/alphafold --nv $image/alphafold2_v2.simg python /app/alphafold/run_alphafold.py \
--fasta_paths=/var/inputdata/3RGK.fasta \
--output_dir=/var/outputdata \
--model_names=model_1 \
--preset=casp14 \
--max_template_date=2020-05-14 \
--data_dir=/db \
--uniref90_database_path=/db/uniref90/uniref90.fasta \
--mgnify_database_path=/db/mgnify/mgy_clusters_2018_12.fa \
--uniclust30_database_path=/db/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
--bfd_database_path=/db/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--pdb70_database_path=/db/pdb70/pdb70 \
--template_mmcif_dir=/db/pdb_mmcif/mmcif_files \
--obsolete_pdbs_path=/db/pdb_mmcif/obsolete.dat

Explanation of Slurm variables and Singularity flags

  1. Values for --mem , --cpus-per-task and --time Slumr variables are for 3RGK.fasta. Adjust them accordingly
  2. We have tested this on both P100 and A100 GPUs where the runtimes were identical. Therefore, the above example was set to former via P100:1
  3. The --nv flag enables GPU support.
  4. --pwd /app/alphafold is to workaround this existing issue
Was this article helpful?
0 out of 0 found this helpful