You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Caution: Please only report your issue related to the installation on your local PC or macOS. If you can get the help message by colabfold_batch --help or run a test prediction successfully, your installation is successful. Requests or questions regarding ColabFold features should be directed to ColabFold repo's issues.
What is your installation issue?
I tried to run colabfold_search in SLURM cluster but it takes like more than 2 days, even though input fasta is just single input.
Computational environment
I used this job allocations:
#SBATCH -c 10 # Requested cores
#SBATCH --time=2-00:00 # Runtime in D-HH:MM format
#SBATCH --partition=medium # Partition to run in
#SBATCH --mem=100GB # Requested Memory
#SBATCH -o %j.out # File to which STDOUT will be written, including job ID (%j)
#SBATCH -e %j.err # File to which STDERR will be written, including job ID (%j)
To Reproduce
And this my colabfold_search execute command:
colabfold_search
--use-env 1
--use-templates 0
--db-load-mode 2
--mmseqs mmseqs
--threads 4
${input_path}
${database_path}
${output_path}
This issue is not about the installation itself. Because colabfold_search depends largely on the machine environment, such as the file system, storage (> 2TB SSD is highly recommended for best performance), RAM (> 768 GB for best performance), and whether or not vmtouch is used. If your job was run on a shared supercomputer and the file system is RAID or network mounted, the calculation speed will be too slow.
Do you have any suggestions to increase the speed of run in my current environment?
What is the most important parameter for determining the performance speed?
Probably, RAM memory (>768 GB)?
In my experience, the most important factor is the file system and the use of SSD. If the sequence databases are placed on an SSD connected by a SATA cable, colabfold_search returns the result in 30~60 minutes even if the machine has only 64GB RAM (in my Desktop Ubuntu 22.04). However, using HDD or network-mounted drive will slow the calculation by more than 10 times. If the sequence databases can be fully cached on the RAM (>768GB) on the first run of colabfold_search, subsequent runs will be extremely fast, on par with the MMSeqs web server.
Caution: Please only report your issue related to the installation on your local PC or macOS. If you can get the help message by
colabfold_batch --help
or run a test prediction successfully, your installation is successful. Requests or questions regarding ColabFold features should be directed to ColabFold repo's issues.What is your installation issue?
I tried to run colabfold_search in SLURM cluster but it takes like more than 2 days, even though input fasta is just single input.
Computational environment
I used this job allocations:
#SBATCH -c 10 # Requested cores
#SBATCH --time=2-00:00 # Runtime in D-HH:MM format
#SBATCH --partition=medium # Partition to run in
#SBATCH --mem=100GB # Requested Memory
#SBATCH -o %j.out # File to which STDOUT will be written, including job ID (%j)
#SBATCH -e %j.err # File to which STDERR will be written, including job ID (%j)
To Reproduce
And this my colabfold_search execute command:
colabfold_search
--use-env 1
--use-templates 0
--db-load-mode 2
--mmseqs mmseqs
--threads 4
${input_path}
${database_path}
${output_path}
and my fasta input is simply this:
Expected behavior
I expected short run time like few hours, but it takes more than 2 days and job was cancelled.
And I attaching the log file, too.
42792350.txt
Thank you for your help in advance!
The text was updated successfully, but these errors were encountered: