Lesson 5: Building a Bioinformatics Environment
Create a bioinformatics environment
Section titled “Create a bioinformatics environment”This command creates a new environment called bioinfo with Python 3.9.
conda create -n bioinfo python=3.9Activate the environment
Section titled “Activate the environment”This command activates the environment so all installs go into bioinfo.
conda activate bioinfoInstall core bioinformatics tools
Section titled “Install core bioinformatics tools”This command installs a standard set of tools used in many workflows. Here is a quick summary of what each tool is used for:
samtools: Manipulates SAM/BAM/CRAM alignment files (view, sort, index)seqkit: Fast toolkit for inspecting and transforming FASTA/FASTQ filesfastqc: Quality control reports for raw sequencing readsmultiqc: Aggregates QC reports from many samples into one summaryminimap2: Aligns long reads or assemblies to a reference genomeblast: Finds sequence similarity against reference databasesbowtie2: Aligns short reads to a reference genome
conda install samtools seqkit fastqc multiqc minimap2 blast bowtie2Note: If you have not configured channels yet, complete Lesson 4 first to ensure reliable installs.
alternatively, you can specify channels directly:
conda install -c conda-forge -c bioconda samtools seqkit fastqc multiqc minimap2 blast bowtie2Remember we said many of the packages in the bioconda channel depend on packages in conda-forge. This is why the order of channels matters.
Verify each tool is installed
Section titled “Verify each tool is installed”These commands check that each tool runs and reports a version.
samtools --versionseqkit versionfastqc --versionmultiqc --versionminimap2 --versionblastn -versionbowtie2 --versionIf each command prints a version number, the installation succeeded.
Export the environment to a YAML file
Section titled “Export the environment to a YAML file”This command writes the environment specification to a file that can be shared or reused.
conda env export -n bioinfo > bioinfo.ymlNote: Keep this YAML file with your project to ensure reproducibility.
Recreate the environment from YAML
Section titled “Recreate the environment from YAML”This command creates a new environment from the exported file.
conda env create -f bioinfo.ymlAfter recreation, you can activate it the same way:
conda activate bioinfoDeactivate when finished
Section titled “Deactivate when finished”This command returns you to the base environment.
conda deactivate