Module 7 — Solutions
Exercise 1 — Your First Script
Section titled “Exercise 1 — Your First Script”#!/bin/bash
echo "Alex"dateecho "Ready to do bioinformatics."Notes:
dateis a command, not a string — no quotes needed.- The shebang must be the very first line, with no blank lines or spaces before it.
- Your name goes in the
echoline — replaceAlexwith your own.
Exercise 2 — Variables and basename
Section titled “Exercise 2 — Variables and basename”#!/bin/bash
INPUT="Training/short_reads/paired/SRR1553607_2.fastq"SAMPLE=$(basename $INPUT .fastq)
echo "Full path: $INPUT"echo "Sample name: $SAMPLE"Notes:
$(basename $INPUT .fastq)runsbasenamewith two arguments: the value of$INPUTand the extension.fastqto strip.- The result is captured by
$()and stored directly inSAMPLE. - The
echolines mix literal text and variable values — Bash replaces$INPUTand$SAMPLEbefore printing.
Exercise 3 — A Simple Loop
Section titled “Exercise 3 — A Simple Loop”#!/bin/bash
for FILE in Training/short_reads/paired/*.fastqdo SAMPLE=$(basename $FILE .fastq) echo "Found sample: $SAMPLE"doneNotes:
*.fastqmatches all files ending in.fastqat that path. Bash expands it automatically.$FILEholds the full path on each pass;basenameextracts the clean name.- The output format
Found sample: $SAMPLEmatches what was asked for — check that yourecholine uses exactly that text.
Exercise 4 — Log Files
Section titled “Exercise 4 — Log Files”#!/bin/bash
LOG="sample_list.txt"
echo "Sample list" > $LOG
for FILE in Training/short_reads/paired/*.fastqdo SAMPLE=$(basename $FILE .fastq) echo "$SAMPLE" >> $LOGdone
echo "Log saved to: sample_list.txt"Notes:
> $LOGcreates a fresh file and writes the header line. If you accidentally used>>here, the header would be appended to whatever was already in the file from a previous run.>> $LOGinside the loop adds each sample name without overwriting the header or previous entries.- The final
echogoes to the terminal, not the log — no>or>>means the output prints to the screen as normal.
Exercise 5 — Capturing All Output with exec
Section titled “Exercise 5 — Capturing All Output with exec”#!/bin/bash
LOG_FILE="full_run.log"
exec > >(tee $LOG_FILE) 2>&1
echo "Run started"
for FILE in Training/short_reads/paired/*.fastqdo SAMPLE=$(basename $FILE .fastq) echo "Processing: $SAMPLE"done
echo "Run finished"Notes:
exec > >(tee $LOG_FILE) 2>&1must come before any commands whose output you want captured — placing it right after the variable declarations is the standard pattern.- There are no
>>or| teeon any individual line. Theexecline handles everything. - When you swap to
exec 2> errors.log, theechomessages still print to the terminal normally — only error messages (from failed commands) would be written toerrors.log. With no errors in this simple loop,errors.logwill be empty or absent.
Exercise 6 — Build Your Own Pipeline Script
Section titled “Exercise 6 — Build Your Own Pipeline Script”#!/bin/bash
DATA_DIR="Training/short_reads/paired"OUTPUT_DIR="my_fastqc_output"LOG_FILE="my_run_log.txt"
exec > >(tee $LOG_FILE) 2>&1
mkdir -p $OUTPUT_DIR
for FILE in $DATA_DIR/*.fastqdo SAMPLE=$(basename $FILE .fastq) echo "Quality checking: $SAMPLE" fastqc $FILE --outdir $OUTPUT_DIRdone
echo "Pipeline complete."Notes:
- Using
exechere means the script no longer needs any>>or| teelines — all output, including any FastQC messages and errors, goes to both the terminal and$LOG_FILEautomatically. echo "Pipeline complete."sits afterdoneso it runs once, not once per file.- Compare this to the Exercise 5 solution from the previous version that used manual
>>—execgives the same result with less effort and also captures errors that>>would have missed.