15 Minutes
We just learned about how to use RSEM & STAR, but now we need to align all of the rest of our samples to the reference genome. In this breakout exercise, we’ll build upon some concepts we’ve learned previously.
Based on our earlier breakout exercise, using a for-loop with our bash variable would look something like this:
for SAMPLE in sample_B sample_C sample_D sample_E sample_F
do
rsem-calculate-expression --star --num-threads 1 --star-gzipped-read-file \
--star-output-genome-bam --keep-intermediate-files \
out_trimmed/${SAMPLE}_R1.trimmed.fastq.gz \
../refs/GRCm38.102.chr19reduced \
out_rsem/${SAMPLE}
done
Place the appropriate code into a file using the nano
editor to create the script, then execute the script.
# Use the nano editor to create a script
nano aligning_B-F.sh # Insert commands into editor, then close the file
# Run the script
bash aligning_B-F.sh
Optional: Add execute permissions to the script before executing.
If going this route, you can call the script directly, without calling bash.
Note that since the script is in the current directory, you’ll have to provide that additional contextual information when calling it (e.g. ./
to represent the current directory).
# Add execute permissions
chmod +x aligning_B-F.sh
# Run the script
./aligning_B-F.sh
Helper Hints: When using a for-loop approach, it can be helpful to slowly build up to the end result, sometimes using a “dry-run” command as a test case, to get learners to be more cognizant of what their code will do.
- Echoing filenames is an easy place to start.
- Iterating over a single sample might also be helpful when testing.
Example echoing filenames:
for SAMPLE in sample_B sample_C sample_D sample_E sample_F
do
echo "in_file: out_trimmed/${SAMPLE}_R1.trimmed.fastq.gz"
echo "out_prefix: out_rsem/${SAMPLE}"
done
Example iterating over a single sample (sample_A, which we’ve already aligned prior to the breakout exercise)
for SAMPLE in sample_A
do
rsem-calculate-expression --star --num-threads 1 --star-gzipped-read-file \
--star-output-genome-bam --keep-intermediate-files \
out_trimmed/${SAMPLE}_R1.trimmed.fastq.gz \
../refs/GRCm38.102.chr19reduced \
out_rsem/${SAMPLE}
done