Cutadapt All Samples Exercise (Breakout)


15 Minutes


Now that we’ve learned the basics of running Cutadapt, we need to trim all the rest of our samples. If you remember from the Computational Foundations course, we learned about using bash variables. Let’s try an exercise where we use a bash variable to trim each one of our FASTQ files.


Instructions:


  • One group member should share their screen in the breakout room. If nobody volunteers, a helper may randomly select someone.
  • The group members should discuss the exercise and work together to find a solution.
  • After a solution is found, allow time for all members to complete the exercise.


  • Review Cutadapt’s help page and choose the proper arguments for our Cutadapt command(s).
  • Use a bash variable along with Cutadapt to trim all remaining FASTQ files.
  • Confirm that we have all of our expected output files.


Hint: Using a bash variable allows us to quickly change some arguments in a repeated command, e.g. :

noun="World"
echo "Hello, $noun!"
noun="Class"
echo "Hello, $noun!"


Solution - Cutadapt All Samples Exercise


One solution is to define a bash variable for the sample, use that variable in a Cutadapt command, and then redefine the variable before repeating the Cutadapt command for each change.

# Define a variable $SAMPLE
SAMPLE=sample_B
# Create a command using the variable $SAMPLE
cutadapt -q 30 -m 20 -o out_trimmed/${SAMPLE}_R1.trimmed.fastq.gz ../reads/${SAMPLE}_R1.fastq.gz

# Redefine the variable and run the command for each additional sample
SAMPLE=sample_C
cutadapt -q 30 -m 20 -o out_trimmed/${SAMPLE}_R1.trimmed.fastq.gz ../reads/${SAMPLE}_R1.fastq.gz

SAMPLE=sample_D
cutadapt -q 30 -m 20 -o out_trimmed/${SAMPLE}_R1.trimmed.fastq.gz ../reads/${SAMPLE}_R1.fastq.gz

SAMPLE=sample_E
cutadapt -q 30 -m 20 -o out_trimmed/${SAMPLE}_R1.trimmed.fastq.gz ../reads/${SAMPLE}_R1.fastq.gz

SAMPLE=sample_F
cutadapt -q 30 -m 20 -o out_trimmed/${SAMPLE}_R1.trimmed.fastq.gz ../reads/${SAMPLE}_R1.fastq.gz


Another solution is to create a for-loop with our bash variable and Cutadapt command. E.g.

for SAMPLE in sample_B sample_C sample_D sample_E sample_F
    do
    cutadapt -q 30 -m 20 -o out_trimmed/${SAMPLE}_R1.trimmed.fastq.gz ../reads/${SAMPLE}_R1.fastq.gz
done


Helper Hint: If suggesting a for-loop approach, it can be helpful to build up a “dry-run” command as a test case, to get learners to be more cognizant of what their code will do. Echoing filenames first might be a good suggestion.


LS0tCnRpdGxlOiAiQnJlYWtvdXQgMDEgU29sdXRpb24iCmF1dGhvcjogIlVNIEJpb2luZm9ybWF0aWNzIENvcmUiCm91dHB1dDoKICAgICAgICBodG1sX2RvY3VtZW50OgogICAgICAgICAgICBpbmNsdWRlczoKICAgICAgICAgICAgICAgIGluX2hlYWRlcjogaGVhZGVyLmh0bWwKICAgICAgICAgICAgdGhlbWU6IHBhcGVyCiAgICAgICAgICAgIGZpZ19jYXB0aW9uOiB0cnVlCiAgICAgICAgICAgIG1hcmtkb3duOiBHRk0KICAgICAgICAgICAgY29kZV9kb3dubG9hZDogdHJ1ZQotLS0KPHN0eWxlIHR5cGU9InRleHQvY3NzIj4KYm9keXsgLyogTm9ybWFsICAqLwogICAgICBmb250LXNpemU6IDE0cHQ7CiAgfQpwcmUgewogIGZvbnQtc2l6ZTogMTJwdAp9Cjwvc3R5bGU+Cgo8YnI+CgojIyBDdXRhZGFwdCBBbGwgU2FtcGxlcyBFeGVyY2lzZSAoQnJlYWtvdXQpCgo8YnI+CgoqKjE1IE1pbnV0ZXMqKgoKPGJyPgoKTm93IHRoYXQgd2UndmUgbGVhcm5lZCB0aGUgYmFzaWNzIG9mIHJ1bm5pbmcgQ3V0YWRhcHQsIHdlIG5lZWQgdG8gdHJpbSBhbGwgdGhlIHJlc3Qgb2Ygb3VyIHNhbXBsZXMuIElmIHlvdSByZW1lbWJlciBmcm9tIHRoZSBDb21wdXRhdGlvbmFsIEZvdW5kYXRpb25zIGNvdXJzZSwgd2UgbGVhcm5lZCBhYm91dCB1c2luZyBiYXNoIHZhcmlhYmxlcy4gTGV0J3MgdHJ5IGFuIGV4ZXJjaXNlIHdoZXJlIHdlIHVzZSBhIGJhc2ggdmFyaWFibGUgdG8gdHJpbSBlYWNoIG9uZSBvZiBvdXIgRkFTVFEgZmlsZXMuCgo8YnI+CgojIyMgSW5zdHJ1Y3Rpb25zOgoKPGJyPgoKLSBPbmUgZ3JvdXAgbWVtYmVyIHNob3VsZCBzaGFyZSB0aGVpciBzY3JlZW4gaW4gdGhlIGJyZWFrb3V0IHJvb20uIElmIG5vYm9keSB2b2x1bnRlZXJzLCBhIGhlbHBlciBtYXkgcmFuZG9tbHkgc2VsZWN0IHNvbWVvbmUuCi0gVGhlIGdyb3VwIG1lbWJlcnMgc2hvdWxkIGRpc2N1c3MgdGhlIGV4ZXJjaXNlIGFuZCB3b3JrIHRvZ2V0aGVyIHRvIGZpbmQgYSBzb2x1dGlvbi4KLSBBZnRlciBhIHNvbHV0aW9uIGlzIGZvdW5kLCBhbGxvdyB0aW1lIGZvciBhbGwgbWVtYmVycyB0byBjb21wbGV0ZSB0aGUgZXhlcmNpc2UuCgo8YnI+CgotIFJldmlldyBDdXRhZGFwdCdzIGhlbHAgcGFnZSBhbmQgY2hvb3NlIHRoZSBwcm9wZXIgYXJndW1lbnRzIGZvciBvdXIgQ3V0YWRhcHQgY29tbWFuZChzKS4KLSBVc2UgYSBiYXNoIHZhcmlhYmxlIGFsb25nIHdpdGggQ3V0YWRhcHQgdG8gdHJpbSBhbGwgcmVtYWluaW5nIEZBU1RRIGZpbGVzLgotIENvbmZpcm0gdGhhdCB3ZSBoYXZlIGFsbCBvZiBvdXIgZXhwZWN0ZWQgb3V0cHV0IGZpbGVzLgoKPGJyPgoKCj4gSGludDogVXNpbmcgYSBiYXNoIHZhcmlhYmxlIGFsbG93cyB1cyB0byBxdWlja2x5IGNoYW5nZSBzb21lIGFyZ3VtZW50cyBpbiBhIHJlcGVhdGVkIGNvbW1hbmQsIGUuZy4gOgo+Cj4gfn5+Cj4gbm91bj0iV29ybGQiCj4gZWNobyAiSGVsbG8sICRub3VuISIKPiBub3VuPSJDbGFzcyIKPiBlY2hvICJIZWxsbywgJG5vdW4hIgo+IH5+fgoKPGJyPgoKIyMjIFNvbHV0aW9uIC0gQ3V0YWRhcHQgQWxsIFNhbXBsZXMgRXhlcmNpc2UKCjxicj4KCk9uZSBzb2x1dGlvbiBpcyB0byBkZWZpbmUgYSBiYXNoIHZhcmlhYmxlIGZvciB0aGUgc2FtcGxlLCB1c2UgdGhhdCB2YXJpYWJsZSBpbiBhIEN1dGFkYXB0IGNvbW1hbmQsIGFuZCB0aGVuIHJlZGVmaW5lIHRoZSB2YXJpYWJsZSBiZWZvcmUgcmVwZWF0aW5nIHRoZSBDdXRhZGFwdCBjb21tYW5kIGZvciBlYWNoIGNoYW5nZS4KCiAgICAjIERlZmluZSBhIHZhcmlhYmxlICRTQU1QTEUKICAgIFNBTVBMRT1zYW1wbGVfQgogICAgIyBDcmVhdGUgYSBjb21tYW5kIHVzaW5nIHRoZSB2YXJpYWJsZSAkU0FNUExFCiAgICBjdXRhZGFwdCAtcSAzMCAtbSAyMCAtbyBvdXRfdHJpbW1lZC8ke1NBTVBMRX1fUjEudHJpbW1lZC5mYXN0cS5neiAuLi9yZWFkcy8ke1NBTVBMRX1fUjEuZmFzdHEuZ3oKCiAgICAjIFJlZGVmaW5lIHRoZSB2YXJpYWJsZSBhbmQgcnVuIHRoZSBjb21tYW5kIGZvciBlYWNoIGFkZGl0aW9uYWwgc2FtcGxlCiAgICBTQU1QTEU9c2FtcGxlX0MKICAgIGN1dGFkYXB0IC1xIDMwIC1tIDIwIC1vIG91dF90cmltbWVkLyR7U0FNUExFfV9SMS50cmltbWVkLmZhc3RxLmd6IC4uL3JlYWRzLyR7U0FNUExFfV9SMS5mYXN0cS5negoKICAgIFNBTVBMRT1zYW1wbGVfRAogICAgY3V0YWRhcHQgLXEgMzAgLW0gMjAgLW8gb3V0X3RyaW1tZWQvJHtTQU1QTEV9X1IxLnRyaW1tZWQuZmFzdHEuZ3ogLi4vcmVhZHMvJHtTQU1QTEV9X1IxLmZhc3RxLmd6CgogICAgU0FNUExFPXNhbXBsZV9FCiAgICBjdXRhZGFwdCAtcSAzMCAtbSAyMCAtbyBvdXRfdHJpbW1lZC8ke1NBTVBMRX1fUjEudHJpbW1lZC5mYXN0cS5neiAuLi9yZWFkcy8ke1NBTVBMRX1fUjEuZmFzdHEuZ3oKCiAgICBTQU1QTEU9c2FtcGxlX0YKICAgIGN1dGFkYXB0IC1xIDMwIC1tIDIwIC1vIG91dF90cmltbWVkLyR7U0FNUExFfV9SMS50cmltbWVkLmZhc3RxLmd6IC4uL3JlYWRzLyR7U0FNUExFfV9SMS5mYXN0cS5negoKPGJyPgoKQW5vdGhlciBzb2x1dGlvbiBpcyB0byBjcmVhdGUgYSBmb3ItbG9vcCB3aXRoIG91ciBiYXNoIHZhcmlhYmxlIGFuZCBDdXRhZGFwdCBjb21tYW5kLiBFLmcuCgogICAgZm9yIFNBTVBMRSBpbiBzYW1wbGVfQiBzYW1wbGVfQyBzYW1wbGVfRCBzYW1wbGVfRSBzYW1wbGVfRgogICAgICAgIGRvCiAgICAgICAgY3V0YWRhcHQgLXEgMzAgLW0gMjAgLW8gb3V0X3RyaW1tZWQvJHtTQU1QTEV9X1IxLnRyaW1tZWQuZmFzdHEuZ3ogLi4vcmVhZHMvJHtTQU1QTEV9X1IxLmZhc3RxLmd6CiAgICBkb25lCgo8YnI+Cgo+IEhlbHBlciBIaW50OiBJZiBzdWdnZXN0aW5nIGEgZm9yLWxvb3AgYXBwcm9hY2gsIGl0IGNhbiBiZSBoZWxwZnVsIHRvIGJ1aWxkIHVwIGEgImRyeS1ydW4iIGNvbW1hbmQgYXMgYSB0ZXN0IGNhc2UsIHRvIGdldCBsZWFybmVycyB0byBiZSBtb3JlIGNvZ25pemFudCBvZiB3aGF0IHRoZWlyIGNvZGUgd2lsbCBkby4gRWNob2luZyBmaWxlbmFtZXMgZmlyc3QgbWlnaHQgYmUgYSBnb29kIHN1Z2dlc3Rpb24uCgo8YnI+Cg==