Estimated time: 15 Minutes

Motivation

Plotting the expression values across all samples for the top variable genes in an experiment can help to visualize how samples cluster together by their expression profiles. When combined with phenotypic data, it can help show how samples with different treatments behave relative to one another.

Exercise

Create a heatmap of the top 50 most variable genes using the pheatmap() function using the rlog normalized data in the rld object.

Instructions

  • One group member should share their screen in the breakout room. If nobody volunteers, a helper may randomly select someone.
  • The group members should discuss the exercise and work together to find a solution.
  • If there is time after a solution is found, allow all members to complete the exercise.

Example

To get an idea of what we expect to see at the end, let’s look at a toy example from the pheatmap() help examples. There’s no need to run this code, we just want to illustrate the form of the result.

# Copied from the pheatmap documentation

# Create matrix with random normally distributed values
test = matrix(rnorm(200), 20, 10)

# Impose some structure so the heatmap appears interesting
test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3
test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2
test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4

# Name the rows and columns
colnames(test) = paste("Test", 1:10, sep = "")
rownames(test) = paste("Gene", 1:20, sep = "")

# Draw the heatmap
pheatmap(test)

Steps

  1. Look at the documentation for the pheatmap() function and determine what the most important parameter is. Hint: It’s usually the first or first few parameters. This is telling us what we’ll have to get for the next step.
Answer
?pheatmap
For this exercise, we’ll only need to use the mat parameter, giving the numeric matrix to be plotted.


  1. Extract the rlog normalized expression values for the experiment. Hint: We created an rld object earlier. The assay() function pulls out the values.
Answer
exp_mat = assay(rld)
head(exp_mat)
                    sample_A  sample_B  sample_C  sample_D  sample_E  sample_F
ENSMUSG00000000001 10.514813 10.366709 10.419463 10.840373 10.410449 10.578771
ENSMUSG00000000028 10.604461 10.734506 10.735026 10.682714 10.820938 10.990999
ENSMUSG00000000031 11.160276 10.498747 10.742763 11.861617 10.578156 10.298022
ENSMUSG00000000037  4.642433  4.555500  4.578934  4.656950  4.719012  4.620204
ENSMUSG00000000049  3.017478  2.820198  2.843894  2.936455  2.814833  2.811619
ENSMUSG00000000056 14.321672 14.284652 14.337197 13.904238 14.393912 14.235971


  1. Calculate the variance for each gene in the expression matrix we just extracted. Hint: Look at the help for matrixStats::rowVars() and decide if that’s a reasonable function to use.
Answer
gene_vars = rowVars(exp_mat)
head(gene_vars)
[1] 0.030341701 0.017700141 0.326567308 0.003409940 0.007135949 0.030863320


  1. Get the numerical indices for the top 50 most variable genes. Hint: Run the order() function on a toy example, like order(c(-1.25, 1.3, 5.6, 2.1)), and think about what is being returned. Note, it’s not the values in the original vector. Then look at the help for order() and figure out how how to reverse what is returned.
Answer
order(c(-1.25, 1.3, 5.6, 2.1))
[1] 1 2 4 3
order(c(-1.25, 1.3, 5.6, 2.1), decreasing = TRUE)
[1] 3 4 2 1
ordered_idx = order(gene_vars, decreasing = TRUE)
top_50_idx = ordered_idx[1:50]


  1. Subset the expression matrix from step 2 using this index vector. Hint: Remember square-bracket notation, and that we want to subset the rows, while returning all the columns. Make sure the result has the number of rows you expect, that is, 50.
Answer
top_var_exp_mat = exp_mat[top_50_idx, ]
dim(top_var_exp_mat)
[1] 50  6


  1. Create a heatmap using this subsetted expression matrix using the pheatmap() function.
Answer
pheatmap(top_var_exp_mat)


Saving the result

If time permits, discuss with your group how you might save this heatmap. Hint: Look at the parameters for the function in ?pheatmap. Alternatively, consider how we saved the PCA in the previous module.

LS0tCnRpdGxlOiAiQnJlYWtvdXQgRXhlcmNpc2UgMSAtIEV4cHJlc3Npb24gSGVhdG1hcCIKYXV0aG9yOiAiVU0gQmlvaW5mb3JtYXRpY3MgQ29yZSIKZGF0ZTogImByIFN5cy5EYXRlKClgIgpvdXRwdXQ6CiAgICAgICAgaHRtbF9kb2N1bWVudDoKICAgICAgICAgICAgaW5jbHVkZXM6CiAgICAgICAgICAgICAgICBpbl9oZWFkZXI6IGhlYWRlci5odG1sCiAgICAgICAgICAgIHRoZW1lOiBwYXBlcgogICAgICAgICAgICB0b2M6IHRydWUKICAgICAgICAgICAgdG9jX2RlcHRoOiA0CiAgICAgICAgICAgIHRvY19mbG9hdDogdHJ1ZQogICAgICAgICAgICBudW1iZXJfc2VjdGlvbnM6IGZhbHNlCiAgICAgICAgICAgIGZpZ19jYXB0aW9uOiB0cnVlCiAgICAgICAgICAgIG1hcmtkb3duOiBHRk0KICAgICAgICAgICAgY29kZV9kb3dubG9hZDogdHJ1ZQotLS0KCjxzdHlsZSB0eXBlPSJ0ZXh0L2NzcyI+CmJvZHksIHRkIHsKICAgZm9udC1zaXplOiAxOHB4Owp9CmNvZGUucnsKICBmb250LXNpemU6IDEycHg7Cn0KcHJlIHsKICBmb250LXNpemU6IDEycHgKfQo8L3N0eWxlPgoKYGBge3IsIGluY2x1ZGUgPSBGQUxTRX0Kc291cmNlKCIuLi9iaW4vY2h1bmstb3B0aW9ucy5SIikKa25pdHJfZmlnX3BhdGgoIjA5YS0iKQpgYGAKCmBgYHtyIE1vZHVsZXMsIGV2YWw9VFJVRSwgZWNobz1GQUxTRSwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KbGlicmFyeShERVNlcTIpCmxpYnJhcnkoZ2dwbG90MikKbGlicmFyeSh0aWR5cikKbGlicmFyeShkcGx5cikKbGlicmFyeShtYXRyaXhTdGF0cykKbGlicmFyeShnZ3JlcGVsKQpsaWJyYXJ5KHBoZWF0bWFwKQpsaWJyYXJ5KFJDb2xvckJyZXdlcikKIyBsb2FkKCJyZGF0YS9SdW5uaW5nRGF0YS5SRGF0YSIpCmBgYAoKRXN0aW1hdGVkIHRpbWU6ICoqMTUgTWludXRlcyoqCgojIE1vdGl2YXRpb24KClBsb3R0aW5nIHRoZSBleHByZXNzaW9uIHZhbHVlcyBhY3Jvc3MgYWxsIHNhbXBsZXMgZm9yIHRoZSB0b3AgdmFyaWFibGUgZ2VuZXMgaW4gYW4gZXhwZXJpbWVudCBjYW4gaGVscCB0byB2aXN1YWxpemUgaG93IHNhbXBsZXMgY2x1c3RlciB0b2dldGhlciBieSB0aGVpciBleHByZXNzaW9uIHByb2ZpbGVzLiBXaGVuIGNvbWJpbmVkIHdpdGggcGhlbm90eXBpYyBkYXRhLCBpdCBjYW4gaGVscCBzaG93IGhvdyBzYW1wbGVzIHdpdGggZGlmZmVyZW50IHRyZWF0bWVudHMgYmVoYXZlIHJlbGF0aXZlIHRvIG9uZSBhbm90aGVyLgoKIyBFeGVyY2lzZQoKQ3JlYXRlIGEgaGVhdG1hcCBvZiB0aGUgdG9wIDUwIG1vc3QgdmFyaWFibGUgZ2VuZXMgdXNpbmcgdGhlIGBwaGVhdG1hcCgpYCBmdW5jdGlvbiB1c2luZyB0aGUgcmxvZyBub3JtYWxpemVkIGRhdGEgaW4gdGhlIGBybGRgIG9iamVjdC4KCiMgSW5zdHJ1Y3Rpb25zCgotIE9uZSBncm91cCBtZW1iZXIgc2hvdWxkIHNoYXJlIHRoZWlyIHNjcmVlbiBpbiB0aGUgYnJlYWtvdXQgcm9vbS4gSWYgbm9ib2R5IHZvbHVudGVlcnMsIGEgaGVscGVyIG1heSByYW5kb21seSBzZWxlY3Qgc29tZW9uZS4KLSBUaGUgZ3JvdXAgbWVtYmVycyBzaG91bGQgZGlzY3VzcyB0aGUgZXhlcmNpc2UgYW5kIHdvcmsgdG9nZXRoZXIgdG8gZmluZCBhIHNvbHV0aW9uLgotIElmIHRoZXJlIGlzIHRpbWUgYWZ0ZXIgYSBzb2x1dGlvbiBpcyBmb3VuZCwgYWxsb3cgYWxsIG1lbWJlcnMgdG8gY29tcGxldGUgdGhlIGV4ZXJjaXNlLgoKIyBFeGFtcGxlCgpUbyBnZXQgYW4gaWRlYSBvZiB3aGF0IHdlIGV4cGVjdCB0byBzZWUgYXQgdGhlIGVuZCwgbGV0J3MgbG9vayBhdCBhIHRveSBleGFtcGxlIGZyb20gdGhlIGBwaGVhdG1hcCgpYCBoZWxwIGV4YW1wbGVzLiBUaGVyZSdzIG5vIG5lZWQgdG8gcnVuIHRoaXMgY29kZSwgd2UganVzdCB3YW50IHRvIGlsbHVzdHJhdGUgdGhlIGZvcm0gb2YgdGhlIHJlc3VsdC4KCmBgYHtyIHRlc3RfaGVhdG1hcH0KIyBDb3BpZWQgZnJvbSB0aGUgcGhlYXRtYXAgZG9jdW1lbnRhdGlvbgoKIyBDcmVhdGUgbWF0cml4IHdpdGggcmFuZG9tIG5vcm1hbGx5IGRpc3RyaWJ1dGVkIHZhbHVlcwp0ZXN0ID0gbWF0cml4KHJub3JtKDIwMCksIDIwLCAxMCkKCiMgSW1wb3NlIHNvbWUgc3RydWN0dXJlIHNvIHRoZSBoZWF0bWFwIGFwcGVhcnMgaW50ZXJlc3RpbmcKdGVzdFsxOjEwLCBzZXEoMSwgMTAsIDIpXSA9IHRlc3RbMToxMCwgc2VxKDEsIDEwLCAyKV0gKyAzCnRlc3RbMTE6MjAsIHNlcSgyLCAxMCwgMildID0gdGVzdFsxMToyMCwgc2VxKDIsIDEwLCAyKV0gKyAyCnRlc3RbMTU6MjAsIHNlcSgyLCAxMCwgMildID0gdGVzdFsxNToyMCwgc2VxKDIsIDEwLCAyKV0gKyA0CgojIE5hbWUgdGhlIHJvd3MgYW5kIGNvbHVtbnMKY29sbmFtZXModGVzdCkgPSBwYXN0ZSgiVGVzdCIsIDE6MTAsIHNlcCA9ICIiKQpyb3duYW1lcyh0ZXN0KSA9IHBhc3RlKCJHZW5lIiwgMToyMCwgc2VwID0gIiIpCgojIERyYXcgdGhlIGhlYXRtYXAKcGhlYXRtYXAodGVzdCkKYGBgCgojIFN0ZXBzCgoxLiBMb29rIGF0IHRoZSBkb2N1bWVudGF0aW9uIGZvciB0aGUgYHBoZWF0bWFwKClgIGZ1bmN0aW9uIGFuZCBkZXRlcm1pbmUgd2hhdCB0aGUgbW9zdCBpbXBvcnRhbnQgcGFyYW1ldGVyIGlzLiBIaW50OiBJdCdzIHVzdWFsbHkgdGhlIGZpcnN0IG9yIGZpcnN0IGZldyBwYXJhbWV0ZXJzLiBUaGlzIGlzIHRlbGxpbmcgdXMgd2hhdCB3ZSdsbCBoYXZlIHRvIGdldCBmb3IgdGhlIG5leHQgc3RlcC4KCjxkZXRhaWxzPgo8c3VtbWFyeT5BbnN3ZXI8L3N1bW1hcnk+CmBgYHtyIHBoZWF0bWFwX2hlbHAsIGV2YWwgPSBGQUxTRX0KP3BoZWF0bWFwCmBgYAoKRm9yIHRoaXMgZXhlcmNpc2UsIHdlJ2xsIG9ubHkgbmVlZCB0byB1c2UgdGhlIGBtYXRgIHBhcmFtZXRlciwgZ2l2aW5nIHRoZSBudW1lcmljIG1hdHJpeCB0byBiZSBwbG90dGVkLgo8L2RldGFpbHM+Cjxicj4KCjIuIEV4dHJhY3QgdGhlIHJsb2cgbm9ybWFsaXplZCBleHByZXNzaW9uIHZhbHVlcyBmb3IgdGhlIGV4cGVyaW1lbnQuIEhpbnQ6IFdlIGNyZWF0ZWQgYW4gYHJsZGAgb2JqZWN0IGVhcmxpZXIuIFRoZSBgYXNzYXkoKWAgZnVuY3Rpb24gcHVsbHMgb3V0IHRoZSB2YWx1ZXMuCgo8ZGV0YWlscz4KPHN1bW1hcnk+QW5zd2VyPC9zdW1tYXJ5PgpgYGB7ciBleHRyYWN0X3JsZF9leHByZXNzaW9ufQpleHBfbWF0ID0gYXNzYXkocmxkKQpoZWFkKGV4cF9tYXQpCmBgYAo8L2RldGFpbHM+Cjxicj4KCjMuIENhbGN1bGF0ZSB0aGUgdmFyaWFuY2UgZm9yIGVhY2ggZ2VuZSBpbiB0aGUgZXhwcmVzc2lvbiBtYXRyaXggd2UganVzdCBleHRyYWN0ZWQuIEhpbnQ6IExvb2sgYXQgdGhlIGhlbHAgZm9yIGBtYXRyaXhTdGF0czo6cm93VmFycygpYCBhbmQgZGVjaWRlIGlmIHRoYXQncyBhIHJlYXNvbmFibGUgZnVuY3Rpb24gdG8gdXNlLgoKPGRldGFpbHM+CjxzdW1tYXJ5PkFuc3dlcjwvc3VtbWFyeT4KYGBge3IgY2FsY19yb3dfdmFyc30KZ2VuZV92YXJzID0gcm93VmFycyhleHBfbWF0KQpoZWFkKGdlbmVfdmFycykKYGBgCjwvZGV0YWlscz4KPGJyPgoKNC4gR2V0IHRoZSBudW1lcmljYWwgaW5kaWNlcyBmb3IgdGhlIHRvcCA1MCBtb3N0IHZhcmlhYmxlIGdlbmVzLiBIaW50OiBSdW4gdGhlIGBvcmRlcigpYCBmdW5jdGlvbiBvbiBhIHRveSBleGFtcGxlLCBsaWtlIGBvcmRlcihjKC0xLjI1LCAxLjMsIDUuNiwgMi4xKSlgLCBhbmQgdGhpbmsgYWJvdXQgd2hhdCBpcyBiZWluZyByZXR1cm5lZC4gTm90ZSwgaXQncyBub3QgdGhlIHZhbHVlcyBpbiB0aGUgb3JpZ2luYWwgdmVjdG9yLiBUaGVuIGxvb2sgYXQgdGhlIGhlbHAgZm9yIGBvcmRlcigpYCBhbmQgZmlndXJlIG91dCBob3cgaG93IHRvIHJldmVyc2Ugd2hhdCBpcyByZXR1cm5lZC4KCjxkZXRhaWxzPgo8c3VtbWFyeT5BbnN3ZXI8L3N1bW1hcnk+CmBgYHtyIG9yZGVyX3ZhcnN9Cm9yZGVyKGMoLTEuMjUsIDEuMywgNS42LCAyLjEpKQpvcmRlcihjKC0xLjI1LCAxLjMsIDUuNiwgMi4xKSwgZGVjcmVhc2luZyA9IFRSVUUpCgpvcmRlcmVkX2lkeCA9IG9yZGVyKGdlbmVfdmFycywgZGVjcmVhc2luZyA9IFRSVUUpCnRvcF81MF9pZHggPSBvcmRlcmVkX2lkeFsxOjUwXQpgYGAKPC9kZXRhaWxzPgo8YnI+Cgo1LiBTdWJzZXQgdGhlIGV4cHJlc3Npb24gbWF0cml4IGZyb20gc3RlcCAyIHVzaW5nIHRoaXMgaW5kZXggdmVjdG9yLiBIaW50OiBSZW1lbWJlciBzcXVhcmUtYnJhY2tldCBub3RhdGlvbiwgYW5kIHRoYXQgd2Ugd2FudCB0byBzdWJzZXQgdGhlIHJvd3MsIHdoaWxlIHJldHVybmluZyBhbGwgdGhlIGNvbHVtbnMuIE1ha2Ugc3VyZSB0aGUgcmVzdWx0IGhhcyB0aGUgbnVtYmVyIG9mIHJvd3MgeW91IGV4cGVjdCwgdGhhdCBpcywgNTAuCgo8ZGV0YWlscz4KPHN1bW1hcnk+QW5zd2VyPC9zdW1tYXJ5PgpgYGB7ciBleHRyYWN0X3RvcF9leHB9CnRvcF92YXJfZXhwX21hdCA9IGV4cF9tYXRbdG9wXzUwX2lkeCwgXQpkaW0odG9wX3Zhcl9leHBfbWF0KQpgYGAKPC9kZXRhaWxzPgo8YnI+Cgo2LiBDcmVhdGUgYSBoZWF0bWFwIHVzaW5nIHRoaXMgc3Vic2V0dGVkIGV4cHJlc3Npb24gbWF0cml4IHVzaW5nIHRoZSBgcGhlYXRtYXAoKWAgZnVuY3Rpb24uCgo8ZGV0YWlscz4KPHN1bW1hcnk+QW5zd2VyPC9zdW1tYXJ5PgpgYGB7ciBjcmVhdGVfaGVhdG1hcH0KcGhlYXRtYXAodG9wX3Zhcl9leHBfbWF0KQpgYGAKPC9kZXRhaWxzPgo8YnI+CgojIFNhdmluZyB0aGUgcmVzdWx0CgpJZiB0aW1lIHBlcm1pdHMsIGRpc2N1c3Mgd2l0aCB5b3VyIGdyb3VwIGhvdyB5b3UgbWlnaHQgc2F2ZSB0aGlzIGhlYXRtYXAuIEhpbnQ6IExvb2sgYXQgdGhlIHBhcmFtZXRlcnMgZm9yIHRoZSBmdW5jdGlvbiBpbiBgP3BoZWF0bWFwYC4gQWx0ZXJuYXRpdmVseSwgY29uc2lkZXIgaG93IHdlIHNhdmVkIHRoZSBQQ0EgaW4gdGhlIHByZXZpb3VzIG1vZHVsZS4KCmBgYHtyIFdyaXRlT3V0LlJEYXRhLCBldmFsPUZBTFNFLCBlY2hvPUZBTFNFLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQojSGlkZGVuIGNvZGUgYmxvY2sgdG8gd3JpdGUgb3V0IGRhdGEgZm9yIGtuaXR0aW5nCiMgc2F2ZS5pbWFnZShmaWxlID0gInJkYXRhL1J1bm5pbmdEYXRhLlJEYXRhIikKYGBgCg==