Estimated time: 15 Minutes

Motivation

Plotting the expression values across all samples for the top variable genes in an experiment can help to visualize how samples cluster together by their expression profiles. When combined with phenotypic data, it can help show how samples with different treatments behave relative to one another.

Exercise

Create a heatmap of the top 50 most variable genes using the pheatmap() function.

Instructions

  • One group member should share their screen in the breakout room. If nobody volunteers, a helper may randomly select someone.
  • The group members should discuss the exercise and work together to find a solution.
  • If there is time after a solution is found, allow all members to complete the exercise.

Example

To get an idea of what we expect to see at the end, let’s look at a toy example from the pheatmap() help examples. There’s no need to run this code, we just want to illustrate the form of the result.

# Copied from the pheatmap documentation

# Create matrix with random normally distributed values
test = matrix(rnorm(200), 20, 10)

# Impose some structure so the heatmap appears interesting
test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3
test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2
test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4

# Name the rows and columns
colnames(test) = paste("Test", 1:10, sep = "")
rownames(test) = paste("Gene", 1:20, sep = "")

# Draw the heatmap
pheatmap(test)

Steps

  1. Look at the documentation for the pheatmap() function and determine what the most important parameter is. Hint: It’s usually the first or first few parameters. This is telling us what we’ll have to get for the next step.
Answer
?pheatmap
For this exercise, we’ll only need to use the mat parameter, giving the numeric matrix to be plotted.


  1. Extract the rlog normalized expression values for the experiment. Hint: We created an rld object earlier. The assay() function pulls out the values.
Answer
exp_mat = assay(rld)
head(exp_mat)
                   SRR7777895 SRR7777896 SRR7777897 SRR7777898 SRR7777899 SRR7777900
ENSMUSG00000000001  10.514813  10.366709  10.419463  10.840373  10.410449  10.578771
ENSMUSG00000000028  10.604461  10.734506  10.735026  10.682714  10.820938  10.990999
ENSMUSG00000000031  11.160276  10.498747  10.742763  11.861617  10.578156  10.298022
ENSMUSG00000000037   4.642433   4.555500   4.578934   4.656950   4.719012   4.620204
ENSMUSG00000000049   3.017478   2.820198   2.843894   2.936455   2.814833   2.811619
ENSMUSG00000000056  14.321672  14.284652  14.337197  13.904238  14.393912  14.235971


  1. Calculate the variance for each gene in the expression matrix we just extracted. Hint: Look at the help for matrixStats::rowVars() and decide if that’s a reasonable function to use.
Answer
gene_vars = rowVars(exp_mat)
head(gene_vars)
[1] 0.030341701 0.017700141 0.326567308 0.003409940 0.007135949 0.030863320


  1. Get the numerical indices for the top 50 most variable genes. Hint: Run the order() function on a toy example, like order(c(-1.25, 1.3, 5.6, 2.1)), and think about what is being returned. Note, it’s not the values in the original vector. Then look at the help for order() and figure out how how to reverse what is returned.
Answer
order(c(-1.25, 1.3, 5.6, 2.1))
[1] 1 2 4 3
order(c(-1.25, 1.3, 5.6, 2.1), decreasing = TRUE)
[1] 3 4 2 1
ordered_idx = order(gene_vars, decreasing = TRUE)
top_50_idx = ordered_idx[1:50]


  1. Subset the expression matrix from step 2 using this index vector. Hint: Remember square-bracket notation, and that we want to subset the rows, while returning all the columns. Make sure the result has the number of rows you expect, that is, 50.
Answer
top_var_exp_mat = exp_mat[top_50_idx, ]
dim(top_var_exp_mat)
[1] 50  6


  1. Create a heatmap using this subsetted expression matrix using the pheatmap() function.
Answer
pheatmap(top_var_exp_mat)


Saving the result

If time permits, discuss with your group how you might save this heatmap. Hint: Look at the parameters for the function in ?pheatmap. Alternatively, consider how we saved the PCA in the previous module.

LS0tCnRpdGxlOiAiQnJlYWtvdXQgRXhlcmNpc2UgMSAtIEV4cHJlc3Npb24gSGVhdG1hcCIKYXV0aG9yOiAiVU0gQmlvaW5mb3JtYXRpY3MgQ29yZSIKZGF0ZTogImByIFN5cy5EYXRlKClgIgpvdXRwdXQ6CiAgICAgICAgaHRtbF9kb2N1bWVudDoKICAgICAgICAgICAgaW5jbHVkZXM6CiAgICAgICAgICAgICAgICBpbl9oZWFkZXI6IGhlYWRlci5odG1sCiAgICAgICAgICAgIHRoZW1lOiBwYXBlcgogICAgICAgICAgICB0b2M6IHRydWUKICAgICAgICAgICAgdG9jX2RlcHRoOiA0CiAgICAgICAgICAgIHRvY19mbG9hdDogdHJ1ZQogICAgICAgICAgICBudW1iZXJfc2VjdGlvbnM6IGZhbHNlCiAgICAgICAgICAgIGZpZ19jYXB0aW9uOiB0cnVlCiAgICAgICAgICAgIG1hcmtkb3duOiBHRk0KICAgICAgICAgICAgY29kZV9kb3dubG9hZDogdHJ1ZQotLS0KCjxzdHlsZSB0eXBlPSJ0ZXh0L2NzcyI+CmJvZHksIHRkIHsKICAgZm9udC1zaXplOiAxOHB4Owp9CmNvZGUucnsKICBmb250LXNpemU6IDEycHg7Cn0KcHJlIHsKICBmb250LXNpemU6IDEycHgKfQo8L3N0eWxlPgoKYGBge3IsIGluY2x1ZGUgPSBGQUxTRX0Kc291cmNlKCIuLi9iaW4vY2h1bmstb3B0aW9ucy5SIikKa25pdHJfZmlnX3BhdGgoIjA5YS0iKQpgYGAKCmBgYHtyIE1vZHVsZXMsIGV2YWw9VFJVRSwgZWNobz1GQUxTRSwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KbGlicmFyeShERVNlcTIpCmxpYnJhcnkoZ2dwbG90MikKbGlicmFyeSh0aWR5cikKbGlicmFyeShkcGx5cikKbGlicmFyeShtYXRyaXhTdGF0cykKbGlicmFyeShnZ3JlcGVsKQpsaWJyYXJ5KHBoZWF0bWFwKQpsaWJyYXJ5KFJDb2xvckJyZXdlcikKIyBsb2FkKCJyZGF0YS9SdW5uaW5nRGF0YS5SRGF0YSIpCmBgYAoKRXN0aW1hdGVkIHRpbWU6ICoqMTUgTWludXRlcyoqCgojIE1vdGl2YXRpb24KClBsb3R0aW5nIHRoZSBleHByZXNzaW9uIHZhbHVlcyBhY3Jvc3MgYWxsIHNhbXBsZXMgZm9yIHRoZSB0b3AgdmFyaWFibGUgZ2VuZXMgaW4gYW4gZXhwZXJpbWVudCBjYW4gaGVscCB0byB2aXN1YWxpemUgaG93IHNhbXBsZXMgY2x1c3RlciB0b2dldGhlciBieSB0aGVpciBleHByZXNzaW9uIHByb2ZpbGVzLiBXaGVuIGNvbWJpbmVkIHdpdGggcGhlbm90eXBpYyBkYXRhLCBpdCBjYW4gaGVscCBzaG93IGhvdyBzYW1wbGVzIHdpdGggZGlmZmVyZW50IHRyZWF0bWVudHMgYmVoYXZlIHJlbGF0aXZlIHRvIG9uZSBhbm90aGVyLgoKIyBFeGVyY2lzZQoKQ3JlYXRlIGEgaGVhdG1hcCBvZiB0aGUgdG9wIDUwIG1vc3QgdmFyaWFibGUgZ2VuZXMgdXNpbmcgdGhlIGBwaGVhdG1hcCgpYCBmdW5jdGlvbi4KCiMgSW5zdHJ1Y3Rpb25zCgotIE9uZSBncm91cCBtZW1iZXIgc2hvdWxkIHNoYXJlIHRoZWlyIHNjcmVlbiBpbiB0aGUgYnJlYWtvdXQgcm9vbS4gSWYgbm9ib2R5IHZvbHVudGVlcnMsIGEgaGVscGVyIG1heSByYW5kb21seSBzZWxlY3Qgc29tZW9uZS4KLSBUaGUgZ3JvdXAgbWVtYmVycyBzaG91bGQgZGlzY3VzcyB0aGUgZXhlcmNpc2UgYW5kIHdvcmsgdG9nZXRoZXIgdG8gZmluZCBhIHNvbHV0aW9uLgotIElmIHRoZXJlIGlzIHRpbWUgYWZ0ZXIgYSBzb2x1dGlvbiBpcyBmb3VuZCwgYWxsb3cgYWxsIG1lbWJlcnMgdG8gY29tcGxldGUgdGhlIGV4ZXJjaXNlLgoKIyBFeGFtcGxlCgpUbyBnZXQgYW4gaWRlYSBvZiB3aGF0IHdlIGV4cGVjdCB0byBzZWUgYXQgdGhlIGVuZCwgbGV0J3MgbG9vayBhdCBhIHRveSBleGFtcGxlIGZyb20gdGhlIGBwaGVhdG1hcCgpYCBoZWxwIGV4YW1wbGVzLiBUaGVyZSdzIG5vIG5lZWQgdG8gcnVuIHRoaXMgY29kZSwgd2UganVzdCB3YW50IHRvIGlsbHVzdHJhdGUgdGhlIGZvcm0gb2YgdGhlIHJlc3VsdC4KCmBgYHtyIHRlc3RfaGVhdG1hcH0KIyBDb3BpZWQgZnJvbSB0aGUgcGhlYXRtYXAgZG9jdW1lbnRhdGlvbgoKIyBDcmVhdGUgbWF0cml4IHdpdGggcmFuZG9tIG5vcm1hbGx5IGRpc3RyaWJ1dGVkIHZhbHVlcwp0ZXN0ID0gbWF0cml4KHJub3JtKDIwMCksIDIwLCAxMCkKCiMgSW1wb3NlIHNvbWUgc3RydWN0dXJlIHNvIHRoZSBoZWF0bWFwIGFwcGVhcnMgaW50ZXJlc3RpbmcKdGVzdFsxOjEwLCBzZXEoMSwgMTAsIDIpXSA9IHRlc3RbMToxMCwgc2VxKDEsIDEwLCAyKV0gKyAzCnRlc3RbMTE6MjAsIHNlcSgyLCAxMCwgMildID0gdGVzdFsxMToyMCwgc2VxKDIsIDEwLCAyKV0gKyAyCnRlc3RbMTU6MjAsIHNlcSgyLCAxMCwgMildID0gdGVzdFsxNToyMCwgc2VxKDIsIDEwLCAyKV0gKyA0CgojIE5hbWUgdGhlIHJvd3MgYW5kIGNvbHVtbnMKY29sbmFtZXModGVzdCkgPSBwYXN0ZSgiVGVzdCIsIDE6MTAsIHNlcCA9ICIiKQpyb3duYW1lcyh0ZXN0KSA9IHBhc3RlKCJHZW5lIiwgMToyMCwgc2VwID0gIiIpCgojIERyYXcgdGhlIGhlYXRtYXAKcGhlYXRtYXAodGVzdCkKYGBgCgojIFN0ZXBzCgoxLiBMb29rIGF0IHRoZSBkb2N1bWVudGF0aW9uIGZvciB0aGUgYHBoZWF0bWFwKClgIGZ1bmN0aW9uIGFuZCBkZXRlcm1pbmUgd2hhdCB0aGUgbW9zdCBpbXBvcnRhbnQgcGFyYW1ldGVyIGlzLiBIaW50OiBJdCdzIHVzdWFsbHkgdGhlIGZpcnN0IG9yIGZpcnN0IGZldyBwYXJhbWV0ZXJzLiBUaGlzIGlzIHRlbGxpbmcgdXMgd2hhdCB3ZSdsbCBoYXZlIHRvIGdldCBmb3IgdGhlIG5leHQgc3RlcC4KCjxkZXRhaWxzPgo8c3VtbWFyeT5BbnN3ZXI8L3N1bW1hcnk+CmBgYHtyIHBoZWF0bWFwX2hlbHAsIGV2YWwgPSBGQUxTRX0KP3BoZWF0bWFwCmBgYAoKRm9yIHRoaXMgZXhlcmNpc2UsIHdlJ2xsIG9ubHkgbmVlZCB0byB1c2UgdGhlIGBtYXRgIHBhcmFtZXRlciwgZ2l2aW5nIHRoZSBudW1lcmljIG1hdHJpeCB0byBiZSBwbG90dGVkLgo8L2RldGFpbHM+Cjxicj4KCjIuIEV4dHJhY3QgdGhlIHJsb2cgbm9ybWFsaXplZCBleHByZXNzaW9uIHZhbHVlcyBmb3IgdGhlIGV4cGVyaW1lbnQuIEhpbnQ6IFdlIGNyZWF0ZWQgYW4gYHJsZGAgb2JqZWN0IGVhcmxpZXIuIFRoZSBgYXNzYXkoKWAgZnVuY3Rpb24gcHVsbHMgb3V0IHRoZSB2YWx1ZXMuCgo8ZGV0YWlscz4KPHN1bW1hcnk+QW5zd2VyPC9zdW1tYXJ5PgpgYGB7ciBleHRyYWN0X3JsZF9leHByZXNzaW9ufQpleHBfbWF0ID0gYXNzYXkocmxkKQpoZWFkKGV4cF9tYXQpCmBgYAo8L2RldGFpbHM+Cjxicj4KCjMuIENhbGN1bGF0ZSB0aGUgdmFyaWFuY2UgZm9yIGVhY2ggZ2VuZSBpbiB0aGUgZXhwcmVzc2lvbiBtYXRyaXggd2UganVzdCBleHRyYWN0ZWQuIEhpbnQ6IExvb2sgYXQgdGhlIGhlbHAgZm9yIGBtYXRyaXhTdGF0czo6cm93VmFycygpYCBhbmQgZGVjaWRlIGlmIHRoYXQncyBhIHJlYXNvbmFibGUgZnVuY3Rpb24gdG8gdXNlLgoKPGRldGFpbHM+CjxzdW1tYXJ5PkFuc3dlcjwvc3VtbWFyeT4KYGBge3IgY2FsY19yb3dfdmFyc30KZ2VuZV92YXJzID0gcm93VmFycyhleHBfbWF0KQpoZWFkKGdlbmVfdmFycykKYGBgCjwvZGV0YWlscz4KPGJyPgoKNC4gR2V0IHRoZSBudW1lcmljYWwgaW5kaWNlcyBmb3IgdGhlIHRvcCA1MCBtb3N0IHZhcmlhYmxlIGdlbmVzLiBIaW50OiBSdW4gdGhlIGBvcmRlcigpYCBmdW5jdGlvbiBvbiBhIHRveSBleGFtcGxlLCBsaWtlIGBvcmRlcihjKC0xLjI1LCAxLjMsIDUuNiwgMi4xKSlgLCBhbmQgdGhpbmsgYWJvdXQgd2hhdCBpcyBiZWluZyByZXR1cm5lZC4gTm90ZSwgaXQncyBub3QgdGhlIHZhbHVlcyBpbiB0aGUgb3JpZ2luYWwgdmVjdG9yLiBUaGVuIGxvb2sgYXQgdGhlIGhlbHAgZm9yIGBvcmRlcigpYCBhbmQgZmlndXJlIG91dCBob3cgaG93IHRvIHJldmVyc2Ugd2hhdCBpcyByZXR1cm5lZC4KCjxkZXRhaWxzPgo8c3VtbWFyeT5BbnN3ZXI8L3N1bW1hcnk+CmBgYHtyIG9yZGVyX3ZhcnN9Cm9yZGVyKGMoLTEuMjUsIDEuMywgNS42LCAyLjEpKQpvcmRlcihjKC0xLjI1LCAxLjMsIDUuNiwgMi4xKSwgZGVjcmVhc2luZyA9IFRSVUUpCgpvcmRlcmVkX2lkeCA9IG9yZGVyKGdlbmVfdmFycywgZGVjcmVhc2luZyA9IFRSVUUpCnRvcF81MF9pZHggPSBvcmRlcmVkX2lkeFsxOjUwXQpgYGAKPC9kZXRhaWxzPgo8YnI+Cgo1LiBTdWJzZXQgdGhlIGV4cHJlc3Npb24gbWF0cml4IGZyb20gc3RlcCAyIHVzaW5nIHRoaXMgaW5kZXggdmVjdG9yLiBIaW50OiBSZW1lbWJlciBzcXVhcmUtYnJhY2tldCBub3RhdGlvbiwgYW5kIHRoYXQgd2Ugd2FudCB0byBzdWJzZXQgdGhlIHJvd3MsIHdoaWxlIHJldHVybmluZyBhbGwgdGhlIGNvbHVtbnMuIE1ha2Ugc3VyZSB0aGUgcmVzdWx0IGhhcyB0aGUgbnVtYmVyIG9mIHJvd3MgeW91IGV4cGVjdCwgdGhhdCBpcywgNTAuCgo8ZGV0YWlscz4KPHN1bW1hcnk+QW5zd2VyPC9zdW1tYXJ5PgpgYGB7ciBleHRyYWN0X3RvcF9leHB9CnRvcF92YXJfZXhwX21hdCA9IGV4cF9tYXRbdG9wXzUwX2lkeCwgXQpkaW0odG9wX3Zhcl9leHBfbWF0KQpgYGAKPC9kZXRhaWxzPgo8YnI+Cgo2LiBDcmVhdGUgYSBoZWF0bWFwIHVzaW5nIHRoaXMgc3Vic2V0dGVkIGV4cHJlc3Npb24gbWF0cml4IHVzaW5nIHRoZSBgcGhlYXRtYXAoKWAgZnVuY3Rpb24uCgo8ZGV0YWlscz4KPHN1bW1hcnk+QW5zd2VyPC9zdW1tYXJ5PgpgYGB7ciBjcmVhdGVfaGVhdG1hcH0KcGhlYXRtYXAodG9wX3Zhcl9leHBfbWF0KQpgYGAKPC9kZXRhaWxzPgo8YnI+CgojIFNhdmluZyB0aGUgcmVzdWx0CgpJZiB0aW1lIHBlcm1pdHMsIGRpc2N1c3Mgd2l0aCB5b3VyIGdyb3VwIGhvdyB5b3UgbWlnaHQgc2F2ZSB0aGlzIGhlYXRtYXAuIEhpbnQ6IExvb2sgYXQgdGhlIHBhcmFtZXRlcnMgZm9yIHRoZSBmdW5jdGlvbiBpbiBgP3BoZWF0bWFwYC4gQWx0ZXJuYXRpdmVseSwgY29uc2lkZXIgaG93IHdlIHNhdmVkIHRoZSBQQ0EgaW4gdGhlIHByZXZpb3VzIG1vZHVsZS4KCmBgYHtyIFdyaXRlT3V0LlJEYXRhLCBldmFsPUZBTFNFLCBlY2hvPUZBTFNFLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQojSGlkZGVuIGNvZGUgYmxvY2sgdG8gd3JpdGUgb3V0IGRhdGEgZm9yIGtuaXR0aW5nCiMgc2F2ZS5pbWFnZShmaWxlID0gInJkYXRhL1J1bm5pbmdEYXRhLlJEYXRhIikKYGBgCg==