Estimated time: 15 Minutes
Motivation
Plotting the expression values across all samples for the top
variable genes in an experiment can help to visualize how samples
cluster together by their expression profiles. When combined with
phenotypic data, it can help show how samples with different treatments
behave relative to one another.
Exercise
Create a heatmap of the top 50 most variable genes using the
pheatmap()
function using the rlog normalized data in the
rld
object.
Instructions
- One group member should share their screen in the breakout room. If
nobody volunteers, a helper may randomly select someone.
- The group members should discuss the exercise and work together to
find a solution.
- If there is time after a solution is found, allow all members to
complete the exercise.
Example
To get an idea of what we expect to see at the end, let’s look at a
toy example from the pheatmap()
help examples. There’s no
need to run this code, we just want to illustrate the form of the
result.
# Copied from the pheatmap documentation
# Create matrix with random normally distributed values
test = matrix(rnorm(200), 20, 10)
# Impose some structure so the heatmap appears interesting
test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3
test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2
test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4
# Name the rows and columns
colnames(test) = paste("Test", 1:10, sep = "")
rownames(test) = paste("Gene", 1:20, sep = "")
# Draw the heatmap
pheatmap(test)
Steps
- Look at the documentation for the
pheatmap()
function
and determine what the most important parameter is. Hint: It’s usually
the first or first few parameters. This is telling us what we’ll have to
get for the next step.
Answer
?pheatmap
For this exercise, we’ll only need to use the mat
parameter, giving the numeric matrix to be plotted.
- Extract the rlog normalized expression values for the experiment.
Hint: We created an
rld
object earlier. The
assay()
function pulls out the values.
Answer
exp_mat = assay(rld)
head(exp_mat)
sample_A sample_B sample_C sample_D sample_E sample_F
ENSMUSG00000000001 10.514813 10.366709 10.419463 10.840373 10.410449 10.578771
ENSMUSG00000000028 10.604461 10.734506 10.735026 10.682714 10.820938 10.990999
ENSMUSG00000000031 11.160276 10.498747 10.742763 11.861617 10.578156 10.298022
ENSMUSG00000000037 4.642433 4.555500 4.578934 4.656950 4.719012 4.620204
ENSMUSG00000000049 3.017478 2.820198 2.843894 2.936455 2.814833 2.811619
ENSMUSG00000000056 14.321672 14.284652 14.337197 13.904238 14.393912 14.235971
- Calculate the variance for each gene in the expression matrix we
just extracted. Hint: Look at the help for
matrixStats::rowVars()
and decide if that’s a reasonable
function to use.
Answer
gene_vars = rowVars(exp_mat)
head(gene_vars)
[1] 0.030341701 0.017700141 0.326567308 0.003409940 0.007135949 0.030863320
- Get the numerical indices for the top 50 most variable genes. Hint:
Run the
order()
function on a toy example, like
order(c(-1.25, 1.3, 5.6, 2.1))
, and think about what is
being returned. Note, it’s not the values in the original vector. Then
look at the help for order()
and figure out how how to
reverse what is returned.
Answer
order(c(-1.25, 1.3, 5.6, 2.1))
[1] 1 2 4 3
order(c(-1.25, 1.3, 5.6, 2.1), decreasing = TRUE)
[1] 3 4 2 1
ordered_idx = order(gene_vars, decreasing = TRUE)
top_50_idx = ordered_idx[1:50]
- Subset the expression matrix from step 2 using this index vector.
Hint: Remember square-bracket notation, and that we want to subset the
rows, while returning all the columns. Make sure the result has the
number of rows you expect, that is, 50.
Answer
top_var_exp_mat = exp_mat[top_50_idx, ]
dim(top_var_exp_mat)
[1] 50 6
- Create a heatmap using this subsetted expression matrix using the
pheatmap()
function.
Answer
pheatmap(top_var_exp_mat)
Saving the result
If time permits, discuss with your group how you might save this
heatmap. Hint: Look at the parameters for the function in
?pheatmap
. Alternatively, consider how we saved the PCA in
the previous module.
LS0tCnRpdGxlOiAiQnJlYWtvdXQgRXhlcmNpc2UgMSAtIEV4cHJlc3Npb24gSGVhdG1hcCIKYXV0aG9yOiAiVU0gQmlvaW5mb3JtYXRpY3MgQ29yZSIKZGF0ZTogImByIFN5cy5EYXRlKClgIgpvdXRwdXQ6CiAgICAgICAgaHRtbF9kb2N1bWVudDoKICAgICAgICAgICAgaW5jbHVkZXM6CiAgICAgICAgICAgICAgICBpbl9oZWFkZXI6IGhlYWRlci5odG1sCiAgICAgICAgICAgIHRoZW1lOiBwYXBlcgogICAgICAgICAgICB0b2M6IHRydWUKICAgICAgICAgICAgdG9jX2RlcHRoOiA0CiAgICAgICAgICAgIHRvY19mbG9hdDogdHJ1ZQogICAgICAgICAgICBudW1iZXJfc2VjdGlvbnM6IGZhbHNlCiAgICAgICAgICAgIGZpZ19jYXB0aW9uOiB0cnVlCiAgICAgICAgICAgIG1hcmtkb3duOiBHRk0KICAgICAgICAgICAgY29kZV9kb3dubG9hZDogdHJ1ZQotLS0KCjxzdHlsZSB0eXBlPSJ0ZXh0L2NzcyI+CmJvZHksIHRkIHsKICAgZm9udC1zaXplOiAxOHB4Owp9CmNvZGUucnsKICBmb250LXNpemU6IDEycHg7Cn0KcHJlIHsKICBmb250LXNpemU6IDEycHgKfQo8L3N0eWxlPgoKYGBge3IsIGluY2x1ZGUgPSBGQUxTRX0Kc291cmNlKCIuLi9iaW4vY2h1bmstb3B0aW9ucy5SIikKa25pdHJfZmlnX3BhdGgoIjA5YS0iKQpgYGAKCmBgYHtyIE1vZHVsZXMsIGV2YWw9VFJVRSwgZWNobz1GQUxTRSwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KbGlicmFyeShERVNlcTIpCmxpYnJhcnkoZ2dwbG90MikKbGlicmFyeSh0aWR5cikKbGlicmFyeShkcGx5cikKbGlicmFyeShtYXRyaXhTdGF0cykKbGlicmFyeShnZ3JlcGVsKQpsaWJyYXJ5KHBoZWF0bWFwKQpsaWJyYXJ5KFJDb2xvckJyZXdlcikKIyBsb2FkKCJyZGF0YS9SdW5uaW5nRGF0YS5SRGF0YSIpCmBgYAoKRXN0aW1hdGVkIHRpbWU6ICoqMTUgTWludXRlcyoqCgojIE1vdGl2YXRpb24KClBsb3R0aW5nIHRoZSBleHByZXNzaW9uIHZhbHVlcyBhY3Jvc3MgYWxsIHNhbXBsZXMgZm9yIHRoZSB0b3AgdmFyaWFibGUgZ2VuZXMgaW4gYW4gZXhwZXJpbWVudCBjYW4gaGVscCB0byB2aXN1YWxpemUgaG93IHNhbXBsZXMgY2x1c3RlciB0b2dldGhlciBieSB0aGVpciBleHByZXNzaW9uIHByb2ZpbGVzLiBXaGVuIGNvbWJpbmVkIHdpdGggcGhlbm90eXBpYyBkYXRhLCBpdCBjYW4gaGVscCBzaG93IGhvdyBzYW1wbGVzIHdpdGggZGlmZmVyZW50IHRyZWF0bWVudHMgYmVoYXZlIHJlbGF0aXZlIHRvIG9uZSBhbm90aGVyLgoKIyBFeGVyY2lzZQoKQ3JlYXRlIGEgaGVhdG1hcCBvZiB0aGUgdG9wIDUwIG1vc3QgdmFyaWFibGUgZ2VuZXMgdXNpbmcgdGhlIGBwaGVhdG1hcCgpYCBmdW5jdGlvbiB1c2luZyB0aGUgcmxvZyBub3JtYWxpemVkIGRhdGEgaW4gdGhlIGBybGRgIG9iamVjdC4KCiMgSW5zdHJ1Y3Rpb25zCgotIE9uZSBncm91cCBtZW1iZXIgc2hvdWxkIHNoYXJlIHRoZWlyIHNjcmVlbiBpbiB0aGUgYnJlYWtvdXQgcm9vbS4gSWYgbm9ib2R5IHZvbHVudGVlcnMsIGEgaGVscGVyIG1heSByYW5kb21seSBzZWxlY3Qgc29tZW9uZS4KLSBUaGUgZ3JvdXAgbWVtYmVycyBzaG91bGQgZGlzY3VzcyB0aGUgZXhlcmNpc2UgYW5kIHdvcmsgdG9nZXRoZXIgdG8gZmluZCBhIHNvbHV0aW9uLgotIElmIHRoZXJlIGlzIHRpbWUgYWZ0ZXIgYSBzb2x1dGlvbiBpcyBmb3VuZCwgYWxsb3cgYWxsIG1lbWJlcnMgdG8gY29tcGxldGUgdGhlIGV4ZXJjaXNlLgoKIyBFeGFtcGxlCgpUbyBnZXQgYW4gaWRlYSBvZiB3aGF0IHdlIGV4cGVjdCB0byBzZWUgYXQgdGhlIGVuZCwgbGV0J3MgbG9vayBhdCBhIHRveSBleGFtcGxlIGZyb20gdGhlIGBwaGVhdG1hcCgpYCBoZWxwIGV4YW1wbGVzLiBUaGVyZSdzIG5vIG5lZWQgdG8gcnVuIHRoaXMgY29kZSwgd2UganVzdCB3YW50IHRvIGlsbHVzdHJhdGUgdGhlIGZvcm0gb2YgdGhlIHJlc3VsdC4KCmBgYHtyIHRlc3RfaGVhdG1hcH0KIyBDb3BpZWQgZnJvbSB0aGUgcGhlYXRtYXAgZG9jdW1lbnRhdGlvbgoKIyBDcmVhdGUgbWF0cml4IHdpdGggcmFuZG9tIG5vcm1hbGx5IGRpc3RyaWJ1dGVkIHZhbHVlcwp0ZXN0ID0gbWF0cml4KHJub3JtKDIwMCksIDIwLCAxMCkKCiMgSW1wb3NlIHNvbWUgc3RydWN0dXJlIHNvIHRoZSBoZWF0bWFwIGFwcGVhcnMgaW50ZXJlc3RpbmcKdGVzdFsxOjEwLCBzZXEoMSwgMTAsIDIpXSA9IHRlc3RbMToxMCwgc2VxKDEsIDEwLCAyKV0gKyAzCnRlc3RbMTE6MjAsIHNlcSgyLCAxMCwgMildID0gdGVzdFsxMToyMCwgc2VxKDIsIDEwLCAyKV0gKyAyCnRlc3RbMTU6MjAsIHNlcSgyLCAxMCwgMildID0gdGVzdFsxNToyMCwgc2VxKDIsIDEwLCAyKV0gKyA0CgojIE5hbWUgdGhlIHJvd3MgYW5kIGNvbHVtbnMKY29sbmFtZXModGVzdCkgPSBwYXN0ZSgiVGVzdCIsIDE6MTAsIHNlcCA9ICIiKQpyb3duYW1lcyh0ZXN0KSA9IHBhc3RlKCJHZW5lIiwgMToyMCwgc2VwID0gIiIpCgojIERyYXcgdGhlIGhlYXRtYXAKcGhlYXRtYXAodGVzdCkKYGBgCgojIFN0ZXBzCgoxLiBMb29rIGF0IHRoZSBkb2N1bWVudGF0aW9uIGZvciB0aGUgYHBoZWF0bWFwKClgIGZ1bmN0aW9uIGFuZCBkZXRlcm1pbmUgd2hhdCB0aGUgbW9zdCBpbXBvcnRhbnQgcGFyYW1ldGVyIGlzLiBIaW50OiBJdCdzIHVzdWFsbHkgdGhlIGZpcnN0IG9yIGZpcnN0IGZldyBwYXJhbWV0ZXJzLiBUaGlzIGlzIHRlbGxpbmcgdXMgd2hhdCB3ZSdsbCBoYXZlIHRvIGdldCBmb3IgdGhlIG5leHQgc3RlcC4KCjxkZXRhaWxzPgo8c3VtbWFyeT5BbnN3ZXI8L3N1bW1hcnk+CmBgYHtyIHBoZWF0bWFwX2hlbHAsIGV2YWwgPSBGQUxTRX0KP3BoZWF0bWFwCmBgYAoKRm9yIHRoaXMgZXhlcmNpc2UsIHdlJ2xsIG9ubHkgbmVlZCB0byB1c2UgdGhlIGBtYXRgIHBhcmFtZXRlciwgZ2l2aW5nIHRoZSBudW1lcmljIG1hdHJpeCB0byBiZSBwbG90dGVkLgo8L2RldGFpbHM+Cjxicj4KCjIuIEV4dHJhY3QgdGhlIHJsb2cgbm9ybWFsaXplZCBleHByZXNzaW9uIHZhbHVlcyBmb3IgdGhlIGV4cGVyaW1lbnQuIEhpbnQ6IFdlIGNyZWF0ZWQgYW4gYHJsZGAgb2JqZWN0IGVhcmxpZXIuIFRoZSBgYXNzYXkoKWAgZnVuY3Rpb24gcHVsbHMgb3V0IHRoZSB2YWx1ZXMuCgo8ZGV0YWlscz4KPHN1bW1hcnk+QW5zd2VyPC9zdW1tYXJ5PgpgYGB7ciBleHRyYWN0X3JsZF9leHByZXNzaW9ufQpleHBfbWF0ID0gYXNzYXkocmxkKQpoZWFkKGV4cF9tYXQpCmBgYAo8L2RldGFpbHM+Cjxicj4KCjMuIENhbGN1bGF0ZSB0aGUgdmFyaWFuY2UgZm9yIGVhY2ggZ2VuZSBpbiB0aGUgZXhwcmVzc2lvbiBtYXRyaXggd2UganVzdCBleHRyYWN0ZWQuIEhpbnQ6IExvb2sgYXQgdGhlIGhlbHAgZm9yIGBtYXRyaXhTdGF0czo6cm93VmFycygpYCBhbmQgZGVjaWRlIGlmIHRoYXQncyBhIHJlYXNvbmFibGUgZnVuY3Rpb24gdG8gdXNlLgoKPGRldGFpbHM+CjxzdW1tYXJ5PkFuc3dlcjwvc3VtbWFyeT4KYGBge3IgY2FsY19yb3dfdmFyc30KZ2VuZV92YXJzID0gcm93VmFycyhleHBfbWF0KQpoZWFkKGdlbmVfdmFycykKYGBgCjwvZGV0YWlscz4KPGJyPgoKNC4gR2V0IHRoZSBudW1lcmljYWwgaW5kaWNlcyBmb3IgdGhlIHRvcCA1MCBtb3N0IHZhcmlhYmxlIGdlbmVzLiBIaW50OiBSdW4gdGhlIGBvcmRlcigpYCBmdW5jdGlvbiBvbiBhIHRveSBleGFtcGxlLCBsaWtlIGBvcmRlcihjKC0xLjI1LCAxLjMsIDUuNiwgMi4xKSlgLCBhbmQgdGhpbmsgYWJvdXQgd2hhdCBpcyBiZWluZyByZXR1cm5lZC4gTm90ZSwgaXQncyBub3QgdGhlIHZhbHVlcyBpbiB0aGUgb3JpZ2luYWwgdmVjdG9yLiBUaGVuIGxvb2sgYXQgdGhlIGhlbHAgZm9yIGBvcmRlcigpYCBhbmQgZmlndXJlIG91dCBob3cgaG93IHRvIHJldmVyc2Ugd2hhdCBpcyByZXR1cm5lZC4KCjxkZXRhaWxzPgo8c3VtbWFyeT5BbnN3ZXI8L3N1bW1hcnk+CmBgYHtyIG9yZGVyX3ZhcnN9Cm9yZGVyKGMoLTEuMjUsIDEuMywgNS42LCAyLjEpKQpvcmRlcihjKC0xLjI1LCAxLjMsIDUuNiwgMi4xKSwgZGVjcmVhc2luZyA9IFRSVUUpCgpvcmRlcmVkX2lkeCA9IG9yZGVyKGdlbmVfdmFycywgZGVjcmVhc2luZyA9IFRSVUUpCnRvcF81MF9pZHggPSBvcmRlcmVkX2lkeFsxOjUwXQpgYGAKPC9kZXRhaWxzPgo8YnI+Cgo1LiBTdWJzZXQgdGhlIGV4cHJlc3Npb24gbWF0cml4IGZyb20gc3RlcCAyIHVzaW5nIHRoaXMgaW5kZXggdmVjdG9yLiBIaW50OiBSZW1lbWJlciBzcXVhcmUtYnJhY2tldCBub3RhdGlvbiwgYW5kIHRoYXQgd2Ugd2FudCB0byBzdWJzZXQgdGhlIHJvd3MsIHdoaWxlIHJldHVybmluZyBhbGwgdGhlIGNvbHVtbnMuIE1ha2Ugc3VyZSB0aGUgcmVzdWx0IGhhcyB0aGUgbnVtYmVyIG9mIHJvd3MgeW91IGV4cGVjdCwgdGhhdCBpcywgNTAuCgo8ZGV0YWlscz4KPHN1bW1hcnk+QW5zd2VyPC9zdW1tYXJ5PgpgYGB7ciBleHRyYWN0X3RvcF9leHB9CnRvcF92YXJfZXhwX21hdCA9IGV4cF9tYXRbdG9wXzUwX2lkeCwgXQpkaW0odG9wX3Zhcl9leHBfbWF0KQpgYGAKPC9kZXRhaWxzPgo8YnI+Cgo2LiBDcmVhdGUgYSBoZWF0bWFwIHVzaW5nIHRoaXMgc3Vic2V0dGVkIGV4cHJlc3Npb24gbWF0cml4IHVzaW5nIHRoZSBgcGhlYXRtYXAoKWAgZnVuY3Rpb24uCgo8ZGV0YWlscz4KPHN1bW1hcnk+QW5zd2VyPC9zdW1tYXJ5PgpgYGB7ciBjcmVhdGVfaGVhdG1hcH0KcGhlYXRtYXAodG9wX3Zhcl9leHBfbWF0KQpgYGAKPC9kZXRhaWxzPgo8YnI+CgojIFNhdmluZyB0aGUgcmVzdWx0CgpJZiB0aW1lIHBlcm1pdHMsIGRpc2N1c3Mgd2l0aCB5b3VyIGdyb3VwIGhvdyB5b3UgbWlnaHQgc2F2ZSB0aGlzIGhlYXRtYXAuIEhpbnQ6IExvb2sgYXQgdGhlIHBhcmFtZXRlcnMgZm9yIHRoZSBmdW5jdGlvbiBpbiBgP3BoZWF0bWFwYC4gQWx0ZXJuYXRpdmVseSwgY29uc2lkZXIgaG93IHdlIHNhdmVkIHRoZSBQQ0EgaW4gdGhlIHByZXZpb3VzIG1vZHVsZS4KCmBgYHtyIFdyaXRlT3V0LlJEYXRhLCBldmFsPUZBTFNFLCBlY2hvPUZBTFNFLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQojSGlkZGVuIGNvZGUgYmxvY2sgdG8gd3JpdGUgb3V0IGRhdGEgZm9yIGtuaXR0aW5nCiMgc2F2ZS5pbWFnZShmaWxlID0gInJkYXRhL1J1bm5pbmdEYXRhLlJEYXRhIikKYGBgCg==