We hope you now have more familiarity with key concepts, data types, tools, and how they all connect to enable single-cell gene expression analysis from RNA-Seq data.
The example dataset used in the workshop was inspired by this Sorkin et al. and the experimental methods below were excerpted directly from that paper:
Tissues harvested from the extremity injury site were digested for 45 min in 0.3% Type 1 Collagenase and 0.4% Dispase II (Gibco) in Roswell Park Memorial Institute (RPMI) medium at 37 °C under constant agitation at 120 rpm. Digestions were subsequently quenched with 10% FBS RPMI and filtered through 40μm sterile strainers. Cells were then washed in PBS with 0.04% BSA, counted and resuspended at a concentration of ~1000 cells/μl. Cell viability was assessed with Trypan blue exclusion on a Countess II (Thermo Fisher Scientific) automated counter and only samples with >85% viability were processed for further sequencing.
University of Michigan Biomedical Research Core Facilities Advanced Genomics Core generated single-cell 3’ libraries on the 10× Genomics Chromium Controller following the manufacturers protocol for the v2 reagent kit (10× Genomics, Pleasanton, CA, USA). Cell suspensions were loaded onto a Chromium Single-Cell A chip along with reverse transcription (RT) master mix and single cell 3’ gel beads, aiming for 2000–6000 cells per channel. In this experiment, 8700 cells were encapsulated into emulsion droplets at a concentration of 700–1200 cells/ul which targets 5000 single cells with an expected multiplet rate of 3.9%. Following generation of single-cell gel bead-in-emulsions (GEMs), reverse transcription was performed and the resulting Post GEM-RT product was cleaned up using DynaBeads MyOne Silane beads (Thermo Fisher Scientific, Waltham, MA, USA). The cDNA was amplified, SPRIselect (Beckman Coulter, Brea, CA, USA) cleaned and quantified then enzymatically fragmented and size selected using SPRIselect beads to optimize the cDNA amplicon size prior to library construction. An additional round of double-sided SPRI bead cleanup is performed after end repair and A-tailing. Another single-sided cleanup is done after adapter ligation. Indexes were added during PCR amplification and a final double-sided SPRI cleanup was performed. Libraries were quantified by Kapa qPCR for Illumina Adapters (Roche) and size was determined by Agilent tapestation D1000 tapes. Read 1 primer sequence are added to the molecules during GEM incubation. P5, P7 and sample index and read 2 primer sequence are added during library construction via end repair, A-tailing, adaptor ligation and PCR. Libraries were generated with unique sample indices (SI) for each sample. Libraries were sequenced on a HiSeq 4000, (Illumina, San Diego, CA, USA) using a HiSeq 4000 PE Cluster Kit (PN PE-410-1001) with HiSeq 4000 SBS Kit (100 cycles, PN FC-410-1002) reagents, loaded at 200 pM following Illumina’s denaturing and dilution recommendations. The run configuration was 26 × 8 × 98 cycles for Read 1, Index and Read 2, respectively.University of Michigan Biomedical Research Core Facilities Advanced Genomics Core executed 10x Genomics Cell Ranger (v7.2.0) to perform sample de-multiplexing, barcode processing, and single cell gene counting (Alignment, Barcoding and UMI Count). The Cell Ranger filtered barcode feature matrix was used as input to downstream analysis.
All analysis and graphics were generated in R (v4.4.1) [1]. Analysis was performed primarily using the Seurat package (v5.0.1) [2]. Cells with extreme values (which indicate low complexity, doublets, or apoptotic cells) were excluded by filtering to include only cells where Genes/cell >300 and % mitochondrial < 15% resulting in approximately 600-5,600 cells per sample after filtering. Counts were then normalized using the SCTransform method with default parameters [3].
Normalized data were integrated using the RPCA method (IntegrateLayers function with “SCT” as normalization method. Principal Component Analysis (PCA) was then performed and the first significant components were used for finding nearest neighbors followed by graph-based, semi-unsupervised Louvain clustering into distinct populations (resolution = 0.4). All uniform manifold approximation and projection (UMAP) plots were generated using default settings [4]. To identify marker genes, the clusters in the integrated data were compared pairwise for differential gene expression using Wilcoxon rank-sum test for single-cell gene expression (FindAllMarkers function; default parameters) [5]. Additional marker genes and cell-type predictions were generated with scCATCH (3.2.2) [6].
To identify differentially expressed (DE) genes within the total population, Case and Control samples were compared pairwise for differential expression expression (FindAllMarkers function; log2FC = 1.5; test.use = “Wilcoxon”). For each cluster, the results were further limited to significantly different genes (Benjamini-Hochberg adjusted p-value <= 0.05). Intra-cluster case-vs-control pseudo bulk comparisons were analyzed using DESeq2 (v1.44.0) [7].
References
Session info
lists the relevant versions of R and
all the libraries that were loaded in the analysis. This info is
generally not included in main body of a paper, but serious
reproducibility street-cred if it shows up in supplemental.
Hear me
now, believe me later.
devtools::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.4.1 (2024-06-14)
os Ubuntu 22.04.5 LTS
system x86_64, linux-gnu
ui RStudio
language (EN)
collate C.UTF-8
ctype C.UTF-8
tz America/Detroit
date 2024-10-21
rstudio 2023.12.0+369 Ocean Storm (server)
pandoc 3.1.1 @ /usr/lib/rstudio-server/bin/quarto/bin/tools/ (via rmarkdown)
─ Packages ───────────────────────────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
abind 1.4-8 2024-09-12 [2] CRAN (R 4.4.1)
assertthat 0.2.1 2019-03-21 [2] CRAN (R 4.4.1)
BiocGenerics 0.50.0 2024-04-30 [2] Bioconductor 3.19 (R 4.4.1)
BPCells * 0.2.0 2024-07-18 [2] Github (bnprks/BPCells@5677cf1)
cachem 1.1.0 2024-05-16 [2] CRAN (R 4.4.1)
cli 3.6.3 2024-06-21 [2] CRAN (R 4.4.1)
cluster 2.1.6 2023-12-01 [2] CRAN (R 4.4.1)
codetools 0.2-20 2024-03-31 [2] CRAN (R 4.4.1)
colorspace 2.1-1 2024-07-26 [2] CRAN (R 4.4.1)
cowplot 1.1.3 2024-01-22 [2] CRAN (R 4.4.1)
crayon 1.5.3 2024-06-20 [2] CRAN (R 4.4.1)
data.table 1.16.0 2024-08-27 [2] CRAN (R 4.4.1)
deldir 2.0-4 2024-02-28 [2] CRAN (R 4.4.1)
devtools 2.4.5 2022-10-11 [2] CRAN (R 4.4.1)
digest 0.6.37 2024-08-19 [2] CRAN (R 4.4.1)
dotCall64 1.1-1 2023-11-28 [2] CRAN (R 4.4.1)
dplyr * 1.1.4 2023-11-17 [2] CRAN (R 4.4.1)
ellipsis 0.3.2 2021-04-29 [2] CRAN (R 4.4.1)
evaluate 0.24.0 2024-06-10 [2] CRAN (R 4.4.1)
fansi 1.0.6 2023-12-08 [2] CRAN (R 4.4.1)
farver 2.1.2 2024-05-13 [2] CRAN (R 4.4.1)
fastDummies 1.7.4 2024-08-16 [2] CRAN (R 4.4.1)
fastmap 1.2.0 2024-05-15 [2] CRAN (R 4.4.1)
fitdistrplus 1.2-1 2024-07-12 [2] CRAN (R 4.4.1)
forcats * 1.0.0 2023-01-29 [2] CRAN (R 4.4.1)
fs 1.6.4 2024-04-25 [2] CRAN (R 4.4.1)
future 1.34.0 2024-07-29 [2] CRAN (R 4.4.1)
future.apply 1.11.2 2024-03-28 [2] CRAN (R 4.4.1)
generics 0.1.3 2022-07-05 [2] CRAN (R 4.4.1)
GenomeInfoDb 1.40.1 2024-05-24 [2] Bioconductor 3.19 (R 4.4.1)
GenomeInfoDbData 1.2.12 2024-07-17 [2] Bioconductor
GenomicRanges 1.56.1 2024-06-12 [2] Bioconductor 3.19 (R 4.4.1)
ggplot2 * 3.5.1 2024-04-23 [2] CRAN (R 4.4.1)
ggrepel 0.9.6 2024-09-07 [2] CRAN (R 4.4.1)
ggridges 0.5.6 2024-01-23 [2] CRAN (R 4.4.1)
globals 0.16.3 2024-03-08 [2] CRAN (R 4.4.1)
glue 1.7.0 2024-01-09 [2] CRAN (R 4.4.1)
goftest 1.2-3 2021-10-07 [2] CRAN (R 4.4.1)
gridExtra 2.3 2017-09-09 [2] CRAN (R 4.4.1)
gtable 0.3.5 2024-04-22 [2] CRAN (R 4.4.1)
hms 1.1.3 2023-03-21 [2] CRAN (R 4.4.1)
htmltools 0.5.8.1 2024-04-04 [2] CRAN (R 4.4.1)
htmlwidgets 1.6.4 2023-12-06 [2] CRAN (R 4.4.1)
httpuv 1.6.15 2024-03-26 [2] CRAN (R 4.4.1)
httr 1.4.7 2023-08-15 [2] CRAN (R 4.4.1)
ica 1.0-3 2022-07-08 [2] CRAN (R 4.4.1)
igraph 2.0.3 2024-03-13 [2] CRAN (R 4.4.1)
IRanges 2.38.1 2024-07-03 [2] Bioconductor 3.19 (R 4.4.1)
irlba 2.3.5.1 2022-10-03 [2] CRAN (R 4.4.1)
jsonlite 1.8.8 2023-12-04 [2] CRAN (R 4.4.1)
KernSmooth 2.23-24 2024-05-17 [2] CRAN (R 4.4.1)
klippy 0.0.0.9500 2024-10-02 [1] Github (umich-brcf-bioinf/workshop-klippy@a1be090)
knitr 1.48 2024-07-07 [2] CRAN (R 4.4.1)
later 1.3.2 2023-12-06 [2] CRAN (R 4.4.1)
lattice 0.22-6 2024-03-20 [2] CRAN (R 4.4.1)
lazyeval 0.2.2 2019-03-15 [2] CRAN (R 4.4.1)
leiden 0.4.3.1 2023-11-17 [2] CRAN (R 4.4.1)
lifecycle 1.0.4 2023-11-07 [2] CRAN (R 4.4.1)
listenv 0.9.1 2024-01-29 [2] CRAN (R 4.4.1)
lmtest 0.9-40 2022-03-21 [2] CRAN (R 4.4.1)
lubridate * 1.9.3 2023-09-27 [2] CRAN (R 4.4.1)
magrittr 2.0.3 2022-03-30 [2] CRAN (R 4.4.1)
MASS 7.3-61 2024-06-13 [2] CRAN (R 4.4.1)
Matrix 1.7-0 2024-04-26 [2] CRAN (R 4.4.1)
MatrixGenerics 1.16.0 2024-04-30 [2] Bioconductor 3.19 (R 4.4.1)
matrixStats 1.4.1 2024-09-08 [2] CRAN (R 4.4.1)
memoise 2.0.1 2021-11-26 [2] CRAN (R 4.4.1)
mime 0.12 2021-09-28 [2] CRAN (R 4.4.1)
miniUI 0.1.1.1 2018-05-18 [2] CRAN (R 4.4.1)
munsell 0.5.1 2024-04-01 [2] CRAN (R 4.4.1)
nlme 3.1-166 2024-08-14 [2] CRAN (R 4.4.1)
parallelly 1.38.0 2024-07-27 [2] CRAN (R 4.4.1)
patchwork 1.3.0 2024-09-16 [2] CRAN (R 4.4.1)
pbapply 1.7-2 2023-06-27 [2] CRAN (R 4.4.1)
pillar 1.9.0 2023-03-22 [2] CRAN (R 4.4.1)
pkgbuild 1.4.4 2024-03-17 [2] CRAN (R 4.4.1)
pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.4.1)
pkgload 1.4.0 2024-06-28 [2] CRAN (R 4.4.1)
plotly 4.10.4 2024-01-13 [2] CRAN (R 4.4.1)
plyr 1.8.9 2023-10-02 [2] CRAN (R 4.4.1)
png 0.1-8 2022-11-29 [2] CRAN (R 4.4.1)
polyclip 1.10-7 2024-07-23 [2] CRAN (R 4.4.1)
prettyunits 1.2.0 2023-09-24 [2] CRAN (R 4.4.1)
profvis 0.3.8 2023-05-02 [2] CRAN (R 4.4.1)
progress 1.2.3 2023-12-06 [2] CRAN (R 4.4.1)
progressr 0.14.0 2023-08-10 [2] CRAN (R 4.4.1)
promises 1.3.0 2024-04-05 [2] CRAN (R 4.4.1)
purrr * 1.0.2 2023-08-10 [2] CRAN (R 4.4.1)
R6 2.5.1 2021-08-19 [2] CRAN (R 4.4.1)
RANN 2.6.2 2024-08-25 [2] CRAN (R 4.4.1)
RColorBrewer 1.1-3 2022-04-03 [2] CRAN (R 4.4.1)
Rcpp 1.0.13 2024-07-17 [2] CRAN (R 4.4.1)
RcppAnnoy 0.0.22 2024-01-23 [2] CRAN (R 4.4.1)
RcppHNSW 0.6.0 2024-02-04 [2] CRAN (R 4.4.1)
readr * 2.1.5 2024-01-10 [2] CRAN (R 4.4.1)
remotes 2.5.0 2024-03-17 [2] CRAN (R 4.4.1)
reshape2 1.4.4 2020-04-09 [2] CRAN (R 4.4.1)
reticulate 1.39.0 2024-09-05 [2] CRAN (R 4.4.1)
rlang 1.1.4 2024-06-04 [2] CRAN (R 4.4.1)
rmarkdown 2.28 2024-08-17 [2] CRAN (R 4.4.1)
ROCR 1.0-11 2020-05-02 [2] CRAN (R 4.4.1)
RSpectra 0.16-2 2024-07-18 [2] CRAN (R 4.4.1)
rstudioapi 0.16.0 2024-03-24 [2] CRAN (R 4.4.1)
Rtsne 0.17 2023-12-07 [2] CRAN (R 4.4.1)
S4Vectors 0.42.1 2024-07-03 [2] Bioconductor 3.19 (R 4.4.1)
scales 1.3.0 2023-11-28 [2] CRAN (R 4.4.1)
scattermore 1.2 2023-06-12 [2] CRAN (R 4.4.1)
scCATCH * 3.2.2 2023-04-23 [2] CRAN (R 4.4.1)
sctransform 0.4.1 2023-10-19 [2] CRAN (R 4.4.1)
sessioninfo 1.2.2 2021-12-06 [2] CRAN (R 4.4.1)
Seurat * 5.1.0 2024-05-10 [2] CRAN (R 4.4.1)
SeuratObject * 5.0.2 2024-05-08 [2] CRAN (R 4.4.1)
shiny 1.9.1 2024-08-01 [2] CRAN (R 4.4.1)
sp * 2.1-4 2024-04-30 [2] CRAN (R 4.4.1)
spam 2.10-0 2023-10-23 [2] CRAN (R 4.4.1)
spatstat.data 3.1-2 2024-06-21 [2] CRAN (R 4.4.1)
spatstat.explore 3.3-2 2024-08-21 [2] CRAN (R 4.4.1)
spatstat.geom 3.3-2 2024-07-15 [2] CRAN (R 4.4.1)
spatstat.random 3.3-1 2024-07-15 [2] CRAN (R 4.4.1)
spatstat.sparse 3.1-0 2024-06-21 [2] CRAN (R 4.4.1)
spatstat.univar 3.0-1 2024-09-05 [2] CRAN (R 4.4.1)
spatstat.utils 3.1-0 2024-08-17 [2] CRAN (R 4.4.1)
stringi 1.8.4 2024-05-06 [2] CRAN (R 4.4.1)
stringr * 1.5.1 2023-11-14 [2] CRAN (R 4.4.1)
survival 3.7-0 2024-06-05 [2] CRAN (R 4.4.1)
tensor 1.5 2012-05-05 [2] CRAN (R 4.4.1)
tibble * 3.2.1 2023-03-20 [2] CRAN (R 4.4.1)
tidyr * 1.3.1 2024-01-24 [2] CRAN (R 4.4.1)
tidyselect 1.2.1 2024-03-11 [2] CRAN (R 4.4.1)
tidyverse * 2.0.0 2023-02-22 [2] CRAN (R 4.4.1)
timechange 0.3.0 2024-01-18 [2] CRAN (R 4.4.1)
tzdb 0.4.0 2023-05-12 [2] CRAN (R 4.4.1)
UCSC.utils 1.0.0 2024-04-30 [2] Bioconductor 3.19 (R 4.4.1)
urlchecker 1.0.1 2021-11-30 [2] CRAN (R 4.4.1)
usethis 3.0.0 2024-07-29 [2] CRAN (R 4.4.1)
utf8 1.2.4 2023-10-22 [2] CRAN (R 4.4.1)
uwot 0.2.2 2024-04-21 [2] CRAN (R 4.4.1)
vctrs 0.6.5 2023-12-01 [2] CRAN (R 4.4.1)
viridisLite 0.4.2 2023-05-02 [2] CRAN (R 4.4.1)
withr 3.0.1 2024-07-31 [2] CRAN (R 4.4.1)
xfun 0.47 2024-08-17 [2] CRAN (R 4.4.1)
xtable 1.8-4 2019-04-21 [2] CRAN (R 4.4.1)
XVector 0.44.0 2024-04-30 [2] Bioconductor 3.19 (R 4.4.1)
yaml 2.3.10 2024-07-26 [2] CRAN (R 4.4.1)
zlibbioc 1.50.0 2024-04-30 [2] Bioconductor 3.19 (R 4.4.1)
zoo 1.8-12 2023-04-13 [2] CRAN (R 4.4.1)
Please take our optional post-workshop survey (5-10 minutes).
We will email you a link to the final session recordings next week.
The website/notes for this workshop will be available.
The UM Bioinformatics Core Workshop Slack channel content will be available for 90 days.
RStudio workshop compute environment will be available until 10/22/2024.
You can download files from the workshop environment from your terminal/command line window as below. (You will need to substitute your actual workshop username and type workshop password when prompted.)
# download workshop files -------------------------------------------------
mkdir intro_scrnaseq-workshop
cd intro_scrnaseq-workshop
scp -r YOUR_USERNAME@bfx-workshop01.med.umich.edu:"ISC_R*" .
Chris | Marci | Raymond | Dana |
Travis | Olivia | Ram | |
Matt | Joe | Nick |
Thank you for participating in our workshop. We welcome your questions and feedback now and in the future.
Bioinformatics Workshop Team
bioinformatics-workshops@umich.edu
UM BRCF Bioinformatics Core