The TENxXeniumData ExperimentHub package provides a collection of Xenium spatial transcriptomics datasets by 10X Genomics. These datasets have been formatted into the Bioconductor classes, the SpatialExperiment or SpatialFeatureExperiment (SFE), to facilitate seamless integration into various applications, including examples, demonstrations, and tutorials. The constructed data objects include gene expression profiles, per-transcript location data, centroid, segmentation boundaries (e.g., cell or nucleus boundaries), and image.
TENxXeniumData 1.0.0
Image-based spatial data, like Xenium, is typically focused on profiling a pre-selected set of genes. Such data can achieve resolution at the level of individual molecules, preserving both single-cell and subcellular details. Additionally, these methods often capture cellular boundaries through segmentations.
The TENxXeniumData
package aims to provide a curated collection of Xenium
spatial transcriptomics datasets provided by 10X Genomics. These
datasets are formatted into Bioconductor classes, specifically the
SpatialExperiment or SpatialFeatureExperiment (SFE). Similar to SFEData,
TENxXeniumData is designed as an ExperimentHub package focusing on Spatial
Data, with a specific emphasis on Xenium.
A notable distinction lies in our constructed data object, where our primary focus is on Xenium data. We aim to capture detected molecules/transcripts crucial for gaining insights into subcellular details related to specific markers and the imaging data, in addition to the gene expression profile of each cell, the centroid, and the boundary of each cell. Additionally, we have chosen to employ SpatialExperiment as an alternative scheme for data representation. In this scheme, cellular segmentations are integrated into per-cell metadata of the constructed object,
To install the TENxXeniumData
package from GitHub:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("TENxXeniumData")
The TENxXeniumData
package provides an R/Bioconductor resource for
Xenium spatially-resolved data by 10X Genomics.
The package currently includes the following datasets:
spe_mouse_brain
(SpatialExperiment Bioconductor class)sfe_mouse_brain
(SpatialFeatureExperiment Bioconductor class)spe_human_pancreas
(SpatialExperiment Bioconductor class)sfe_human_pancreas
(SpatialFeatureExperiment Bioconductor class)A list of currently available datasets can be obtained using the ExperimentHub interface:
library(SpatialExperiment)
library(SpatialFeatureExperiment)
library(TENxXeniumData)
library(BumpyMatrix)
library(SummarizedExperiment)
eh <- ExperimentHub()
(q <- query(eh, "TENxXenium"))
## ExperimentHub with 4 records
## # snapshotDate(): 2024-04-29
## # $dataprovider: NA
## # $species: Mus musculus, Homo sapiens
## # $rdataclass: SpatialFeatureExperiment, SpatialExperiment
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH8547"]]'
##
## title
## EH8547 | spe_mouse_brain
## EH8548 | sfe_mouse_brain
## EH8549 | spe_human_pancreas
## EH8550 | sfe_human_pancreas
The following examples illustrate the process of loading the provided
datasets into your R session, representing them as objects of
the SpatialExperiment
or SpatialFeatureExperiment
classes.
Loading SpatialExperiment object:
# load object
spe <- spe_mouse_brain()
# check object
spe
## class: SpatialExperiment
## dim: 541 36554
## metadata(0):
## assays(2): counts molecules
## rownames(541): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
## rowData names(3): means vars cv2
## colnames(36554): 1 2 ... 36601 36602
## colData names(10): transcript_counts control_probe_counts ... nCounts
## nGenes
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : x_centroid y_centroid
## imgData names(4): sample_id image_id data scaleFactor
# here, cellular segmentations are stored in per-cell metadata
colData(spe)
## DataFrame with 36554 rows and 10 columns
## transcript_counts control_probe_counts control_codeword_counts cell_area
## <numeric> <numeric> <numeric> <numeric>
## 1 384 0 0 305.211
## 2 146 0 0 176.606
## 3 81 0 0 263.938
## 4 314 0 0 427.810
## 5 639 0 0 424.604
## ... ... ... ... ...
## 36598 352 0 0 466.961
## 36599 412 1 0 576.194
## 36600 161 0 0 398.323
## 36601 387 0 0 510.762
## 36602 449 0 0 565.898
## nucleus_area cellSeg nucSeg sample_id
## <numeric> <sfc_POLYGON> <sfc_POLYGON> <character>
## 1 70.71469 list(c(1901.875, 190.. list(c(1903.78747558.. sample01
## 2 6.41219 list(c(1895.5, 1890... list(c(1897.625, 189.. sample01
## 3 32.78344 list(c(2362.36254882.. list(c(2361.9375, 23.. sample01
## 4 68.18594 list(c(1902.51245117.. list(c(1903.15002441.. sample01
## 5 102.95625 list(c(1914.19995117.. list(c(1913.13745117.. sample01
## ... ... ... ... ...
## 36598 67.37313 list(c(3336.88745117.. list(c(3337.94995117.. sample01
## 36599 6.86375 list(c(3371.10009765.. list(c(3369.61254882.. sample01
## 36600 13.23078 list(c(3325.41259765.. list(c(3330.9375, 33.. sample01
## 36601 21.31375 list(c(3321.58740234.. list(c(3322.64990234.. sample01
## 36602 43.44031 list(c(3336.88745117.. list(c(3323.07495117.. sample01
## nCounts nGenes
## <numeric> <integer>
## 1 385 97
## 2 146 64
## 3 81 48
## 4 315 95
## 5 640 98
## ... ... ...
## 36598 352 91
## 36599 413 85
## 36600 161 57
## 36601 387 95
## 36602 449 93
Loading SpatialFeatureExperiment object:
# load object
sfe <- sfe_mouse_brain()
# check object
sfe
## class: SpatialFeatureExperiment
## dim: 541 36554
## metadata(0):
## assays(2): counts molecules
## rownames(541): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
## rowData names(3): means vars cv2
## colnames(36554): 1 2 ... 36601 36602
## colData names(8): transcript_counts control_probe_counts ... nCounts
## nGenes
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : x_centroid y_centroid
## imgData names(4): sample_id image_id data scaleFactor
##
## unit:
## Geometries:
## colGeometries: centroids (POINT)
## annotGeometries: cellSeg (POINT), nucSeg (POINT)
##
## Graphs:
## sample01:
sessionInfo()
## R version 4.4.0 (2024-04-24)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.19-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] BumpyMatrix_1.12.0 TENxXeniumData_1.0.0
## [3] ExperimentHub_2.12.0 AnnotationHub_3.12.0
## [5] BiocFileCache_2.12.0 dbplyr_2.5.0
## [7] SpatialFeatureExperiment_1.6.1 SpatialExperiment_1.14.0
## [9] SingleCellExperiment_1.26.0 SummarizedExperiment_1.34.0
## [11] Biobase_2.64.0 GenomicRanges_1.56.0
## [13] GenomeInfoDb_1.40.0 IRanges_2.38.0
## [15] S4Vectors_0.42.0 BiocGenerics_0.50.0
## [17] MatrixGenerics_1.16.0 matrixStats_1.3.0
## [19] BiocStyle_2.32.0
##
## loaded via a namespace (and not attached):
## [1] DBI_1.2.2 bitops_1.0-7
## [3] deldir_2.0-4 s2_1.1.6
## [5] rlang_1.1.3 magrittr_2.0.3
## [7] RSQLite_2.3.6 e1071_1.7-14
## [9] compiler_4.4.0 DelayedMatrixStats_1.26.0
## [11] png_0.1-8 sfheaders_0.4.4
## [13] fftwtools_0.9-11 vctrs_0.6.5
## [15] pkgconfig_2.0.3 wk_0.9.1
## [17] crayon_1.5.2 fastmap_1.2.0
## [19] magick_2.8.3 XVector_0.44.0
## [21] scuttle_1.14.0 utf8_1.2.4
## [23] rmarkdown_2.26 UCSC.utils_1.0.0
## [25] purrr_1.0.2 bit_4.0.5
## [27] xfun_0.44 zlibbioc_1.50.0
## [29] cachem_1.0.8 beachmat_2.20.0
## [31] jsonlite_1.8.8 blob_1.2.4
## [33] rhdf5filters_1.16.0 DelayedArray_0.30.1
## [35] Rhdf5lib_1.26.0 BiocParallel_1.38.0
## [37] jpeg_0.1-10 tiff_0.1-12
## [39] terra_1.7-71 parallel_4.4.0
## [41] R6_2.5.1 bslib_0.7.0
## [43] limma_3.60.0 boot_1.3-30
## [45] jquerylib_0.1.4 Rcpp_1.0.12
## [47] bookdown_0.39 knitr_1.46
## [49] R.utils_2.12.3 tidyselect_1.2.1
## [51] Matrix_1.7-0 abind_1.4-5
## [53] yaml_2.3.8 EBImage_4.46.0
## [55] codetools_0.2-20 curl_5.2.1
## [57] tibble_3.2.1 lattice_0.22-6
## [59] withr_3.0.0 KEGGREST_1.44.0
## [61] evaluate_0.23 sf_1.0-16
## [63] units_0.8-5 spData_2.3.0
## [65] proxy_0.4-27 Biostrings_2.72.0
## [67] filelock_1.0.3 pillar_1.9.0
## [69] BiocManager_1.30.23 KernSmooth_2.23-22
## [71] generics_0.1.3 sp_2.1-4
## [73] RCurl_1.98-1.14 BiocVersion_3.19.1
## [75] sparseMatrixStats_1.16.0 class_7.3-22
## [77] glue_1.7.0 tools_4.4.0
## [79] BiocNeighbors_1.22.0 data.table_1.15.4
## [81] locfit_1.5-9.9 rhdf5_2.48.0
## [83] grid_4.4.0 spdep_1.3-3
## [85] AnnotationDbi_1.66.0 DropletUtils_1.24.0
## [87] edgeR_4.2.0 GenomeInfoDbData_1.2.12
## [89] HDF5Array_1.32.0 cli_3.6.2
## [91] rappdirs_0.3.3 fansi_1.0.6
## [93] S4Arrays_1.4.0 dplyr_1.1.4
## [95] R.methodsS3_1.8.2 zeallot_0.1.0
## [97] sass_0.4.9 digest_0.6.35
## [99] classInt_0.4-10 SparseArray_1.4.4
## [101] dqrng_0.4.0 rjson_0.2.21
## [103] htmlwidgets_1.6.4 memoise_2.0.1
## [105] htmltools_0.5.8.1 R.oo_1.26.0
## [107] lifecycle_1.0.4 httr_1.4.7
## [109] mime_0.12 statmod_1.5.0
## [111] bit64_4.0.5