zellkonverter 1.0.3
This package provides a lightweight interface between the Bioconductor
SingleCellExperiment
data structure and the Python AnnData
-based single-cell
analysis environment. The idea is to enable users and developers to easily move
data between these frameworks to construct a multi-language analysis pipeline
across R/Bioconductor and Python.
The readH5AD()
function can be used to read a SingleCellExperiment
from a
H5AD file. This can be manipulated in the usual way as described in the
SingleCellExperiment documentation.
library(zellkonverter)
# Obtaining an example H5AD file.
example_h5ad <- system.file("extdata", "krumsiek11.h5ad",
package = "zellkonverter")
readH5AD(example_h5ad)
## class: SingleCellExperiment
## dim: 11 640
## metadata(2): highlights iroot
## assays(1): X
## rownames(11): Gata2 Gata1 ... EgrNab Gfi1
## rowData names(0):
## colnames(640): 0 1 ... 158-3 159-3
## colData names(1): cell_type
## reducedDimNames(0):
## altExpNames(0):
We can also write a SingleCellExperiment
to a H5AD file with the
writeH5AD()
function. This is demonstrated below on the classic Zeisel mouse
brain dataset from the scRNAseq package. The resulting file can
then be directly used in compatible Python-based analysis frameworks.
library(scRNAseq)
sce_zeisel <- ZeiselBrainData()
out_path <- tempfile(pattern = ".h5ad")
writeH5AD(sce_zeisel, file = out_path)
SingleCellExperiment
and AnnData
objectsDevelopers and power users who control their Python environments can directly
convert between SingleCellExperiment
and
AnnData
objects using the
SCE2AnnData()
and AnnData2SCE()
utilities. These functions expect that
reticulate has already been loaded along with an appropriate
version of the anndata package. We
suggest using the basilisk package to set up the Python
environment before using these functions.
library(basilisk)
library(scRNAseq)
seger <- SegerstolpePancreasData()
roundtrip <- basiliskRun(fun = function(sce) {
# Convert SCE to AnnData:
adata <- SCE2AnnData(sce)
# Maybe do some work in Python on 'adata':
# BLAH BLAH BLAH
# Convert back to an SCE:
AnnData2SCE(adata)
}, env = zellkonverter:::anndata_env, sce = seger)
Package developers can guarantee that they are using the same versions of Python
packages as zellkonverter by using the .AnnDataDependencies
variable to set up their Python environments.
.AnnDataDependencies
## [1] "anndata==0.7.4" "h5py==2.10.0" "hdf5==1.10.5" "natsort==7.0.1"
## [5] "numpy==1.19.1" "packaging==20.4" "pandas==1.1.2" "scipy==1.5.2"
## [9] "sqlite==3.33.0"
sessionInfo()
## R version 4.0.4 (2021-02-15)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.12-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.12-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] basilisk_1.2.1 scRNAseq_2.4.0
## [3] SingleCellExperiment_1.12.0 SummarizedExperiment_1.20.0
## [5] Biobase_2.50.0 GenomicRanges_1.42.0
## [7] GenomeInfoDb_1.26.3 IRanges_2.24.1
## [9] S4Vectors_0.28.1 BiocGenerics_0.36.0
## [11] MatrixGenerics_1.2.1 matrixStats_0.58.0
## [13] zellkonverter_1.0.3 knitr_1.31
## [15] BiocStyle_2.18.1
##
## loaded via a namespace (and not attached):
## [1] ProtGenerics_1.22.0 bitops_1.0-6
## [3] bit64_4.0.5 filelock_1.0.2
## [5] progress_1.2.2 httr_1.4.2
## [7] tools_4.0.4 bslib_0.2.4
## [9] utf8_1.1.4 R6_2.5.0
## [11] lazyeval_0.2.2 DBI_1.1.1
## [13] withr_2.4.1 tidyselect_1.1.0
## [15] prettyunits_1.1.1 bit_4.0.4
## [17] curl_4.3 compiler_4.0.4
## [19] basilisk.utils_1.2.2 xml2_1.3.2
## [21] DelayedArray_0.16.2 rtracklayer_1.50.0
## [23] bookdown_0.21 sass_0.3.1
## [25] askpass_1.1 rappdirs_0.3.3
## [27] Rsamtools_2.6.0 stringr_1.4.0
## [29] digest_0.6.27 rmarkdown_2.7
## [31] XVector_0.30.0 pkgconfig_2.0.3
## [33] htmltools_0.5.1.1 ensembldb_2.14.0
## [35] dbplyr_2.1.0 fastmap_1.1.0
## [37] rlang_0.4.10 RSQLite_2.2.3
## [39] shiny_1.6.0 jquerylib_0.1.3
## [41] generics_0.1.0 jsonlite_1.7.2
## [43] BiocParallel_1.24.1 dplyr_1.0.5
## [45] RCurl_1.98-1.2 magrittr_2.0.1
## [47] GenomeInfoDbData_1.2.4 Matrix_1.3-2
## [49] Rcpp_1.0.6 fansi_0.4.2
## [51] reticulate_1.18 lifecycle_1.0.0
## [53] stringi_1.5.3 yaml_2.2.1
## [55] debugme_1.1.0 zlibbioc_1.36.0
## [57] BiocFileCache_1.14.0 AnnotationHub_2.22.0
## [59] grid_4.0.4 blob_1.2.1
## [61] promises_1.2.0.1 ExperimentHub_1.16.0
## [63] crayon_1.4.1 lattice_0.20-41
## [65] Biostrings_2.58.0 GenomicFeatures_1.42.1
## [67] hms_1.0.0 pillar_1.5.1
## [69] biomaRt_2.46.3 XML_3.99-0.5
## [71] glue_1.4.2 BiocVersion_3.12.0
## [73] evaluate_0.14 BiocManager_1.30.10
## [75] vctrs_0.3.6 httpuv_1.5.5
## [77] openssl_1.4.3 purrr_0.3.4
## [79] assertthat_0.2.1 cachem_1.0.4
## [81] xfun_0.21 mime_0.10
## [83] xtable_1.8-4 AnnotationFilter_1.14.0
## [85] later_1.1.0.1 tibble_3.1.0
## [87] GenomicAlignments_1.26.0 AnnotationDbi_1.52.0
## [89] memoise_2.0.0 ellipsis_0.3.1
## [91] interactiveDisplayBase_1.28.0