Demo with scanpy • anndata

We’ve found that by using anndata for R, interacting with other anndata-based Python packages becomes super easy!

Set up

To use another Python package (e.g. scanpy), you need to make sure that it is installed in the same ephemeral Python environment that anndata uses. You can let reticulate handle this for you by using the py_require() function:

library(anndata)
library(reticulate)
py_require("scanpy")

TIP: Check out the vignette on setting up Python package environments with reticulate: https://rstudio.github.io/reticulate/articles/python_packages.html.

Download and load dataset

Let’s use a 10x dataset from the 10x genomics website. You can download it to an anndata object with scanpy as follows:

sc <- import("scanpy")

url <- "https://cf.10xgenomics.com/samples/cell-exp/6.0.0/SC3_v3_NextGem_DI_CellPlex_CSP_DTC_Sorted_30K_Squamous_Cell_Carcinoma/SC3_v3_NextGem_DI_CellPlex_CSP_DTC_Sorted_30K_Squamous_Cell_Carcinoma_count_sample_feature_bc_matrix.h5"

ad <- sc$read_10x_h5("dataset.h5", backup_url = url)

ad
#> AnnData object with n_obs × n_vars = 5377 × 36601
#>     var: 'gene_ids', 'feature_types', 'genome', 'pattern', 'read', 'sequence'

Preprocessing dataset

The resuling dataset is a wrapper for the Python class but behaves very much like an R object:

ad[1:5, 3:5]
#> View of AnnData object with n_obs × n_vars = 5 × 3
#>     var: 'gene_ids', 'feature_types', 'genome', 'pattern', 'read', 'sequence'
dim(ad)
#> [1]  5377 36601

But you can still call scanpy functions on it, for example to perform preprocessing.

sc$pp$filter_cells(ad, min_genes = 200)
sc$pp$filter_genes(ad, min_cells = 3)
sc$pp$normalize_per_cell(ad)
sc$pp$log1p(ad)

Analysing your dataset in R

You can seamlessly switch back to using your dataset with other R functions, for example by calculating the rowMeans of the expression matrix.

library(Matrix)
rowMeans(ad$X[1:10, ])
#> AAACCCAAGCGCGTTC-1 AAACCCAAGGCAATGC-1 AAACCCAGTATCTTCT-1 AAACCCAGTGACAACG-1 
#>         0.05451418         0.13627126         0.12637224         0.13958617 
#> AAACCCAGTTGAATCC-1 AAACCCATCGGCTTGG-1 AAACGAAAGAGAGCCT-1 AAACGAAAGCTTAAGA-1 
#>         0.05979424         0.11365747         0.05011727         0.14347849 
#> AAACGAAAGGCACGAT-1 AAACGAAAGGTAGCCA-1 
#>         0.12979302         0.12366312