Chapter 1 Preface

This is a minimal benchmark for analyzing snRNA-seq data, especially with scverse and bioconductor communities. It will be continuously updated throughout my PhD years.

From my current experience:

bioconductor has more solid basement of biological (annotation) data
questions and methods to tackle them (except for dl) are usually first developed in R, then the python community would propose competing methods. Since seruat comes earlier than scanpy
the scanpy community evolves fast but somehow the developers are not considering better integrating each other’s new features
packages in R is in general easier to use, though I personally is more used to python and could adapt more myself
R utils are too easily to crash with the same hardware facility, same data size and same kind of job

1.1 Comments on snRNA-seq vs scRNA-seq

advantages of snRNA-seq: does not require the preservation of cellular integrity during sample preparation, especially dissociation.

disadvantages of snRNA-seq: loss of biological signal for genes with little nuclear localization.

Interpretation of DE results:

assumption: nuclear abundances are a good proxy for the overall expression profile.
considerations:
- transcripts for strongly expressed genes might localize to the cytoplasm for efficient translation and subsequently be lost upon stripping.
- genes with the same overall expression but differences in the rate of nuclear export may appear to be differentially expressed between clusters.
- In the most pathological case, higher snRNA-seq abundances may indicate nuclear sequestration of transcripts for protein-coding genes and reduced activity of the relevant biological process, contrary to the usual interpretation of the effect of upregulation.

snRNA-seq data analysis

snRNA-seq data analysis

Chapter 1 Preface

1.1 Comments on snRNA-seq vs scRNA-seq