Single Cell Genomics Day

"toda uma naturza humana, flexivelmente pré-ordenada em nossos cromossomos e indiossincrática de cada um de nós"

Posted by Ruby on April 7, 2023

This is a recap from SCGD 2023, which includes:

  1. a very concise list of concepts that catch my eye
  2. my comments in quotes.

Rahul Satija

Single-cell genomics: Recent advances and future directions

  1. in-silico “experiments”
  2. sample size > cell numbers

    think about what matters to you, detecting rare cell type or information from major cell types (perturbation onto major cell type )? Most of my cases would be the latter.

    Prior investigation on “how” to lower cell number per sample, project specific (sample quality and property play a part) remains to be done.

Cell states

Samantha Morris

New genomic technologies to deconstruct and control cell identity

  1. “Dead end” and “reprogramming permissive state”
  2. computation infill the gap between sampling and experimental measurement to the “truth”.

    our field exists only because of the limitations of experimental technology.

Sydney Shaffer

Tracing lineages and cell states from metaplasia to cancer

  • lentivirus doesn’t work in human tissues
  • barcode in vivo: mitochondrial mutations
    • survival advantages? Assumption: not the driver for cancer cell growth
  • metaplasia, dysplasia and source clone of it and molecular driver (scRNA-seq & WES confirm)

    Atlas

    Sten Linnarsson

    Cellular diversity of the developing and adult human nervous system

    1. dissection-dissection correlation

      Very good insights one can get from an “atlas” study

  1. Gaussian mixture model clusters and Topics
  2. Representative sampling, based on factors contributing to phenotypes most
    • e.g. sex, if really want to dissect, need large sample sizes. If you have not enough samples, no point in sex-specific analysis

      prior knowledge on factors contributing to phenotypes to guide study design before initiating projects

Jay Shendure

Global views of mammalian development, from zygote to pup

  • Mice: bounded, reproducible, accessible
  • Whole embryos flash frozen
  • dynamic integration
  • nearest neighbor mostly from same cell type (in differnet time points)

    is this a question regarding the fluidity/cutoff in terms of time in ontology?

  • drastic changes at birth, C-section quite differently

    birth itself is a dramastic process

Spatial

Sanja Vickovic

Spatial host-microbiome sequencing

compromise between resolution and throughput

integrate/cross-compare the results from techs with different resolution-throughput

  1. DataInno: in-silico tissue generators
  2. SHM-seq: host-microbiome

    how to dissect mutual influences?

  3. communicating or just proximity: computationally more on live tissues.
  4. subcellular spatial information: to higher resolution

Jackson Weir

Slide-tags: A new methodology for generating spatially localized single cell genomics data

  1. single cell, quality ≈ snRNA-seq
  2. tonsil: immune, germinal center reaction, spatial confined cell-cell interactions
    • receptor-ligand in two (not so) proximal clusters
  3. dedifferentiation in melanoma tumors and TF motis associated in it

    they just did cluster level analysis. but this concept is what could be one insight we want from “trajectory” analysis

DNA perturbation

Brian Cleary

Compressed Perturb-seq: highly efficient screens from regulatory circuits using random composite perturbations

  1. assumptions: affect low numbers of co-regulated modules
  2. from prior knowledge of scarcity of (nonzero effects) from perturbation, compress samples to O(k log n), then sparce inference by methods such as LASSO
  3. composite sample: $cs_1 = \sum_{i=1}^s w_{2,i} x_i$ (weighted sum of outcome of a subset of conventional samples)
    • Non-linearity, complexity. compressed- theoretially help with this. But in their 600 genes study, still too many as C_600^2 to test any significant pair
  4. Methods:
    1. cell pooling (multiple cells per droplet), resulting in droplet level observations in matrices
    2. Guide pooling (multiple gRNAs per cell), resulting in (target) gene level …
    3. Decompress: sparse factorization (sparce PCA) on expression count matrix (droplets by genes), sparce recovery (LASSO) on perturbation design matrix (droplets by sgRNAs)
  5. Evaluation:
    • 4-20x reduced costs
    • Guide pooling more effective in degree of overloading
  6. biological analysis: cis-, trans-eQTL

Helen Kang

Variant-to-Gene-to-Program: an approach to trace the path between GWAS variants and common disease

  1. Variant-Gene & Gene-Program: intersection

    why simple intersect? is there more soft association analysis?

post-transcriptional process

Madeline Kowalski

PerturbSci-Kinetics: Characterizing heterogeneity in post-transcriptional regulation with single-cell sequencing

  1. with info about RNA/degradation/production

    experimental data which I should use to evaluate my metrics build from RNA-seq data

  2. 3’UTR enlongation or shortening
  3. APARENT-PERTURB to predict sequence (motif and regions) ~ CPA regulators
  4. VASA-seq, EasySci, Parse Biosciences, Smart-seq3, long-read scRNA-seq: alternative splicing

John Blair

Phospho-seq: Reconstructing regulatory networks from multimodal single cell data

consider as experimental data to evaluate signaling regulatory network inference, but better to just build upon data sampling the matched subjects (proteins) per se. i.e. These experimental advances void the computational methods designed before.

  1. TF-focused association network: Pando, in silico perturb by CellOracle
  2. TF binding: in silico ChIP-seq by Argelaguet, et al., 2022
  3. protein level: TF-protein by NEAT-seq
  4. Bridge integration: high quality RNA from unfixed scRNA-seq, fixed RNA+ATAC as bridge, final trimodel dataset with ATAC+Protein

    Money!

Scalable strategies

Gesmira Molla

Strategies for massively scalable single-cell analysis

  • Aggreagte:
    • MetaCell-2: Parallel and dynamic
    • SEACells: similarity from adaptive Gaussian kernel, then archetype analysis decomposes kernels. Meta cells good in correlation between ATAC-seq peaks and Genes since ATAC-seq in single cell is quite sparse

      this kernel and Archetypal analysis might have other usage

  • Sketch (sampling)
    • geometricly even
  • Stream

    Then why bother massive?

    1. computational sampling is not the same as experimental sampling
    2. the techs advance speed is just not matched in this era

    Confirms that samples > cells

    How true is the assumption in homogeneous? Would this method change the intra-“cell type” heterogeneity?

    Why we must put together cell types