Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt

Project description

Transposable elements (TEs) are mobile DNA sequences that constitute large fractions of many eukaryotic genomes. They have an increasingly recognized role in regulating cellular identity changes in many different contexts, like embryonic development and the onset of many diseases, including neurodegenerative diseases and cancer (Bourque et al. 2018). However, we still know little about how they are regulated. The current availability of a large number of single-cell omic datasets provides a great opportunity to map the transcriptional activity and the chromatin state of these elements at a single-cell resolution genome-wide. However, there are fundamental computational challenges that remain to be solved. The main challenge comes from the repetitive nature of TEs, which hinders the mapping of reads originating from them. The currently available computational tools designed to address this challenge either profile TEs at a sub-family level, losing any information on the TE genomic location; or exploit mutations to pinpoint the location of TEs, a strategy that fails with “young” intact TEs having a low substitution rate.

This project has two main aims: (i) devise new computational methods to characterize the transcriptional and chromatin state of TEs from single-cell sequencing datasets; (ii) use these methods to characterize TEs regulation during human embryonic development and in stem-cell-based models of human embryos.

For the first aim, we will develop a novel algorithm based on Bayesian Machine Learning starting from existing approaches to analyze single-cell RNA-seq datasets (Eling et al. 2018), which will be extended to explicitly model ambiguous reads in scRNA-seq data as well as other types of sequencing data, like single-cell ATAC-seq and bisulfite sequencing (Kapourani et al. 2021). This will enable us to also analyze single-cell multi-omic datasets and investigate the regulation of TEs in various cell types and states.

For the second aim, we will apply our new algorithm to map the patterns of expression and chromatin states of TEs during human embryonic development using published single-cell omic data (e.g., (Tyser et al. 2021)) as well as new, unpublished datasets. We will focus on key stages of development such as gastrulation and cardiogenesis. We will compare the transcriptional and epigenomic patterns of TEs in embryos with those we find in stem-cell-based models of human development (Scialdone and Rivron 2022).

Overall, these analyses will shed light on the role of TEs during early organogenesis and will help assess the quality of currently available in vitro models of human development.

Relevant literature

Bourque, Guillaume, Kathleen H. Burns, Mary Gehring, Vera Gorbunova, Andrei Seluanov, Molly Hammell, Michaël Imbeault, et al. 2018. “Ten Things You Should Know about Transposable Elements.” Genome Biology 19 (1): 199.

Eling, Nils, Arianne C. Richard, Sylvia Richardson, John C. Marioni, and Catalina A. Vallejos. 2018. “Correcting the Mean-Variance Dependency for Differential Variability Testing Using Single-Cell RNA Sequencing Data.” Cell Systems 7 (3): 284–94.e12.

Kapourani, Chantriolnt-Andreas, Ricard Argelaguet, Guido Sanguinetti, and Catalina A. Vallejos. 2021. “scMET: Bayesian Modeling of DNA Methylation Heterogeneity at Single-Cell Resolution.” Genome Biology 22 (1): 114.

Scialdone, Antonio, and Nicolas Rivron. 2022. “In Preprints: Improving and Interrogating Embryo Models.” Development 149 (23). doi.org/10.1242/dev.201404.

Tyser, Richard C. V., Elmir Mahammadov, Shota Nakanoh, Ludovic Vallier, Antonio Scialdone, and Shankar Srinivas. 2021. “Single-Cell Transcriptomic Characterization of a Gastrulating Human Embryo.” Nature 600 (November): 285.