Tracking lineage relationships between cells is a powerful way to obtain mechanistic insights into development and cell differentiation (Kretzschmar and Watt, 2012). Over a century ago, cells and their progeny were tracked using simple dye labelling experiments. Following the same principle, powerful contemporary approaches label cells with unique heritable molecular barcodes, using single-cell RNA-sequencing to read out cell lineage and gene expression in parallel, in many individual cells. Here, we highlight several preprints in this fast-moving area of single-cell genomics-based lineage tracing.
Early single-cell genomics methods to measure lineage relied on CRISPR-Cas9-based genome editing or virus-based methods to introduce unique genetic barcodes into cells. However, these early methods are limited by technical barriers, such as the deletion of CRISPR-Cas9 lineage records or the use of non-mutable virus-based labelling, restricting the resolution of lineage tree construction (Kester and van Oudenaarden, 2018; VanHorn and Morris, 2021). Recent lineage-tracing methods have therefore adopted an ensemble approach to overcome some of these limitations, e.g. to record lineages in human cerebral organoids (He et al., 2021). Two recent preprints take an entirely unique approach, using DNA recording devices to store information relating to lineage and the past states of cells, reading this information later in time. Both methods use prime editing – a gene-editing method employing DNA mismatch repair to precisely introduce targeted small insertions, deletions and base swapping. Using this method, cells are uniquely labelled over time to enable lineage relationships to be resolved.
In the first of these two preprints, Choi et al. introduce a new method termed ‘DNA Ticker Tape’, a tandem array of partial CRISPR/Cas9 target sites designed to be targeted by Cas9 in an ordered manner by leveraging insertion events created via prime editing (Choi et al., 2021 preprint). Each successive edit records the identity of the prime editing guide RNA (pegRNA) that mediated the edit while also shifting the position of the active target site by one unit along the array, resulting in sequential genome editing. Via constitutive pegRNA expression, the authors use this information encoder to record lineage relationships between cells. Furthermore, in proof-of-principle experiments, they demonstrate how signals of interest can be coupled to the expression or activity of a pegRNA, allowing the precise order of molecular events within individual cells to be recorded.
Appearing in tandem with DNA Ticker Tape, peCHYRON (prime editing Cell HistorY Recording by Ordered iNsertion) is a DNA recorder technology that leverages prime editing to record sequential signals into one genomic locus over time (Loveless et al., 2021 preprint). Using a prime editor enzyme and pegRNAs barcoded with unique 3 bp sequences, unique cellular barcodes are generated over time by assembling combinations of these 3 bp sequences. Using an alternating insertion sequence, prime editing can be propagated for many rounds to record both lineage and molecular events. Both DNA Ticker Tape and peCHYRON represent exciting additions to the field, allowing the precise order of molecular events to be tracked. For example, they allow information about the order in which a cell receives a series of signals to be integrated with lineage information, enabling deeper mechanistic insight across a range of biological questions.
Although genomic methods continue to be at the forefront of lineage-tracing innovations, their combination with spatial and imaging data is still limited by the inaccessibility of high-resolution spatial genomics methods. Unsurprisingly, researchers are continuing to develop fluorescent protein-based lineage-tracing approaches, with a focus on improving clonal resolution and usability. For example, Caviglia, Unterweger et al. developed FRaeppli, a Cre-inducible PhiC31-based multicolour labelling approach that can be used in zebrafish with almost perfect colour distribution (∼25% for each of the four cassettes). In addition, the FRaeppli cassette does not contain any fluorescent proteins in the green spectrum, which allows its combination with classic EGFP transgenic lines (Caviglia et al., 2022 preprint). An alternative to genetic lineage tracing, albeit for shorter chase times, is photoconversion of fluorescent proteins, which allows researchers to selectively label cells of interest during imaging experiments. A recent study used photoconversion to label motile versus non-motile epithelial cells during airway regeneration, allowing these cells to be profiled separately by single-cell sequencing and thereby uncovering potential regulatory mechanisms (Kwok et al., 2022 preprint). Another study exploited the power of correlative live and fixed microscopy to delineate fate decisions in cerebral organoids (Coquand et al., 2022 preprint). In this case, the authors used sparse lentiviral transduction and GFP live imaging to define clonal dynamic behaviours of neural progenitors; they then performed correlative fixation and immunostaining to evaluate the fates of each cell within unique clones, characterizing their variation in morphology, self-renewal, differentiation and fates. Thus, far from being outdated, imaging-based tracing technologies continue to see a steady upgrade in capabilities and analytic power.
In parallel with these methodological developments, the array of biological questions that are being examined through the combination of single-cell profiling and lineage tracing is expanding. One of the most exciting questions that lineage tracing seems uniquely suited to address is how cell dedifferentiation contributes to tissue regeneration. A recent study interrogated the progeny of Tbx5+ cardiomyocytes during heart regeneration, uncovering a variety of Tbx5-derived cell states that could potentially be used to improve induced pluripotent stem cell (iPSC)-derived heart cell therapies (Siatra et al., 2022 preprint). Another study profiled the progeny of stromal cells during endometrial regeneration, confirming their contribution to the epithelial layer after each menstrual cycle (Kirkwood et al., 2022 preprint).
Clonal tracking and single-cell profiling also continue to shed light on our understanding of cellular memories during differentiation. In an exciting new study, Mold et al. analysed a wide array of datasets from cell-tracing experiments and revealed vast transcriptional variation among different clones of neurons and T lymphocytes (Mold et al., 2022 preprint). This extends previous observations about transcriptional heritability in stem cells and cancer cells into exciting new territories. The authors also showed that chromatin accessibility differences correlate with clonal transcriptional heritability, although the mechanistic underpinning of clonal variation remains unclear.
Finally, the ever-increasing complexity of single-cell lineage tracing studies must be accompanied by computational methods to construct single-cell phylogenies and interpret the resulting data. Initially, approaches based on maximum parsimony were developed, but these methods had to dispense with some lineage information, reducing the accuracy of tree reconstruction. However, several recent studies report computational methods designed to overcome these limitations. For example, TiDeTree (Time-scaled Developmental Trees) is a Bayesian phylogenetic framework that uses genetic lineage tracing data to infer time-scaled single-cell phylogenies and population dynamics such as cell division, death and differentiation rates (Seidel and Stadler, 2022 preprint). Such methods are also being accompanied by modelling approaches to test the feasibility of phylogeny inference from CRISPR-Cas9 cell labelling approaches (Wang et al., 2021 preprint), which will undoubtedly aid experimental method development. Finally, with lineage information to hand, methods to aid biological interpretation are emerging. For example, ClonoCluster (Richman et al., 2022 preprint) is a computational method that combines both clone and transcriptomic information to reveal biologically relevant expression differences between cells that relate to functional outcomes. As additional modalities, such as chromatin accessibility and epigenetic information, are captured in parallel with lineage information, computational methods such as ClonoCluster represent valuable platforms with which to build upon to extract useful biological information from the complex datasets generated.
Single-cell lineage tracing is evolving at a fast pace. The continued development of experimental and computational methods to construct lineages, together with the application of these approaches, promises unprecedented mechanistic insight into an array of longstanding biological questions.
A.R.F. acknowledges helpful comments from preprint authors on Twitter in addition to discussions with members of the Quantitative Stem Cell Dynamics lab. S.A.M. acknowledges helpful discussions with her lab members.
A.R.F. was supported by funding from the CRIS contra el Cancer foundation (PR_EX_2020-24), the MINECO (PID2020-114638RA-I00), a Ramon y Cajal fellowship (RYC2020-029004-I), and a fellowship from la Caixa Foundation (ID 100010434) and from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 847648 (JL_21_35). IRB Barcelona is the recipient of a Severo Ochoa Award of Excellence from the MINECO. S.A.M. is supported by an Allen Distinguished Investigator Award (through the Paul G. Allen Frontiers Group), a Vallee Scholar Award, a Sloan Research Fellowship and a New York Stem Cell Foundation Robertson Investigator Award.
S.A.M. is an Associate Editor at Development.