ABSTRACT
The evolution of a unique craniofacial complex in vertebrates made possible new ways of breathing, eating, communicating and sensing the environment. The head and face develop through interactions of all three germ layers, the endoderm, ectoderm and mesoderm, as well as the so-called fourth germ layer, the cranial neural crest. Over a century of experimental embryology and genetics have revealed an incredible diversity of cell types derived from each germ layer, signaling pathways and genes that coordinate craniofacial development, and how changes to these underlie human disease and vertebrate evolution. Yet for many diseases and congenital anomalies, we have an incomplete picture of the causative genomic changes, in particular how alterations to the non-coding genome might affect craniofacial gene expression. Emerging genomics and single-cell technologies provide an opportunity to obtain a more holistic view of the genes and gene regulatory elements orchestrating craniofacial development across vertebrates. These single-cell studies generate novel hypotheses that can be experimentally validated in vivo. In this Review, we highlight recent advances in single-cell studies of diverse craniofacial structures, as well as potential pitfalls and the need for extensive in vivo validation. We discuss how these studies inform the developmental sources and regulation of head structures, bringing new insights into the etiology of structural birth anomalies that affect the vertebrate head.
Introduction
Craniofacial development involves an intricate series of morphogenetic processes in the embryonic head. Shortly after gastrulation, cranial neural crest cells (CNCCs) emerge from the neural plate border through an epithelial-mesenchymal transition and migrate throughout the head. At around the same time, the pharyngeal endodermal epithelium forms a series of evaginations (pouches), which contact ectodermal epithelial invaginations (clefts). Within each pharyngeal arch segment, the pouch and cleft epithelia envelop CNCC mesenchyme and a core of mesoderm-derived mesenchyme (Fig. 1A) (Chai and Maxson, 2006).
Overview of human craniofacial development and organ systems. (A) Pharyngeal arch stage (human 4-5 weeks) with cross-sectional diagrams illustrating the major germ layer contributions. Facial ectoderm, purple; cranial neural crest, green; pharyngeal mesoderm, red; pharyngeal endoderm, orange. (B) Contribution of germ layers to the bones and teeth of the face and skull (left), muscles and glands (right). (C) Developmental stages of a typical tooth, facial gland and palate. Germ layer contributions are shown for the tooth and gland.
Overview of human craniofacial development and organ systems. (A) Pharyngeal arch stage (human 4-5 weeks) with cross-sectional diagrams illustrating the major germ layer contributions. Facial ectoderm, purple; cranial neural crest, green; pharyngeal mesoderm, red; pharyngeal endoderm, orange. (B) Contribution of germ layers to the bones and teeth of the face and skull (left), muscles and glands (right). (C) Developmental stages of a typical tooth, facial gland and palate. Germ layer contributions are shown for the tooth and gland.
The three primary germ layers (ectoderm, endoderm and mesoderm), along with the cranial neural crest (often referred to as the fourth germ layer), contribute to distinct cell types and communicate with each other to pattern craniofacial structures (Fig. 1B). CNCCs contribute to musculoskeletal structures (bone, cartilage, teeth, tendons and ligaments), as well as smooth muscle and fibroblasts (Fabian et al., 2022). Meanwhile, the mesoderm generates the facial muscles and some parts of the head skeleton, particularly the posterior skull, the endoderm contributes to endocrine organs such as the thymus and parathyroid, and the ectoderm contributes to the salivary and lacrimal glands (Lescroart et al., 2015; Rothova et al., 2012). Over a century of dye labeling, tissue grafting and genetic recombination experiments have revealed extensive catalogs of cell types derived from each germ layer (Fabian and Crump, 2022).
Identification of the genes and signaling pathways required for craniofacial development has come from multiple approaches. Forward genetic screens in model organisms such as zebrafish and mice have identified many genes required for craniofacial development (Amsterdam et al., 2004; Piotrowski et al., 1996; Schilling et al., 1996), while genome-wide association studies (GWAS) have identified candidate genes in humans with congenital anomalies (e.g. cleft palate and craniosynostosis) (Justice et al., 2020; Leslie et al., 2017). Reverse genetic approaches test the requirements of candidate genes and pathogenicity of mutations through genetic engineering of mouse embryonic stem cells and diverse model organisms (Van Otterloo et al., 2016). Conditional genetics, in which genes are deleted or modified in specific tissue types (e.g. through Cre-mediated recombination), reveal in which tissues genes function for craniofacial development (Murray, 2011).
Despite these advances, we have an incomplete picture of the genetic and mechanistic underpinnings of craniofacial development. Genetic studies often miss key genes involved in craniofacial development due to genetic redundancy, pleiotropy and species-specific requirements. Changes in gene regulation play major roles in congenital anomalies and evolution, yet our ability to understand the non-coding regulatory genome is only just emerging. Recent advances in high-throughput sequencing, including at the single-cell level, have enabled a more holistic view of the repertoire of expressed genes and active gene regulatory elements in diverse cell types of the developing head. In this Review, we discuss the opportunities and challenges in leveraging these massive datasets to make new insights into craniofacial development, disease and evolution.
Single-cell studies of craniofacial germ layers
Cells can be classified into types based on functions, morphologies, locations, protein compositions and other properties. High-throughput DNA sequencing, combined with the ability to uniquely barcode single cells, has led to new opportunities to define all cell types within a tissue, organ or entire organism. Current approaches include single-cell or single-nuclei sequencing of reverse transcribed RNA molecules (scRNAseq) or regions of the genome accessible to transposase integration (snATACseq) (Buenrostro et al., 2015) (Fig. 2B). Accessible regions include the chromatin-depleted regions of enhancers, promoters, insulators and other gene regulatory domains. scRNAseq and snATACseq datasets can be integrated using software [e.g. SnapATAC; (Fang et al., 2021)] that uses accessibility of gene loci in snATACseq data to generate pseudotranscriptomic data, or by performing multiomic experiments on the same barcoded cells (Cao et al., 2018). Lineage information can then be deduced by connecting cells with overlapping gene expression (e.g. Monocle 3; Cao et al., 2019) or by comparing nascent RNA versus spliced RNA to infer the trajectory of gene expression (e.g. RNA Velocity; La Manno et al., 2018). Gene regulatory networks can be predicted by constructing regulons of correlated gene expression (e.g. SCENIC; Aibar et al., 2017) or by linking expression of transcription factors with their binding site enrichment in accessible chromatin domains (e.g. CellOracle; Kamimoto et al., 2023). Receptor and ligand expression can also be used to identify potential cell-cell signaling pathways [e.g. CellChat (Jin et al., 2021) and CellPhoneDB (Efremova et al., 2020)] (Fig. 2C). See Table 1 for a summary of single-cell bioinformatics software packages.
Flow chart of single-cell studies from sample preparation to validation. (A) Craniofacial tissues can be dissected, and specific cell types can be enriched by fluorescence-activated cell sorting (FACS) based on transgenes or cell surface markers. Cell barcoding can be performed by sequential pipetting into multi-well plates with unique barcodes or by emulsion with barcode-containing lipid droplets. (B) Diagram of a transcriptional locus showing sources of RNA for scRNAseq and accessible chromatin for scATACseq. (C) Bioinformatics packages can be used to generate cell clusters and marker genes, to infer potential cell lineage trajectories, and to identify potential master regulatory genes and potential enhancers. (D) In vivo validation includes mRNA in situ hybridization, lineage tracing, loss- and gain-of-function studies, and transgenic testing of enhancers.
Flow chart of single-cell studies from sample preparation to validation. (A) Craniofacial tissues can be dissected, and specific cell types can be enriched by fluorescence-activated cell sorting (FACS) based on transgenes or cell surface markers. Cell barcoding can be performed by sequential pipetting into multi-well plates with unique barcodes or by emulsion with barcode-containing lipid droplets. (B) Diagram of a transcriptional locus showing sources of RNA for scRNAseq and accessible chromatin for scATACseq. (C) Bioinformatics packages can be used to generate cell clusters and marker genes, to infer potential cell lineage trajectories, and to identify potential master regulatory genes and potential enhancers. (D) In vivo validation includes mRNA in situ hybridization, lineage tracing, loss- and gain-of-function studies, and transgenic testing of enhancers.
While these approaches are powerful, they are also prone to misinterpretation due to batch effects (Büttner et al., 2019; De Rop et al., 2023), over -or under-clustering of data (Xia and Yanai, 2019), limitations of lower dimensional space visualization tools such as tSNE and UMAP (Chari and Pachter, 2023), and arbitrary parameter selection in trajectory analyses (Gorin et al., 2022; Weinreb et al., 2018). In all cases, these approaches simply make predictions about the underlying biology that must be carefully validated in vivo (Fig. 2D).
Two major approaches have been taken to isolate craniofacial tissues for single-cell profiling (Fig. 2). The first involves dissection of particular structures, such as the maxillary and mandibular prominences, or the teeth, sutures, glands and palate during later development (Fig. 2A). This dissection approach has been applied largely to mouse and, to a limited extent, human embryos (Yankee et al., 2023). The second approach involves cell sorting to enrich specific cell populations or lineages. Cell types can be purified based on active expression of reporter transgenes or surface proteins. Alternatively, cells of a particular lineage can be labeled by permanent genetic recombination earlier in development. One of the most common approaches in mouse is the Cre-Lox system (Murray, 2011), where tissue-specific Cre recombinase excises a LoxP-flanked stop cassette to allow expression of a reporter gene from a ubiquitous promoter in all derivatives of the Cre-expressing cells. Combined approaches can also be taken, such as isolating CNCC-lineage cells labeled by the Cre-Lox system specifically from the dissected jaw. Marker expression can also be used to extract craniofacial cell types from larger single-cell datasets, such as those from the Human Cell Atlas Project (Caetano et al., 2022). Despite the high prevalence of craniofacial anomalies, craniofacial cell types are not a major focus of current cell atlas projects and can be rare in whole-animal datasets, necessitating more-focused single-cell studies (Table 2).
Cranial neural crest cells
Since the discovery that CNCCs generate cartilage and teeth in the skull (Platt, 1893), researchers have used various methods to catalog the diversity of CNCC-derived cell types (see reviews by Fabian and Crump, 2022; Tang and Bronner, 2020). Methods include dye labeling in amphibia (Collazo et al., 1993) and avian chimeras (Le Lievre and Le Douarin, 1975), as well as genetic recombination-based tracing, primarily Wnt1-Cre in mouse (Jiang et al., 2002) and Sox10-Cre in zebrafish (Kague et al., 2012). A unique feature of CNCCs, as opposed to trunk neural crest, is their ability to form not only neuroglial and pigment cell types, but also the ectomesenchyme that contributes to bone, cartilage, teeth and other connective tissues (Le Lievre and Douarin, 1975). How CNCCs generate such diverse cell types with spatiotemporal precision remains a primary focus of the field.
In chick, scRNAseq, snATACseq and bulk epigenomic profiling data suggest that CNCCs are biased towards ectomesenchyme versus neuroglial lineages at pre-migratory stages (Williams et al., 2019). However, a separate study concluded that heterogeneity in migratory chick CNCCs corresponds more to leader versus follower migratory behavior, rather than to lineage potential (Morrison et al., 2017). In mouse, scRNAseq of pre-migratory and migratory CNCCs at embryonic day (E) 8.5 and E9.5 identified migratory CNCCs with shared mesenchymal and neuroglial programs, which resolved into distinct ectomesenchymal and neuroglial programs through cross-repression (Soldatov et al., 2019). In zebrafish, photoconversion of a sox10:nlsEOS line was used to label and sort mandibular CNCCs for scRNAseq analysis (Tatarakis et al., 2021). As in mouse, this study revealed that the earliest lineage decisions of zebrafish CNCCs into ectomesenchyme, pigment and neuroglial fates occurs at late migration stages. These latter studies agree with genetic evidence that ectomesenchyme identity is established during CNCC migration, coincident with downregulation of early CNCC specifiers sox10 and foxd3, and upregulation of ectomesenchyme markers twist1a, dlx2a and fli1a (Bildsoe et al., 2009; Blentic et al., 2008; Das and Crump, 2012).
Upon migration into the pharyngeal arches, CNCC ectomesenchyme acquires positional information. Anterior-posterior identity is retained from CNCC origins near the nascent midbrain and hindbrain (Hunt et al., 1991), with the hyoid and more-posterior branchial arches, but not the mandibular arch and frontonasal prominence, expressing anterior Hox genes (Miller et al., 2004). Signaling from the arch epithelia regionalizes CNCCs along the dorsoventral (i.e. proximodistal), mediolateral and oral-aboral axes (reviewed by Minoux and Rijli, 2010), as exemplified by epithelial-derived endothelin 1 induction of nested Dlx gene expression in CNCCs along the dorsoventral axis (Beverdam et al., 2002; Depew et al., 2002; Talbot et al., 2010). In zebrafish, bulk transcriptomic analyses comparing fli1a:GFP+; sox10:dsRed+ CNCC ectomesenchyme with CNCCs from dlx5a:GFP+ or hand2:GFP+ dorsoventral subdomains identified cohorts of region-specific CNCC genes (Askary et al., 2017). Subsequently, scRNAseq analysis of fli1a:GFP+; sox10:dsRed+ arch CNCCs showed that, before differentiation, CNCCs are largely distinguished by regional identity along the anteroposterior axis (Mitchell et al., 2021). Additionally, integration of scRNAseq and snATACseq data of zebrafish post-migratory CNCCs, labeled by Sox10:Cre, revealed clustering to be driven by anteroposterior, dorsoventral and oral-aboral identity (Fabian et al., 2022). The strong effect of regional identity on clustering allowed researchers to create a virtual three-dimensional map of pharyngeal gene expression, which predicted with ∼90% accuracy the precise arch expression domains of CNCC genes. For example, this analysis identified a previously unreported role for the nuclear receptor Nr5a2 in promoting tendon, ligament and glandular fates in the mandibular arches of zebrafish and mouse, with multiome analysis of fish nr5a2 mutants identifying a number of directly regulated jaw-specific enhancers (Chen et al., 2023).
In mouse, scRNAseq of dissected facial domains allowed profiling of the maxillary and mandibular prominences, and hyoid arch (Pushel et al., 2021 preprint; Sun et al., 2023; Xu et al., 2019; Yuan et al., 2020). Although dissected tissue contains cell types of each germ layer, CNCCs often make the largest contribution and can be subclustered based on ectomesenchyme gene expression. As in zebrafish, scRNAseq analysis of the mouse mandibular process at E10.5 revealed CNCC clustering driven by proximodistal and oral-aboral patterning (Pushel et al., 2021 preprint; Xu et al., 2019; Yuan et al., 2020). Analysis of co-regulated genes suggested that Hedgehog and Bmp signaling promote oral and aboral development, respectively, which was validated through genetic manipulation of both pathways in mouse embryos (Xu et al., 2019). scRNAseq of the mandibular prominences from E10.5-E14.5 revealed progressive fate restriction of CNCCs into fibroblastic and progenitor fates, with progenitors further bifurcating along bone and tooth lineages (Yuan et al., 2020). During the same period in the maxillary prominences, CNCCs bifurcate into palatal and tooth lineages (Sun et al., 2023). In chick, scRNAseq of each of the first four branchial arches revealed both a core migratory signature and arch-specific gene expression (Morrison et al., 2021). In human embryos, scRNAseq of the facial prominences at Carnegie stage 20 (Yankee et al., 2023), along with chromatin accessibility and histone mark data from multiple stages (Wilderman et al., 2018), showed strong correlation of cell types between human and mouse, as well as potential human-specific gene expression. Genes expressed in the developing human face were also enriched for known disease-causing mutations. In addition to revealing common signatures of arch regionalization across vertebrates, these studies reveal candidate genes and regulatory sequences that may be sites of damaging mutations in craniofacial anomalies.
Although comprehensive single-cell analyses of CNCC derivatives in adult mice has not been reported, scRNAseq and snATACseq have been used to profile zebrafish Sox10:Cre-labeled CNCC derivatives from pharyngeal arch to adult stages (Fabian et al., 2022). In addition to musculoskeletal tissue types, this study revealed CNCC-derived gill cell types unique to fish, including a specialized endothelial-like pillar cell for gas exchange (Mongera et al., 2013). Gills consist of several primary and secondary filaments that support respiration, with each filament arising clonally from a single CNCC (Stolper et al., 2019). Pseudotime analysis using Monocle3 revealed a putative fgf10b+ CNCC gill progenitor, which was confirmed as a progenitor for multiple gill cell types by photoconversion-based lineage tracing (Fabian et al., 2022). In contrast, pseudotime analyses of skeletal and connective tissue lineages generated poorly resolved trajectories, consistent with other studies (Bandyopadhyay et al., 2008; Colnot et al., 2004; Matsushita et al., 2020). One possibility is that skeletal and connective tissue progenitors exist in a continuum of states, with plasticity in response to homeostatic and regenerative needs. Furthermore, in vivo validation will be needed to resolve mesenchymal lineage relationships, likely aided by emerging barcoding and in situ readout technologies such as MEMOIR (Frieda et al., 2017).
Facial mesoderm
In addition to generating muscles and vasculature, the facial mesoderm generates many of the same connective tissue subtypes as CNCCs, including osteoblasts, chondrocytes, tendons, ligaments and fibroblasts (Grimaldi and Tajbakhsh, 2021). CNCCs and mesoderm both contribute in a seamless fashion to the skull bones (Koyabu et al., 2012; Teng et al., 2019), stapes cartilage of mouse (Thompson et al., 2012), otic cartilage of fish (Kague et al., 2012) and gill arch cartilage of rays (Sleight and Gillis, 2020). Despite these similar contributions, the differentiation of mesoderm into head connective tissues remains less understood than for CNCCs. To address this, scRNAseq of mouse Mesp1-Cre-labeled facial mesoderm and Myf5-Cre-labeled mesoderm was recently performed at E14.5 (Grimaldi et al., 2022). By analyzing cohorts of co-expressed genes using SCENIC (Aibar et al., 2017), potential drivers of the connective tissue lineage were identified, including genes shared with CNCC ectomesenchyme, such as Twist1 (Bildsoe et al., 2009; Das and Crump, 2012), Prrx1 (Barske et al., 2016; ten Berge et al., 1998), Pdgfra (Weston et al., 2004), Foxp2 (Chen et al., 2023), Fli1 (Das and Crump, 2012) and Six2 (Liu et al., 2019). Inhibition of the myogenic program by Myf5 loss resulted in increased contributions of facial mesoderm to connective tissue fates, and pseudotime analysis implicated Foxp2 as a potential regulator of connective tissue commitment (Grimaldi et al., 2022). In addition, scRNAseq of Mesp1-Cre- and Tbx1-Cre-labeled mesoderm, combined with bulk ATACseq and Tbx1 ChiP-Seq, revealed a central role for Tbx1 in opening and activating enhancers for cardiac and branchial muscle fates in a common progenitor (Nomaru et al., 2021). In zebrafish, scRNAseq analysis of gata5:GFP+ mesoderm indicated bifurcations between cardiac and facial connective tissue fates, with loss of Gata5/6 function skewing lineages toward facial connective tissue fates (Song et al., 2022). These findings highlight similarities between commitment of mesoderm into myogenic versus connective tissue lineages and of CNCCs into neuroglial versus ectomesenchyme connective tissue lineages, potentially reflecting convergence of mesoderm and CNCC lineage cells onto a mesenchyme subtype with similar connective tissue potential.
Pharyngeal endoderm
The pharyngeal endoderm forms a series of pouches that help pattern the craniofacial skeleton and contribute to several endocrine organs (Choe et al., 2013; Graham et al., 2005). In mouse, sorting of Pax9-Venus/EpCam+ cells from dissected pharynxes allowed the authors to catalog gene expression (by scRNAseq) and chromatin accessibility (by snATACseq) in pharyngeal endoderm cells from E9.5-E12.5 (Magaletta et al., 2022). Cell clustering on the integrated datasets revealed molecular differences between the first and second endodermal pouch and captured the emergence of the thyroid, parathyroid, ultimobranchial body and thymus. In addition to identifying previously unreported gene expression in these developing glands, single-cell analysis revealed the earliest markers of distinct medullary and cortical regions of the thymic epithelium. In silico perturbations of gene function by CellOracle were used to predict a key role for Foxn1 in the medullary thymic lineage, which has since been functionally validated in mice through knockout studies (Kadouri et al., 2022; Li et al., 2023).
In zebrafish, scRNAseq analysis of the adult pituitary, in which endodermal lineage cells were marked by sox17:CreER-mediated recombination, revealed an unexpected contribution of endodermal cells to the pituitary (Fabian et al., 2020). The endocrine component of the pituitary has classically been thought to derive from an ectodermal placode, the adenohypophysis, yet scRNAseq analysis revealed that endodermal cells contributed to all endocrine cell types of the pituitary, as validated by time-lapse imaging, analysis of multiple endoderm-labeling transgenes and mRNA in situ hybridization. As nonvertebrate chordates develop an endoderm-derived pituitary-like structure (Schlosser, 2017), the presence of endodermal lineage cells in the zebrafish pituitary may help clarify the origin and evolution of the vertebrate pituitary gland.
Facial ectoderm
As with pharyngeal endoderm, the facial ectoderm has an important signaling role in craniofacial patterning and contributes to diverse organs, including teeth, glands and taste buds. Epithelial cells, mostly of ectoderm origin, have been recovered in single-cell sequencing studies of the dissected murine maxillary prominence (Sun et al., 2023), mandibular prominence (Xu et al., 2019; Yuan et al., 2020) and hyoid arch (Pushel et al., 2021 preprint). Even after sorting based on CNCC-specific transgenes in zebrafish, epithelial cells are recovered, likely owing to the imperfect nature of the sorting process (Fabian et al., 2022; Tatarakis et al., 2021). In most of these studies, ectoderm lineage cells represented a small fraction of sequenced cells and were not analyzed in depth.
Ectoderm lineage cells of the frontonasal and maxillary prominences meet and fuse at the lambdoid junction and nasolacrimal groove to ensure proper palate and nasal development (Millicovsky et al., 1982). After dissection of these transient embryonic structures from E11.5 mice, scRNAseq of ectoderm lineage cells (marked by an ectoderm-specific Crect-Cre transgene) revealed that ectodermal derivatives include distinct basal epithelial cells within and outside the fusion sites, as well as dental epithelium, periderm and other cell types (Li et al., 2019). In a separate study, epithelial cells from the E12 mouse mandible were dissociated from the mesenchyme, sorted based on epithelial labeling by K14-Cre-mediated recombination and subjected to scRNAseq (Ye et al., 2022). As with CNCC mesenchyme, clustering of ectodermal epithelial cells was dictated largely by anatomical positions, including dental epithelia, adjacent posterior/oral and anterior/aboral epithelia, tongue epithelia with taste buds and the superficial periderm layer. Similar results were found by performing multiomics on mandibular and maxillary epithelia from E10.5 and E12.5 mouse embryos (Shao et al., 2022 preprint). In concordance with scRNAseq of the E12 mandible (Ye et al., 2022), the posterior/oral epithelium was distinguished by expression of Sox2, and Shh and Bmp signaling components, the anterior/aboral epithelium was distinguished by expression of Tfap2a, Tfap2b and Wnt signaling components and, at their intersection, the dental epithelium was distinguished by Pitx2 expression. Distinct transcription factor and signaling pathway expression in these domains was independently validated by bulk RNA sequencing after laser microdissection of domains from the E11.5 mandible (Shao et al., 2022 preprint), with both studies using in situ validation at multiple embryonic stages to show progressive establishment of distinct dental and non-dental epithelial domains. Interestingly, expression of some signaling molecules, such as Wnt10b, were detected in bulk RNAseq and scRNAseq, but not in multiome sequencing, highlighting recently reported tradeoffs in recovering lower abundance transcripts when combining sequencing modalities in single experiments (Booeshaghi et al., 2023 preprint). As discussed below, these studies provide insights into the earliest patterning events of the facial ectoderm that prefigure development of the palate, teeth, glands and other epithelial structures.
Single-cell studies of craniofacial organs
Palate
The primary palate forms from the anterior-most nasal processes, with the secondary palate zippering closed to separate the nasal and oral cavities. Development of the secondary palate involves medial growth of the maxillary processes, which elevate above the tongue to fuse with the lateral and medial nasal processes (Fig. 1C) (Burdi and Faist, 1967). Formation of the secondary palate requires involvement of multiple germ layers and signaling pathways (Compagnucci et al., 2021; Hammond and Dixon, 2022). scRNAseq of the dissected secondary palate in E13.5 mice revealed regionalization of the mesenchyme into Msx1+ proximal and Shox2+ oral domains (Ozekin et al., 2023), consistent with earlier expression studies (Han et al., 2009; Yu et al., 2005).
Although the hard palate consists of CNCC-derived bone, the soft palate consists of muscularized tissue that opens and closes the nasopharyngeal and oral cavities. scRNAseq of soft palate development in E13.5, E14.5 and E15.5 mice revealed a population of CNCC-derived perimysial cells that, through in situ validation, were confirmed to surround the developing palatal muscles (Han et al., 2021). A role in perimysial cell specification was identified for Runx2 (Han et al., 2021), which is well characterized in osteoblast formation (Ducy et al., 1997). Using CellChat, a potential role for transforming growth factor β (TGFβ) signaling in perimysial cell specification was also indicated (Feng et al., 2022), which was confirmed by Osr2-Cre-mediated deletion of Tgfbr1 or its target gene Fgf18 (Feng et al., 2022). These studies show the power of single-cell studies in identifying new and unexpected roles of genes and signaling pathways in CNCC development. Integrating single-cell and GWAS datasets can also point to cell populations and genetic pathways disrupted in cleft lip and palate, such as the Irf6-Tfap2a-Grhl3 gene circuit in the periderm (Siewert et al., 2023) – the outermost layer of the embryonic epithelium that serves to prevent inappropriate fusions of the palatal shelves during development (Richardson et al., 2014).
Teeth
Tooth development involves crosstalk between oral epithelia and CNCC mesenchyme, which induces differentiation of epithelial cells into enamel-producing ameloblasts and CNCCs into dentin-producing odontoblasts, pulp, periodontal ligament, cementum and alveolar bone (Fig. 1C) (Balic and Thesleff, 2015). Our understanding of the mechanisms behind the shaping and complex differentiation of teeth remains incomplete. scRNAseq of the E12 mouse mandibular epithelium, coupled with mRNA in situ hybridization, revealed previously unreported markers of the initiation knot, which is the signaling center for tooth initiation, including Gad1, Sp5 and Proser2 (Ye et al., 2022). One surprising finding was that Ntrk2, which helps transduce neurotrophic signaling in the nervous system, also plays a role in the proliferative growth and morphogenesis of incisors. Single-cell analysis also revealed the striking gene expression similarities of tooth germs and taste buds (Ye et al., 2022), consistent with experimental evidence in cichlid fish and mouse that taste buds can be partially converted to a tooth-like program by modulating Bmp signaling (Bloomquist et al., 2019).
Several groups have studied the mouse molar to understand how the diverse cell types derived from the CNCC mesenchyme arise. Jing et al. performed scRNAseq of the mouse molar from E13.5 to postnatal day (P) 7.5 (Jing et al., 2022). By E14.5, mesenchyme can be separated into the dental papilla, which generates the odontoblasts and pulp, and the dental follicle, which generates the periodontal ligament, cementum and alveolar bone. Furthermore, CellChat predicted roles for insulin-like growth factor (IGF) signaling within the dental follicle domain, which was validated by periodontal ligament defects upon Cre-mediated deletion of the Igf1 ligand or its receptor, Igf1r (Jing et al., 2022). These findings highlight the usefulness of the many publicly available Cre mouse lines to validate single-cell predictions, as well as to dissect cell-cell receptor-ligand interactions driving cell fate decisions.
In contrast to the molar, the rodent incisor is continuously growing, thus providing an opportunity to study stem cell regulation in both epithelial and mesenchymal compartments. One of the first global gene expression studies of teeth used microarray analysis in 94 dissected young adult mouse incisors to identify cohorts of co-varying gene modules (Seidel et al., 2017). In situ validation revealed multiple types of stem and progenitor populations in epithelial and mesenchymal compartments. In a subsequent study, scRNAseq of Epcam+ epithelial cells of the incisor revealed broad populations of proliferative progenitors that bifurcate to ameloblast and non-ameloblast lineages. Comparative scRNAseq of incisors during homeostasis and chemical injury-induced regeneration further revealed increased recruitment of Notch1-CreER-labeled stratum intermedium cells, which sit under the inner enamel epithelium, to regenerate ameloblasts (Sharir et al., 2019). This provides a good example of how lineages can be modified in response to injury to promote efficient repair.
Another approach to study the cellular composition of mouse incisors employed a combination of SMART-seq2, which provides high sequencing depth to relatively few cells, and 10x Chromium scRNAseq, which allows lower depth sequencing of many cells (Krivanek et al., 2020). In addition to confirming many of the independently reported progenitors, ameloblast intermediates and non-ameloblasts in the epithelium (Seidel et al., 2017; Sharir et al., 2019), a rare subtype of epithelial cell expressing mechanotransduction and calcium-dependent genes was identified. Whether this cell type functions to regulate incisor growth in response to mechanical forces remains to be corroborated. This same study also revealed different types of immune cells in the adult incisor, with implications for how the immune system may function to prevent caries (Krivanek et al., 2020). Comparative scRNAseq on the non-growing adult mouse molar revealed common signatures of shared tooth cell types, as well as unique cell types and gene expression that may underlie the ability of the incisor to continuously grow (Krivanek et al., 2020). Moreover, recent single-cell studies of developing human teeth, including non-growing incisors, also provide a foundation for comparisons with the continuously growing incisors of mice (Alghadeer et al., 2023).
Sutures
Sutures separate the calvarial bones and coordinate growth of the skull with protection of the underlying brain (Fig. 1B). In craniosynostosis, sutures fuse prematurely, leading to defective skull growth and neurocognitive defects (Twigg and Wilkie, 2015). Several studies have performed single-cell profiling of the sutures to understand the nature of the stem/progenitor and other cells that grow the bones and keep them separate. Holmes et al. profiled the embryonic mouse frontal suture, identifying osteoblasts, several clusters of suture-associated mesenchyme, overlying hypodermis, underlying dura, endothelial cells, pericytes and diverse immune cells (Holmes et al., 2020). In situ RNA hybridization revealed two populations of mid-suture mesenchyme: bone-associated osteoprogenitors and proliferative cells. Profiling of sutures in a mouse Twist1+/− model of Saethre-Chotzen syndrome, a genetic condition in which the coronal sutures are lost, and a Fgfr2S252W/+ model of Apert syndrome, in which multiple sutures are fused, revealed distinct changes in gene expression but no alterations to suture cell type composition (Holmes et al., 2020).
Two scRNAseq studies of the coronal suture identified largely similar cell types to those of the frontal suture (Farmer et al., 2021; Holmes et al., 2021). Both the ectocranial layers overlying the calvarial bones (Merrill et al., 2006; Ting et al., 2009) and the meninges situated between the calvaria and brain (Yu et al., 2021) are known to regulate suture formation and homeostasis. Farmer et al. revealed a diversity of ectocranial layers, including a ligament-like cell population bridging the suture that may have roles in mechanotransduction and/or stabilization of the suture (Farmer et al., 2021). Meanwhile, the dura contains a cartilage-like population that may have roles in suture closure through endochondral ossification. scRNAseq of suture chondrocytes in mice lacking Gnas, the mutation of which in humans is linked to fusion of multiple sutures, revealed their transformation into osteoblasts that closed the suture (Xu et al., 2022).
Comparative single-cell studies have the potential to reveal molecular differences between sutures that make them differentially sensitive to genetic and environmental perturbations. For example, one study identified unique expression of the Hedgehog signaling regulator Hhip in the coronal versus the frontal mid-suture region (Holmes et al., 2021). Lineage tracing with Hhip-CreER showed that Hhip+ mid-suture mesenchyme is not a major source of new osteoblasts in the short term, although there is modest contribution to calvarial osteoblasts after several months (Holmes et al., 2021). This result is similar to lineage tracing observed in mid-suture mesenchyme by Axin2-CreER (Maruyama et al., 2016), and differs from the extensive contribution of cells traced from both the bone fronts and mid-suture mesenchyme by Gli1-CreER (Zhao et al., 2015) or Ctsk-Cre (Debnath et al., 2018). These different findings suggest that new bone addition occurs mainly at the bone fronts, with mid-suture mesenchyme representing a reserve pool of cells that may contribute primarily during repair. As a note of caution, a scRNAseq study of cells from the frontal, coronal and sagittal sutures of mouse recovered two unique populations in the sagittal suture (Menon et al., 2021). However, closer examination suggests that these represent chondrocytes that were included preferentially in sagittal suture dissections. Thus, differences in dissection technique between sutures and between research groups need to be considered when analyzing cell compositional differences between suture stages and types.
A limitation of single-cell profiling is the inability to directly correlate cell clusters to spatial domains within the embryo. To address this, new ‘spatial transcriptomic’ methods have been developed to preserve spatial information by applying barcoding beads across tissue sections on slides. After sequencing, indexing the spatial position of beads allows mapping of global gene expression back to precise positions on the sectioned tissue (Ståhl et al., 2016). Such technology was applied to the sagittal sutures of newborn mice and gradients of gene expression from the mid-suture region to the flanking calvarial bones were analyzed (Tower et al., 2021). When cranial nerves innervating the suture were ablated, spatial transcriptomics revealed redistribution of Bmp and TGFβ pathway genes, which correlated with increased Bmp and TGFβ signaling in the mid-suture region and partial fusion of the suture. This analysis highlights the power of spatial genomics in revealing how changes in gene expression and cell distribution correlates to developmental defects of craniofacial structures.
Glands
The facial glands lubricate and protect the oral cavity and ocular surface. The submandibular and sublingual salivary glands produce mucous-rich secretions, whereas the parotid salivary gland produces watery secretions rich in digestive enzymes and the lacrimal gland secretes tears (de la Cuadra-Blanco et al., 2003; Martinez-Madrigal and Micheau, 1989). Major epithelial cell types include secretory acinar cells; intercalated, striated and excretory ductal cells that modify and conduct the secretions to the oral cavity or eye; and myoepithelial cells that promote secretion through contraction (Lombaert et al., 2013). Development of the glands shares properties with that of teeth, in particular interactions of CNCC mesenchyme with the overlying ectodermal epithelium to promote budding, branching and differentiation (Fig. 1C) (Grobstein, 1953; Jaskoll et al., 2002; Johnston et al., 1979; Rothova et al., 2012).
Single-cell RNA profiling has been performed on several glandular structures to characterize their development (Table 2). For example, Sekiguchi et al. compared the murine submandibular and parotid glands at E12 and found greater inter-gland differences in the mesenchymal than epithelial populations (Sekiguchi et al., 2020). The submandibular gland mesenchyme expressed a unique set of genes from parotid gland mesenchyme, including Nr5a2, which is essential for initiation and formation of the submandibular but not the parotid gland (Chen et al., 2023). Reciprocally, parotid gland mesenchyme displayed selective expression of the transcription factor Pou3f3 and associated long non-coding RNAs Pantr1 and Pantr2, potentially reflecting the role of Pou3f3 in development of the proximal arches where the parotid gland forms (Barske et al., 2020; Jeong et al., 2008). These mesenchymal expression differences may help establish differences in mucous versus aqueous secretions of these glands, with increased numbers of myoepithelial cells recovered in submandibular gland epithelia, reflecting the need for greater contractile forces in secretion of more viscous mucous (Tucker, 2007).
scRNAseq datasets for the mouse submandibular gland have been generated for bud (E12 and E14), branching (E16), postnatal (P1 and P30) and adult (10 month) stages, with a focus on epithelial cells (Hauser et al., 2020). Trajectory analysis using the dimensionality reduction algorithm PAGA (Wolf et al., 2019) suggests a late embryonic Krt19+ ductal population as the common progenitor for postnatal acinar, intercalated duct and striated duct cells. By performing scRNAseq of the P8 mouse submandibular gland, Song et al. identified a distinct SMA+ progenitor for basal epithelial and myoepithelial cells, which was confirmed with Acta2(SMA)-CreER-based lineage tracing (Song et al., 2018). Two distinct types of putative acinar progenitors, labeled by Smgc and Bpifa2, have also been identified that may give rise to distinct mucous and serous acinar cells, respectively (Hauser et al., 2020). Sexually dimorphic expression was also noted in adult glands, with Smgc, for example, expressed in female but not male mucous acinar cells. Sexual dimorphism has also been noted in scRNAseq studies of other epithelial organs, such as the kidney (Ransick et al., 2019), highlighting the importance of separately profiling male and female organs, especially at mature stages.
Although model organisms such as the mouse are valuable for uncovering core developmental processes of organs, there may also be human-specific gene expression and other features that are important for understanding disease etiology. As in mouse, scRNAseq of submandibular glands from 12- to 19-week-old human fetuses revealed KRT19+ basal epithelial cells as potential progenitors to all differentiated epithelial cell types, except for excretory ducts and myoepithelial cells (Hauser et al., 2020). Comparative studies also revealed similar cell compositions between human and mouse parotids (Chen et al., 2022), and minor salivary glands (Huang et al., 2021). Although serous acinar cells were similar between human and mouse (Chen et al., 2022), human mucous acinar cells were highly divergent from those in mouse, and no human equivalents of mouse granular convoluted tubular cells were evident (Horeth et al., 2023). These cellular differences may reflect divergent physiological properties of human versus mouse glands that may impact modeling glandular diseases in rodents.
Single-cell analysis has also been used to benchmark the fidelity of salivary gland (Moskwa et al., 2022) and lacrimal gland (Bannier-Helaouet et al., 2021) organoids; for example, showing that current lacrimal gland organoids are largely ductal in nature. Single-cell sequencing of mouse models (Horeth et al., 2021) and human tissues (Nayar et al., 2022 preprint) from individuals with Sjögren's syndrome, an autoimmune disease affecting facial gland function, revealed dynamic changes in immune and stromal cell components of the glands, as well as upregulation of immune regulators in glandular epithelial cells that may contribute to the disease. In addition, single-cell profiling of human minor salivary glands and oral mucosa revealed cell type-specific expression of the viral entry factor ACE2, which may explain increased susceptibility of these glands to COVID-19 infection (Huang et al., 2021).
Conclusions and future perspectives
Single-cell studies of craniofacial development generate massive amounts of data, yet most strategies to validate findings in vivo are low throughput. For RNA expression, new spatial transcriptomic approaches, such as MERFISH (Xia et al., 2019) and seqFISH (Coskun and Cai, 2016), allow the simultaneous detection of thousands of different transcripts in the same cell. When combined with spatial barcoding techniques such as MEMOIR (Askary et al., 2020; Frieda et al., 2017), these techniques could provide holistic lineage and spatial information for cell types, although the need for sectioning limits the ability to generate three-dimensional maps. To complement snATACseq predictions of enhancers, single-cell profiling of chromatin marks, DNA methylation, transcription factor occupancy and chromosome conformation could predict active, poised or repressed enhancers, as well as other types of boundary and regulatory elements. In vivo adaptation of high-throughput reporter assays, such as MPRA-seq (Inoue and Ahituv, 2015) and STARR-seq (Arnold et al., 2013), could also aid in dissecting the enhancer logic of craniofacial development.
Single-cell datasets can also be integrated with GWAS to identify potential causative mutations for craniofacial anomalies (Gawel et al., 2019; Jagadeesh et al., 2022; Jia et al., 2022; Zhang et al., 2022), including cleft palate (Leslie et al., 2016) and craniosynostosis (Justice et al., 2020). Given developmental gene conversation across vertebrates, coding mutations can often be modeled in organisms such as mouse and zebrafish to validate causation and understand developmental etiologies. Enhancers, however, are much less conserved across vertebrates and thus candidate non-coding mutations are more difficult to model. A few deeply conserved enhancers have been identified, such as a Pou3f3 enhancer associated with gill cover development in fishes and closure of embryonic neck cavities in mammals (Barske et al., 2020). For non-conserved enhancers, normal and patient-associated versions can be tested for differential activity in model organisms (Liu et al., 2020). Recently, the chromatin landscape of human embryonic craniofacial tissues has been analyzed, with putative enhancers correlated to craniofacial GWAS hits (Wilderman et al., 2018). A better understanding of craniofacial enhancer logic will be needed to predict which non-coding polymorphisms may be causative.
Variation in human facial shape and evolution of facial structure are likely driven primarily by changes in gene regulation. Integration of human facial variation GWAS (Shaffer et al., 2016; White et al., 2021) with single-cell craniofacial studies will help predict causative protein and non-coding polymorphisms. To date, single-cell studies of craniofacial development have been largely limited to zebrafish, mouse and human. Given that cell populations can be deconvolved by marker expression in single-cell datasets without the need for transgenic labels, it is now possible to perform single-cell analysis of craniofacial tissues from nearly any vertebrate. Comparative single-cell analyses should help inform how changes in gene expression and enhancer use correlate with evolution of facial form, as well as the extent to which cell types emerge or disappear during evolution. Single-cell approaches will complement exciting ongoing genomics analyses of avian beak shape (Smith et al., 2022), jaw length in Beloniformes fishes (Daane et al., 2021), and facial differences between apes, and archaic and modern humans (Gokhman et al., 2020; Prescott et al., 2015).
Finally, the application of single-cell technology to craniofacial regeneration would help reveal how resident progenitors and lineage plasticity contribute to repair. Single-cell analysis of the injured murine temporomandibular joint disc revealed potential progenitors for regeneration of the fibroblast but not chondrocyte components (Bi et al., 2023). In zebrafish, single-cell sequencing revealed a potential perichondral progenitor involved in regeneration of the jaw joint (Smeeton et al., 2021) and dedifferentiation of ligamentocytes during jaw ligament regeneration (Anderson et al., 2023). It will also be interesting to analyze whether regeneration-specific enhancers (Kang et al., 2016; Wang et al., 2020) function in craniofacial tissues, and whether, as for the heart (Yan et al., 2023), regeneration-specific enhancers can be leveraged to promote repair of mammalian craniofacial tissues. Comparative single-cell approaches of regenerative versus non-regenerative species will ultimately inform how differences in cell types, gene expression and enhancer usage underlie the ability of some vertebrates to robustly regenerate craniofacial tissues.
Footnotes
Funding
The authors’ research is funded by the National Institute of Dental and Craniofacial Research (NIH R35 DE027550) to J.G.C. and a California Institute for Regenerative Medicine EDUC4 Training Grant to K.-C.T. Deposited in PMC for release after 12 months.
References
Competing interests
The authors declare no competing or financial interests.