ABSTRACT
In the nascent mesoderm, TBXT expression must be precisely regulated to ensure that cells exit the primitive streak and pattern the anterior-posterior axis, but how varying dosage informs morphogenesis is not well understood. In this study, we define the transcriptional consequences of TBXT dosage reduction during early human gastrulation using human induced pluripotent stem cell models of gastrulation and mesoderm differentiation. Multi-omic single-nucleus RNA and single-nucleus ATAC sequencing of 2D gastruloids comprising wild-type, TBXT heterozygous or TBXT null human induced pluripotent stem cells reveal that varying TBXT dosage does not compromise the ability of a cell to differentiate into nascent mesoderm, but instead directly influences the temporal progression of the epithelial-to-mesenchymal transition with wild type transitioning first, followed by TBXT heterozygous and then TBXT null. By differentiating cells into nascent mesoderm in a monolayer format, we further illustrate that TBXT dosage directly impacts the persistence of junctional proteins and cell–cell adhesions. These results demonstrate that epithelial-to-mesenchymal transition progression can be decoupled from the acquisition of mesodermal identity in the early gastrula and shed light on the mechanisms underlying human embryogenesis.
INTRODUCTION
In the early embryo, cells must precisely regulate gene expression to ensure the organism progresses through standard hallmarks of development. One key developmental time point in vertebrate embryogenesis is the establishment and morphogenesis of the primitive streak (PS), a transient structure of the posterior embryo that initiates germ layer formation and establishes bilateral symmetry (Mikawa et al., 2004). Cells follow precisely orchestrated migration patterns as they ingress into the streak, undergo an epithelial-to-mesenchymal transition (EMT) to delaminate from the streak, and expand outward as individual mesenchymal cells to form endodermal and mesodermal germ lineages (Schoenwolf and Smith, 2000; Ciruna and Rossant, 2001; Dale et al., 2006). Timing of epithelial cell ingression into the PS informs cell fate, as cells ingressing early during gastrulation give rise to cranial mesoderm (Lawson et al., 1991), whereas subsequently ingressing cells form axial, paraxial and lateral mesoderm of the anterior trunk (Wilson and Beddington, 1996). Posterior trunk mesoderm, including caudal somites, emerges later from a separate progenitor population in the tailbud known as the neuromesodermal progenitors (NMPs) (Tzouanacou et al., 2009; Henrique et al., 2015).
The transcription factor brachyury [TBXT (human); Tbxt, T or Bra (mouse)] has a conserved role in mesoderm differentiation across vertebrates (Technau, 2001) and is widely utilized as one of the first markers of nascent mesoderm. T is initially expressed in the posterior embryo just before the emergence of the PS, and as gastrulation progresses its expression domain expands to include the PS, notochord, and later the tailbud NMP population (Wilkinson et al., 1990; Rivera-Pérez and Magnuson, 2005). Still, how precisely T orchestrates the interplay between germ layer specification and PS morphogenesis is unclear. Human stem cell-based models show that TBXT is required for mesoderm induction (Bernardo et al., 2011; Faial et al., 2015) and studies in Xenopus show that the TBXT homolog (also known as XBra), can induce different mesodermal cell types in a dose-dependent manner (O'Reilly et al., 1995; Faial et al., 2015). In contrast, mice with loss-of-function T mutations appear to generate an early mesoderm population (Yanagisawa et al., 1981; Hashimoto et al., 1987; Rashbass et al., 1991; Wilson et al., 1995). Morphogenesis, however, is dramatically altered in these mutants, as T−/− mesoderm-like cells lose their ability to properly migrate away from the embryonic midline and accumulate in the node, PS, and regions immediately ventral to the streak. This aberrant cellular distribution, coupled with later-onset defects in tailbud mesoderm specification, ultimately causes mutant embryos to not develop a notochord, have significant body axis truncations rostral to somite seven, and die from incomplete allantois development. How T loss drives this phenotype, including if and how its mode of misregulation is conserved in humans, has not been fully defined.
Interestingly, this mutant axial truncation phenotype appears to be correlated to T dose. Mice heterozygous for T show a small but notable accumulation of cells in the same domains that see cell accumulation in T homozygous knockout mice (Wilson et al., 1993; Wilson et al., 1995; Wilson and Beddington, 1997). Heterozygous mice are viable, but later generate short tails and can display notochord and sacral malformations (Dobrovolskaia-Zavadskaia, 1927; Chesley, 1935; Stott et al., 1993). In humans, hypomorphic TBXT expression manifests as partial absences or abnormal fusions of the tailbone, pelvis, or lower vertebrae, and, like in mouse, TBXT loss-of-function mutations are embryonic lethal (Papapetrou et al., 1999; Ghebranious et al., 2008; Postma et al., 2014; Chen et al., 2023). The intermediate axial truncation phenotype seen in both human and mouse when TBXT expression is reduced suggests that the mechanism governing the spatial patterning of mesoderm once it exits the vertebrate PS is dependent on precisely calibrated expression levels of TBXT.
For TBXT+ cells to become motile and exit the PS, they must undergo EMT, including restructuring cell–cell adhesions, the cytoskeleton, and the composition of the extracellular matrix. In early amniote gastrulation, EMT is tightly associated with the acquisition of mesoderm fate, and it has been suggested that EMT initiates mesoderm commitment in human embryonic stem cells (hESCs) (Evseenko et al., 2010). TBXT also promotes EMT in the context of cancer (Fernando et al., 2010; Roselli et al., 2012); however, it has been proposed to be dispensable for EMT during notochord development (Zhu et al., 2016). How TBXT dosage modulates EMT in the context of mesoderm commitment to ensure cells acquire increased motility is therefore not fully defined.
Acquisition of cell fate preceding and throughout the establishment of the primitive streak is primarily controlled by a network consisting of BMP, WNT and Nodal signaling (Arnold and Robertson, 2009). This network can be manipulated in vitro to yield hESC colonies, termed 2D gastruloids, that reproducibly generate concentric rings of epiblast, mesoderm, endoderm, and extra-embryonic cells from the center outwards (Warmflash et al., 2014; Minn et al., 2020; 2021). 2D gastruloids have been used to further understanding of the minimal inputs driving multicellular patterning and migration kinetics throughout the course of gastrulation, including PS morphogenesis (Libby et al., 2019; Martyn et al., 2019 preprint; Joy et al., 2021).
Here, we have adapted this 2D micropatterned gastruloid culture to investigate how TBXT dose-dependently controls mesodermal cell identity, EMT, and subsequent migratory behavior during early human gastrulation. We demonstrate that varying levels of TBXT expression modulate human PS morphogenesis by controlling the timing of EMT, including the persistence of junctional proteins and cell–cell adhesions, without compromising the ability of a cell to differentiate into nascent mesoderm. We conclude that initial mesoderm specification can be decoupled from the temporal progression of EMT during early gastrulation, thus significantly improving our understanding of cell fate acquisition and morphogenesis during early human development.
RESULTS
Generation of hiPSC TBXT allelic series and 2D gastruloids
To investigate the effect of TBXT dose on PS morphogenesis, we engineered a WTC11-LMNB1-GFP-derived human induced pluripotent stem cell (hiPSC) allelic series: TBXT+/+ (wild type, WT), TBXT+/− (TBXT-Het) and TBXT−/− (TBXT-KO), by targeting the first exon of TBXT with CRISPR/Cas9 (Fig. 1A, Fig. S1). The resulting indel created a premature stop codon in one or two alleles, respectively. Conducting a western blot for TBXT in a monolayer of cells exposed to a proprietary mesoderm induction media for 48 h revealed the expected decrease in TBXT expression across the allelic series (Fig. 1B,B′; Materials and Methods). Of note, the TBXT protein level in the TBXT-Het was approximately 75-80% of the WT protein level, suggesting that intrinsic dose compensation mechanisms are active in the TBXT gene regulatory network (Fig. 1B′, Fig. S1C). This result was replicated in an independent heterozygous clone, which expressed TBXT at 91% of the WT level (Fig. S1, Table S1). Knockout efficiency was further confirmed by immunofluorescence (IF), which demonstrated the complete absence of TBXT protein in the TBXT-KO (Fig. 1D).
TBXT dose does not dramatically impact lineage emergence in 2D gastruloids. (A) Schematic of the location within the TBXT locus targeted by sgRNA to generate TBXT-Het and TBXT-KO. (B,B′) Western blot (B) and quantification of fluorescent signal (B′) showing the knockout efficiency of TBXT sgRNA across the allelic series. *P<0.5, ***P<0.01 (unpaired t-test), n=5. β-Actin was used as loading control and data are expressed relative to the WT group. The horizontal line indicates the median value, the box limits represent the 25th and 75th percentiles, and the whiskers extend to the largest value (no further than 1.5×interquartile range). (C) 2D gastruloid differentiation strategy. hiPSCs were seeded onto matrigel-stamped plates and grown to confluency before being exposed to BMP4 for 48 h, yielding 2D gastruloids with concentric rings of epiblast-like cells (blue), mesendoderm-like cells (green) and extraembryonic-like cells (pink). R.I., ROCK inhibitor. (D,E) Immunofluorescence staining of canonical gastrulation markers in gastruloids. (E) Immunofluorescence staining of canonical markers of the BMP→WNT→Nodal signaling network underlying 2D gastruloid differentiation. Green staining indicates protein of interest, and blue staining indicates LMNB1. (F) Quantification of fluorescence intensity of canonical gastrulation markers (TBXT: n=5 WT, 15 TBXT-Het, 10 TBXT-KO; SOX2: n=4 WT, 3 TBXT-Het, 3 TBXT-KO; CDX2: n=5 WT, 15 TBXT-Het, 10 TBXT-KO; LEFTY: n=6 WT, 7 TBXT-Het, 8 TBXT-KO; NODAL: n=8 WT, 3 TBXT-Het, 6 TBXT-KO; pSMAD1/5: n=8 WT, 7 TBXT-Het, 4 TBXT-KO; WNT3a: n=8 WT, 7 TBXT-Het, 4 TBXT-KO). Shaded areas around lines indicate s.e.m. Gray vertical highlights indicate regions of interest based on variability in expression. A.U., arbitrary units.
TBXT dose does not dramatically impact lineage emergence in 2D gastruloids. (A) Schematic of the location within the TBXT locus targeted by sgRNA to generate TBXT-Het and TBXT-KO. (B,B′) Western blot (B) and quantification of fluorescent signal (B′) showing the knockout efficiency of TBXT sgRNA across the allelic series. *P<0.5, ***P<0.01 (unpaired t-test), n=5. β-Actin was used as loading control and data are expressed relative to the WT group. The horizontal line indicates the median value, the box limits represent the 25th and 75th percentiles, and the whiskers extend to the largest value (no further than 1.5×interquartile range). (C) 2D gastruloid differentiation strategy. hiPSCs were seeded onto matrigel-stamped plates and grown to confluency before being exposed to BMP4 for 48 h, yielding 2D gastruloids with concentric rings of epiblast-like cells (blue), mesendoderm-like cells (green) and extraembryonic-like cells (pink). R.I., ROCK inhibitor. (D,E) Immunofluorescence staining of canonical gastrulation markers in gastruloids. (E) Immunofluorescence staining of canonical markers of the BMP→WNT→Nodal signaling network underlying 2D gastruloid differentiation. Green staining indicates protein of interest, and blue staining indicates LMNB1. (F) Quantification of fluorescence intensity of canonical gastrulation markers (TBXT: n=5 WT, 15 TBXT-Het, 10 TBXT-KO; SOX2: n=4 WT, 3 TBXT-Het, 3 TBXT-KO; CDX2: n=5 WT, 15 TBXT-Het, 10 TBXT-KO; LEFTY: n=6 WT, 7 TBXT-Het, 8 TBXT-KO; NODAL: n=8 WT, 3 TBXT-Het, 6 TBXT-KO; pSMAD1/5: n=8 WT, 7 TBXT-Het, 4 TBXT-KO; WNT3a: n=8 WT, 7 TBXT-Het, 4 TBXT-KO). Shaded areas around lines indicate s.e.m. Gray vertical highlights indicate regions of interest based on variability in expression. A.U., arbitrary units.
TBXT dose does not dramatically impact lineage emergence in 2D gastruloids
To begin to dissect how TBXT dose shapes the earliest stages of nascent mesoderm morphogenesis, we subjected the allelic series to 2D gastruloid differentiation, which reproducibly generates concentric rings of radially patterned primary germ layers, primordial germ cell-like cells (PGCLCs), and extra-embryonic-like cells after 48 h of BMP4 exposure (Warmflash et al., 2014; Minn et al., 2020) (Fig. 1C; Materials and Methods). We then conducted IF in colonies of each genotype to identify if and how TBXT expression broadly influences germ layers' patterning, proportion and identity during early human gastrulation.
Using antibodies targeting epiblast/ectoderm (SOX2+), mesoderm (EOMES+/SOX17−), endoderm/PGCLCs (SOX17+), and extra-embryonic cells (CDX2+), we observed a conserved spatial distribution of the germ layers between WT, TBXT-Het and TBXT-KO colonies (Fig. 1D,F, Fig. S2). The consistent expression pattern of these broad germ layer markers suggests that the intrinsic BMP→WNT→NODAL feedback loop is sustained in the absence of TBXT, as predicted in silico (Kaul et al., 2023), and reflects the ability of nascent mesoderm to develop in murine T−/− models during the initial stages of gastrulation (Beddington et al., 1992; Rashbass et al., 1994; Wilson et al., 1995).
To validate the conservation of this signaling network, we conducted IF for pSMAD1/5, WNT3a, NODAL, and the NODAL inhibitor LEFTY. The consistent localization of pSMAD1/5 along the colony periphery revealed that all genotypes maintain BMP4 signaling in this region (Fig. 1E,F). Additionally, NODAL was detected broadly and upregulated just interior toward the nascent mesoderm domain in gastruloids of all genotypes, and LEFTY was seen adjacent to this domain near the outer mesendoderm boundary. WNT3a protein expression was also maintained in the mesendoderm domain across all genotypes, but was slightly decreased in this region of TBXT-KO gastruloids, suggesting the positive feedback loop between TBXT and canonical WNT signaling is disrupted in the TBXT-KO (Fig. 1E,F).
The uniformity in germ layer identity and distribution across genotypes suggests that cell fate specification is not dependent on TBXT dose during the early stages of human PS morphogenesis, and that cell fate specification and induction of morphogenetic movements are likely governed independently during in vitro gastrulation.
Defining a genomic regulatory network underlying TBXT dosage
To gain a more comprehensive understanding of similarities to the malformations observed in T mutants observed in vivo, we sought to precisely define disparities in cell identity and the expression of migration regulators, including drivers of EMT, within gastruloids of each genotype. TBXT directly binds to the regulatory regions of key genes in mesoderm development and can influence chromatin accessibility (Faial et al., 2015; Koch et al., 2017), so we further hypothesized that TBXT expression level may influence chromatin accessibility at genes that are required for mesoderm maturation, and that altered accessibility may precede changes in protein level at the initial stages of mesoderm development (48 h BMP4 exposure). To explore these possibilities, we conducted multi-omic single-nucleus RNA sequencing (snRNA-seq) and single-nucleus assay for transposase-accessible chromatin (snATAC-seq) on gastruloids of each genotype after 48 h of BMP4 treatment.
Our analysis of all three pooled genotypes yielded 11 clusters consisting of three extra-embryonic cell populations: clusters 1 to 3, extra-embryonic late to extra-embryonic early (‘ExE-Late’, ‘ExE-Middle’, ‘ExE-Early’); cluster 4, extra-embryonic progenitor cells (‘EEM Progenitors’); cluster 5, epiblast-like cells (‘Epiblast’); clusters 6-8, three primitive PS-like cell populations (‘PS-Early’, ‘PS-Middle’, ‘PS-Late’); cluster 9, nascent mesoderm-like cells (‘Mesoderm’); cluster 10, nascent endoderm-like cells (‘Endoderm’); cluster 11, primordial germ cell-like cells (‘PGCLC’) (Fig. 2A-D, relationships between clusters at various resolutions illustrated in Figs S3, S7).
snRNA-seq reveals conserved lineage emergence in the absence of TBXT. (A) UMAP derived from snRNA-seq and snATAC-seq data depicting 11 clusters identified through Seurat analysis of snRNA-seq data after 48 h BMP exposure (n=23,838 cells, n=2 replicates). (B) Bar plot depicting the proportion of cells from each sample assigned to each cluster. (C) UMAP feature plots depicting GeneExpressionMatrix plots (derived from snRNA-seq) or GeneScoreMatrix plots for chromatin accessibility loci (derived from snATAC-seq) for key markers of gastrulation lineages. (D) Heatmap of snRNA-seq expression patterns for key markers of gastrulation lineages across clusters. (E) snATAC-seq accessibility tracks across clusters centered around TBXT, SOX2 and POU5F1 and aligned to detected peaks, H3K27ac tracks from published datasets, and Peak2Gene linkage predictions of regulatory connections between distal accessible regions (peaks) and nearby genes. Gray highlights indicate peaks of interest.
snRNA-seq reveals conserved lineage emergence in the absence of TBXT. (A) UMAP derived from snRNA-seq and snATAC-seq data depicting 11 clusters identified through Seurat analysis of snRNA-seq data after 48 h BMP exposure (n=23,838 cells, n=2 replicates). (B) Bar plot depicting the proportion of cells from each sample assigned to each cluster. (C) UMAP feature plots depicting GeneExpressionMatrix plots (derived from snRNA-seq) or GeneScoreMatrix plots for chromatin accessibility loci (derived from snATAC-seq) for key markers of gastrulation lineages. (D) Heatmap of snRNA-seq expression patterns for key markers of gastrulation lineages across clusters. (E) snATAC-seq accessibility tracks across clusters centered around TBXT, SOX2 and POU5F1 and aligned to detected peaks, H3K27ac tracks from published datasets, and Peak2Gene linkage predictions of regulatory connections between distal accessible regions (peaks) and nearby genes. Gray highlights indicate peaks of interest.
Clusters 1-3 revealed a bias toward late-amnion (GABRP, HEY1, HAND1, VTCN1, TPM1, IGFBP3, ANKS1A) compared with trophectoderm (Fig. S4), in agreement with published findings that primed hiPSCs are biased toward a late-amnion fate in the presence of BMP4 (Rostovskaya et al., 2022). These clusters have very similar chromatin accessibility and inferred peaks to both one another relative to all other clusters and to published H3K27ac accessibility data from primary amnion tissue (Bernstein et al., 2010), reinforcing their developmental similarity and amnion-like identity (Figs. 2E, Fig. S5D). These peaks are also enriched for GATA motifs, which are key regulators of extra-embryonic cell fate (Fig. S5E). Analysis of differentially accessible regions (DARs) between clusters revealed a prospective SOX2 enhancer and the POU5F1 promoter that was uniquely accessible in the epiblast, extra-embryonic progenitor, and early PS-like cell types (clusters 4-7) (Fig. 2E). These DARs correspond to H3K27ac chromatin immunoprecipitation with sequencing (ChIP-seq) peaks in pluripotent HUES cells (Tsankov et al., 2015), reflective of these clusters' pluripotency-related expression pattern and subsequent downregulation as differentiation progresses away from pluripotency. Clusters 9 and 10 were enriched for several motifs regulating mesendoderm specification, including GATA factors and FOXA2 (Fig. S5E). Chromatin accessibility near the TBXT promoter was increased in the PS and mesendoderm-like subclusters (clusters 6-10) and, interestingly, there appeared to be a potential TBXT regulatory domain uniquely accessible in extra-embryonic cell types (Fig. 2E). This peak was not linked to TBXT gene expression by Peak2Gene analysis, but it did correlate with the H3K27ac ChIP-seq profile of amnion tissue (Bernstein et al., 2010), suggesting that TBXT expression may be uniquely regulated in extra-embryonic tissue relative to embryonic, a possibility that warrants further study. In agreement with IF data, we found that there was not a significant difference in the proportion of cells from each genotype assigned to each cluster, supporting the notion that cell identity during early gastrulation was not significantly affected by the loss of TBXT.
TBXT influences downstream gene expression in a dose-dependent manner
To understand why the loss of T in vivo most dramatically affects the morphogenesis of the mesoderm population, we assessed the nuanced differences that exist between the mesoderm populations of each genotype and how these may contribute to abnormal migration patterns observed later in development. We first compared differentially expressed genes (DEGs) in TBXT-Het versus WT or TBXT-KO versus WT within the Mesoderm cluster (Fig. 3A,A′). This analysis identified 14 genes downregulated in the TBXT-Het compared with WT and 44 genes downregulated in the TBXT-KO compared with WT (Fig. 3B-D, Fig. S6A,B) (adj. P<0.05, log2FC<−0.25). Two general classes of DE genes emerged: genes that required a relatively binary threshold level of TBXT expression to fully activate or repress downstream expression, and genes expression levels of which scaled with TBXT expression levels. Nine genes were statistically significantly downregulated in both the TBXT-Het and TBXT-KO compared with WT, putting them in the former category, including the non-canonical WNT pathway component WNT5A and BMP inhibitor BMPER (Fig. 3C,D, asterisks). In contrast, several genes, including WDPCP and TCF4, were significantly downregulated in the TBXT-KO compared with WT, but had intermediate expression in the TBXT-Het (Fig. 3D, no asterisk). Gene Ontology (GO) biological pathway enrichment for the 44 genes significantly downregulated in the TBXT-KO versus WT, the majority of which also had reduced expression in the TBXT-Het, identified ‘regulation of cell motility’, ‘locomotion’ and ‘PCP pathway involved in axis elongation’, reflecting the attenuated migratory phenotype and increased adhesiveness seen when TBXT expression is reduced in vivo (Wilson and Beddington, 1997; Wilson et al., 1995) (Fig. 3E, Table S2). As anticipated, in situ hybridization confirmed decreased expression of the mesoderm marker MESP1 and the WNT regulators WNT5A and RSPO3 in TBXT-KO gastruloids relative to WT (Fig. 3F).
TBXT dose influences downstream gene expression in a dose-dependent manner. (A,A′) UMAP highlighting the Mesoderm cluster (Cluster 9) (A) and violin plot showing the relative expression levels of TBXT in each cluster (A′). (B,B′) Volcano plot of DEGs between WT and TBXT-Het (B) or WT and TBXT-KO (B′) after subsetting the Mesoderm cluster (n=3212 cells, adj. P<0.05, log2FC>0.25). (C) Venn diagrams of the number of DEGs identified in the TBXT-Het (green), TBXT-KO (blue), or both TBXT-Het and TBXT-KO (purple) compared with WT. The top diagram includes genes downregulated and the bottom diagram includes genes upregulated in TBXT-Het and/or TBXT-KO compared with WT. (D) Dotplot of 89 DEGs between WT and TBXT-KO in order of log2FC. Asterisks indicate genes that are also differentially expressed between WT to TBXT-Het, and the bolded black text indicates genes identified as likely direct targets of TBXT by comparison with ChIP-seq datasets. DDX58, RIGI. (E) Key results from ShinyGO GO Biological Process Analysis (FDR<0.05, Pathway Size 2-2000 genes) (F) Multiplexed fluorescence in situ hybridization in gastruloids at 48 h. In situ hybridizations were repeated for ≥3 gastruloids of each genotype. (G-I) Dotplots showing gene expression trends for key genes related to mesoderm identity (G), WNT signaling (H) and EMT (I). Bold genes are significantly differentially expressed (adj. P<0.05, log2FC>0.25). (J) Venn diagrams summarizing DEGs detected in corresponding ChIP-seq datasets. lnc, long non-coding RNA, mito, mitochondrial gene.
TBXT dose influences downstream gene expression in a dose-dependent manner. (A,A′) UMAP highlighting the Mesoderm cluster (Cluster 9) (A) and violin plot showing the relative expression levels of TBXT in each cluster (A′). (B,B′) Volcano plot of DEGs between WT and TBXT-Het (B) or WT and TBXT-KO (B′) after subsetting the Mesoderm cluster (n=3212 cells, adj. P<0.05, log2FC>0.25). (C) Venn diagrams of the number of DEGs identified in the TBXT-Het (green), TBXT-KO (blue), or both TBXT-Het and TBXT-KO (purple) compared with WT. The top diagram includes genes downregulated and the bottom diagram includes genes upregulated in TBXT-Het and/or TBXT-KO compared with WT. (D) Dotplot of 89 DEGs between WT and TBXT-KO in order of log2FC. Asterisks indicate genes that are also differentially expressed between WT to TBXT-Het, and the bolded black text indicates genes identified as likely direct targets of TBXT by comparison with ChIP-seq datasets. DDX58, RIGI. (E) Key results from ShinyGO GO Biological Process Analysis (FDR<0.05, Pathway Size 2-2000 genes) (F) Multiplexed fluorescence in situ hybridization in gastruloids at 48 h. In situ hybridizations were repeated for ≥3 gastruloids of each genotype. (G-I) Dotplots showing gene expression trends for key genes related to mesoderm identity (G), WNT signaling (H) and EMT (I). Bold genes are significantly differentially expressed (adj. P<0.05, log2FC>0.25). (J) Venn diagrams summarizing DEGs detected in corresponding ChIP-seq datasets. lnc, long non-coding RNA, mito, mitochondrial gene.
We also identified 54 genes upregulated in the TBXT-Het compared with WT and 45 genes upregulated in the TBXT-KO compared with WT (Fig. 3B-D, Fig. S6A) (adj. P<0.05, log2FC>0.25). Of these, 15 genes overlapped, including the posterior morphogens CYP26A1 and FGF17 and the broad mesendoderm marker PRDM1 (Fig. 3C,D, asterisks). We conducted in situ hybridization for FGF17 and CYP26A1 to validate these results and observed increased expression of both transcripts in TBXT-KO gastruloids relative to the WT (Fig. 3F). Genes that demonstrated expression patterns that scaled with TBXT dose included the migration and adhesion ligand EFNA5, the cell adhesion molecules CADM1, CADM2 and CDH12, and the endoderm marker LHX1 (Fig. 3D). GO enrichment of the 45 genes significantly upregulated in the TBXT-KO identified ‘cell–cell adhesion’ and ‘adherens junction organization’, reflective of a persistent epithelial character and impaired migration in the absence or reduction of TBXT (Fig. 3E, Table S2). In addition, several terms reflective of neurogenesis emerged. These terms are almost exclusively driven by genes involved in axon guidance or migration such as NTN4, SEMA6A and EFNA5, rather than markers of neural identity such as SOX2, OLIG3 or PAX6, so although it is possible that these changes in gene expression reflect a neural bias we believe it more accurately represents the misregulation of genes involved in the migratory mesenchymal phenotype. Furthermore, it is also possible that this neural bias in GO terms reflects a bias of an NMP-like population toward neural identities at the expense of mesodermal, a decision point that TBXT has been previously shown to modulate directly (Koch et al., 2017). However, the NMP marker NKX1-2 was minimally detected across all clusters, and no cluster showed robust co-expression of SOX2 and TBXT even at higher clustering resolutions (Fig. 2D, Fig. S7). This prevented us from conclusively defining the presence of an NMP subpopulation within the mesodermal population or in the dataset more broadly.
Overall, the level of expression of DEGs in the TBXT-Het was always similar to that of the TBXT-KO, similar to that of the WT, or an intermediate between the two. This pattern suggests that although some genes respond to TBXT dose in a binary manner whereby WT expression levels are needed for downstream expression, more often the expression level of TBXT is closely linked to the expression level of its downstream targets.
TBXT dose subtly influences the expression profile within the mesodermal population
Although the presence and spatial distribution of broad mesoderm markers were not affected by varying TBXT expression, closer examination revealed that the gene regulatory network controlling early mesoderm identity itself is only subtly influenced by TBXT dose. The broad mesendoderm and PGCLC marker PRDM1 was significantly upregulated in both the TBXT-Het and the TBXT-KO relative to WT, suggesting its expression is sensitive to TBXT dose (Fig. 3D,G, Fig. S6A, Table S3). LHX1 and PDGFRA had a similar expression pattern as PRDM1, with intermediate expression levels in the TBXT-Het and significant upregulation in the TBXT-KO compared with WT. To understand whether this gene expression pattern reflected a larger bias toward endodermal cell identity, we manually isolated several canonical markers of mesodermal and endodermal identity from our dataset and looked at their expression patterns. SOX17 and EOMES followed a similar pattern to PRDM1, with slightly elevated but not significantly differential expression in the TBXT-KO relative to the WT. In contrast, the mesoderm markers MESP1, TBX6 and BMP4 were slightly higher in the WT compared with the KO, but again these changes in expression were not statistically significant (Fig. 3G, Fig. S6C). The canonical WNT pathway components RSPO3, WDPCP and TCF4 did display significantly graded expression, with the highest expression in WT and the lowest in the TBXT-KO (Fig. 3D,H, Fig. S6D). WNT3A itself was very sparsely detected in the nascent mesoderm population by snRNA-seq; however, the significant upregulation of several canonical WNT pathway components in WT agrees with the WNT3A expression pattern observed by IF (Figs 1D-F, 3H). We then re-analyzed our dataset at a higher resolution (0.6) to understand whether these subtle trends reflected larger biases toward endoderm fate that were being masked by our clustering resolution (Fig. S7). Although this new resolution did segregate our mesoderm cluster into a mesendoderm-like population and a relatively more mature mesoderm-like population, there was not a clear genotype-specific bias in the distribution of cells assigned to each of these new clusters, reinforcing the consistency in lineage allocation regardless of TBXT dosage.
Notably, several direct targets of TBXT that are implicated in osteogenesis were significantly upregulated in WT, including BMPER, GNAI2, ENPP1, THSD7A and ADAMTS3 (Fig. S6C). This is relevant because T+/− mutant mice frequently display skeletal malformations later in development, including rib fusions, osteochondrodysplasia and brachydactyly (Grüneberg, 1958; Herrmann et al., 1990; Wilson et al., 1993). In addition, MAML3, which amplifies transcription of HES1 to drive oscillations in somitogenesis, was increased in the TBXT-KO gastruloids. This expression change is potentially related to somitic fusions and subsequent rib fusions seen in T+/− and T−/− animal models later in development (Wu et al., 2002; William et al., 2007). Finally, the BMP inhibitor BMPER was significantly downregulated in both TBXT-Het and TBXT-KO compared with WT. Because BMP4 signaling activates TBXT expression and TBXT directly activates BMPER expression, decreased BMPER expression in the TBXT-KO could reflect a compensatory pathway to rescue TBXT expression levels in TBXT-Het or TBXT-KO.
These gene expression trends suggest that, although reduced TBXT expression does not preclude cells from forming mesoderm, it does influence the gene regulatory networks that may have a later-onset role in downstream axial elongation and patterning.
TBXT dose influences the expression of genes that modulate EMT
In addition to genes related to endoderm identity, we observed several gene expression patterns reflective of impaired EMT in our mutant gastruloids. For example, the cell adhesion molecules CADM1, CADM2, CDH12 and EFNA5 were significantly upregulated in the TBXT-KO mesoderm, intermediate in the TBXT-Het mesoderm, and downregulated in the WT mesoderm (Fig. 3I). CDH1 shared this expression pattern albeit with reduced statistical significance, suggesting that the mutant cell lines retain an epithelial character, unlike their WT counterparts. Indeed, SNAI1 and SNAI2, canonical regulators of EMT, both had the highest expression in the WT mesoderm, although they were detected in very few cells (Fig. 3I). The upregulation of adhesion molecules and downregulation of SNAI family proteins observed in TBXT-Het and TBXT-KO suggests that, although the WT population is actively undergoing EMT, this process is impaired in the mutant gastruloids.
The majority of DEGs are direct targets of TBXT
To better isolate the key components of the gene regulatory network underlying TBXT dose-responsive gene expression, we sought to identify which DEGs are also likely direct targets of TBXT based on promoter binding proximity. To do this, we leveraged four existing TBXT ChIP-seq datasets from hESCs and mouse embryonic stem cells (mESCs) grown in vitro. The first two datasets were derived from hESCs differentiated into monolayer TBXT+ cell populations using either activin (endoderm-biased) or BMP4 (mesoderm-biased) protocols (Faial et al., 2015). The third dataset utilized a mesendoderm-biased hESC population driven by TGFβ and WNT signaling (Tsankov et al., 2015). The final study used an activin A-mediated protocol to drive mESC embryoid bodies to a T+ PS fate (Lolas et al., 2014).
This comparison revealed that the vast majority of the genes that were differentially expressed between the different genotypes within the Mesoderm cluster have TBXT-binding sites adjacent to their promoters and are therefore likely direct targets of TBXT. Specifically, 33 of the 44 DEGs downregulated in TBXT-KO compared with WT had proximal TBXT-binding sites in at least one ChIP-seq dataset (Fig. 3D, bold, 3J, Fig. S8). Six of these 33 genes were detected in both human and mouse ChIP-seq datasets (BMPER, DTWD2, EXT1, MDFIC, ENPP1 and RSPO3), suggesting they play an evolutionarily conserved function in early development. The remaining 26 genes were specifically identified in the human ChIP-seq datasets. One gene, PCED1B, was detected as a potential direct target of TBXT in the mouse datasets but not in any of the human datasets. Seven of the 33 DEGs identified in the ChIP-seq datasets were significantly downregulated in both the TBXT-Het and TBXT-KO colonies compared with WT, including BMPER, WNT5A and UNC5C, suggesting these genes are particularly sensitive to reduced TBXT expression. The 11 DEGs that were not identified as likely direct targets of TBXT included four protein-coding genes (LDHA, AFDN, NLGN4Y and SLC9C1) and seven long non-coding RNAs.
Thirty-five out of the 45 genes upregulated in the TBXT-KO compared with WT were found to have proximal TBXT-binding sites in at least one ChIP-seq dataset (log2FC>0.25, adj. P<0.05) (Fig. 3D, bold, 3J, Fig. S8). Of these, nine were identified as potential direct targets of TBXT in both human and mouse datasets (FGF17, LHX1, SAMD3, UBL3, CADM1, EFNA5, MAML3, SEMA6A and TCF7L1). The remaining 26 genes were identified specifically in the human ChIP-seq datasets. Nine of the 35 potential direct targets, including FGF17, CYP26A1 and PRDM1, were significantly upregulated in both TBXT-Het and TBXT-KO colonies compared with WT, reflecting a more pronounced dosage sensitivity (Fig. S8, bold). Ten DEGs were not identified in any of the ChIP-seq datasets, of which three were detected in coding regions (CST1, PKIB and RPS21), one was mitochondrial, and the remaining six were long non-coding RNAs.
Taken together, these results indicate that the majority of genes that display TBXT dosage sensitivity have proximal binding sites for TBXT, and therefore their transcription is likely directly modulated by TBXT. In addition, based on their evolutionary preservation, the genes that we identified as shared between this study and ChIP datasets from both species are likely uniquely important for gastrulation and subsequent survival of the embryo.
TBXT does not significantly impact chromatin accessibility in nascent mesoderm
It has previously been shown that TBXT plays a role in the deposition of H3K27ac at target genes during hematopoietic and endothelial development to alter transcription (Beisaw et al., 2018; Chen et al., 2023) and that TBXT is essential for remodeling chromatin during NMP development (Koch et al., 2017). We were interested in determining whether TBXT similarly modulates chromatin accessibility during early PS morphogenesis. However, in our gastruloid mesoderm population, very few genomic regions were differentially accessible between either the TBXT-Het and WT or TBXT-KO and WT conditions (26 and six genes, respectively; FDR<0.1, log2FC>0.5) (Fig. S9A-C, Table S3). Although it is true that TBXT-Het does reflect a higher number of DARs than TBXT-KO when compared with WT, the log2FC values of these DARs were very close to the significance cutoff and several were microRNAs. The six DARs identified in the TBXT-KO included the adhesion protein MDGA2, the vitamin D metabolizing enzyme CYP2R1, the pluripotency-related gene EYS, the histone components HIST1H4L (H4C13), HIST1H1B (H1-5) and HIST1H3I (H3C11), and several long non-coding RNAs. Likewise, very few peaks were differentially accessible between the two conditions (18 peaks between WT and TBXT-Het and no peaks between WT and TBXT-KO, respectively; FDR<0.1, log2FC>0.5), and no motifs were significantly enriched within these peaks, as was true for all clusters (Table S3). We observed slight variation in peak height at genes demonstrating differential expression, such as WNT5A, RSPO3 and LHX1, although these variations did not reach statistical significance (Fig. S9D). These results suggest that, at this early stage of mesoderm specification in gastruloids, TBXT expression does not contribute to mesodermal patterning by influencing chromatin accessibility.
CellChat reveals TBXT-driven regulation of cell–cell adhesions and non-canonical WNT signaling across and within clusters
Morphogenesis frequently involves paracrine and juxtracrine signaling between different cell populations. For example, mesodermal cells in contact with the epiblast and visceral endoderm have distinct protrusions, whereas mesoderm cells in contact with other mesoderm cells appear smoother, reflective of distinct cell responses to specific migratory guidance cues (Saykali et al., 2019). Therefore, we questioned whether TBXT dose-dependent changes in cell behavior are restricted to the mesoderm population or if they may be influenced by adjacent cell types. To begin to address this question, we turned to the software CellChat, which analyzes the expression of ligand–receptor pairs within and across clusters to predict patterns in cell–cell communication.
First, we investigated how TBXT expression influences broad patterns in pathway activation by assessing which pathways have the largest changes in signals between or within different cell types, termed ‘information flow’, when comparing WT, TBXT-Het and TBXT-KO snRNA-seq data across all 11 clusters. This analysis revealed several pathways that have varying levels of information flow, and we focused on the three in which the WT and TBXT-KO have distinctly different patterns from one another (Fig. 4A). The first two pathways, cell adhesions (CADM) and junctional adhesions (JAM), were predicted to have higher information flow in the TBXT-Het and TBXT-KO gastruloids relative to WT, whereas the third pathway, non-canonical WNT (ncWNT) signaling, was predicted to have higher information flow in WT and TBXT-Het gastruloids relative to TBXT-KO.
CellChat reveals TBXT-driven regulation of cell–cell adhesions and ncWNT signaling across and within clusters. (A) Bar plot comparing information flow (arbitrary units) of key pathways between WT, TBXT-Het and TBXT-KO. (B) Circle plots visualizing communication between or within clusters for the CADM, JAM and ncWNT signaling pathways. Numbers correlate with cluster identity, and line thickness corresponds to the strength of predicted communication. Cluster 9 is the Mesoderm cluster. Small circle plots under the ‘ncWNT Signaling Pathway Network’ header depict interactions between WNT5A and designated FZD receptors. (C) Heatmap showing the communication dynamics for each genotype for the CADM, JAM and ncWNT signaling pathways. (D) Bubble plot comparing predicted ligand–receptor interactions for CADM, JAM, WNT and ncWNT pathways. Only ligand–receptor interactions with variability in communication probability are shown.
CellChat reveals TBXT-driven regulation of cell–cell adhesions and ncWNT signaling across and within clusters. (A) Bar plot comparing information flow (arbitrary units) of key pathways between WT, TBXT-Het and TBXT-KO. (B) Circle plots visualizing communication between or within clusters for the CADM, JAM and ncWNT signaling pathways. Numbers correlate with cluster identity, and line thickness corresponds to the strength of predicted communication. Cluster 9 is the Mesoderm cluster. Small circle plots under the ‘ncWNT Signaling Pathway Network’ header depict interactions between WNT5A and designated FZD receptors. (C) Heatmap showing the communication dynamics for each genotype for the CADM, JAM and ncWNT signaling pathways. (D) Bubble plot comparing predicted ligand–receptor interactions for CADM, JAM, WNT and ncWNT pathways. Only ligand–receptor interactions with variability in communication probability are shown.
To understand how this expression pattern affects interactions across clusters, we looked at predicted ligand–receptor interactions of each of these three pathways within and between each cluster for all three genotypes. Analysis of CADM pathway and JAM pathway communication across and with clusters revealed these hits were primarily driven by varying expression levels of CADM1 and F11R, respectively, although the JAM pathway was also influenced by JAM3 expression (Fig. S10A). We then visualized ligand–receptor interactions between and within clusters using circle plots, where the presence and thickness of a line correlate with the degree of predicted communication within or between clusters. An overall increase in CADM signaling was evident in TBXT-Het and TBXT-KO gastruloids relative to WT, and this seemed to be largely due to increased communication within the PS-Late and Mesoderm clusters (Fig. 4B). Similarly, in TBXT-Het and TBXT-KO gastruloids, JAM signaling appeared exaggerated in PS and Mesoderm clusters, whereas these signaling dynamics were largely lost in WT. These trends suggest that, unlike the WT, the TBXT-Het and TBXT-KO colonies are maintaining their cell–cell adhesions and junctional adhesions, and these differences are uniquely apparent in cell types expected to undergo EMT, such as PS-Late and Mesoderm.
Within the ncWNT pathway, the TBXT-Het and TBXT-KO gastruloids had reduced ligand–receptor interactions compared with WT both between clusters, but also within clusters, with the degree of self-regulation changing most notably within the three PS clusters and the Mesoderm cluster (Fig. 4B). This variation in ncWNT signaling across genotypes was driven by the expression patterns of WNT5A, WNT5B and several genes encoding FZD receptors (Fig. 4B, Fig. S10). WNT5A is a direct target of TBXT and a crucial component of both the non-canonical WNT/planar cell polarity (PCP) pathway, which controls many aspects of directed cell migration and convergent extension, and the canonical WNT pathway, which is crucial for sustained mesoderm development (Yamaguchi et al., 1999; Dunty et al., 2008; Kikuchi et al., 2012).
To better understand how CADM, JAM and ncWNT pathways are influenced by TBXT dose, we next explored the extent to which different clusters operate as senders, receivers, mediators, and influencers of ligands and receptors in these pathways. Mediators serve as gatekeepers to control cell communication between any two groups, whereas influencers are predicted to control information flow more generally (Jin et al., 2021). For both the CADM and JAM pathways, the TBXT-Het and TBXT-KO PS-Late and Mesoderm clusters increased their ability to serve as senders, receivers, mediators or influencers relative to WT (Fig. 4C). These trends likely reflected genotype-dependent patterns in CADM1 and F11R expression within these specific clusters. In contrast, the PS-Late, Mesoderm and PGCLC clusters largely lost their ability to operate as senders, mediators and influencers of the ncWNT pathway, but they maintained their ability to serve as receivers (Fig. 4C). These patterns suggest that variability in ncWNT pathway information flow is likely more highly dependent on varying levels of WNT5A or WNT5B expression (senders) rather than FZD gene expression (receivers).
To clarify how specific ligands or receivers modulated predicted information flow in the ncWNT pathway, we looked at how these gastruloid communication patterns were affected by the expression of specific ligand–receptor pairs sent or received by the Mesoderm cluster (Fig. 4D). We found that WNT5A signals sent from the mesoderm to FZD receptors across all other clusters, including the mesoderm itself, were largely lost in the TBXT-KO but maintained in WT. WNT5B, however, was not detected as a major ligand sent from the mesoderm. Reciprocally, ncWNT signals, such as WNT5A and WNT5B, sent from ExE-Early, PS-Late and PGCLC clusters to the Mesoderm cluster showed increased signal in the WT, likely reflecting an increase in WNT5A expression in all TBXT-expressing clusters. This analysis also revealed that WNT5B is most impactful in extra-embryonic and PGCLC clusters, but negligible in PS and mesendodermal clusters.
Taken together, our analyses directly reflect the persistence of cell adhesions and junctional adhesions in the TBXT-Het and TBXT-KO PS-Late and Mesoderm clusters, suggesting that these cells do not readily acquire a mesenchymal identity, a key requirement for motility and subsequent PS morphogenesis. The results additionally reflect the downregulation of WNT5A in the TBXT-Het and TBXT-KO compared with WT and identify the non-canonical WNT signaling pathway as a key TBXT-dependent regulator of nascent mesoderm development.
TBXT dose influences the persistence of cell–cell adhesions
To understand how cell–cell adhesions and EMT are affected by TBXT dose, we conducted IF for several markers of EMT progression, including SNAI1, the tight junction regulator ZO1 (TJP1), and the basement membrane protein FN1 in 48 h 2D gastruloid colonies (Fig. S11). However, gastruloid colonies develop a very dense mesoderm layer that grows into the z-axis, and we found that the colony density and multilayered growth, in conjunction with the presence of multiple cell types, prohibited the clear visualization and quantification of cell morphology and junctions. In addition, the time point recapitulated by the gastruloids is likely too early to visualize the full process of EMT, and the 48 h limitation of colony growth restricts our ability to monitor EMT progression over time. Therefore, we turned to an alternative protocol that induces early mesoderm in a monolayer using STEMCELL Technology's STEMdiff Mesoderm Induction Medium (MIM). In WT cells, this medium induces TBXT expression in >90% of cells within 48 h, thereby recapitulating the initial stages of nascent mesoderm specification (Fig. 5A,B, Fig. S12A). This early mesoderm population maintains consistent EOMES expression regardless of TBXT genotype with expression peaking at 48 h, reinforcing the shared nascent mesoderm identity across genotypes (Fig. S12B). At 48 h we observed a TBX6+ mesoderm population in WT and TBXT-Het and a reduced but present TBX6+ population in TBXT-KO. This is in agreement with our snRNA-seq data and the established role of TBX6 as a direct target of TBXT in the PS and later in the paraxial mesoderm. (Fig. 3G, Fig. S12C). We differentiated cells of each genotype using MIM for 0 h, 24 h, 48 h or 72 h, and assessed cell morphology and protein expression by IF.
TBXT dose directly impacts the timing of EMT to permit migration. (A) Schematic of the experimental setup. R.I., ROCK inhibitor. hiPSCs were grown in a monolayer and exposed to MIM for 0, 24, 48 or 72 h before subsequent analysis. (B) Immunofluorescence for TBXT or EOMES expression in each genotype after 48 h MIM exposure. (C,E) Immunofluorescence for CDH1 (C) or SNAI (E) at 0 h, 24 h, 48 h or 72 h MIM exposure. Boxed areas are shown at higher magnification on the right. (D) Quantification of the ratio between CDH1 and LMNB1 intensity. n≥6 wells/genotype. (F) Schematic showing segmentation of nuclei and cytoplasm and the quantification of the ratio between SNAI and LMNB1 specifically within cell nuclei. WT=1228 cells; TBXT-Het=1423 cells TBXT-KO=1373 cells. n=3 images per genotype per time point. (G) Brightfield (BF) images of the scratch assay at 24 h or 48 h MIM exposure (0 h or 24 h relative to scratch), and immunofluorescence for F-actin at the edge of a scratch wound at 48 h (24 h relative to scratch). (H) Quantification of the difference in tissue area between 48 h and 24 h MIM exposure (mTeSR+: n=WT=23, TBXT-Het=19, TBXT-KO=21; MIM: n=WT=8, TBXT-Het=15, TBXT-KO=16). (I) Overview of temporal onset of EMT in each genotype. A.U., arbitrary units.
TBXT dose directly impacts the timing of EMT to permit migration. (A) Schematic of the experimental setup. R.I., ROCK inhibitor. hiPSCs were grown in a monolayer and exposed to MIM for 0, 24, 48 or 72 h before subsequent analysis. (B) Immunofluorescence for TBXT or EOMES expression in each genotype after 48 h MIM exposure. (C,E) Immunofluorescence for CDH1 (C) or SNAI (E) at 0 h, 24 h, 48 h or 72 h MIM exposure. Boxed areas are shown at higher magnification on the right. (D) Quantification of the ratio between CDH1 and LMNB1 intensity. n≥6 wells/genotype. (F) Schematic showing segmentation of nuclei and cytoplasm and the quantification of the ratio between SNAI and LMNB1 specifically within cell nuclei. WT=1228 cells; TBXT-Het=1423 cells TBXT-KO=1373 cells. n=3 images per genotype per time point. (G) Brightfield (BF) images of the scratch assay at 24 h or 48 h MIM exposure (0 h or 24 h relative to scratch), and immunofluorescence for F-actin at the edge of a scratch wound at 48 h (24 h relative to scratch). (H) Quantification of the difference in tissue area between 48 h and 24 h MIM exposure (mTeSR+: n=WT=23, TBXT-Het=19, TBXT-KO=21; MIM: n=WT=8, TBXT-Het=15, TBXT-KO=16). (I) Overview of temporal onset of EMT in each genotype. A.U., arbitrary units.
Varying the timing of MIM exposure revealed a robust and stepwise delay in the timing of the downregulation of CDH1, a canonical marker of the epithelial state, across genotypes (Fig. 5C,D). At 24 h of MIM treatment, all genotypes displayed robust junctional CDH1 expression. However, by 48 h, CDH1 was largely eliminated in WT cells, TBXT-Het cells had variable expression, and TBXT-KO cells maintained robust junctional expression, in agreement with our snRNA-seq data. By 72 h, TBXT-Het cells had lost CDH1 expression to mirror WT cells, whereas CDH1 was reduced but still detectable in TBXT-KO cells. This expression pattern demonstrates that TBXT dose is directly correlated to the temporal downregulation of CDH1 in the nascent mesoderm, which in turn correlates with stunted EMT progression. We also looked at CDH2 expression, as CDH1 downregulation is generally associated with CDH2 upregulation as EMT progresses. However, at 48 h CDH2 appeared to be consistently expressed regardless of CDH1 expression level or genotype (Fig. S13B).
To better understand how TBXT influences EMT progression more broadly, we looked at the distribution of SNAI1, an established marker of EMT and inhibitor of CDH1 (Muqbil et al., 2014; Xu et al., 2019), across the same 72 h time span. As previously indicated, we detected higher SNAI1 RNA expression in WT compared with TBXT-Het and TBXT-KO by snRNA-seq (Fig. 3I). IF revealed that WT and TBXT-Het had a significantly higher level of nuclear SNAI1 relative to TBXT-KO at the 48 h time point (Fig. 5E,F). At 72 h, this expression was beginning to drop off in the WT but was maintained in the TBXT-Het and TBXT-KO, revealing an inverse correlation to CDH1 expression. Furthermore, we also observed increased protein expression of other epithelial markers, including junctional β-catenin and ZO1, in the TBXT-KO cells relative to the WT after 48 h MIM, in accordance with TBXT dose dependently augmenting the progression of EMT (Fig. S13A).
EMT progression and mesenchymal cell motility are linked to alterations in extracellular matrix composition. In particular, the deposition of the basement membrane protein and SNAI1-target fibronectin (FN1) reflects the acquisition of a mesenchymal phenotype. We observed decreased deposition of the basement membrane component FN1 in our TBXT-KO relative to TBXT-Het and WT, reflecting maintained cell adhesions and decreased motility in the absence of TBXT (Fig. S13A).
Next, we utilized the scratch wound assay to visualize the migration patterns of cells of each genotype and quantify how TBXT dose-dependent changes in EMT-related protein expression correlate with cell migration kinetics. Comparing the difference in the percentage of occupied tissue area between the starting time point (24 h MIM) and ending time point (48 h MIM) across genotypes revealed that the WT population migrated significantly farther into the wound space than did the TBXT-Het or TBXT-KO populations (Fig. 5G,H, Movies 1-6). WT cells were noticeably larger and much more motile both as a group and independently. Staining of F-actin revealed long protrusions in WT cells (Fig. 5G), reflective of their motile mesenchymal character. TBXT-KO cells, in contrast, remained tightly packed and epithelial, and cell movement appeared to be driven by overconfluence more so than directed cell movement. A fraction of TBXT-Het cells acquired a similar mesenchymal phenotype to WT, but others appeared epithelial and non-migratory. Therefore, increased SNAI1 and decreased CDH1 expression seen in WT cells at the 48 h MIM time point reflect the acquisition of a TBXT-dependent migratory mesenchymal phenotype.
Through these studies, we observed that TBXT-Het is able to increase SNAI1, FN1 and β-catenin expression in tandem with WT. However, at the 48 h time point, there were clusters of TBXT-Het cells that maintain their junctional protein expression, including CDH1 and ZO1, in a way that more closely mirrors the TBXT-KO phenotype. We hypothesize that both the upregulation of SNAI1 and the downregulation of CDH1 are crucial for migration, and that specifically the downregulation of CDH1 is required for delamination and increased cell motility. Therefore, perduring junctional CDH1 likely explains why we observe a less migratory phenotype in TBXT-Het.
The expression patterns of CDH1, β-catenin, ZO1, SNAI1 and FN1 in addition to the distinct migratory dynamics seen across genotypes indicate that TBXT plays a crucial role in regulating the temporal component of EMT, where TBXT dose dependently promotes the reduction of CDH1, nuclear localization of SNAI, and subsequent acquisition of a mesenchymal phenotype.
DISCUSSION
Our study demonstrates that the acquisition of mesodermal identity can be decoupled from the acquisition of a mesenchymal character, and TBXT is required for the latter in a dose-dependent manner. After 48 h of BMP4 exposure, the mesoderm cluster of our gastruloids exhibit TBXT dose-dependent changes in gene expression related to mesendoderm identity, both non-canonical and canonical WNT signaling, and EMT, and the majority of these DEGs are likely directly regulated by TBXT as evidenced by comparisons to existing ChIP-seq datasets. CellChat analysis highlighted the JAM, CADM and ncWNT pathways as uniquely regulated between WT and TBXT-KO colonies and showed that the Mesoderm and PS-Late clusters play key roles in the regulation of these pathways. We then utilized a 2D monolayer culture system to demonstrate that TBXT dose directly correlates with the temporal downregulation of genes related to cell adhesions, the upregulation of drivers of EMT, and the subsequent acquisition of a mesenchymal phenotype.
These findings have interesting implications for our understanding of how TBXT influences embryonic patterning at the earliest stages of gastrulation, even before the commencement of posterior trunk development. As opposed to NMP populations, which require TBXT for mesoderm differentiation (Gouti et al., 2017; Koch et al., 2017), we demonstrated that early PS populations do not require TBXT for the initial establishment of a relatively anterior mesoderm identity. This result is in agreement with studies showing that another T-box factor, EOMES, primarily drives mesoderm differentiation at this stage of development (Schüle et al., 2023 preprint). However, we also discovered that TBXT is not dispensable at this early stage, as its expression directly influences the timing of EMT and therefore the ability of the nascent mesoderm cells to properly acquire a migratory mesenchymal character. We believe that this impaired EMT observed in vitro is reflected in the ‘pile-up’ phenotype observed in vivo, as cells accumulate in the PS when T expression is lost.
The time point reflected in our 2D gastruloid culture reflects the earliest stages of mesoderm commitment. However, the question remains how TBXT dosage influences the differentiation of mesodermal subtypes after EMT has been initiated. We observe that the early mesoderm-like cells emerge before EMT regardless of TBXT dosage; however, this does not negate the very likely possibility that both TBXT expression and changes in gene regulation driven by the mechanical forces of EMT itself are required for more mature mesodermal subtypes to differentiate correctly as development progresses. Additionally, TBXT is known to augment cell fate in populations not clearly defined in our 2D gastruloid or MIM systems, including notochord and NMP differentiation (Zhu et al., 2016; Koch et al., 2017). Additional experiments, possibly using in vivo models, will likely be required to further disentangle how these later mesodermal cell identities are influenced by the reduction of TBXT expression and the corresponding delayed onset of EMT.
Somewhat surprisingly, we did not observe large-scale changes in chromatin accessibility across TBXT genotypes, despite detecting variation in gene expression. Although this may be specific to our in vitro model system, it is likely that it reflects the relatively early time point modeled by the gastruloids. At 48 h, mesoderm is just beginning to emerge, and it is possible that with continued differentiation over an additional 24-48 h, the mesoderm identity would advance, and chromatin remodeling would be more readily apparent. It is also possible that TBXT does not directly remodel chromatin during early gastrulation and might instead be interacting with histone-modifying complexes.
Overall, this study clarifies the role of TBXT during early PS development and sheds light on its ability to promote the temporal progression of EMT in a dose-dependent manner. Although TBXT-KO cells do ultimately downregulate CDH1 and acquire a motile phenotype, this transition occurs considerably later than in WT cells, demonstrating that the correct timing of EMT is crucial for proper morphogenesis. In addition, this study decouples EMT progression from initial mesoderm specification in the PS and complements in vivo studies to both improve our core understanding of vertebrate mesoderm development and identify nuances of human development. This understanding will help us to effectively design directed differentiation strategies that incorporate both EMT and cell fate acquisition, in addition to improving our foundational understanding of human embryogenesis.
MATERIALS AND METHODS
Cell lines
All work with hiPSCs was approved by the University of California, San Francisco Human Gamete, Embryo, and Stem Cell Research (GESCR) Committee. hiPSCs harboring genome-edited indel mutations for TBXT were generated for this study. TBXT-Het-35 and TBXT-KO were derived from the Allen Institute WTC11-LMNB1-GFP parental cell line (AICS-0013 cl.210) and TBXT-Het-6 was derived from the Allen Institute WTC11-LMNB1-mTagRFP-T parental cell line (AICS-0034 cl.62). Throughout this article, ‘TBXT-Het’ refers to ‘TBXT-Het-35’ unless otherwise noted. The WT line was derived from a WTC11-LMNB1-GFP subclone that was exposed to the TBXT sgRNA but remained unedited. All cell lines were karyotyped by Cell Line Genetics and reported to be karyotypically normal. Additionally, all cell lines tested negative for Mycoplasma using a MycoAlert Mycoplasma Detection Kit (Lonza).
Maintenance of hiPSCs
hiPSCs were cultured on growth factor-reduced (GFR) Matrigel (Corning Life Sciences) and fed at least every other day with mTeSR-Plus medium (STEMCELL Technologies) (Ludwig et al., 2006). Cells were passaged by dissociation with Accutase (STEMCELL Technologies) and re-seeded in mTeSR-Plus medium supplemented with the small molecule Rho-associated coiled-coil kinase (ROCK) inhibitor Y276932 (10 μM; Selleckchem) (Park et al., 2015) at a seeding density of 12,000 cells per cm2. After 24 h, cells were maintained in mTeSR-Plus media until 80% confluent.
TBXT allelic series generation
To generate the TBXT allelic series, we first lipofected parental cells with 125 ng sgRNA and 500 ng Cas9 protein according to the Lipofectamine Stem Transfection Reagent Protocol (Invitrogen). The human TBXT sgRNA (CAGAGCGCGAACUGCGCGUG) was a gift from Jacob Hanna (Addgene plasmid #59726; RRID: Addgene_59726) and targeted the first exon of the TBXT gene. After recovery for 48 h in mTeSR-Plus supplemented with ROCK inhibitor, lipofected cells were dissociated using Accutase and passaged into a GFR Matrigel-coated 10 cm dish, where they were expanded for 24 h in mTeSR-Plus with ROCK inhibitor. Media was replaced with mTeSR-Plus without ROCK inhibitor and cells continued to grow for another 2-4 days before the manual selection of 20-30 single colonies into individual wells of a 96-well plate. After the expansion of the clonal populations for 5-10 days, cells were passaged into a new 96-well plate at a 1:5 dilution ratio in mTeSR-Plus supplemented with ROCK inhibitor, and the remaining cells were used for genotyping. For screening of TBXT exon 1 non-homologous end-joining mutations, DNA was isolated using QuickExtract DNA lysis solution (Epicentre, QE0905 T), and genomic DNA flanking the targeted sequence was amplified by PCR (For1, GAAGGTGGATCTCAGGTAGCGAGTCTGG; Rev1, CAGCAGGAAGGAGTACATGGCGTTGG) and sequenced using Rev1. Synthego ICE analysis was employed to quantify editing efficiency and identify clones with heterozygous (45-55% KO score) or homozygous null (>90% KO score) mutations (Table S1). To eliminate the possibility of the heterozygous lines being a mixed population of WT and homozygous null alleles, a minimum of 12 subclones of each prospective heterozygous clone were isolated and Sanger sequenced as before, and all subclones were confirmed to contain identical genotypes (Table S1). After sequencing confirmation of respective genotypes, karyotypically normal cells from each hiPSC line were expanded for subsequent studies.
PDMS stamp fabrication
Standard photolithography methods were used to fabricate a master template (Alom Ruiz and Chen, 2007; Théry and Piel, 2009; Minn et al., 2020), which was provided as a gift from PengFei Zhang and the Abate lab at the University of California, San Francisco, USA, and previously described by Vasic et al. (2023). The photoresist master was then coated with a layer of chlorotrimethylsilane in the vacuum for 30 min. Polydimethylsiloxane (PDMS) and its curing agent, Sylgard 184, (Dow Corning) were mixed in a 10:1 ratio, degassed, poured over the top of the master, and cured at 60°C overnight, after which the PDMS layer was peeled off to be used as a stamp in microcontact printing.
Microcontact printing
PDMS stamps (each containing 12×1000 μm circles) were sterilized by washing in a 70% ethanol solution and drying in a laminar flow hood. GFR Matrigel (Corning) was diluted in DMEM/F-12 (Gibco) at 1:100 and incubated on the stamps to cover the entire surface of the feature side at 37°C for 1 h. The Matrigel solution was then aspirated off the stamps, which were air-dried. Using tweezers, the Matrigel-coated surface of stamps was brought into contact with a glass or plastic substrate, usually a glass 24-well plate or removable three-chamber slide (Ibidi), and incubated on the substrate for 1 h at 37°C. The stamps were then removed and rinsed in ethanol for future use. Matrigel-printed substrates were incubated with 1% bovine serum albumin (Sigma-Aldrich) in Dulbecco's PBS without calcium and magnesium (DPBS−/−) at room temperature for 1 h before being stored in DPBS−/− solution at 4°C for up to 2 weeks.
Confined 2D gastruloid differentiation
hiPSCs were dissociated with Accutase and resuspended in mTeSR-Plus supplemented with ROCK inhibitor. Cells were then seeded onto a stamped well at a concentration of approximately 750 cells/mm2. Cells were incubated at 37°C for 3 h before the well was rinsed 1× with DPBS and given fresh mTeSR-Plus supplemented with ROCK inhibitor. Approximately 24 h post-seeding, media was exchanged for mTeSR-Plus. After another 24 h or upon confluency of the stamped colony, media was exchanged for mTeSR-Plus supplemented with BMP4 (50 ng/ml). Colonies were allowed to differentiate in the presence of BMP4 for 48 h before being processed for downstream analyses.
Mesoderm induction media differentiation
hiPSCs were dissociated with Accutase and resuspended in mTeSR-Plus supplemented with ROCK inhibitor, as previously described. Cells were then seeded as a monolayer at a concentration of approximately 500 cells/mm2. Cells were incubated at 37°C for 24 h before being rinsed with 1× DPBS and given fresh mTeSR-Plus without ROCK inhibitor. Approximately 24 h later, media was exchanged for MIM (STEMCELL Technologies). Colonies were allowed to differentiate for 24-72 h, with MIM being exchanged daily, before use in downstream analyses.
Western blot
Cells of each genotype were induced to form TBXT+ mesoderm with either MIM or 4 μM CHIR99021 (Tocris, 4423) in mTeSR+ for 48 h prior to protein isolation. Cells were washed twice with ice-cold PBS and lysed in RIPA lysis buffer (ThermoFisher Scientific, A32965). Three replicate wells were pooled for each genotype for each differentiation condition. The protein concentration was determined using the Pierce BCA Protein Assay Kit (Life Technologies, 23227) and quantified on a SpectraMax i3 Multi-Mode Platform (Molecular Devices) following the manufacturer's instructions. Protein (∼20-40 μg) was transferred to the membrane using the Trans-Blot Turbo Transfer System (Bio-Rad, 1704157). The membrane was then blocked overnight at 4°C using Intercept TBS Blocking Buffer (Li-COR, 927-70001). Primary antibodies against TBXT (R&D Systems, AF2085; 0.2 µg/ml; goat) and either GAPDH (Abcam, ab9485; 1 µg/ml; rabbit), α-tubulin (Sigma-Aldrich, T5168; 20.5 µg/ml; mouse) or β-actin (Abcam, ab8226; 1 µg/ml; mouse) were diluted in Intercept T20 (TBS) antibody dilution buffer (LI-COR) at 1:1000 and incubated with the membrane overnight at 4°C. The next morning, membranes were washed in 1× TBS-T (0.1% Triton X-100) and incubated for 1 h at room temperature in the dark with species-specific secondary antibodies (Rb-680; 926-68071; Gt-680 925-68074, Ms-800 926-32212 Gt-800; 926-32214) (VWR) at 1:10,000. Membranes were subsequently washed and developed using the Bio-Rad ChemiDoc MP. Protein levels were quantified using ImageJ by first subtracting the intensity of a blank region of interest from the experimental region of interest, and then calculating a normalization factor by dividing the observed housekeeping intensity by the highest observed housekeeping intensity. The observed experimental signal was then divided by the lane normalization factor to generate a normalized experimental signal. Each lane from the same blot was then converted to a percentage of the highest WT normalized experimental signal on that blot.
Immunofluorescence
hiPSCs were rinsed with 1×PBS, fixed in 4% paraformaldehyde (VWR) for 15-20 min, and subsequently washed three times with PBS. The fixed cells were permeabilized and blocked in a buffer comprising 0.3% Triton X-100 (Sigma-Aldrich) and 5% normal donkey serum in PBS for 1 h, and then incubated with primary antibodies diluted in antibody dilution buffer (0.3% Triton X-100, 1% bovine serum albumin in PBS) overnight (Table S4). The following day, samples were washed three times with PBS and incubated with secondary antibodies in antibody dilution buffer at room temperature for 2 h. Secondary antibodies were conjugated with Alexa 405, Alexa 555 or Alexa 647 (Life Technologies) and used at 1:400. Cells were imaged at 10×, 20× or 40× magnification on an inverted AxioObserver Z1 (Zeiss) with an ORCA-Flash4.0 digital CMOS camera (Hamamatsu).
Scratch assay
Approximately 50,000 hiPSCs of each genotype were seeded into each well of a 96-well plate and induced to form mesoderm with the MIM induction protocol previously described. Twenty-four hours after the addition of MIM, a scratch was made in confluent wells manually by using a p200 pipette tip. Using ZenPro software, cells were then imaged every 12 min across 24 h in brightfield at 10× magnification on an inverted AxioObserver Z1 (Zeiss) with an ORCA-Flash4.0 digital CMOS camera (Hamamatsu) at 37°C with 5% CO2. Images of cells were computationally segmented, and the area occupied by cells was calculated using CellProfiler at the 24 h and 48 h time points relative to initial MIM addition (0 h and 24 h of live imaging). The area occupied at the final time point was subtracted from the area at the initial time point to yield the change in area. Wells that were not confluent at the time of the scratch or wells in which the basement membrane had been removed by the scratch were omitted from the final dataset.
Fluorescence in situ hybridization
PDMS stamps coated in GFR Matrigel were applied to three-well chamber slides with removable silicone chamber walls (Ibidi, 80381) and gastruloids were generated. Colonies were then fixed in 4% paraformaldehyde for 15-30 min, rinsed in PBS, and dehydrated according to the RNAScope cultured Adherent Cell Sample Preparation for RNA Multiplex Fluorescent v2 Assay (Advanced Cell Diagnostics). Slides were stored in 100% ethanol at −20°C on a short-term basis until initiation of the in situ hybridization protocol. The RNAScope protocol was then performed as outlined in User Manual 323100-USM. Catalog numbers for ACDBio RNAScope probes used in this study include FGF17 (1148351-C1), RSPO3 (413701-C2), MESP1(849231), CYP26A1 (487741) and WNT5A (604921). Colonies were imaged in collaboration with the UCSF Nikon Imaging Core and Gladstone Microscopy Core using the Olympus Fluoview FV3000 confocal microscope or the Nikon C2 laser-scanning confocal microscope equipped with a Prime 95B 25 mm sCMOS camera. Images are displayed as maximum z-projections.
Cell harvesting for single-nuclei multi-ome ATAC and RNA sequencing
Each of the TBXT genotypes was differentiated, harvested and prepared at the same time for each of the two biological replicates. Therefore, each set of biological replicates represents an experimental batch. For each sample within the batch, 12 micropatterns were differentiated within each well of a 24-well plate and cells from all wells on a plate were pooled, yielding a cell suspension comprising approximately 288 colonies per sample. Nuclei were isolated and ∼9000-12,000 nuclei/sample (TBXT-KO-2=19,000 nuclei) were transposed and loaded onto a 10× Chromium Chip J to generate gel bead-in emulsions (GEMs) following the 10× Chromium Next GEM Single Cell Multiome ATAC and Gene Expression Kit (10× Genomics, CG000338). GEMs were processed to produce ATAC and gene expression libraries in collaboration with the Gladstone Genomics Core. Deep sequencing was performed on a NovaSeq 6000 S4 200 cycle flow cell for a read depth of >25,000 reads per cell (TBXT-KO-2=16,000 reads per cell).
Data processing using Cell Ranger ARC
All ATAC and GEX datasets were processed using Cell Ranger ARC 2.0.0. FASTQ files were generated using the mkfastq function, and reads were aligned to the hg38 reference genome (version 2.0.0).
Seurat analysis
Outputs from the Cell Ranger ARC count pipeline were analyzed using the Seurat package (version 4.3.0) (Satija et al., 2015; Butler et al., 2018; Stuart et al., 2019) in R (version 4.2.0). Quality control filtering included the removal of outliers based on the number of features/genes (nFeature_RNA>2500 & nFeature_RNA<4500, nCount_RNA>5000 & nCount_RNA<12,500, mitochondrial percentage>5% and mitochondrial percentage<20%, and ribosomal percentage>3% and ribosomal percentage<15%). Cell cycle scores were added using the function CellCycleScoring. ScTransform version 2 normalization was then performed to integrate samples based on batch with regression based on cell cycle scores and ribosomal content [vars.to.regress=c(‘S. Score’, ‘G2 M. Score’, ‘percent_ribo’)]. Principal component analysis was performed using the most highly variable genes, and cells were clustered based on the top 15 principal components using the functions RunUMAP, FindNeighbors and FindClusters, and the output UMAP graphs were generated by DimPlot. The resolution parameter of 0.4 was set so that cluster boundaries largely separated the likely major cell types. Cluster annotation was performed based on the expression of known marker genes, leading to 11 broadly assigned cell types. Cells filtered out of the ArchR dataset based on doublet identification (see the ‘ArchR analysis’ section below) were removed from the Seurat dataset (final n=23,838 cells). Differential gene expression was then performed with the function FindAllMarkers (logfc.threshold=0.25 and min.pct=0.1) to generate a list of top marker genes for each cluster. In pairwise comparisons of differential gene expression, positive values reflect upregulation in mutant lines, whereas negative values reflect upregulation in WT.
ArchR analysis
Indexed Fragment files generated by the CellRanger-Arc counts function served as input for the generation of sample-specific ArrowFiles (minTSS=4 & minFrags=1000) using the R package ArchR version 1.0.2 (Granja et al., 2021). ArrowFile creation also generates a genome wide TileMatrix using 500 bp bins and a GeneScoreMatrix, an estimated value of gene expression based on a weighted calculation of accessibility within a gene body and surrounding locus. Each Arrow file (n=6 total) was then aggregated into a single ArchRProject for downstream analysis. Corresponding gene expression matrices were imported to the project based on the filtered feature barcode matrix h5 file generated by CellRanger-arc counts and descriptive cluster labels were imported from the corresponding Seurat object based on cell barcodes. Cells filtered out of the Seurat dataset based on QC metrics previously described were also removed from the ArchR dataset. Cell doublet removal was performed in ArchR using the functions addDoubletScores and filterDoublets, leaving 23,838 cells with a median TSS of 13.278 and a median value of 12,774.5 fragments per cell. (Cells filtered=WT-1 767/8759, TBXT-Het-1 613/7831, TBXT-KO-1 826/9092, WT-2 916/9572, TBXT-Het-2 617/7857, TBXT-KO-2 1623/12740).
After generation of the aggregated ArchR project, dimensionality reduction was performed using ArchR's implementation of Iterative Latent Semantic Indexing (LSI) with the function addIterativeLSI based on the 500 bp TileMatrix with four iterations, increasing resolution values (0.1, 0.2 and 0.4) each iteration. This was repeated using the Gene Expression Matrix based on 2500 variable features, yielding ‘LSI-ATAC’ and ‘LSI-RNA’ reduced dimensions, respectively. The two reduced dimension values were then combined using addCombinedDims to yield ‘LSI_Combined’, which was used as input for batch correction using Harmony with the function addHarmony (groupby=‘Sample, Batch’). Clustering was then performed using Harmony-corrected values with addClusters with a resolution of 0.4 from the R package Seurat. Finally, clusters were visualized with function plot embedding, using batch-corrected, single-cell embedding values from uniform manifold approximation and projection (UMAP) using the function addUMAP. Clusters and their corresponding UMAP projection were very similar to those generated based on RNA data in Seurat. Cluster identities in figures are based on barcodes transferred from Seurat rather than ArchR's LSI implementation.
After cluster annotation, pseudobulk replicates of cells within similar groups were created to facilitate peak calling. Replicates were created using addGroupCoverages and peak calling was performed using addReproduciblePeakSet using standard settings by implementing MACS2. We then used ArchR's iterative overlap peak merging method to create a union peakset of 322,520 unique peaks.
Cluster-enriched marker peaks were identified with getMarkerFeatures, using a Wilcoxon test and normalizing for biases from TSS enrichment scores and sequencing depth, and visualized with plotMarkerHeatmap, filtering for FDR<0.01 and abs(Log2FC)>1.25. Motif enrichment of cluster-enriched peaks was done using addMotifAnnotations with the ‘CODEX’ motif set. Enriched motifs per cluster were visualized by first running peakAnnoEnrichment, with FDR<0.1 and Log2FC>0.5. The top seven motifs per cluster were visualized as a heatmap using plotEnrichHeatmap.
Peak-to-gene linkage analysis was performed in ArchR using the addPeak2GeneLinks command, using the batch-corrected Harmony embedding values. A total of 3,010,318 linkages were found using FDR 1e−04, corCutOff=0.95 and a resolution of 1.
Differential accessibility within the Mesoderm cluster was performed by using the command subsetArchrProject to subset the ArchR project based on the Mesoderm-annotated cluster as determined from Seurat. This subsetting yielded 3212 cells, with a median TSS of 13.3085 and a median number of fragments of 12,423. DEGs predicted pairwise across genotypes (WT versus TBXT-KO or WT versus TBXT-Het) were identified with getMarkerFeatures based on the GeneScoreMatrix, using a Wilcoxon Test and normalizing for biases from TSS enrichment scores and sequencing depth. GetMarkers was then run and visualized as a volcano plot using plotMarkers [FDR<0.1 and abs(Log2FC)>0.5]. This process was repeated for the PeakMatrix to determine uniquely accessible peaks. Eighteen peaks were detected between WT and TBXT-Het and no peaks were detected between WT and TBXT-KO. No CODEX motif enrichments were detected between genotypes [FDR<0.1 and abs(Log2FC)>0.5].
CellChat
Cell signaling analysis was performed using the R package CellChat (Jin et al., 2021). The Seurat object containing all samples was subset by genotype, yielding a separate Seurat object for WT, TBXT-Het, or TBXT-KO. These three objects were then imported into CellChat using the function createCellChat. All ligand–receptor and signaling pathways within the CellChatDB.human were kept for analysis. Initial preprocessing to identify over-expressed ligands and receptors was performed using the functions identifyOverExpressedGenes and identifyOverExpressedInteractions with standard settings. Inference of cell communication was calculated with computeCommunProb(cellchat) and filtered by filterCommunication(cellchat, min.cells=10). Pathway-level cell communication was calculated with computeCommunProbPathway, and aggregated networks were identified with aggregateNet, using standard settings. Network centrality scores were assigned with the function netAnalysis_computeCentrality. This workflow was run for WT, TBXT-Het and TBXT-KO datasets independently and differential signaling analysis was then run by merging the WT, TBXT-Het and TBXT-KO objects with mergeCellChat(). Information flow, which is defined by the sum of communication probability among all pairs of cell groups in the inferred network (i.e. the total weights in the network), was compared across genotypes using rankNet(cellchat). The distance of signaling networks between WT and TBXT-KO datasets was calculated by performing joint manifold learning and classification of communication networks based on functional similarity using computeNetSimilarityPairwise(cellchat), netEmbedding(cellchat), and netClustering (cellchat). Circle diagrams, heatmaps and bubble plots of pathways of interest were then generated for each genotype separately using the standard settings for netVisual_aggregate(cellchat), netVisual_heatmap(cellchat) or netVisual_bubble(cellchat), respectively. Violin plots of differential gene expression were generated using plotGeneExpression(cellchat) with the standard settings.
GO analysis
GO analysis for downregulated or upregulated TBXT-dependent genes was performed with ShinyGO version 0.77 (Ge et al., 2020) using GO Biological Process terms. Downregulated or upregulated TBXT-dependent gene lists from the Mesoderm subcluster were assembled from differential tests between TBXT-Het versus WT or TBXT-KO versus WT in Seurat. Gene sets were filtered with a significance threshold set at adj. P<0.05 and the abs(log2FC)>0.25. Biologically relevant pathways within the first 20 hits for the TBXT-KO versus WT comparison were visualized with bar plots, and all results are available in Table S2.
Quantification and statistical analysis
Each experiment was performed with at least three biological replicates except multi-omic snATAC and snRNA-seq, which was performed with two biological replicates. Multiple comparisons were used to compare multiple groups followed by unpaired t-tests (two-tailed) between two groups subject to a posthoc Bonferroni correction. In gene expression analysis, two replicates were used for each condition, and all gene expression was normalized to control WT populations followed by unpaired t-tests (two-tailed). Significance was specified as adj. P<0.05 unless otherwise specified in figure legends. All error bars represent the s.e.m. unless otherwise noted in the figure legend.
Acknowledgements
We thank Mylinh Bernardi and Horng-Ru Lin of the Gladstone Genomics core for their support in library preparation, the Gladstone Stem Cell Core for iPSC reagent procurement and facilities, and Kathryn Claiborn, Giovanni Maki and Tami Tolpa for editorial and graphic design assistance. Imaging data for this study was acquired at the Center for Advanced Light Microscopy at UCSF using a CREST/C2 confocal microscope obtained using grants from the UCSF Program for Breakthrough Biomedical Research funded in part by the Sandler Foundation and the UCSF Research Resource Fund Award, and using an AxioObserver Z1 inverted microcope at the Gladstone Histology and Light Microscopy core. In addition, we thank Pengfei Zhang and the Abate Lab at UCSF for assistance with PDMS stamp fabrication, as well as Nicholas Elder, David Joy, Martin Dominguez, and the rest of the McDevitt and Bruneau labs for their experimental and computational expertise.
Footnotes
Author contributions
Conceptualization: E.A.B., I.M.-V., A.R.G.L., T.C.M., B.G.B.; Methodology: E.A.B., I.M.-V., A.R.G.L., T.C.M., B.G.B.; Software: E.A.B.; Validation: E.A.B.; Formal analysis: E.A.B.; Investigation: E.A.B.; Resources: E.A.B., T.C.M., B.G.B.; Data curation: E.A.B.; Writing - original draft: E.A.B.; Writing - review & editing: E.A.B., I.M.-V., A.R.G.L., T.C.M., B.G.B.; Visualization: E.A.B.; Supervision: E.A.B., T.C.M., B.G.B.; Project administration: E.A.B., T.C.M., B.G.B.; Funding acquisition: E.A.B., T.C.M., B.G.B.
Funding
This work was supported by a grant from the National Heart, Lung, and Blood Institute (R01 HL114948, R01 HL155906 to B.G.B.), the Roddenberry Foundation, Additional Ventures and the Younger Family Fund. E.A.B. was supported by a fellowship from the National Science Foundation Graduate Research Fellowship Program (NSF-GRFP; 2020291111). Deposited in PMC for release after 12 months.
Data availability
snATAC-seq and snRNA-seq data have been deposited in Gene Expression Omnibus under accession number GSE245998. Sequencing analysis scripts are available on Github and can be accessed at https://github.com/ebulger3/TBXT-CDX2-Analysis.
The people behind the papers
This article has an associated ‘The people behind the papers’ interview with some of the authors.
Peer review history
The peer review history is available online at https://journals.biologists.com/dev/lookup/doi/10.1242/dev.202516.reviewer-comments.pdf
References
Competing interests
B.G.B. is a founder, shareholder and advisor of Tenaya Therapeutics and is an advisor for Silver Creek Pharmaceuticals. I.M.-V. is the founder and Chief Executive Officer of Vitra Labs, Inc. T.C.M. is an employee at Genentech and is an advisor for Vitra Labs. The work presented here is not related to the interests of these commercial entities.