Lineage-restricted transcription factors, such as the intestine-specifying factor CDX2, often have dual requirements across developmental time. Embryonic loss of CDX2 triggers homeotic transformation of intestinal fate, whereas adult-onset loss compromises crucial physiological functions but preserves intestinal identity. It is unclear how such diverse requirements are executed across the developmental continuum. Using primary and engineered human tissues, mouse genetics, and a multi-omics approach, we demonstrate that divergent CDX2 loss-of-function phenotypes in embryonic versus adult intestines correspond to divergent CDX2 chromatin-binding profiles in embryonic versus adult stages. CDX2 binds and activates distinct target genes in developing versus adult mouse and human intestinal cells. We find that temporal shifts in chromatin accessibility correspond to these context-specific CDX2 activities. Thus, CDX2 is not sufficient to activate a mature intestinal program; rather, CDX2 responds to its environment, targeting stage-specific genes to contribute to either intestinal patterning or mature intestinal function. This study provides insights into the mechanisms through which lineage-specific regulatory factors achieve divergent functions over developmental time.
Lineage-specifying transcription factors are required for a cellular lineage to develop in embryos, and their loss can lead to a selective failure of the respective tissue to develop in mutant embryos. Examples include MYOD1, which establishes muscle development and induces muscle trans-differentiation in fibroblasts (Davis et al., 1987; Tapscott et al., 1988), and PDX1, which is essential for pancreas development (Jonsson et al., 1994; Offield et al., 1996). Study of such transcription factors is informative during early development, when tissues are specified; however, the same factors often have additional roles in subsequent tissue maturation, organogenesis and adult homeostasis. For example, Foxa2 plays a crucial role in liver specification in the embryo, but also regulates bile acid transport in the adult liver (Bochkis et al., 2008; Lee et al., 2005). ChIP-seq of FOXA2 in embryonic and adult hepatocytes revealed distinct sets of target genes in these distinct contexts. Adult enhancer chromatin structure in the liver, as assayed by H3K4me1 ChIP-seq, indicated that FOXA2-bound regions specific to the adult took on an active chromatin structure, whereas sites bound by FOXA2 only in the embryo were occupied by a nucleosome in the adult hepatocytes (Alder et al., 2014). Chromatin-binding activity of the intestine-specifying factor CDX2 has not been explored in distinct developmental contexts.
The homeodomain transcription factor CDX2 is expressed from the onset of intestinal development through adult life. In mice, early deletion of Cdx2 in the visceral endoderm precludes formation of the intestinal epithelium, which undergoes an esophagus-like homeotic conversion (Gao et al., 2009). Squamous epithelial conversion also occurs upon spontaneous loss of CDX2 heterozygosity in mouse development, leading to harmatomas in the colon (Chawengsaksophak et al., 1997; Tamai et al., 1999). Moreover, expression of certain gastric markers is observed in the intestinal epithelium upon early embryonic inactivation of Cdx2 by a Villin-Cre driver (Grainger et al., 2010). In contrast, ablation of Cdx2 in the intestinal epithelium late in fetal or adult life compromises vital digestive function without inducing histological features of the rostral gut (Verzi et al., 2010, 2011). Conversely, rostral gut genes are ectopically expressed when adult intestinal stem cells lacking CDX2 are cultured in the presence of factors conducive to gastric differentiation (Simmini et al., 2014) or upon prolonged loss of CDX2 in a subset of adult intestinal tissues (Hryniuk et al., 2012; Stringer et al., 2012).
The distinct consequences of CDX2 loss at different developmental stages prompt questions about the mechanistic basis of lineage-specifying transcription factor activities: do these factors bind their chromatin targets from the outset or do their transcriptional targets differ along the developmental continuum? How do they interact with chromatin across developmental time? Is expression of a lineage-specific transcription factor sufficient to activate appropriate target genes, even in the absence of extracellular signals? Additionally, although many insights have been garnered from studies in mice, CDX2 function in human tissue specification remains untested.
Here, we employ mouse models and human pluripotent stem cell-derived models, coupled with investigation of chromatin accessibility, to show that CDX2 binds distinct chromatin sites in embryonic and adult intestines; this distinction is conserved in mice and humans. We find that CDX2 is incapable of instructing major shifts in the chromatin to direct its own binding, and is thus not functioning as a pioneer factor in the developing gut. Rather, dynamic chromatin-binding properties of CDX2 correlate with the dynamic transitions in chromatin accessibility that occur in the course of intestine development. Importantly, CDX2 is required to sustain newly accessible chromatin regions in mature tissues. Our identification of distinct CDX2-binding sites and regulatory functions during tissue specification versus in adults may explain why tissue plasticity is only observed upon CDX2 inactivation during a distinct window during embryonic development. These findings support a model in which lineage-specifying transcription factors operate in context-dependent roles, shaped by a dynamic developmental chromatin landscape.
CDX2 is required for human intestinal development
In mouse embryos with conditional deletion of Cdx2 in the developing endoderm, the intestinal epithelium exhibits morphological and molecular characteristics of foregut identity. Depending on the timing of CDX2 loss using different Cre drivers, esophageal, gastric corpus or gastric antral identities are observed, with earlier onset of Cdx2 inactivation producing more anterior (rostral) identity shifts (Gao et al., 2009; Grainger et al., 2010). It is unknown whether CDX2 has a conserved role in human intestine development, and new methods to direct differentiation of human pluripotent stem cells (hPSCs) allow us to ask this question (D'Amour et al., 2005; Finkbeiner et al., 2015; McCracken et al., 2011; Spence et al., 2011).
CDX2 expression is first observed upon specification of the midgut by WNT and FGF signaling (Wells and Melton, 2000). Immunoblotting confirmed absence of CDX2 in human embryonic stem cells (hESCs) or definitive endoderm, but combined activation of WNT (using the GSK3β inhibitor CHIR99021 to stabilize β-catenin) and FGF (using recombinant FGF4) signaling triggered robust CDX2 expression, which was sustained in three-dimensional human intestinal organoid (HIO) cultures months later (Fig. 1A-C). To determine if CDX2 is required to induce the intestinal lineage, we used CRISPR/Cas9 to engineer CDX2-null hESCs in the H9 line and differentiated the cells into HIOs. Sequencing confirmed mutation of CDX2, and CDX2 immunoreactivity was lost in mutant cells directed toward intestine differentiation (Fig. 1D, Fig. S1). Whereas control hESCs gave rise to tissues that expressed CDX2 (Fig. 1D) and the intestine-restricted markers AMY2B, MUC2 and DPP4 (Fig. 1E), CDX2-null hESCs failed to develop tissues with these characteristic markers. Instead, they gave rise to organoids expressing transcripts and proteins of CDX2-negative foregut lineages, including stomach, esophagus, pancreas and gall bladder (Fig. 1D,E). These findings are consistent with mouse models in which CDX2 loss in the developing epithelium leads to the development of a tissue with foregut properties (Gao et al., 2009). In contrast to the mouse genetic models, the HIO system contains both epithelial and mesenchymal components, and CDX2 expression is lost in both tissues in this model. Still, the resulting foregut features developing in CDX2-negative human organoids is consistent with a conserved role for CDX2 in initiating human intestine development, and underscores the importance of pursuing the mechanisms through which CDX2 imparts intestinal fate.
CDX2 regulates distinct transcriptional targets across human intestinal development
To explore these mechanisms, we performed ChIP-seq on biological replicates (Fig. S2A) of intestine-specified endoderm (herein referred to as ‘midgut’), using the differentiation protocol outlined above (Fig. 1A), which has been demonstrated to generate intestine with duodenal properties (Tsai et al., 2017). We compared CDX2 ChIP-seq in these midgut tissues with ChIP-seq on primary adult human duodenal epithelium (herein referred to as ‘adult’). We used Model-based Analysis of ChIP-Seq (MACS; Zhang et al., 2008) to identify ChIP-seq binding sites and k-means clustering to define genomic regions with differential CDX2 occupancy (Fig. 2A,B). Ontology analysis of genes linked to CDX2-binding sites revealed that midgut-enriched CDX2 binding occurred near genes that were enriched for functions in development, patterning and morphogenesis, whereas genes near adult-enriched CDX2-binding sites were associated with adult intestinal physiological functions (Fig. 2C). DNA sequence motifs enriched at midgut- and adult-enriched regions were also distinct, suggesting that CDX2 works in different regulatory complexes in embryos and adults. The CDX2 motif was strongly enriched in both groups of binding regions, but whereas midgut-enriched sites were enriched in NANOG and SOX15 motifs, consistent with a role for SOX factors supporting endoderm development (Kanai-Azuma et al., 2002), adult-enriched regions were most enriched in HNF4A motifs (Fig. 2D, Table S1), consistent with the function of HNF4 in the mature intestine (Babeu et al., 2009; Cattin et al., 2009; San Roman et al., 2015; Stegmann et al., 2006). Examples of midgut-enriched target genes included the HOXB cluster genes (Fig. 2E, Fig. S2B) known for their roles in tissue patterning and axial elongation (Young et al., 2009), whereas adult-enriched target genes included the enterocyte differentiation marker ALPI and the differentiation-promoting transcription factor gene HNF4A (Fig. 2E, Fig. S2C). These results reveal that human CDX2 binds distinct genomic regions in a stage-specific manner and regulates stage-specific genes.
Both WNT and FGF signals are required to drive robust differentiation of human endoderm into midgut (Spence et al., 2011) and activation of these signals induces CDX2 expression (Fig. 1A). It is unclear whether CDX2 expression is sufficient for its binding at midgut regulatory regions or whether CDX2 also requires other sequelae of dual signaling pathway activation to bind to its midgut-enriched binding regions. To address this question, we engineered hESCs for doxycycline-inducible CDX2 expression. ChIP-seq after forced CDX2 expression in our ESC-derived endoderm cells yielded more than twice as many binding sites as were present in CHIR- and FGF4-treated endoderm (135,194 sites compared with 58,981 sites, Fig. 2F), suggesting that many of these sites are off-target from the endogenous intestinal program driven by WNT and FGF. Both groups of regions were enriched in the CDX2 consensus motif (Fig. 2F, Table S2), consistent with bona fide CDX2 binding to these regions. Surprisingly, despite robust expression (Fig. S2E) and occupancy of significantly more genomic sites, ectopic CDX2 failed to bind more than half of the midgut-enriched targets occupied in the presence of CHIR and FGF4 (Fig. 2F), even though these regions exhibited a strong presence of consensus CDX2-binding sequences (HOMER, P=1e−12760; Fig. 2F, Table S2). Thus, although CDX2 expression is necessary for specification of human intestine (Fig. 1C,D), CDX2 cannot reach its full complement of midgut chromatin targets in the absence of WNT- and FGF-induced alterations to regulatory mechanisms governing CDX2 binding.
Dynamic CDX2 activity across intestinal development is conserved in mice
To refine and corroborate this model of CDX2 binding and activity in vivo, we turned to mice. First, we identified CDX2-binding sites in intestinal cells isolated at embryonic days (E) 13.5, 16.5 and 17.5, and from adult intestines. k-means clustering identified four major groups: 4123 genomic regions were bound more robustly in embryos, 12,551 sites were more specific to the adult (Fig. 3A,B), and the remaining sites were bound similarly at all stages (1791 sites with robust CDX2 binding and 15,368 sites with modest signals; clusters 1 and 4 in Fig. S3A and Table S3). CDX2 binding showed a notable transition late in gestation: binding at embryo-enriched sites declined by E16.5, as occupancy of adult-enriched regions became evident (Fig. 3A). To test stage-specific binding functions in gene regulation, we purified control and CDX2-depleted epithelium after embryonic knockout (Shh-Cre; Cdx2f/f, collected at E12.5) or adult knockout (Villin-CreERT2; Cdx2f/f) and used RNA-seq to identify CDX2-dependent transcripts. Consistent with CDX2's activity as a direct transcriptional activator, the genes nearby CDX2-binding sites showed reduced expression upon CDX2 loss at the corresponding developmental stages (Fig. S3B). Developmental stage-specific CDX2-binding regions also correlated with stage-specific gene expression, as genes linked to adult-enriched binding sites were expressed more robustly in adult than in the developing intestine, and the converse was true for embryo-enriched binding targets (Fig. 3C). Thus, stage-specific CDX2 binding predicts nearby gene expression in each context.
Transcription factor motifs enriched at stage-specific enhancers also suggest dynamic regulatory activity. Although CDX2-binding motifs were enriched in both adult- and embryo-enriched binding sites, other motifs were selectively enriched. For example, embryo-enriched CDX2 binding was associated with HOX and FOXA motifs, whereas adult-enriched binding sites were associated with HNF4, FRA1 and KLF5 motifs (Fig. S3C, Table S3). This divergence points to different transcriptional regulatory networks in the embryonic gut, where developmental competence and tissue patterning predominate (Wang et al., 2015), versus in the adult intestine, where cell identity is fixed and CDX2 and partner factors control physiological functions (Verzi et al., 2010, 2011). Indeed, ontology analysis indicated that genes near embryo-enriched binding sites associate with developmental processes, such as liver development and regulation of stem cell differentiation, whereas genes near adult-enriched regions associate with mature intestinal functions, including lipid metabolism and brush border localization (Fig. 3D). Condition-specific binding at developmental genes (HoxB, Fzd10) or those expressed exclusively in the mature tissue (Alpi, Sis) illustrates these findings (Fig. 3E,F, Fig. S3D,E). Together, these results reveal that CDX2 has distinct genomic targets at different developmental stages, regulates target genes in a stage-dependent manner, and likely partners with different factors in each context to accomplish stage-dependent gene regulation. These results generally parallel observations observed in the human model (Fig. 2), suggesting that the dynamic functions of CDX2 across intestinal development are conserved between these mammalian species.
Temporal dissection of Cdx2 requirements for intestinal identity
We next sought to correlate our observed changes in CDX2's genomic occupancy with observations of homeotic conversion phenotypes upon CDX2 loss. CDX2 loss triggers profound homeotic transformation of the intestine to esophagus when it occurs at E8.5 (FoxA3-Cre; Gao et al., 2009). When Cdx2 is deleted at E13.5, however, certain stomach-specific genes are ectopically expressed without overt squamous cell differentiation (Grainger et al., 2010).
We examined the timing of CDX2 dependency more closely, with additional time points of CDX2 inactivation. Mouse endoderm engineered to delete Cdx2 at ∼E9.5 using the Shh-Cre driver (Harfe et al., 2004) exhibited stomach characteristics anteriorly in the jejunum [ATP4B and foveolar Periodic acid-Schiff (PAS) staining] and stratified squamous esophageal features posteriorly (Fig. 4A, Fig. S4A). Thus, loss of CDX2 ∼E9.5 results in a homeotic phenotype of intermediate severity compared with when CDX2 is lost earlier (E8.5; Gao et al., 2009) or later (E13.5; Grainger et al., 2010). In contrast, we observed no evidence of transformation when we inactivated Cdx2 at later embryonic stages by tamoxifen treatment of dams carrying Villin-CreERT2; Cdx2f/f embryos at E13.5 or E15.5 (Fig. 4A; see also figure 4A of Banerjee et al., 2018). Thus, these late-stage embryos, like the adult epithelium upon CDX2 loss (Verzi et al., 2011), retain features of intestinal identity. RNA-seq analysis of these mutants provides a more global perspective of lineage-specific gene expression upon CDX2 loss at distinct developmental time points. Consistent with our histological analysis, early loss of CDX2 via Shh-Cre triggered a substantial increase in transcripts normally enriched in the hind stomach, though not to levels typically found in the hind stomach (see figure 4C in Banerjee et al., 2018). Elevated expression of Sox2 transcripts in the Shh-Cre; Cdx2f/f mutant exemplifies these findings, as Sox2 transcripts in E12 epithelium increase upon loss of CDX2, but are less robustly detected, or undetectable, upon CDX2 loss at later developmental or adult stages, respectively (Fig. S4C). Whereas rostral transcripts were elevated upon CDX2 loss at the earliest time points examined, transcripts normally enriched in the intestinal epithelium were reduced to levels similar to those found in the hind stomach, though not to levels as low as those of intestinal genes in esophagus samples (see supplementary figure 4B in Banerjee et al., 2018). It is interesting to note that whereas the ileal-esophageal transition observed upon Shh-Cre-driven CDX2 loss is accompanied by squamous histology, the gastric characteristics present in the jejunal epithelium of this model exist within an undulating, villus-like epithelial morphology.
Taken together, our analysis of CDX2 inactivation in the intestine at different developing time points, along with previously published Cdx2 mouse knockout studies, suggests that a critical window for intestinal plasticity exists prior to E13.5. After this E13.5 time point, intestinal identity is stabilized and less susceptible to transformation to anterior structures (Fig. 4B). These temporal phenotypes roughly correspond to the dynamic CDX2-binding patterns we observed in ChIP-seq assays of early versus late embryonic intestinal epithelia (Fig. 3A) and are consistent with observations that CDX2 target genes partition into functions associated more with developmental processes (embryo-enriched sites) versus mature intestinal physiological functions (adult-enriched sites, Fig. 3D).
CDX2 follows, then stabilizes, a developmentally dynamic chromatin landscape
CDX2 expression is nearly constant from the onset of intestine development through adult life in both humans (Fig. 1B,C) and mice (Fig. 5A,B). To resolve why CDX2 could function in different capacities despite relatively consistent levels of expression (Figs 2, 3), we examined the possibility that a dynamic chromatin environment could underlie stage-specific CDX2 genomic binding. Chromatin accessibility was measured at all embryonic and adult binding sites using ATAC-seq (assay for transposase-accessible chromatin using sequencing) at E11.5, E14.5, E16.5, E18.5, postnatal day (P) 1 and adult. We observed a clear pattern at CDX2-bound regions, with embryo-enriched sites losing accessible chromatin after E14.5, and adult-enriched sites gaining chromatin access thereafter (Fig. 5C,D). Four features are notable in this regard. First, 74% of genomic regions with strong adult-enriched CDX2 binding showed poor accessibility in early embryos, lacking ATAC signals at E11.5 (P<10−5; Fig. 5C). Second, open chromatin emerged at these sites in adults (Fig. 5C), where CDX2 binding was clearly absent in the embryo (Fig. 3A); the converse was apparent for the embryo-enriched CDX2-binding sites. Third, chromatin dynamics were similar for the active histone mark H3K27ac (Kazakevych et al., 2017), with stage-specific enhancer marks corresponding to stage-specific CDX2-binding regions (Fig. S5A,B). Fourth, stage-specific binding in mouse endoderm resembled the profound shift in binding between human ESC-derived gut endoderm and adult human intestine (Fig. 2A). Thus, despite constant expression across time, CDX2 binding closely follows a temporal wave of accessible chromatin from embryo-enriched to adult-enriched sites. Examples of these findings are illustrated at genes selectively expressed in developing or adult intestine (Fig. 5E,F).
The contemporaneous transitions we observed after E14.5 in chromatin accessibility (Fig. 5C,D), CDX2 binding (Fig. 3A) and histone marks (Fig. S5A,B) coincide roughly with the restricted temporal window in which Cdx2 loss is permissive to ectopic expression of more rostral gut features in the intestine (Fig. 4A,B). Importantly, although CDX2 cannot access adult-enriched target sites in the developmental context (Figs 2A and 3A), ATAC-seq in Shh-Cre; CDX2f/f embryos at E16.5 revealed diminished chromatin accessibility at the vast majority of regions occupied by CDX2 at that developmental time point (Fig. S5C,D). Reduced chromatin access was specific to CDX2-bound regions, as chromatin accessibility at all annotated promoters was unchanged (Fig. S5D,E). Together, these data reveal that although chromatin accessibility directs CDX2 binding during endoderm development, CDX2 (likely in partnership with other activators) is required to sustain access at regions associated with adult intestinal genes. This finding is consistent with CDX2 control of enhancer chromatin structures in mature mice (Verzi et al., 2013). Thus, CDX2 acts in distinct developmental contexts, constrained in part by temporal chromatin landscapes, to promote tissue specification in embryos and essential digestive functions in adults.
Pioneer factors help shape accessible chromatin (Iwafuchi-Doi and Zaret, 2016), and developmental signaling pathways and partner transcription factors prime chromatin structure to create a state of ‘developmental competence’ (Wang et al., 2015). Our findings fit within this model, in that CDX2 is unable to bind to selected genomic regions outside of a proper developmental context. Our findings demonstrate that CDX2 is re-purposed across the transitions in developmental competence: first CDX2 binds genes associated with developmental and specification functions, and subsequently CDX2 binds genes that promote the mature differentiated state in adults. The correlation between chromatin accessibility (Fig. 5), active histone modifications (Fig. S5), and CDX2 binding suggests that the chromatin environment is likely a crucial factor in dictating condition-specific binding of CDX2. Our experiments with forced CDX2 expression in hESC-derived endoderm are similar to our findings in the mouse, with the observation that CDX2 is not sufficient to access many of its primary midgut-specific targets independently of the proper signaling and/or chromatin environment. These findings are consistent with a previous study in which CDX2 was ectopically expressed in ESCs, endoderm, or motor neuron progenitors. In these distinct contexts, CDX2 could access some binding sites that are common to each tissue, but depends on accessible chromatin at sites that are specifically occupied in only one of these three lineages (Mahony et al., 2014). In addition to regulation of chromatin accessibility, post-translational modification of CDX2, expression of transcriptional co-regulators, or other regulatory mechanisms likely function downstream of WNT and FGF signaling to create an environment in which CDX2 can access its midgut target sites and impart intestinal identity. A similar set of regulatory mechanisms could govern the transition of CDX2 from embryo-enriched binding sites to adult-enriched binding sites (Figs 2, 3) across crucial developmental transitions (Fig. 4B). DNA-binding motif analyses (Figs 2, 3) suggest that CDX2 has different binding partners in embryonic versus mature intestine, and that these partners are likely to be at least partially conserved, with HNF4 and AP-1 factor motifs enriched at CDX2-binding sites in mature intestines of both mice and humans. These findings should be corroborated by testing the roles of these candidate partners in genetic models and by assaying for direct protein-protein interactions. CDX2 binding during intestinal specification is likely to rely upon pioneer factors, and motif analysis points to FoxA, a known pioneer factor (McPherson et al., 1993), as a likely candidate (Fig. S3C). As-yet-unidentified chromatin regulators likely generate the corresponding competence for CDX2 occupancy and intestinal maturation as the intestinal chromatin landscape is altered in late gestation (Fig. 5C). Expression of these chromatin regulators could even be under the control of, or partner with, CDX2.
Our observations that CDX2 chromatin binding and chromatin accessibility shifts during development provide new insights into the molecular basis of intestinal maturation, and indicate that mere expression of lineage regulators is insufficient to achieve desired tissue fates, at least for CDX2 in the intestine. Interestingly, expression of CDX2 in ESCs is both required (Strumpf et al., 2005) and sufficient (Niwa et al., 2005) to induce trophectoderm fate. CDX2 expression activates trophectoderm target genes and indirectly represses activity of pluripotency transcription factors, as CDX2 binding is not observed nearby genes regulated by pluripotency transcription factors, in spite of the observation that the pluripotency factors no longer bind chromatin robustly after expression of CDX2 (Nishiyama et al., 2009). One potential mechanism of trophectoderm lineage induction by CDX2 is through a direct interaction with Oct3/4 (POU5F1), as measured by co-immunoprecipitation and fluorescence resonance energy transfer (Niwa et al., 2005). The trophectoderm and intestinal systems offer an interesting contrast in the ability of one transcription factor to induce lineage differentiation, and it will be intriguing to define potential protein interactors of CDX2 in the embryonic intestine to identify potential negative regulatory interactions akin to what is observed in trophectoderm development.
In models of Barrett's esophagus, a predisposing condition of esophageal adenocarcinoma, CDX2 is unable to induce intestinal metaplasia simply by its ectopic expression in the adult squamous foregut via a keratin 14 promoter (Kong et al., 2011). Intestinal metaplasias are only triggered upon CDX2 expression within the proper cellular context – either in the fetal stomach, or in a subset of junctional cells at the border of the squamous and columnar epithelium in the stomach (Jiang et al., 2017; Silberg et al., 2002). The inability of CDX2 to induce Barrett's metaplasia in most adult contexts is consistent with the observations we report. CDX2 only seems to influence tissue patterning at early developmental stages, corresponding to stages in which chromatin accessibility is more broadly similar across all regions of the developing gastrointestinal tract (Banerjee et al., 2018). After the epigenome is re-structured late in gestation, CDX2 accesses adult-specific binding regions, but its loss no longer leads to ectopic expression of foregut markers. Consistent with the observation that the intestinal chromatin landscape shifts late in development, loss of CDX2 in the adult mouse epithelium leads to a fetus-like transcriptome (Mustata et al., 2013), but not the early embryonic state permissive to homeotic transformation. CDX2 loss in early embryos permits such transformation in both mice and humans (Gao et al., 2009; Grainger et al., 2010; Fig. 4A,B), likely because a permissive chromatin environment is present for a limited time. It will be interesting to define how chromatin accessibility is altered in instances of homeotic transformation upon prolonged loss of CDX2 in adult tissues (Simmini et al., 2014; Stringer et al., 2008). Transformation of adult tissues will likely require reversal of repressive chromatin features, such as H3K9me3 domains that require re-structuring, as observed during induced pluripotent stem cell reprogramming (Soufi et al., 2012), or the loss of PRC2 function that is required for expression of alternate lineage genes in the intestine (Saxena et al., 2017). Future studies investigating these epigenomic processes in the context of CDX2 loss in the early embryonic intestine or upon ectopic CDX2 expression leading to Barrett's metaplasia will be important next steps in understanding the mechanisms of lineage plasticity. In addition, defining the factors that craft embryonic chromatin landscapes toward their mature forms will bolster efforts of regenerative medicine to develop these tissues accurately in vitro.
MATERIALS AND METHODS
Shh-cre mice (Harfe et al., 2004) were purchased from The Jackson Laboratory and male Shh-cre; Cdx2f/+ were bred to Cdx2 f/f female mice for inducing Cdx2 knockout (KO) in the developing intestine beginning at ∼E9.5. Cre-negative embryos were used as littermate controls. Villin-CreERT2 (el Marjou et al., 2004) and CDX2f/f mice (Verzi et al., 2010) were previously described. Wild-type CD1 mice were obtained from Charles River Laboratories. All mouse protocols and experiments were approved by the Rutgers Institutional Animal Care and Use Committee.
Human pluripotent stem cell culture and differentiation, and human adult tissue collection
Human embryonic stem cells (line H9, WiCell Research Institute, NIH registration number 0062) (tested for mycoplasma contamination and validated with STR profiling) were maintained and differentiated into endoderm, midgut and intestinal organoids as published previously (Cruz-Acuña et al., 2017; Tsai et al., 2016, 2017). hESCs were maintained in mTESR1 (Stem Cell Technologies) on hESC qualified Matrigel (Corning) in 6-well dishes. Prior to differentiation, hESCs were passaged as small clumps into 24-well Nunc Cell-Culture Treated multi-well plates with the Nunclon Delta surface treatment. When cultures were approximately 60% confluent, hESC media was replaced with endoderm differentiation media (RPMI1640+100 ng/ml Activin A; R&D Systems) for 3 days supplemented with 0%, 0.2% and 2% HyClone FBS on subsequent days. Following endoderm differentiation, cells were exposed to midgut differentiation media for 4-6 days, which consisted of RPMI1640+2% HyClone FBS+2 µM CHIR99021+500 ng/ml FGF4 (made in-house, as previously described; Leslie et al., 2015). Free-floating midgut spheroids formed on the 4th-6th days, and were subsequently placed in a droplet of Matrigel and were grown/expanded into HIOs for 28-35 days, in HIO media as previously described (Tsai et al., 2016), which consisted of Advanced DMEM/F12 media supplemented with 1× Pen/Strep, 1× B27 (Gibco, Life Technologies), 100 ng/ml EGF, 5% NOGGIN conditioned media (Heijmans et al., 2013) and 5% R-SPONDIN2 conditioned media (Bell et al., 2008).
For human patients, de-identified duodenal biopsies were collected with Rutgers IRB approval, cut longitudinally and epithelial cells were obtained by scrapping the luminal surface. Cells were washed twice with PBS, fixed with 1% formaldehyde (15 min at 4°C followed by 30 min at room temperature), washed twice and cell pellets were flash frozen for subsequent chromatin pulldown.
Histology, immunofluorescence and immunohistochemistry
Human organoids derived from stem cells (Spence et al., 2011) for 30 days were processed for immunofluorescence using antibodies (CDX2, Bio-Genex, MU392-UC, 1:500; MUC5AC, Abcam, ab79082, 1:500; NKX2.1, ThermoFisher, 8G7G3, 1:50; PDX1, Epitomics, 3470-1 1:300; SOX2, Santa Cruz Biotechnologies, sc-27603, 1:100). Mouse immunostaining was performed with the following antibodies: CDX2, Cell Signaling, 12306, 1:200; ATP4B, MBL International Corp., D032-3, 1:200; TRP63, Santa Cruz Biotechnologies, sc-8343, 1:500. Human organoids were collected and fixed in 4% paraformaldehyde (PFA) for 1 h, followed by cryoprotection with 30% sucrose overnight. On the next day, tissues were embedded with OCT (Fisher, Tissue-Plus, 4585) and stored at −80°C. Tissues, were sectioned at 3 µm and immunofluorescence staining was carried out using standard methods (Spence et al., 2011) by staining primary antibody overnight and 1 h secondary antibody on the next day. For midgut immunofluorescence staining, the monolayer of cells was fixed by 4% PFA for 20 min, followed by the same methods as organoids. Images were taken using an Olympic microscope IX71.
For mouse immunostaining, intestinal tissue was dissected and fixed overnight in 4% PFA at 4°C, washed with PBS, passed through increasing concentrations of an ethanol series and paraffin. Intestinal sections (5 µm) were cut from paraffin blocks, processed for immunostaining with the indicated primary antibodies, developed using the Vectastain ABC Kit (Vector Laboratories, PK6101) and counterstained with Hematoxylin. An antigen retrieval step of 1 h in 10 mM sodium citrate solution under 15 psi pressure was used for all stains. Slides were incubated with primary antibody overnight at 4°C (CDX2, Cell Signaling, 12306, 1:200; ATP4B, MBL International Corp., D032-3, 1:200; TRP63, Santa Cruz Biotech, sc-8343, 1:500). PAS staining was conducted by treating slides with 0.5% periodic acid and staining with Schiff's Reagent (Alfa Aesar, J612171). Images were taken using a Retiga 1300CCD (Q-Imaging) camera and a Nikon Eclipse E800 microscope with QC-Capture imaging software. Adjustments in contrast and sharpness, when made, were applied to complete figure panels in Adobe Photoshop.
Briefly, RNA isolation was performed using MagMAXTM-96 total RNA isolation kit (Ambion, AM1830). SuperScript VILO cDNA synthesis kit (ThermoFisher, 11754250) was used to make cDNA from 200 ng RNA. cDNA levels were detected using QuantiTect SYBR Green (Qiagen, 608056). Relative gene expression was plotted as arbitrary units, using the following formula: [2^(housekeeping gene Ct-gene Ct)]×10,000.
Generating CDX2 hESC knockout cell line using CRISPR/Cas9
The CDX2-KO was generated using the pCas-Guide-EF1a-GFP plasmid (Blue Heron Biotech, GE100018) using the CDX2 sgRNA of targeting sequence (CCTCTCAGAGAGCCCCAGCGTGG) at the human CDX2 exon2 DNA-binding domain. Plasmid was introduced into hESCs using electroporation (NEPA21 Electro-Kinetic Transfection System). Total plasmid DNA (10 µg) was prepared and mixed with 1×106 hESCs in transfection medium (OPIT-MEM, Gibco, 31985-062) to a total volume of 100 µl and added into an electroporation cuvette (Bulldog Bio, 12358-346). Cells plus DNA were electroporated using the following settings: poring pulse: 175 volt; transfer pulse: 20 volt. Immediately, the transfected cells were transferred to one well of a Matrigel coated 6-well plate with 1.5 ml mTeSR1 medium plus ROCK inhibitor (Reagents Direct, Y27632, 53-B85). After 24 h, GFP-positive cells were sorted by fluorescence-activated cell sorting (FACS) and plated in Matrigel-coated 6-well plates with mTeSR1 medium plus ROCK inhibitor at low density. After 5-7 days culture, single colonies were manually picked and transferred to individual wells of a 24-well plate, and were expanded. Clones were screened by DNA sequencing and were differentiated to interrogate CDX2 protein expression by immunofluorescence. DNA-sequencing primer sequences were GCATCCTCCTGCTTCAGTCT and GCAGTTCTCAGCCCTCACTT. Control cells were treated the same way, but were electroporated with a GFP plasmid lacking a sgRNA.
Generating doxycycline-inducible CDX2 expression in hESCs
The pInducer20 vector is an inducible Lentiviral vector system, which carries both rtTA3 and neomycin-resistance genes under the UBC promoter as well as a cDNA of interest under the control of a tetracycline-responsive promoter (Meerbrey et al., 2011). pInducer plasmids are available from Addgene (44012). Generation of the pInducer-GFP inducible control cell line has been described elsewhere (Chen et al., 2014). The human CDX2 clone HsCD00045643 was purchased from the Arizona State DNA Plasmid Repository (Dnasu.org), and was introduced into the pInducer20 plasmid using Gateway recombination cloning technology (ThermoFisher Scientific) following standard manufacturer protocols. Lentiviral particles were generated at the University of Michigan Viral Vector Core, and inducible hESCs lines were generated as previously described (Tsai et al., 2016). Following antibiotic selection (G418), GFP or CDX2 gene expression was induced by adding 2 µg/ml doxycycline to the culture medium.
The cells at indicated differentiation stages were collected, homogenized and lysed in lysis buffer. Lysis buffer consisted of 50 mM Tris pH 7.4, 150 mM NaCl, 1% Triton X-100, 1.5 mM MgCl2, 5 mM EGTA, 1% glycerol and protease and phosphatase inhibitors [30 mM sodium pyrophosphate (Na4P2O7), 50 mM sodium fluoride (NaF), 0.1 mM sodium orthovanadate (Na3O4V) and Complete, EDTA-free (Roche, 11873580001)]. Cells were lysed for 30 min at 4°C with rotation and cleared by centrifugation at 16,000 g. Protein concentrations were determined by Bio-Rad protein assay (Bio-Rad, 500-0006). Samples containing 20 μg of protein lysate in Laemmli sample buffer were separated on 15% SDS-PAGE gel and transferred (semi-dry) to nitrocellulose membrane. Membrane was probed with rabbit mAb CDX2 (D11D10) (Cell Signaling Technology, 12306, 1:500), then re-probed with mouse mAb β-actin (Sigma, A1978). HRP-conjugated rabbit or mouse secondary antibodies (Cell Signaling Technology, 7074 or 7076) were used and the membrane was developed by using chemiluminescence (Western Blotting Luminol Reagent, sc-2048).
Purification of intestinal epithelial cells from mice
Pregnant dams were sacrificed and dissected embryos were kept in ice-cold PBS. Embryo tail tissue was used for genotyping using KAPA Mouse Genotyping Kits (Kapa Biosystems, KK7352). For E12.5 RNA-seq entire gut tubes (caudal of stomach), and for E16.5 ATAC-seq midguts (caudal stomach to rostral caecum), were dissected out of the body cavity. Intestinal tissues were treated with pre-warmed 0.25% Trypsin for 8-10 min at 37°C on a vortex station, neutralized with 10% FBS, and passed through a 70 µm filter to obtain single cells. Single cells were incubated with phycoerythrin (PE)-conjugated anti-CD326 (EpCam clone G8.8, eBiosciences, 12-5791-81) for ∼30 min on ice. PE-stained cells were then incubated for ∼30 min with magnetic conjugated anti-PE antibody (Miltenyi Biotec Anti-PE MicroBeads, 130-048-801). Subsequently, upon the availability of anti-CD326 (Epcam) magnetic microbeads antibody (Miltenyi Biotec 130-105-958), cells were stained with the anti-Epcam magnetic microbeads antibody for ∼40 min on ice. Stained cells were passed through a column (Miltenyi Biotec, MS Columns, 130-042-201) in a magnetic field to obtain magnetic antibody-conjugated, EpCam-positive epithelial cells. The purity of magnetic cell isolation was compared with FACS-sorted EpCam-positive cells and found to be comparable. Cells were dissolved in Trizol for RNA processing or used immediately for ATAC-seq.
For embryonic CDX2 ChIP-seq (E13.5, 12 embryos pooled per replicate; E16.5, 3 embryos pooled per replicate; and E17.5, 2 embryos pooled per replicate) midguts (caudal stomach to rostral caecum) were cut into small pieces, pooled and fixed. Briefly, tissues were PBS washed twice and flash frozen for subsequent chromatin pulldown assays. Cell pellets were thawed on ice and dissolved in 3× volume of lysis buffer: 1% SDS, 10 mM EDTA, 50 mM Tris-HCl pH 8.1, Protease Inhibitor 1× (Complete inhibitors, Roche) or mammalian Protearrest (G-Biosciences). Cells were sonicated (∼25-30 mins) to shear chromatin in 200-600 bp fragments in a Diagenode bioruptor. Chromatin was incubated overnight with CDX2 antibody (6 µl, Bethyl A300-691A, lot 1) conjugated to protein A/G beads (15 µl each). Chromatin-bound beads were washed five times with RIPA wash buffer (50 mM HEPES pH 7.6, 1 mM EDTA, 0.7% sodium deoxycholate, 1% NP-40; 0.5 M LiCl) and rinsed once with TE buffer (10 mM Tris, 0.1 mM EDTA). Chromatin-bound beads were re-suspended in reverse crosslinking buffer (1% SDS, 0.1 M NaHCO3) and incubated at 65°C for 6 h to release ChIP DNA. DNA was column purified using a PCR purification column (Qiagen) and quantified using Picogreen (Life Technologies). ChIP DNA was used to prepare ChIP-seq libraries using Rubicon Genomics ThruPLEX DNA-seq Kit (R400427/R400428/R40048), fragment size selected using Pippin Prep and sequenced on Illumina HiSeq (50 or 75-bp reads; single end; ∼25 M reads).
For ATAC-seq, 25,000-50,000 isolated epithelial cells, from midguts (caudal stomach to rostral caecum, up to 2 embryos per sample) were used, as described previously (Buenrostro et al., 2015) with minor modifications. Briefly, cells were centrifuged at 500 g for 5 min, and lysed in ice-cold ATAC-lysis buffer [10 mM Tris-HCl, pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% (v/v) NP-40], followed by centrifugation at 500 g for 10 min at 4°C to isolate nuclear pellets, which were treated in a 50 μl reaction with Nextera Tn5 Transposase (Illumina, FC-121-1030) for 25-30 min at 37°C. The transposed chromatin was purified with QIAquick PCR Purification Kit (Qiagen) and PCR amplified with high-fidelity 2× PCR Master Mix (New England Biolabs) using universal forward primer and unique reverse primers in a 50 µl reaction for five cycles. After five cycles, 5 µl of the reaction mix was amplified by qPCR for 20 cycles to determine the optimum number of additional cycles, based on one-quarter of the maximum fluorescence intensity. The PCR-amplified libraries were column purified, fragment size selected using Pippin Prep and sequenced on Illumina HiSeq (50/75-bp reads; single end; ∼25 M reads).
For E12.5 RNA-seq, total RNA was extracted from Epcam-enriched epithelial cells (as described above) from gut tissue caudal to the stomach (2-5 embryos pooled per sample) in Trizol using RNeasy micro kit (Qiagen, 74004). RNA-seq libraries were prepared using the SMARTer-Seq v4 Low Input mRNA library kit (Clonetech SMARTer, 634888). Adult RNA data were curated from GSE70766 and GSE98724 (San Roman et al., 2015; Saxena et al., 2017).
Raw sequencing reads (fastq) were quality checked using fastQC (v0.11.3), aligned to mouse (mm9) or human (hg19) genomes using TopHat2 (v2.1.0, RNA-seq) or bowtie2 (v2.2.6, ChIP- and ATAC-seq) to generate bam files. For ChIP- and ATAC-seq, bam files were merged using samtools merge (0.1.19) for downstream analysis. Bigwigs were generated from bam files using deeptools bamCoverage (v2.4.2; Ramírez et al., 2014), duplicate reads ignored, RPKM normalized and reads extended for Chip- and ATAC-seq visualization using Integrated Genomics Viewer (IGV; Robinson et al., 2011).
RNA-seq: Cxb files were generated from BAM files using Cuffquant (v2.2.1, frag-bias-correct, multi-read-correct). Normalized expression count tables were generated from cxb files using Cuffnorm (v2.2.1, library-norm-method quartile). Differential gene expression tables were computed from cxb files using Cuffdiff (v2.2.1, --multi-read-correct --frag-bias-correct, --dispersion-method per-condition --library-norm-method quartile) and log2 fold change (FPKM+1) used for analysis.
For ChIP-seq and ATAC-seq, peaks (bed file) were identified from aligned reads (bam) using MACS (1.4.1; Zhang et al., 2008). Two biological replicate ChIPs were performed for each condition, and replicates inspected by measuring Pearson correlation coefficients. Genes associated with ChIP peaks were identified using BETA minus (genes within 5 kb of binding sites, using CTCF boundaries to filter peaks around a gene). Enriched motifs at ChIP peaks were identified using HOMER findMotifsGenome.pl (v4.8.3, knownResults; Heinz et al., 2010). Enriched ontologies were identified from genomic regions (bed file) using GREAT analysis (v3.0.0) using the Proximal- 5 kb upstream and 1 kb downstream and Distal 200 kb setting (McLean et al., 2010) or DAVID (Huang da et al., 2009).
For comparing open chromatin regions (ATAC-seq) or differential CDX2 binding (ChIP-seq), bigwigs were generated (no normalization) and signal intensities were quantile normalized (v0.4.0, bin_size 50, log transformation) using Haystack (Pinello et al., 2018). k-means-clustered heatmaps were generated from haystack normalized bigwigs using computeMatrix and plotHeatmap from deeptools package (v2.4.2). Genomic regions (bed file) associated with desired k-mean clusters were extracted from bed files generated by PlotHeatmap, to obtain selected bed files (genomic regions).
To identify condition-specific binding sites in mouse, MACS called peaks at P-value 10−3 for E13.5 and at P-value 10−3 and 10−5 for adult were concatenated for each biological replicate, yielding a total of 33,834 genomic regions. E13.5 and adult replicate bam files were merged respectively, and quantile normalized bigwigs were used for k-mean clustering (k=4) at these sites. Cluster2 was identified as CDX2's E13.5 enriched genomic targets (4132) and cluster 3 as CDX2’s adult enriched genomic targets (12,551 regions, genomic coordinates provided in Table S3). For human data, replicate bam files were merged for subsequent analysis. MACS peaks at P-value 10−10 were called on merged bam files of midgut-specified endoderm (58,981 peaks) and adult (19,752 peaks). A k-means-clustered heatmap was generated using quantile normalized bigwigs at midgut-specified endoderm (top cluster) and adult sites (bottom cluster). Genes within a 5 kb region of CDX2 summits were identified using BETA minus at midgut-enriched and adult-enriched binding sites, and nearby genes were further used for evaluating gene ontologies (Table S1). For CDX2 ChIP-seq in naive endoderm with doxycycline-induced CDX2, two biological replicates were performed and replicate bam files were merged for subsequent analysis. MACS peaks were called on merged bam files. Genomic sites bound by CDX2 in the doxycycline-induced condition were identified as overlapping sites of midgut-specified endoderm using intersectBed peaks at P-value 10−10 (23,076 peaks). Condition-enriched sites were identified by subtracting lower stringency peaks of one condition (P-value 10−3) from higher stringency peaks of the second condition (P-value 10−10). This approach yielded midgut-only peaks (30,501 sites), not bound by CDX2 in the Tet-On condition, and Tet-On enriched peaks, which were bound upon CDX2 induction without differentiating the cells towards intestine (87,373 sites).
For ATAC-seq, two biological replicates from each developmental stage were performed, bam files were merged and bigwigs were quantile normalized to compare intensities of open chromatin signal. Embryonic and adult ChIP-seq data (Kazakevych et al., 2017) of the active chromatin marker (H3K27ac) was applied to CDX2's E13.5- and adult-genomic targets. Quantile normalized bigwig files of different replicates were merged by BigWigMerge. To evaluate whether CDX2 is required to maintain open chromatin regions where it typically binds, we compared ATAC-seq at E16.5 intestinal epithelium with CDX2 KO (Shh-cre) at E16.5. Two biological replicates were performed, bam files were merged and bigwigs were quantile normalized to compare intensities of open chromatin signal at regions indicated above.
For statistical analysis associated with Fig. 1, data are expressed as the median of each sample set. Each data point in the plots represents an independent biological sample. For organoid experiments, each independent biological sample is comprised of three to five organoids pooled together. All organoid experiments were conducted on at least three independent biological replicates. Unpaired t-tests were carried out with GraphPad Prism 5.0 software. Each data point is presented, with the middle line representing the mean, with error bars representing s.e.m. In all figures, *P<0.05. Gene set enrichment analysis (GSEA) was run using the pre-ranked setting on all genes using FPKM+1 values as described (Subramanian et al., 2005).
We acknowledge members of the Verzi lab and the Rutgers Epigenomics Group for helpful discussions.
Conceptualization: N.K., R.A.S., J.R.S., M.P.V.; Methodology: N.K.; Validation: L.C., J.R.S.; Formal analysis: N.K., Y.-H.T., L.C., A.Z., K.K.B., M.S., S.H., N.H.T., J.X., R.A.S., J.R.S., M.P.V.; Investigation: N.K., Y.-H.T., L.C., A.Z., K.K.B., M.S., S.H., N.H.T., J.X., R.A.S., J.R.S., M.P.V.; Resources: R.A.S., J.R.S.; Data curation: N.K., L.C., A.Z., K.K.B., M.S.; Writing - original draft: N.K., J.R.S., M.P.V.; Writing - review & editing: N.K., L.C., R.A.S., J.R.S., M.P.V.; Visualization: N.K., Y.-H.T., L.C., A.Z., K.K.B., M.S., S.H., N.H.T., J.X., R.A.S., J.R.S., M.P.V.; Supervision: J.X., R.A.S., J.R.S., M.P.V.; Project administration: R.A.S., J.R.S., M.P.V.; Funding acquisition: R.A.S., J.R.S., M.P.V.
This work was supported by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) and the National Institute of Allergy and Infectious Diseases (NIAID) (U01 DK103141 to M.P.V. and J.R.S.); the National Institutes of Health (R01 CA190558 to M.P.V., R01 DK082889 to R.A.S.); the Human Genetics Institute of New Jersey; the Biospecimen Repository Service Shared Resource and Sequencing Facility of the Rutgers Cancer Institute of New Jersey (P30CA072720); and the University of Michigan Center for Gastrointestinal Research (UMCGR) (NIDDK 5P30DK034933). Deposited in PMC for release after 12 months.
The authors declare no competing or financial interests.