ELT-2 is the major regulator of genes involved in differentiation, maintenance and function of C. elegans intestine from the early embryo to mature adult. elt-2 responds to overexpression of the GATA transcription factors END-1 and END-3, which specify the intestine, as well as to overexpression of the two GATA factors that are normally involved in intestinal differentiation, ELT-7 and ELT-2 itself. Little is known about the molecular mechanisms underlying these interactions, how ELT-2 levels are maintained throughout development or how such systems respond to developmental perturbations. Here, we analyse elt-2 gene regulation through transgenic reporter assays, ELT-2 ChIP and characterisation of in vitro DNA-protein interactions. Our results indicate that elt-2 is controlled by three discrete regulatory regions conserved between C. elegans and C. briggsae that span >4 kb of 5′ flanking sequence. These regions are superficially interchangeable but have quantitatively different enhancer properties, and their combined activities indicate inter-region synergies. Their regulatory activity is mediated by a small number of conserved TGATAA sites that are largely interchangeable and interact with different endodermal GATA factors with only modest differences in affinity. The redundant molecular mechanism that forms the elt-2 regulatory network is robust and flexible, as loss of end-3 halves ELT-2 levels in the early embryo but levels fully recover by the time of hatching. When ELT-2 is expressed under the control of end-1 regulatory elements, in addition to its own endogenous promoter, it can replace the complete set of endoderm-specific GATA factors: END-1, END-3, ELT-7 and (the probably non-functional) ELT-4. Thus, in addition to controlling gene expression during differentiation, ELT-2 is capable of specifying the entire C. elegans endoderm.
INTRODUCTION
The C. elegans endoderm provides an experimentally accessible and relatively simple example of a transcriptional network that drives the development of an entire tissue, namely the intestine (reviewed by McGhee, 2013). The core transcription factors have been identified and their functional roles are understood at the level of genetics and cell biology. Current investigations focus on understanding the network at a biochemical level: (1) to define direct interactions between transcription factors and their target genes and (2) to understand how this regulatory network functions quantitatively (Maduro et al., 2015; Nair et al., 2013; Raj et al., 2010).
The entire C. elegans intestine (endoderm or E lineage) is produced from cells that descend from the single E blastomere of the eight-cell embryo (Fig. 1) (Sulston et al., 1983). Endoderm specification occurs when the genes encoding two small GATA-type transcription factors, END-1 and END-3, are transcriptionally activated only in the E blastomere (Maduro et al., 2005; Owraghi et al., 2009). In the current model of the endoderm regulatory network (Fig. 1), END-1 and END-3 activate transcription of the gene encoding the next factor in the endoderm regulatory cascade, the GATA factor ELT-7, at the 2E cell stage (Nair et al., 2013; Sommermann et al., 2010). The gene encoding the final factor in the cascade, the GATA factor ELT-2, is activated slightly later (at the 4E cell stage for most embryos) (Fukushige et al., 1998; Nair et al., 2013; Raj et al., 2010) and remains active into adulthood. It has been proposed that ELT-2 participates in the transcription of every gene expressed in the differentiating and mature intestine (with the likely exception of ribosomal protein genes), binding directly to TGATAA sites in intestinal gene promoters (McGhee et al., 2009, 2007). Loss of elt-2 is completely lethal, whereas loss of elt-7 has no obvious phenotype (Fukushige et al., 1998; McGhee et al., 2007; Sommermann et al., 2010), implying that END-1 and END-3 might interact directly with the elt-2 promoter. In the absence of elt-2, the intestine is malformed but clearly specified and quite well differentiated (Fukushige et al., 1998). Loss of elt-7 exacerbates the elt-2 loss-of-function phenotype (Sommermann et al., 2010), but even the elt-7; elt-2 double-mutant intestine is reasonably well formed, suggesting that END-1 and/or END-3 might also be able to activate early genes of intestinal differentiation. However, most of the direct interactions implied by this network are yet to be demonstrated.
Regulatory network consisting of the four zygotically expressed endoderm-specific GATA-type transcription factors that specify and differentiate the C. elegans early endoderm (E lineage). Time scale (minutes after first cell division at 20°C) is shown on the left. In the centre are images of three early stages of embryogenesis: the 1E, 2E and 4E cells are indicated by white dots. The current model for the roles and regulatory relations between the various transcription factors is shown on the right.
Regulatory network consisting of the four zygotically expressed endoderm-specific GATA-type transcription factors that specify and differentiate the C. elegans early endoderm (E lineage). Time scale (minutes after first cell division at 20°C) is shown on the left. In the centre are images of three early stages of embryogenesis: the 1E, 2E and 4E cells are indicated by white dots. The current model for the roles and regulatory relations between the various transcription factors is shown on the right.
Here, we address several questions. How is transcription of the elt-2 gene controlled? Which of the other endodermal GATA factors participate directly? Are there a small number of crucial cis-acting sites in the elt-2 promoter or are there large numbers of redundant sites, thereby providing possible insights into network behaviour? Do perturbations of the regulatory network persist or do they self-correct? Finally, what is the nature of the extensive redundancy within the endoderm network? Do individual factors have unique properties as proteins, or is it their expression timing that is important? As a partial answer to these last two questions, we show that ELT-2, if expressed under the control of the end-1 promoter in addition to its own promoter, is able to replace the entire set of endoderm-specific GATA factors: END-1, END-3, ELT-7 and (the probably non-functional) ELT-4. Thus, ELT-2 alone can both specify the endoderm and regulate intestine differentiation and maintenance.
RESULTS
Defining the elt-2 upstream regulatory region
Fig. 2A (top) shows anti-ELT-2 antibody staining in staged wild-type embryos: ELT-2 protein is never detected in 1E cell stages, is rarely (<1%) detected in 2E cell stages but is invariably detected by the 4E cell stage. Strong intestine-specific expression continues in later embryonic stages as well as in larvae and adults (8E and 1.5-fold stages are shown in Fig. 2A; adult staining is shown in Fig. S1A) (see also Fukushige et al., 1998). Fig. S1B provides evidence for the specificity of the antibody. As also shown in Fig. 2A (bottom), the native ELT-2 expression pattern is adequately reproduced by a transgenic reporter, in which 5048 bp of elt-2 5′ flanking region is used to drive expression of a nuclear-localised GFP reporter. GFP fluorescence is first detected weakly at the 4E cell stage and much more strongly at the 8E cell stage. This short time lag in reporter expression is consistent with the expected 30 min delay introduced by GFP folding and/or maturation (Iizuka et al., 2011) and, as expected, stronger 4E cell expression of the reporter can be detected by an anti-GFP antibody (data not shown).
Transcriptional regulation of the elt-2 gene. (A) The expression of transgenic reporters accurately reflects the in vivo expression of ELT-2. The top row shows the normal endogenous expression patterns of ELT-2 in early C. elegans embryos, as detected by immunofluorescence using the anti-ELT-2 monoclonal antibody 455-2A4. The middle row shows expression patterns in early to mid-stage embryos of a transgenic nuclear-localised GFP reporter construct driven by the 5048 bp 5′ flanking region of the elt-2 gene. Egg shells are outlined (dashed line). Beneath is a differential interference contrast (DIC) image of an L1 larva, with the fluorescence from the transgenic elt-2::GFP reporter superimposed. Scale bar: 20 μm. In each image, fluorescence signal is adjusted to high contrast to emphasize expression patterns and the lack of non-intestinal expression. (B) Identifying conserved regions in the elt-2 5′ control region by sequence alignments. The dot matrix plot (EMBOSS/dotmatcher, www.ebi.ac.uk/tools/emboss) compares 6 kb upstream of the ATG initiation codon for C. elegans (horizontal axis) and C. briggsae (vertical axis), revealing three blocks of conserved sequences (CR I, CR II and CR III). These conserved blocks are aligned with the genomic locus of the C. elegans elt-2 gene, showing (to scale and from left to right) the upstream elt-4 gene, the apparent ORF C39B10.7 and the elt-2 coding region with the transcriptional start site (TSS) indicated. Also shown are two genomic deletions (ca16 and gk153), TGATAA sites (filled triangles) and WGATAR sites that are not TGATAA (open circles).
Transcriptional regulation of the elt-2 gene. (A) The expression of transgenic reporters accurately reflects the in vivo expression of ELT-2. The top row shows the normal endogenous expression patterns of ELT-2 in early C. elegans embryos, as detected by immunofluorescence using the anti-ELT-2 monoclonal antibody 455-2A4. The middle row shows expression patterns in early to mid-stage embryos of a transgenic nuclear-localised GFP reporter construct driven by the 5048 bp 5′ flanking region of the elt-2 gene. Egg shells are outlined (dashed line). Beneath is a differential interference contrast (DIC) image of an L1 larva, with the fluorescence from the transgenic elt-2::GFP reporter superimposed. Scale bar: 20 μm. In each image, fluorescence signal is adjusted to high contrast to emphasize expression patterns and the lack of non-intestinal expression. (B) Identifying conserved regions in the elt-2 5′ control region by sequence alignments. The dot matrix plot (EMBOSS/dotmatcher, www.ebi.ac.uk/tools/emboss) compares 6 kb upstream of the ATG initiation codon for C. elegans (horizontal axis) and C. briggsae (vertical axis), revealing three blocks of conserved sequences (CR I, CR II and CR III). These conserved blocks are aligned with the genomic locus of the C. elegans elt-2 gene, showing (to scale and from left to right) the upstream elt-4 gene, the apparent ORF C39B10.7 and the elt-2 coding region with the transcriptional start site (TSS) indicated. Also shown are two genomic deletions (ca16 and gk153), TGATAA sites (filled triangles) and WGATAR sites that are not TGATAA (open circles).
To define cis-acting influences on elt-2 transcription, genomic sequences were compared for the 6 kb upstream of the elt-2 genes from C. elegans and the related nematode C. briggsae. A dot matrix comparison (Fig. 2B) detects three conserved regions (CRs): CR I (∼ −0.6 to 0 kb), which contains the transcriptional start site at −499 bp (Kruesi et al., 2013); CR II (∼ −2.2 to −1.5 kb); and CR III (∼ −4.4 to −3.3 kb). Pairwise sequence alignments are shown in Fig. S2. Roughly similar regions of conservation can be detected when the C. elegans elt-2 promoter is aligned with those from C. brenneri and C. remanei, but only CR III can be detected in the more distantly related nematode C. japonicum (data not shown).
Previous experiments and existing chromosomal deletions define limits to functional regions within the elt-2 promoter. A 4.3 kb promoter fragment driving a C-terminal ELT-2::GFP fusion (with an unc-54 3′UTR) rescues the otherwise 100% lethal elt-2 (ca15) null mutation [construct pJM86 in Fukushige et al. (1999)]. Deletion ca16, which removes elt-4, and deletion gk153, which removes the distal half of CR III and much of the open reading frame (ORF) C39B10.7 (Fig. 2B), have no measureable effect on brood size, defecation rate or growth rate [Fukushige et al. (2003) and Fig. S3, respectively]. We conclude that: (1) all necessary regulatory information lies within 5 kb upstream of the elt-2 gene and (2) sequences removed by deletion ca16 or gk153 are not required for adequate elt-2 expression.
The three conserved promoter regions contribute synergistically to the initiation and maintenance of elt-2 expression
The results of two opposing deletion series (5′ and 3′) of an elt-2prom::GFP-lacZ reporter construct, assayed in transgenic embryos, are collected in Fig. S4. Reporter expression decreases abruptly as the proximal region of CR III is removed from either direction. These results are consistent with two interpretations: (1) the proximal region of CR III contains a site crucial for elt-2 expression in the embryo or (2) the elt-2 promoter contains multiple distributed sites contributing to activity, the proximal region of CR III being the point where a critical number of these sites has been removed in either deletion series such that overall promoter activity now falls below a threshold.
To distinguish between these two models, CR I, CR II and CR III were fused individually and in combinations to a GFP reporter. Constructs were assessed for ‘initiation’ activity (expression at the 4E to 8E cell stage of transgenic embryos) and ‘maintenance’ activity (expression from the comma stage embryo through the larval and adult stages). When CR III, CR II and CR I are fused directly to each other and to the GFP reporter (Fig. 3A, Construct #1), expression is strong and robust, starting at the 4E cell stage (again, allowing for maturation time lag of GFP) and continuing to adulthood. Within experimental uncertainty, expression levels approximate to those produced by the full unmodified 5 kb promoter assayed with the same reporter, suggesting that all necessary or even influential cis-acting regulatory motifs are contained within the three conserved regions.
Enhancer activities of the three conserved regions identified in the 5′ flanking region of elt-2. (A) The enhancer activity of CR I, CR II and CR III was tested individually and in combinations. Reporter expression patterns from multiple independent transgenic lines are summarised as: +++, similar pattern and intensity as the intact 5 kb promoter; ++ and +, decreasing (∼by half) steps in intensity; −, no detectable reporter expression. (B) Importance of conserved TGATAA sites in CR III and CR I for elt-2 enhancer activity. Mutated TGATAA sites are marked by ‘X’.
Enhancer activities of the three conserved regions identified in the 5′ flanking region of elt-2. (A) The enhancer activity of CR I, CR II and CR III was tested individually and in combinations. Reporter expression patterns from multiple independent transgenic lines are summarised as: +++, similar pattern and intensity as the intact 5 kb promoter; ++ and +, decreasing (∼by half) steps in intensity; −, no detectable reporter expression. (B) Importance of conserved TGATAA sites in CR III and CR I for elt-2 enhancer activity. Mutated TGATAA sites are marked by ‘X’.
CR I in isolation (Fig. 3A, Construct #2) is neither able to initiate reporter expression at the 4E cell stage nor able to maintain expression past hatching but does significantly contribute to expression during the mid-to-late embryonic stages. CR II in isolation (Construct #3) is unable to drive detectable reporter expression at any stage. By contrast, CR III in isolation (Construct #4) is able to drive expression from the earliest initiation phase (4E) and at all subsequent stages into adulthood. CR III (which is perhaps augmented in its activity by basal promoter activity associated with the unannotated ORF C39B10.7; Fig. 2B) thus appears to provide the strongest contribution of the three individual conserved regions to elt-2 transcriptional activity, both in initiation and in maintenance phases. However, when the conserved regions are combined pairwise (Fig. 3A, Constructs #5-7), they show clear synergy. That is, together the conserved regions are able to drive higher levels of reporter expression at more stages than the estimated sum of the activities of the two tested individual conserved regions. Part of the apparent synergy between CR III and CR I could be because CR I provides a basal promoter activity for CR III. When CR III is fused to a non-endodermal basal promoter (containing no TGATAA sites) from the C. elegans heat shock gene (Construct #8), reporter activity is indistinguishable from that of the CR III-CR I combination in Construct #7.
The synergies observed between the three conserved regions appear incompatible with the model in which transcription of elt-2 is controlled by a crucial site situated in the proximal region of CR III but do appear compatible with the alternative interpretation of there being multiple cis-acting motifs distributed throughout the conserved regions that all contribute to promoter activity. Under this model, if the summed contributions of some subclass or combination of these cis acting sites lie above a threshold, elt-2 is transcribed. The results also provide evidence against a model in which each conserved region contributes solely and uniquely to a spatial subpattern of activity (for example, to elt-2 expression in the anterior or posterior intestine) or is solely responsible for elt-2 expression during a restricted developmental time window (for example, only in embryos or only in L1 larvae).
Conserved TGATAA sites are individually dispensable but collectively crucial for elt-2 control
Of the 30 potential GATA factor binding sites (defined as WGATAR) in the genomic region depicted in Fig. 2B, 22 are TGATAA, the site highly enriched in promoters of intestinal genes (McGhee et al., 2009, 2007; Pauli et al., 2006); this proportion (∼73%) is more than twice that expected from base composition and applies to the overall region as well as to each of the three conserved regions. It would be an overwhelming task to mutate these individual sites combinatorially and comprehensively, especially without a precise quantitative assay. The present analysis is thus limited to mutating four conserved TGATAA sites in CR III and three conserved TGATAA sites plus a conserved AGATAG site in CR I.
As noted above, CR III fused to the basal heat shock promoter (Fig. 3A, Construct #8) produces strong expression from the 4E stage to adulthood. However, this activity is completely abolished by mutation of the four conserved TGATAA sites in CR III (Fig. 3B, Construct #9). Likewise, no activity is observed when a quadruply mutated CR III is combined with a quadruply mutated CR I (Fig. 3B, Construct #10). When an unmutated CR I or CR III is fused to a quadruply mutated CR III or CR I, respectively (Fig. 3B, Constructs #11,12), the resulting activity is close to that provided by the unmutated regions assayed in isolation. These results suggest that for CR III (and for CR I with the caveat that one of the mutated sites was AGATAG), TGATAA sites are necessary to provide enhancer activity. Further, the results suggest that there are no non-TGATAA sites that are sufficient for driving elt-2 expression.
Each of the four TGATAA sites in CR III were mutated one at a time and then assayed as a fusion to an unmutated CR I (Fig. 3B, Constructs #13). Expression patterns produced by each of the four constructs are essentially indistinguishable and are similar to that produced by the unmutated CR III-CR I fusion (Construct #7). We conclude that there is no single TGATAA site within CR III that is necessary or obviously distinguished from the others. When two or even three of the four CR III TGATAA sites are mutated (Fig. 3B, Constructs #14, 15), expression remains strong.
END-1, ELT-7 and ELT-2 can bind in vitro to conserved TGATAA sites in the elt-2 promoter
The above results show that CR III appears to be involved in both the initiation and maintenance of elt-2 transcription. Electrophoretic mobility shift experiments show that END-1 (involved in elt-2 initiation), ELT-2 (involved in elt-2 maintenance) and ELT-7 (probably involved in both elt-2 initiation and maintenance) can all bind in vitro to each of the four conserved TGATAA sites in CR III (Fig. 4A). Binding is specific in that it is competed efficiently by unlabelled double-stranded oligonucleotide but is competed much less efficiently by unlabelled oligonucleotides in which the TGATAA sites have been mutated to GTCGAC. Allowing for different specific activities of labelling of the four probes, for each of the three proteins we estimate that binding affinities to the individual TGATAA sites in CR III differ by at most 5- to 10-fold. These biochemical measurements support the general view derived from the previous transgenic experiments, namely that each TGATAA site contributes to overall promoter activity, even though individual sites may differ several fold in their influence.
DNA-protein interactions in the elt-2 control region. Electrophoretic mobility shift assays to show that END-1, ELT-2 and ELT-7 proteins can all bind directly to each of the four conserved TGATAA sites in CR III of the elt-2 promoter. The same set of labelled probes was used for all three proteins, with the coordinates of the individual conserved TGATAA sites shown at the top.
DNA-protein interactions in the elt-2 control region. Electrophoretic mobility shift assays to show that END-1, ELT-2 and ELT-7 proteins can all bind directly to each of the four conserved TGATAA sites in CR III of the elt-2 promoter. The same set of labelled probes was used for all three proteins, with the coordinates of the individual conserved TGATAA sites shown at the top.
ELT-2 binds directly in vivo to its own promoter and to promoters of intestinal differentiation genes
To test for direct ELT-2 occupancy at its own promoter in vivo, we performed chromatin immunoprecipitation and sequencing (ChIP-Seq) using an antibody specific to C. elegans ELT-2 (as used in Fig. 2A and Fig. S3) and extracts derived from wild-type L3 larvae. Within the elt-2 promoter, three regions of ELT-2 occupancy can be identified that passed our threshold of significance (MACS2 peak scores <10−30) and that aligned satisfactorily with CR I, CR II and CR III (Fig. 5, Fig. S5). Thus, the ChIP-Seq results support the view that ELT-2 interacts with all three of the conserved cis-regulatory regions in the elt-2 promoter. A peak of ELT-2 occupancy (below our significance threshold) can be detected upstream of the elt-4 gene, suggesting that ELT-2 might also regulate elt-4. Fig. S6 shows that the expression of an integrated elt-4 promoter::GFP transgenic reporter is largely abolished by elt-2 RNAi.
ELT-2 ChIP-Seq on the elt-2, ges-1 and cpr-6 loci. (A) ELT-2 ChIP-Seq tracks (dark blue) from L3 larval worms are shown on and around the elt-2 gene, with significant MACS2 peaks highlighted above (dark blue bars). ChIP-Seq reads were normalised with respect to read depth and IgG-only controls. The average of three replicates is shown. Individual replicates are shown in Fig. S5. The corresponding RNA-Seq (light blue) results obtained from the same chromatin preparation (whole L3 worms) are shown below. Regions of poor mappability owing to genomic repeats are depicted in the Repeatmasker (www.repeatmasker.org) trace (grey). The location of CR I, CR II and CR III and the occurrences of TGATAA motifs are also shown. (B,C) ELT-2 ChIP-Seq and RNA-Seq tracks at the (B) ges-1 and (C) cpr-6 loci.
ELT-2 ChIP-Seq on the elt-2, ges-1 and cpr-6 loci. (A) ELT-2 ChIP-Seq tracks (dark blue) from L3 larval worms are shown on and around the elt-2 gene, with significant MACS2 peaks highlighted above (dark blue bars). ChIP-Seq reads were normalised with respect to read depth and IgG-only controls. The average of three replicates is shown. Individual replicates are shown in Fig. S5. The corresponding RNA-Seq (light blue) results obtained from the same chromatin preparation (whole L3 worms) are shown below. Regions of poor mappability owing to genomic repeats are depicted in the Repeatmasker (www.repeatmasker.org) trace (grey). The location of CR I, CR II and CR III and the occurrences of TGATAA motifs are also shown. (B,C) ELT-2 ChIP-Seq and RNA-Seq tracks at the (B) ges-1 and (C) cpr-6 loci.
An extensive ChIP-Seq analysis of ELT-2 binding to intestinal differentiation genes at different developmental stages will be provided elsewhere (E.O.N., J.D.L. and J.D.M., unpublished). However, the present ChIP-Seq data allow us to illustrate the direct binding of ELT-2 to two previously characterised intestinal differentiation genes, ges-1 (gut esterase) and cpr-6 (cysteine protease). As shown in Fig. 5B, a significant peak of ELT-2 occupancy can be detected 1.1 kb upstream of the ges-1 initiation codon, aligning with the tandem pair of GATA sites (TGATAA and TGATAG) that have previously been shown to be functional in transgenic assays (Egan et al., 1995) and that were originally used to clone elt-2 (Hawkins and McGhee, 1995). In L1 larvae, levels of cpr-6 transcripts decrease ∼100-fold in the absence of ELT-2 (McGhee et al., 2009) and a prominent peak of ELT-2 aligns with a TGATAA site immediately upstream of the cpr-6 coding sequence (Fig. 5C).
ELT-2 levels recover from early perturbation caused by lack of END-3
Both quantitative in situ hybridisation and transgenic reporter assays show that loss of end-3 significantly decreases the level of elt-2 transcripts at the 4E to 8E cell stage (Boeck et al., 2011; Raj et al., 2010). Yet loss of end-3 results in only 5-10% of embryos that lack an intestine (Maduro et al., 2007, 2005). Indeed, we determined that brood sizes in homozygous end-3(-) adults are essentially wild type [244±29 (n=1220) for end-3(-) adults, 253±33 (n=1265) for N2 (±s.d.); P=0.7, t-test]. Thus, either animals can survive and thrive with only a fraction of their normal ELT-2 levels, or the ELT-2 levels recover later in development.
ELT-2 protein levels in 8E cell stage end-3(-) embryos were measured by quantitative immunofluorescence and found to be 50±20% of those in 8E cell stage wild-type embryos (weighted mean±s.d. from three independent immunofluorescence comparisons, n=513 embryos), consistent with previous estimates (Raj et al., 2010; Boeck et al., 2011). Because immunofluorescence of individual embryos during later stages of embryogenesis was too variable, we used quantitative western blotting to measure ELT-2 levels in newly hatched end-3(-) and wild-type L1 larvae (Fig. S7). We first established conditions in which the measured band intensity of a paramyosin UNC-15 control increased by 2.0±0.3-fold when twice the number of animals were loaded on the gel. From band intensities measured with three different loadings of both N2 and end-3(ok1448) L1 larvae (extracts of 500, 1000 and 2000 animals per lane) on each of two independent gels, we estimate that the ELT-2 levels in end-3(-) L1 larvae are 108±23% (±s.d.) of the UNC-15-normalised ELT-2 levels measured in N2 L1 larvae (P=0.4, t-test; thus not significantly different from 100%). Therefore, ELT-2 levels in end-3(-) animals have reattained wild-type levels by the time of hatching.
ELT-2 expressed earlier in development can replace all other endodermal GATA factors
In the course of normal C. elegans development, either END-1 or END-3 is necessary to specify the C. elegans endoderm (Maduro et al., 2015; Owraghi et al., 2009), whereas initial ELT-2 function is restricted to differentiation. We examined whether ELT-2 could replace END-1 and END-3 in specifying the endoderm, simply by being expressed earlier. If so, this would be a striking result, because these endodermal GATA factors have overall amino acid identities and similarities of only 12-24%.
We attempted to rescue an end-1 end-3 double mutant by injecting a construct in which elt-2 cDNA is under the control of either the end-1 or end-3 promoter. Strain MS1248 [end-1(ok558) end-3(ok1448); irEx568 [end-1(+); end-3(+); sur-5::RFP]; Owraghi et al., 2009] was injected so as to install a second extrachromosomal transgenic array containing, for example, end-1prom::elt-2 cDNA, together with elt-2prom::GFP and rol-6(su1006) marker constructs. Successful rescue was indicated by progeny worms that had lost the original rescuing array but had retained the replacement array, thereby producing green non-red rollers. Rescue was judged to be unsuccessful if such segregants could not be detected after several generations. Initial observations showed that both end-3prom::elt-2 cDNA and end-1prom::elt-2 cDNA constructs were able to rescue the end-1 end-3 double mutant. Subsequent experiments were performed with the end-1prom::elt-2 cDNA construct because it gave more efficient rescue. To establish a stable rescuing construct, one particular rescuing array was integrated into a wild-type background, followed by outcrossing to remove extraneous mutations. This strain (JM230 caIs85[end-1prom::elt-2 cDNA; elt-2prom::GFP; pRF4] I) serves as a control for phenotypes that are not connected to the rescuing ability of the transgenic array. The integrated array caIs85 was then introduced into a quadruple homozygous null mutant in end-1, end-3, elt-7 and elt-4.
The final rescued strain (JM229 caIs85; elt-7 end-1 end-3; elt-4) is surprisingly viable, healthy and fertile. JM229 shows only 3±4% embryonic lethality and 12±9% larval lethality (nine broods, 1467 total progeny), corresponding to an overall rescuing rate of ∼85%. Control strain JM230 caIs85 shows 2% and 0% embryonic and larval lethality, respectively, suggesting that the 15% overall lethality observed with JM229 is due to incomplete rescue and not to any unrelated property associated with the rescuing array. The rescued strain JM229 shows roughly the same level of lethality/arrest as seen with an end-3 single mutant: we measured embryonic and larval lethality/arrest in strain RB1331 end-3(ok1448) as 3±3% and 12±7%, respectively (four broods, 912 total progeny); Maduro et al. (2007) previously reported that 5% of end-3(ok1448) embryos lack an intestine. Early embryonic phenotypes of JM229 (20°C) are mild and resemble the incomplete rescue of the end-3 null phenotype described previously (Boeck et al., 2011). Specifically, the 2E cell stage is 2-4 min shorter than in control embryos, the 2E-to-4E cell division occurs closer to the ventral surface of the embryo [10±2 μm for JM229 (n=19); 15±1 μm for the N2 control (n=8); 15±1 μm for control strain JM230 (n=17)], and the division axis tends to be oriented in a more dorsoventral direction than normal. The most severe overall embryonic phenotype of the rescued strain is that the duration of embryogenesis (the time from the 2- to 4-cell stage to hatching) is extended by ∼1 h compared with N2 and with the control strain JM230 (Fig. 6A).
ELT-2 can replace END-1, END-3, ELT-7 and ELT-4. (A) Embryos from the rescued quadruple mutant strain JM229 (black circles) hatch later than embryos from the control strain JM230 (white circles). Other symbols represent hatching curves measured for four different local versions of N2 wild-type worms (including a recent thaw). Time on the x-axis is minutes at 20°C from the 1- to 4-cell stage of embryogenesis. (B) DIC image of an L1 larva from the rescued quadruple mutant strain JM229. Average length of JM229 L1 larvae is 242±20 μm. (C) Fluorescent image of the same larva as in B, showing expression of an elt-2prom::GFP reporter incorporated into the integrated rescuing array caIs85. (D) DIC image of an L1 larva from N2 wild-type control. Average length of N2 L1 larvae is 278±18 μm.
ELT-2 can replace END-1, END-3, ELT-7 and ELT-4. (A) Embryos from the rescued quadruple mutant strain JM229 (black circles) hatch later than embryos from the control strain JM230 (white circles). Other symbols represent hatching curves measured for four different local versions of N2 wild-type worms (including a recent thaw). Time on the x-axis is minutes at 20°C from the 1- to 4-cell stage of embryogenesis. (B) DIC image of an L1 larva from the rescued quadruple mutant strain JM229. Average length of JM229 L1 larvae is 242±20 μm. (C) Fluorescent image of the same larva as in B, showing expression of an elt-2prom::GFP reporter incorporated into the integrated rescuing array caIs85. (D) DIC image of an L1 larva from N2 wild-type control. Average length of N2 L1 larvae is 278±18 μm.
An image of a newly hatched L1 larva of the rescued strain JM229 is shown in Fig. 6B. The number of intestinal nuclei is normal (20.1±0.8, n=31) as counted using the elt-2prom::GFP reporter that is part of the rescuing array (Fig. 6C). JM229 L1 larvae are 13-15% shorter than control L1 larvae [242±20 μm for JM229 (n=31); 284±25 μm for JM230 (n=32); 278±18 for N2 (n=30)]. The heads of rescued larvae sometimes appear more rounded than normal (compare the JM229 L1 shown in Fig. 6B with the wild-type L1 shown in Fig. 6D). The overall life cycle (time from 2- to 4-cell stage to first egg lay) is extended (∼84 h for JM229 compared with ∼60 h for the N2 control) but a similar delay is measured with the control strain JM230 and is thus more likely to reflect a property of the rescuing array than incomplete rescue. The most severe post-hatching phenotype is that the brood size is reduced [163±65 (n=9) for JM229; 262±52 (n=5) for N2; 221±28 (n=5) for JM230]. However, with respect to morphology and overall viability, the JM229 phenotypes are remarkably minor.
Using the same assay, we were unable to rescue the quadruple elt-7 end-1 end-3; elt-4 mutant with a single copy of the end-1prom::elt-2 cDNA construct inserted into the ttTi5605 MosCI site on chromosome II (Frøkjaer-Jensen et al., 2012). Thus, elt-2 overexpression in the early embryo might be a key feature of its ability to rescue the quadruple mutant. ELT-2 protein can always be detected immunologically in the 2E cell stage of JM229 embryos, which is one cell cycle earlier than it appears in wild-type embryos. However, the majority of JM229 1E cell embryos are ELT-2 negative, suggesting that there could be functional levels of ELT-2 below our detection limit or that endoderm specification can occur at the 2E cell stage. As assayed by immunofluorescence, a 2E cell stage JM229 embryo contains roughly the same amount of ELT-2 protein as found in an 8E to 16E cell stage wild-type embryo (data not shown). In other words, ELT-2 is indeed overexpressed in the earliest embryos of the rescued strain but not exceptionally so.
elt-7 cDNA expressed under the control of the end-1 promoter in a multicopy transgenic array was also able to rescue the quadruple elt-7 end-1 end-3; elt-4 mutant. As expected, this rescue required elt-2 (McGhee et al., 2007; Sommermann et al., 2010). However, not all GATA factors can specify the C. elegans endoderm. We used the end-1 promoter to drive expression of cDNAs of either the C. elegans hypodermis-specific GATA factor ELT-3 (Gilleard et al., 1999) or the mouse endodermal GATA transcription factor GATA4 (Aronson et al., 2014), but neither construct was able to rescue the end-1 end-3 double mutant.
DISCUSSION
The C. elegans elt-2 gene is controlled by three conserved ‘enhancers’ distributed over ∼5 kb of 5′ flanking region. Each of these three regions contributes to the transcriptional activation of elt-2 but the exact contribution depends on which of the other conserved regions are included in the construct. In other words, the three enhancers appear to interact with each other and to contribute synergistically to overall elt-2 activity. At the present level of our analysis, there is no evidence of any particular subpattern of expression (e.g. adult stage only or anterior intestine only) being conveyed by any particular enhancer; rather, they all seem to contribute to overall elt-2 transcriptional activity.
Focusing on two of the elt-2 enhancers (CR I and CR III), we showed that mutation of all conserved TGATAA sites (four in CR III, three in CR I plus an AGATAG site) abolished both enhancer and basal promoter activity when assayed in transgenic reporters. We conclude that these conserved sites are necessary for elt-2 expression and that no other site within the enhancers is sufficient for reporter expression (barring sites that overlap with the mutated TGATAA sites). Thus, we have no evidence that the core developmental control of elt-2 expression is mediated by anything other than conserved TGATAA sites and, by implication, by the known set of endodermal GATA factors: END-1/3 and ELT-2/7. However, we fully expect that there will be other types of intestinal transcription factors and other cis-acting sites that, at least in post-embryonic stages, participate in elt-2 control, maintaining physiological homeostasis and responding to nutritional or environmental signals.
The transcriptional activity of the most active enhancer, CR III, persisted even with only one remaining wild-type TGATAA site, at least qualitatively. Furthermore, we could find no evidence that any individual TGATAA site was functionally distinct from any other; for example, being solely responsible for the initiation of elt-2 transcription or responsive to only one of the several GATA factors present in the early endoderm. This model is supported by the in vitro demonstration that END-1, ELT-7 and ELT-2 could all bind directly to each of the four conserved TGATAA sites within CR III. Even though different sites showed modestly different affinities for the different factors, we were not able to identify a ‘GATA’ site to which one factor could bind but another factor could not (note that END-3 protein was not available).
The above results lead to a model for elt-2 control that is redundant, robust and flexible. Indeed, when ELT-2 levels are halved by loss of end-3 in the early embryo, they are able to recover to wild-type levels by the time that the animals hatch, several hours later. Thus, the endoderm network is capable of dynamic re-equilibration or self-correction during embryogenesis. A different perturbation of the early endoderm regulatory network has recently been shown to lead to increased numbers of intestinal nuclei in the adult (Maduro et al., 2015). It will be important to determine if such adult phenotypes are due to a persistent perturbation of the transcriptional network and its downstream biochemical pathways or are rather due to some irreversible early cellular defect, such as aberrant cell division in the early embryo.
An unexpected result of the present study is the demonstration that ELT-2, when expressed under a transgenic end-1 promoter as well as under its own endogenous promoter, is able to replace, essentially completely, the other four GATA factors involved in development of the C. elegans endoderm, namely END-1, END-3, ELT-7 and (for completeness) ELT-4. Rescue is highly efficient using a multiple copy transgene of end-1prom::elt-2 cDNA, but is not perfect. The strain has low levels of embryonic/larval lethality, slightly perturbed gastrulation, marginally slower embryonic development and a lower brood size. However, overall, we regard these phenotypes as remarkably modest considering the extensive rearrangement of the core transcriptional network. We were unable to achieve rescue of the quadruple elt-7 end-1 end-3; elt-4 mutant using a single integrated copy of the end-1prom::elt-2 cDNA transgene. This failure could possibly reflect a position effect, although single-copy insertions of end-1 into this same genomic locus appear to function well (Maduro et al., 2015). Alternatively, perhaps the larger ELT-2 molecule takes longer to be produced than the smaller END-1 and/or END-3 molecules (see below), or perhaps ELT-2 binds less optimally to the early endoderm-specifying genes that are the normal targets of END-1/END-3. Both of these inefficiencies might require that ELT-2 is expressed at higher levels in order to compensate.
Regulatory pathways are thought to evolve in a retrograde manner, with genes expressed late brought under control of factors expressed earlier, which in turn are brought under the control of factors expressed even earlier (Wilkins, 1995). Thus, genes encoding intestinal digestive enzymes or intestinal structural proteins might originally have been under the sole control of ELT-2. ELT-2 might subsequently have come under control of the earlier expressed end-1 and end-3, with the redundant activation of elt-2 by elt-7 introduced as an intermediate step or as a later intercalation.
In light of the demonstration that ELT-2 can perform all necessary functions of END-1, END-3 and ELT-7, why did the C. elegans endoderm pathway evolve as it did and not remain with only ELT-2 as the transcriptional activator of the genes performing specification, differentiation, growth and intestinal maintenance? Three reasons come to mind. The first reason is possible selection for greater fidelity of endoderm development controlled by a redundant pathway (Cooke et al., 1997). A second possible reason is to separate elt-2 control from the influence of skn-1 and pop-1, the two maternal-effect genes that activate end-1 and end-3 in the E blastomere. Both skn-1 and pop-1 also function zygotically within the differentiating and mature intestine and the transient expression of end-1 and end-3 would free ELT-2 from being controlled by skn-1 and/or pop-1 throughout the animal's lifespan. A third reason could be that the use of the smaller end-1 and end-3 genes allows more rapid transcription and translation and hence more rapid specification of the E cell than if specification depended on elt-2. The differences are greatest when comparing end-3 and elt-2. The transcript lengths are 1276 and 2344 nucleotides, respectively; protein sizes are 242 and 433 amino acids, respectively. Cell cycle times in the early C. elegans embryo are only 15-20 min, perhaps providing sufficient time for end-3 but not elt-2 to be transcribed and translated. In the early Drosophila embryo, cell cycle times are even shorter and it has been shown that the transcription of long genes is aborted by the intervention of mitosis (Shermoen and O'Farrell, 1991). However, elt-2 (∼2.3 kb) is much shorter than the Drosophila gene (Ubx, 57 kb), the transcription of which is aborted by mitosis, and to our knowledge it has not yet been shown that similar mitosis-aborted transcription occurs in early C. elegans development. Moreover, our rescuing transgene is based on elt-2 cDNA, which is roughly the same size as the end-3 transcript, suggesting that perhaps it is translation that is limiting, not transcription. In any event and independent of any particular explanation, the quadruple mutant elt-7(-) end-1(-) end-3(-); elt-4(-) embryos rescued by the end-1prom::elt-2 cDNA transgene hatch ∼1 h later than wild-type embryos, a potentially huge fitness disadvantage.
The C. elegans endoderm is one of only several developmental cell lineages in which a plausible direct molecular chain of command can be proposed to connect factors in the maternal cytoplasm with factors controlling tissue-specific gene transcription in the mature adult. In the present paper, we have defined core features of both the cis-acting sequences and the trans-acting factors controlling transcription of the gene encoding ELT-2, the predominant transcription factor associated with endoderm differentiation and function. We have provided a clear example of how the regulatory network is able to overcome a severe early perturbation, namely the low concentration of ELT-2 in the early embryo caused by loss of the end-3 gene. Finally, we have explored the regulatory potential of the ELT-2 protein and have shown that it is capable of replacing all other endodermal GATA factors, in particular replacing the END-1 and END-3 factors that normally specify the endoderm. Our results contribute to a long-term goal of describing development of the C. elegans endoderm quantitatively, in terms of binding affinities for particular regulatory sites and in terms of transcription factor activities, redundancies and stabilities.
MATERIALS AND METHODS
Nematode strains
C. elegans strains were grown on OP50-seeded NGM plates (Brenner, 1974). Transgenic animals were produced by standard gonadal injection (Mello et al., 1991), with the DNA construct to be tested present at 25-50 µg/ml, usually together with pRF4 at 50 µg/ml as a phenotypic marker. Plasmids were constructed by standard methods and are described in more detail in the supplementary Materials and Methods. Selected transforming arrays were integrated into the genome using γ-irradiation as previously described (Egan et al., 1995), followed by outcrossing at least four times. The full genotype of the rescued quadruple mutant strain JM229 is: caIs85[end-1prom::elt-2cDNA (pJM513), rol-6(su1006) (pRF4), elt-2prom::GFP (pJM370) I]; elt-7(tm840) end-1(ok558) end-3(ok1448) V; elt-4(ca16) X. The genotype of the control strain JM230 is: caIs85 I. All mutant alleles of the GATA factor genes are deletions (presumed nulls) and were followed in genetic crosses by PCR; primer sequences and expected product sizes are detailed in Table S1. The homozygosity of the end-1 deletion in the final strain JM229 was also verified by Southern blotting. The end-1prom::elt-2 cDNA sequence was cloned into plasmid pCFJ350 (Addgene) and a single copy inserted into chromosome II using strain EG6699 and the MosCI technique developed by Frøkjaer-Jensen et al. (2012).
Antibodies and immunodetection
The anti-ELT-2 monoclonal antibody 455-2A4 (isotype IgG1) used in immunofluorescence, ChIP and western blotting was produced by the Southern Alberta Cancer Research Institute Antibody Services, using as antigen a purified polyhistidine-tagged full-length ELT-2 protein produced in E. coli. Immunostaining of embryos dissected from adult hermaphrodites was performed as described (Van Furden et al., 2004). Western blots were performed by standard methods, probed with anti-ELT-2 monoclonal antibody 455-2A4 together with monoclonal antibody MH16 (Developmental Studies Hybridoma Bank, University of Iowa) to detect paramyosin as a loading control and developed with HRP-conjugated secondary antibodies and the Amersham ECL Prime reagent. Band intensities were measured with a LAS4000 Imaging Station (GE Healthcare) and quantitated using Fiji. Further details of the antibodies used and immunodetection methods are provided in the supplementary Materials and Methods.
DNA-protein interactions
Electrophoretic mobility shift assays (band shifts) were performed as previously described (Kalb et al., 1998). The four TGATAA sites (with 10 bp flanking sequence on either side) were all synthesized as self-complementary hairpins (four C residues in the loop) to guarantee equal stoichiometry and a double-stranded conformation. Only the wild-type hairpin was used as a labelled probe, in which case a missing 3′ terminal C residue was filled in using Klenow polymerase and α32P-dCTP. Binding specificities were tested using competition with non-labelled hairpins at a 50-fold excess. Full-length His-tagged END-1, ELT-7 and ELT-2 proteins were synthesized in baculovirus-infected insect cells, either by ourselves or by ARVYS Proteins Inc., and purified by metal-affinity chromatography (estimated purity >90%).
Chromatin immunoprecipitation followed by deep sequencing (ChIP-Seq)
ChIP-Seq was performed and the data were analysed as previously described (Berkseth et al., 2013). ChIP-Seq and RNA-Seq methods, analysis parameters and instructions for data access (GEO accession numbers) are described in detail in the supplementary Materials and Methods, with ELT-2 ChIP-Seq peaks and summits listed in Tables S2 and S3.
Acknowledgements
We thank the following people who have variously contributed to this project over the years: Jamie Feng, Anne Formaz-Preston, Tetsunari Fukushige, Mark Hawkins, Sai Ravikumar, Fran Snider and Lana Wong.
Footnotes
Author contributions
T.W. and J.Y.B. performed and analyzed the immunohistochemical and transgenic reporter experiments. B.G. performed the EMSA assays for in vitro DNA-protein interactions. E.O.N., A.G.R. and J.D.L. performed and analyzed the ELT-2 ChIP-Seq and RNA-Seq experiments. T.W. performed the mutant rescue experiment. T.W., E.O.N., J.D.L. and J.D.M. wrote and edited the manuscript.
Funding
Work in Calgary was supported by an operating grant from the Canadian Institutes of Health Research to J.D.M.; the ChIP-Seq/RNA-Seq experiments were funded by the National Institutes of Health [grant 5R01GM104050 to J.D.L.]. E.O.N. was supported by a Damon Runyon Cancer Research Foundation Postdoctoral Fellowship Award [2083-11]. J.D.M. gratefully acknowledges salary support from the Alberta Heritage Foundation for Medical Research (now AIHS) and from the Canada Research Chairs Program. Some strains were provided by the CGC, which is funded by the NIH Office of Research Infrastructure Programs [P40 OD010440]. Deposited in PMC for release after 12 months.
References
Competing interests
The authors declare no competing or financial interests.