Precise spatiotemporal gene expression during animal development is achieved through gene regulatory networks, in which sequence-specific transcription factors (TFs) bind to cis-regulatory elements of target genes. Although numerous cis-regulatory elements have been identified in a variety of systems, their global architecture in the gene networks that regulate animal development is not well understood. Here, we determined the structure of the core networks at the cis-regulatory level in early embryos of the chordate Ciona intestinalis by chromatin immunoprecipitation (ChIP) of 11 TFs. The regulatory systems of the 11 TF genes examined were tightly interconnected with one another. By combining analysis of the ChIP data with the results of previous comprehensive analyses of expression profiles and knockdown of regulatory genes, we found that most of the previously determined interactions are direct. We focused on cis-regulatory networks responsible for the Ciona mesodermal tissues by examining how the networks specify these tissues at the level of their cis-regulatory architecture. We also found many interactions that had not been predicted by simple gene knockdown experiments, and we showed that a significant fraction of TF-DNA interactions make major contributions to the regulatory control of target gene expression.
INTRODUCTION
During animal development, gene expression is highly regulated, both spatially and temporally. This regulation is primarily achieved at the transcriptional level by the sequence-specific binding of transcription factors (TFs). Such interactions are governed by gene regulatory networks, in which genes exert their effects in a sequential and combinatorial manner (Levine and Davidson, 2005). Remarkable progress has been made in reconstructing the gene regulatory networks responsible for developmental processes. For example, the expression profiles and regulatory networks involved in dorsoventral patterning and mesoderm specification of the fly embryo (Stathopoulos and Levine, 2005; Jakobsen et al., 2007; Li et al., 2008; Sandmann et al., 2006; Sandmann et al., 2007; Liu et al., 2009; Zinzen et al., 2009), endomesoderm specification in Xenopus and sea urchin embryos (Koide et al., 2005; Loose and Patient, 2004; Oliveri et al., 2008) and vulva development in Caenorhabditis elegans (Inoue et al., 2005; Ririe et al., 2008) have been reported. Significant progress has also been made towards understanding how tissues are specified in the early embryo of the simple chordate Ciona intestinalis (Imai et al., 2006). Since the fates of most cells are determined by the early gastrula stage in Ciona and most cells in the early embryo can be identified, gene regulatory networks can be analyzed at single-cell resolution.
However, our knowledge of most of these networks is incomplete or provisional, particularly in chordates. Among the regulatory interactions in the Ciona embryo that have been revealed by gene knockdown and overexpression studies, only a small fraction have been examined at the cis-regulatory level (Bertrand et al., 2003; Brown et al., 2007; Corbo et al., 1997b; Di Gregorio et al., 2001; Oda-Ishii et al., 2005; Yagi et al., 2004). This is partly because an examination of cis-regulatory elements is very labor intensive. In addition, although most studies of cis-regulatory elements have identified elements that are qualitatively necessary and sufficient for recapitulating the endogenous gene expression patterns, they did not reveal the comprehensive binding profiles of TFs on cis-regulatory regions. Recent studies have revealed that weak or secondary cis-regulatory modules that have often been ignored in such analyses may or may not contribute to driving downstream genes in a quantitative respect (Segal et al., 2008; Zinzen et al., 2009). It therefore remains important to determine the complete binding profiles of TFs on the genome for an integrative understanding of gene regulatory networks in the Ciona embryo.
Ciona tadpole larvae have a basic chordate body plan, with the notochord located in the center of the tail and flanked laterally by muscle, ventrally by endoderm and dorsally by the nerve cord (Satoh et al., 2003). The mesenchyme fills the space between the epidermal and endodermal tissues in the trunk. The fate of most cells is already specified before gastrulation is complete. The Ciona genome contains ~670 TF genes and their expression profiles during embryonic development have almost all been described at single-cell resolution (Imai et al., 2004; Miwata et al., 2006). Comprehensive functional assays examining regulatory genes expressed in the early embryo have revealed, at single-cell resolution, the molecular pathways that regulate early embryonic specification events (Imai et al., 2006). These studies identified 11 TF genes that play a central role in early regulatory networks. Here, we have examined their in vivo occupancies on genomic DNA and determined the architecture of their networks.
MATERIALS AND METHODS
Ascidian embryos
Ciona adults were obtained from the National BioResource Project for Ciona (Japan) and were maintained in aquaria. Eggs and sperm were surgically obtained from gonoducts. Following insemination, eggs were reared at 18°C in filtered sea water.
Chromatin immunoprecipitation
We made the following DNA constructs encoding GFP-tagged TFs using Multisite Gateway Technology (Invitrogen): Brachyury-promoter(2.2k)-attB1-Brachyury coding region-attB2-GFP; FoxA-a-promoter(3.3k)-attB1-GFP-attB2-FoxA-a coding region; FoxD-promoter(3.1k)-attB1-FoxD coding region-attB2-GFP; MyoD-promoter(3.5k)-attB1-MyoD coding region-attB2-GFP; Neurogenin-promoter(3.1k)-attB1-Neurogenin coding region-attB2-GFP; Otx-promoter(2.9k)-attB1-Otx coding region-attB2-GFP; Snail-promoter(3.1k)-attB1-Snail coding region-attB2-GFP; SoxC-promoter(2.0k)-attB1-SoxC coding region-attB2-GFP; Tbx6b-promoter(3.1k)-attB1-Tbx6b coding region-attB2-GFP; Twist-like1-promoter(4.9k)-attB1-Twist-like1 coding region-attB2-GFP; and ZicL-promoter(1.8k)-attB1-ZicL coding region-attB2-GFP. Each of these constructs was introduced into fertilized eggs by electroporation (Corbo et al., 1997a). The embryos were fixed with 1% formaldehyde for 15 minutes. The developmental stages at which these embryos were fixed are shown in Fig. 1. Chromatin extracted from 0.5-1.0 g of embryos was sonicated to obtain DNA fragments with an average size of 1 kb. The DNA fragments were enriched by immunoprecipitation with a polyclonal antibody specific for GFP (Invitrogen). After reversal of the fixation, the immunoprecipitated and input DNAs were analyzed by quantitative PCR (qPCR) or amplified ligation-mediated PCR for the microarray analysis.
qPCR was performed using the SYBR Green method (Applied Biosystems). The ChIP enrichment ratio for a given region was determined as the ratio of the region in the precipitated DNA to that in the whole genome relative to the average for five genomic regions that were not expected to be bound by ZicL and Brachyury.
For the microarray analysis, immunoprecipitated DNA and genomic DNA were randomly labeled with Cy3 and Cy5 and hybridized to whole-genome Ciona arrays using the CGH protocol provided by Agilent Technologies. Two or three independent experiments were performed to reduce experimental error. Microarray data were submitted to the GEO public database (for accession numbers, see Table 1).
Design of the whole-genome Ciona arrays
We used a set of two 244k arrays (version 1) for 16 experiments and one 1M array (version 2) for seven experiments. These arrays were custom made by Agilent Technologies. The arrays used in each experiment are listed in Table 1. Each of these two sets of arrays covers the entire euchromatic region of the Ciona genome (Dehal et al., 2002) (GEO accession numbers for the array design: GPL6555/GPL6556 for the version 1 array; GPL8993 for the version 2 array). The version 1 array contains 393,266 unique probes separated by an average of 292 bp. In addition to the uniformly dispersed probes, 56,113 probes separated by an average of 79 bp were designed to densely cover the region spanning from 5 kb upstream of the transcription start site to the first intron of each regulatory gene. The version 2 array contains 971,042 unique probes separated by an average of 60 bp to densely cover the same genomic region as the version 1 arrays.
Data analysis
The raw data from the tiling arrays were normalized using Feature Extraction software (version 9.0, Agilent Technologies) with Linear and Lowess normalization. To identify significant probes and regions, we used two different programs called ChIPmix (Martin-Magniette et al., 2008) and Chipotle (Buck et al., 2005), which are based on totally different algorithms: the former is based on a linear regression mixture model and identifies significant probes, whereas the latter uses a sliding window approach and identifies significant regions. The Chipotle program (window size, 1 kb; step size, 250 bp) was first used to determine candidate significant regions. Among the DNA segments identified by Chipotle, we chose those containing one or more probes that the ChIPmix program identified as significant. To reduce noise, we performed two or three independent experiments for each of the 11 TFs (biological replicates). Only the segments identified as positive in both experiments were considered significant in the present study.
For mapping of the identified segments to the genes, we used the newest gene model set (Satou et al., 2008). If a portion of or the entire DNA segment was mapped within a genomic region containing a gene, the DNA segments were considered to be associated with this gene. If a given DNA segment was located in a region between two genes that were in head-to-tail orientation, and the distance between the segment and the second gene was less than 5 kb, the segment was considered to be associated with the second gene. If the two genes were aligned in head-to-head orientation, the segment was considered to be associated with the closer gene provided that the distance was less than 5 kb. If the two genes were aligned in tail-to-tail orientation, the segment was not considered to be associated with either gene.
The enrichment of TF binding sequences in the identified regions was examined as previously described (Li et al., 2008). The background regions were randomly selected non-coding regions of the Ciona genome that did not overlap with the binding regions of the TF in question. The position weight matrices (PWMs) of ZicL and Tbx6b were constructed from the results of Selex (systematic evolution of ligands by exponential enrichment) experiments (Yagi et al., 2004; Yagi et al., 2005). For the remaining nine TFs of unknown PWM, we searched for their best homologs, the PWMs of which were found in the TRANSFAC database (Matys et al., 2003) using the BLASTP program. Patser (Hertz and Stormo, 1999) was used to search for matches to the PWMs. The enrichment was calculated by dividing the density of matches in the bound regions by the density of matches in the background regions inside 125 bp non-overlapping sliding windows across the regions 4 kb upstream to 4 kb downstream of the peak positions of identified binding regions.
Gene knockdown and overexpression experiments
In the present study, we used morpholino antisense oligonucleotides (MOs) (Gene Tools) for MyoD, Twist-like1 and Brachyury, the specificities of which were confirmed in a previous study (Imai et al., 2006). Synthetic capped mRNAs were synthesized from cDNAs cloned into pBluescript RN3 vector (Lemaire et al., 1995) using the mMESSAGE mMACHINE Kit (Ambion). Fifteen fmoles of MOs or 1.5 pg (Brachyury) or 9 pg (MyoD and Twist-like1a) of synthetic capped mRNAs in 30 pl of solution were injected into dechorionated eggs after fertilization, as described previously (Imai et al., 2006). Cleavage of some embryos was arrested at the 110-cell stage with 2.5 μg/ml cytochalasin B, and the arrested embryos were cultured further until the control embryos reached the early tailbud stage (10 hours after fertilization). Relative quantification of mRNA levels (by qRT-PCR) was performed as described (Imai et al., 2004).
RESULTS
Identification of in vivo binding of eleven TFs in the early Ciona embryo
The developmental fates of blastomeres in the Ciona embryo have been determined by the gastrula stage (Nishida, 1987). A comprehensive study has revealed that 53 TF genes are zygotically expressed and regulate one another in complex networks before gastrulation begins (Imai et al., 2006). To dissect the architecture of these networks at the level of protein-DNA interactions, we focused on 11 TF genes that play core roles in gene regulatory networks for endomesoderm specification: Brachyury, FoxA-a, FoxD, MyoD, Neurogenin, Otx, Snail, SoxC, Tbx6b, Twist-like1 and ZicL. Because the Ciona genome contains multiple copies of FoxD, Tbx6b and ZicL as gene clusters and their precise copy numbers have not yet been determined, these genes are collectively referred to FoxD, Tbx6b and ZicL in this paper. Likewise, there are two copies of Twist-like1, which are highly similar to each other, and we collectively refer to these as Twist-like1.
We prepared 11 gene-fusion constructs that encode GFP-tagged TFs expressed under the control of their own promoters (e.g. a fusion gene that encodes GFP-tagged Brachyury driven by the Brachyury promoter). When these constructs were introduced into eggs, the resultant embryos expressed the fusion genes at the same time and in the same blastomeres as the endogenous genes (Fig. 1). Exceptions were the Twist-like1 and the Snail constructs. Twist-like1 is normally expressed in three cell lineages (A7.6, B7.7 and B8.5), but our construct drove Twist-like1-GFP expression only in the B7.7 and B8.5 lines. Snail expression in the notochord lineage is normally very weak. Our Snail construct did not recapitulate this expression in the notochord lineage but did drive Snail-GFP expression in the remaining lineages.
Expression of these genes did not affect embryonic morphology at the stage when the embryos were fixed (Fig. 1). The fixed embryos were subjected to ChIP using anti-GFP antibodies, and subsequently to microarray analysis. To define significant regions, we used two programs employing totally different algorithms. DNA segments regarded as positive by both programs were defined as significant. To confirm that our approach successfully identified TF binding sites, we analyzed the sequences of ZicL and Tbx6b binding regions defined with three different false discovery rates (FDRs), as the consensus binding motifs of these two TFs are known (Yagi et al., 2004; Yagi et al., 2005). The frequencies of matches to the consensus binding sequences for ZicL and Tbx6b around peaks in 0.1% FDR were generally better than in 0.01% and 1% FDRs (Fig. 2A). As expected, the frequencies of the consensus binding sequences for ZicL and Tbx6b were markedly higher around peaks in the identified regions (Fig. 2A), suggesting that our method was able to successfully identify the TF binding regions.
Brachyury and Ci-tropomyosin-like are the only known direct targets of ZicL and Brachyury, respectively (Yagi et al., 2004; Di Gregorio and Levine, 1999). As an independent confirmation, we inspected the TF binding sites of these genes. Our ZicL ChIP profile showed a sharp peak around two known strong ZicL binding sites (Fig. 2B). The Brachyury ChIP profile also showed a peak around the known Brachyury binding site in the Ci-tropomyosin-like promoter (Fig. 2C). These peaks were included in significant regions identified with all the FDRs described above. We also performed ChIP-qPCRs for these two known interactions. The ChIP-qPCR results showed excellent agreement with the ChIP-chip results (Fig. 2B,C, orange dots).
Next, we examined the promoters of genes that were identified in previous studies as likely direct targets of one of the 11 TFs on the basis of expression assays (Imai et al., 2004) and gene knockdown assays (Imai et al., 2006). Among 29 interactions that had been found in the gene knockdown assays and for which both the source and target genes are expressed in the same cells, 28, 23 and 19 interactions were indicated to be direct under the FDRs of 1%, 0.1% and 0.01%, respectively. The remainder of the interactions were not regarded as direct. Otx expression in the A-line lineage requires a cis-regulatory module that includes Fox binding sites (Oda-Ishii et al., 2005) and is suppressed in FoxA-a morphants (Imai et al., 2006). The FoxA-a binding to this cis-regulatory element was counted with FDRs of 1% and 0.1%, but not with the most stringent FDR (0.01%) (Fig. 3A and see Fig. S7 in the supplementary material). Similarly, several lines of evidence have suggested that MyoD is directly regulated by ZicL. First, MyoD expression is suppressed in ZicL morphants (Imai et al., 2006). Second, MyoD and ZicL are both expressed in presumptive muscle cells and the time windows of their expression overlap (Imai et al., 2004). Lastly, there is a putative ZicL binding site near to the peaks found in the MyoD upstream region (P=1.2×10−5; Fig. 2D). This putative binding was observed under the FDRs of 1% and 0.1%, but not under the most stringent FDR of 0.01%. On the basis of the above observations, in the following sections we generally describe the results obtained at an FDR of 0.1%. The number of significant regions and the putative target genes associated with the 11 TFs are summarized in Table 2 and listed in Table S1 in the supplementary material.
As shown in Fig. 2A, the frequencies of the consensus sequences for ZicL and Tbx6b binding were markedly higher around peaks in the identified regions. Since the consensus binding motifs of the other nine TFs had not been determined previously, we performed similar analyses with motifs of homologs in other animals (see Fig. S1A-I in the supplementary material). The frequencies of the consensus binding motifs for six of the TFs, but not FoxD, SoxC or Twist-like1, were markedly higher around peaks. Because the position weight matrices (PWMs) for FoxD, SoxC and Twist-like1 gave higher background, no significant changes were seen. However, the number of matches to the motifs was markedly higher around peaks than in flanking regions and the background (see Fig. S1J-L in the supplementary material). These observations suggested that our method was able to successfully identify the TF binding regions.
As previously reported in other animals (Li et al., 2008), we found that the regions bound by Brachyury, MyoD, Neurogenin, Snail, Tbx6b, Twist-like1 and ZicL, especially around the peaks, showed a marked GC bias (see Fig. S2 in the supplementary material). This bias is likely to be related to the consensus sequences, because the consensus sequences for these TFs are generally more GC-rich than those of the remaining TFs (see Fig. S4 in the supplementary material). The observed enrichment of recognition sequences was unlikely to be an artifact of GC bias because even if we picked up background sequences with a base composition comparable to the averaged GC content of the bound regions (the difference between the average GC content of the bound and background regions was less than 0.8%), matches to the PWMs were enriched around peaks versus each of the GC-adjusted backgrounds (see Fig. S3 in the supplementary material).
Next, we attempted to discover overrepresented motifs in the regions (360 bp) around the peaks identified by each ChIP experiment using the Trawler program (Ettwiller et al., 2007). We found that overrepresented motifs were similar to the PWMs that were determined experimentally (Tbx6b and ZicL) or to those of homologs in other animals (the remaining nine TFs) (see Fig. S4 in the supplementary material). This further supported the conclusion that the results of our ChIP experiments were of high quality.
It is generally believed that TFs tend to bind near promoters, although many examples are known in which TFs bind to enhancers far from promoters. The distributions of peaks in all experiments, except Snail ChIP, were higher around transcription start sites (see Fig. S5 in the supplementary material). The reason why Snail binding sites were not enriched around transcription start sites is unclear, but this does not necessarily indicate that the results of the Snail ChIP were of low quality. Altogether, these observations support the conclusion that all of our ChIP experiments revealed in vivo occupancies of the TFs.
Tightly interconnected regulatory networks in early embryogenesis
TF genes were significantly enriched among the target genes of the 11 TFs. Among 670 potential TF genes in the Ciona genome, at least 607 encode proteins with known TF motifs or proteins with two or more zinc-finger motifs that potentially bind to DNA (Imai et al., 2004; Miwata et al., 2006). As shown in Table 3, we found a significantly greater number of TF genes among the targets than would be expected from random sampling (>99% confidence in χ2 tests). This enrichment indicates that the TFs examined bind targets selectively and not randomly.
We compared the ChIP data with the results of the comprehensive gene knockdown experiments of a previously study (Imai et al., 2006). Among 76 interactions previously found in the early embryo, the ChIP assays indicated that 58 are direct (see Fig. S6 in the supplementary material). In addition, we found 251 novel interconnections. Fig. 3A shows interconnections among the 11 TFs that were subjected to the present ChIP experiments (for results with the mild and stringent FDRs, see Fig. S7 in the supplementary material). Among 121 (11×11) possible interconnections, 84 were observed in the present study (Fig. 3B). Our data indicate that these genes are tightly interconnected with one another.
Regulatory networks for mesodermal tissue specification
Because the gene regulatory network model previously constructed from comprehensive expression profiles and comprehensive knockdowns of regulatory genes is of single-cell resolution (Imai et al., 2006; Imai et al., 2009), we simply integrated the ChIP data into this network by assuming that the examined TFs bind to the targets wherever their mRNAs are expressed. The reconstructed networks had a complex architecture.
The reconstructed regulatory networks allow us to trace development at the single-cell level. Figs S8 and S9 in the supplementary material show the interconnections among the core 11 TFs in A-line and B-line blastomeres, which give rise to endomesodermal tissues, from the 8-cell to the early gastrula stage. Two of the three mesenchymal lineages (B-line mesenchymal cells) and 28 out of 36 muscle cells (B-line muscle cells) in the tadpole larvae are derived from B4.1 blastomeres at the 8-cell stage (Fig. 4A). Thirty-two and eight notochord cells are derived from A4.1 and B4.1 blastomeres, respectively. Previous studies demonstrated that Twist-like1, MyoD and Brachyury are essential for specification of the mesenchyme, muscle and notochord, respectively (Corbo et al., 1997b; Imai et al., 2003; Imai et al., 2006; Meedel et al., 2007; Tokuoka et al., 2004; Yasuo and Satoh, 1998).
Mesenchyme
Twist-like1 is expressed exclusively in the mesenchymal lineage and is regulated by FoxA-a, Otx and ZicL, as indicated by the fact that knockdown of any of these three genes results in loss or reduction of Twist-like1 expression (Imai et al., 2006). As summarized in Fig. 4B (see also Fig. S11A in the supplementary material), we could not detect direct binding of FoxA-a to the Twist-like1 promoter (Fig. 5A) even with 1% FDR, but we did find that FoxA-a binds to the upstream regions of Otx and ZicL (Fig. 5B,C), and that ZicL and Otx bind to the promoter of Twist-like1 (Fig. 5A). Therefore, it is highly likely that FoxA-a mainly activates Twist-like1 indirectly through activating Otx and ZicL.
Twist-like1 expression begins in B7.7 (the posterior B-line mesenchyme) at the 64-cell stage and in B8.5 (the anterior B-line mesenchyme) at the early gastrula stage. These two mesenchymal lines contribute to distinct adult tissues after metamorphosis (Tokuoka et al., 2005). ZicL might be associated with the differences between these two lineages because the contribution of ZicL to Twist-like1 activation is weaker than that of Otx (Imai et al., 2006). To confirm this idea, we tested a mutant Twist-like1 promoter, from which a 150 bp segment containing the identified ZicL binding region was deleted. Because the Otx ChIP result indicated that the Otx binding region is distinct from the ZicL binding region, Otx was expected to bind to this mutant promoter. When introduced into fertilized eggs by electroporation, the wild-type promoter (1550 bp) drove reporter expression in 65% of the embryos (Fig. 5D; n=368, three independent experiments), whereas the mutant promoter drove reporter expression in 36% of the embryos (Fig. 5D′; n=281, three independent experiments). In addition to the significant decrease in the number of embryos expressing the reporter (Student's t-test, P<0.001), the overall fluorescence was weaker and the posterior B-line mesenchyme did not appear to express the reporter in the mutant construct (Fig. 5E,E′). To confirm this observation, the experimental embryos were cleavage-arrested at the 110-cell stage. Cells in the arrested embryos cannot divide further, but the developmental programs proceed as in normal embryos. The mutant construct failed to drive reporter expression in the posterior B-line mesenchyme (Fig. 5F,F′). These results suggest that ZicL contributes to the difference between these two lineages.
A previous study showed that nine mesenchyme-specific non-regulatory genes are under the control of Twist-like1. None of these genes was identified as a direct target in the present study (Fig. S10A in the supplementary material). Even when applied with an FDR of 1%, only one gene was identified as a direct target. Therefore, it is likely that Twist-like1 regulates the expression of mesenchyme-specific genes through its downstream regulatory gene circuit, although there is a possibility that Twist-like1 binds to the regulatory elements of these genes at later stages.
Muscle
The B6.2 and B6.4 cell pairs in the 32-cell embryo have the potential to give rise to mesenchyme and muscle (Fig. S8 in the supplementary material). At the 64-cell stage, these cells divide, and one of the daughter cells becomes specified to give rise to the muscle cells. Previous functional assays showed that ZicL, Tbx6b and MyoD are essential for specification of muscle cells (Imai et al., 2002c; Imai et al., 2006; Meedel et al., 2007; Yagi et al., 2005). Tbx6b begins to be expressed at the 16-cell stage, and cells expressing Tbx6b give rise not only to muscle cells but also to mesenchyme cells. Tbx6b expression declines to undetectable levels before the tailbud stage. ZicL starts to be expressed at the 32-cell stage in a variety of cells, including those with developmental fates of muscle, mesenchyme, notochord and neurons. ZicL expression in the muscle lineage disappears before the late gastrula stage. MyoD expression begins at the 44-cell stage exclusively in the muscle lineage under the control of Tbx6b and ZicL. The present study showed that ZicL, Tbx6b and MyoD constituted a tightly interconnected gene circuit that is responsible for this specification (see Fig. S11B in the supplementary material): (1) ZicL bound to the promoters of MyoD and Tbx6b; (2) Tbx6b bound to the promoters of MyoD and ZicL; and (3) MyoD bound to the promoter of Tbx6b and to its own promoter. All of these interactions, except MyoD binding to the Tbx6b promoter, have been confirmed by functional assays (Yagi et al., 2005; Imai et al., 2006).
To understand how this gene circuit regulates downstream muscle-specific genes, we examined the promoters of 13 muscle structural genes that are well annotated and known to be expressed in the larval tail muscle (see Fig. S10 in the supplementary material). Of these, ten were directly bound by MyoD and Tbx6, one by MyoD and ZicL, one by Tbx6b and ZicL, and one by MyoD alone.
As expected from a previous study (Brown et al., 2007), both MyoD and Tbx6 bound to the promoters of more than three-quarters of the muscle genes examined. To test the action of this feed-forward loop comprising MyoD and Tbx6b in the regulation of muscle-specific gene expression, we examined the expression patterns of genes under the control of this circuit. Of the 155 genes under the direct control of MyoD and Tbx6b, 50 (including the above ten) were already known to be expressed in muscle cells (Fujiwara et al., 2002; Imai et al., 2004; Kusakabe et al., 2002; Miwata et al., 2006; Satou et al., 2001). From the remaining 105 genes, we randomly chose 20, and found that 15 are expressed in muscle cells (Fig. 6), suggesting that this circuit is widely used for the regulation of genes expressed in muscle cells, and also that this circuit might not necessarily be sufficient for driving expression of the target.
Notochord
Brachyury is activated at the 64-cell stage exclusively in the notochord lineage, and this expression specifies the notochord fate (Corbo et al., 1998; Yasuo and Satoh, 1998). As described above, ZicL directly binds to the Brachyury promoter and activates its expression (Yagi et al., 2004). It has also been shown that FoxD and FoxA-a are required for Brachyury expression, probably through regulating ZicL expression, and that FGF signaling is also required for Brachyury expression (Imai et al., 2002a; Imai et al., 2002b; Imai et al., 2006). The present assays showed that not only ZicL, but also FoxD binds to the Brachyury promoter (Fig. 4C; see Fig. S11C in the supplementary material). Although FoxD mRNA is not present in the notochord lineage at the 32-cell and 64-cell stages, when ZicL and Brachyury are activated, respectively (FoxD is expressed in the ancestors of cells in which ZicL and Brachyury are expressed; see Fig. S9 in the supplementary material), the ChIP assay indicated that FoxD binds to the promoters of ZicL and Brachyury. Because knockdown of FoxD eliminates ZicL and Brachyury expression (Imai et al., 2002b; Imai et al., 2006) and because the FoxD-GFP fusion protein exists in the notochord lineage at the 32-cell stage (Fig. 1), it is likely that FoxD protein exists in these cells and binds to the promoters of ZicL and Brachyury when these two genes begin to be expressed.
FoxA-a binding to the Brachyury promoter was not identified under 0.1% FDR. There was, however, a small peak that was counted as significant under 1% FDR. We could not rule out the possibility that FoxA-a binds weakly to the Brachyury promoter. It is also possible that FoxA-a could bind weakly to a FoxD binding site because the FoxA-a binding peak coincided with that of FoxD (see Fig. S11 in the supplementary material). Even if this weak binding occurs in vivo, the regulation of Brachyury by FoxA-a would largely be achieved indirectly through FoxD and ZicL, as we found strong binding of FoxA-a to the promoters of FoxD and ZicL.
Next, we examined 14 non-regulatory genes that are known to be expressed in the notochord under the control of Brachyury (Hotta et al., 2000; Takahashi et al., 1999). Among them, 11 were identified here as direct targets of Brachyury (see Fig. S10 in the supplementary material). The present results suggest that the remaining three genes are regulated indirectly through a gene circuit under the control of Brachyury, although it cannot be ruled out that Brachyury binds to the regulatory elements of these three genes at later stages.
The contribution of TF-DNA binding to regulatory controls of target gene expression
In the present study, we found many interactions between TFs and genomic DNA that were unexpected from preceding gene knockdown assays. Similar observations were also reported in preceding ChIP studies (Jakobsen et al., 2007; Li et al., 2008; Sandmann et al., 2006). To estimate what proportion of the binding makes a major contribution to gene regulation in Ciona embryos, we injected MyoD mRNA or an MO against MyoD into eggs and examined their effects on the expression of the same targets that we analyzed above at the gastrula stage or at the tailbud stage, respectively. The mRNA levels of 14 targets, ten of which were expressed in muscle, were significantly increased (>2-fold) in embryos injected with MyoD mRNA (Fig. 6), and MyoD MO injection significantly reduced the mRNA levels of three of these targets. The mRNA level of one target (KH.C12.38), which was weakly expressed in muscle at the tailbud stage (Fig. 6), was significantly decreased in embryos injected with MyoD mRNA, whereas the mRNA level of one target (KH.C9.27), which was expressed in muscle at the gastrula stage (Fig. 6), was significantly increased in embryos injected with the MyoD MO. In total, the mRNA levels of 16 targets were significantly altered by MyoD mRNA overexpression or gene suppression (Fig. 6). The remaining four were not significantly affected, although three of these were expressed in muscle, implying that MyoD binding makes a relatively small contribution to activating these target genes. We also found that eight of 15 Brachyury targets and seven of 12 Twist-like1 targets were significantly affected in the embryos by overexpression or knockdown of Brachyury or Twist-like1, respectively (Fig. 7 and see Fig. S12 in the supplementary material). Therefore, we estimated that more than half of TF binding makes a major contribution to the regulatory control of gene expression.
DISCUSSION
Generally, high-quality antibodies are required for ChIP experiments. In our experiments, TF-GFP fusion genes were introduced exogenously into eggs and a polyclonal antibody specific for GFP was used for all experiments. This approach successfully identified TF-DNA interactions in the Ciona embryos. In this regard, our approach offers a great advantage. A possible drawback is that the introduced TFs might be somewhat overexpressed, although they were expressed at the same time and in the same blastomeres as the endogenous genes, being under the control of their own promoters (Fig. 1). It is possible that some of the interactions we observed in the present study were stronger, and thus more prevalent, than would normally occur in wild-type embryos. However, as the introduced TFs did not disrupt the morphology of the embryos at the time when we collected the samples, it seems likely that any effect of overexpression was minimal. Even though we might have overestimated the binding strength, our experiments indicate that in early embryos these TFs have access to the DNA segments we identified. However, because the introduced gene is not always expressed in all cells and the expressed GFP-tagged TF would compete for binding with the endogenous TFs, sensitivity could be reduced, and therefore some weak interactions could have been missed. Nonetheless, the number of identified regions was comparable to that reported in previous ChIP studies in other animals, in which ChIP assays were performed with TF-specific antibodies and normal embryos (Jakobsen et al., 2007; Li et al., 2008; Sandmann et al., 2006; Sandmann et al., 2007; Zeitlinger et al., 2007), suggesting that the majority of binding sites were identified in our experiments.
We found more interactions than would be predicted from previous functional assays. There were some target genes that are not expressed zygotically in early embryos. However, this observation is not necessarily contradictory because the binding of TFs to cis elements may repress target gene expression. In addition, the binding of these TFs might be insufficient for regulating the targets and they might instead need to work cooperatively with other TFs that are expressed at other time points.
A large proportion of the previously identified interactions (Imai et al., 2006) were revealed to be direct in the present study, whereas a small number of interactions were suggested to be indirect. Most of the latter interactions can be explained, as shown, for example, by the regulation of Twist-like1 (Fig. 4B) and Brachyury (Fig. 4C) by FoxA-a. Indirect interactions that cannot be explained based on the current knowledge are the auto-repressive interactions of FoxD, Tbx6b and ZicL. Since these auto-interactions were identified as significant under a milder FDR (1%), weak interactions might exist.
The present study also showed that tissue-specific genes are regulated in a different way in each of three ascidian mesodermal tissues, after key regulatory genes are expressed. In muscle, most muscle structural genes were found to be under the direct control of MyoD and/or a feed-forward loop of MyoD and Tbx6b. In the mesenchyme lineage, Twist-like1 did not seem to regulate the mesenchyme-specific non-regulatory genes directly. In the notochord, 11 of 14 notochord-specific non-regulatory genes were under the direct control of Brachyury.
Information inherited from parental cells and signals from nearby cells are integrated at the cis-regulatory elements of target genes to control their transcription. Because changes in the cis-regulatory regions of TFs are thought to have made major contributions to animal evolution (Stern and Orgogozo, 2008), these elements should carry traces of evolutionary processes. Deciphering the logic of how these elements can be changed will lead to a better understanding of how developmental systems have evolved.
Acknowledgements
We thank all the staff members of the Field Science Center (Onagawa) of Tohoku University and the Maizuru Fisheries Research Station of Kyoto University for collecting Ciona intestinalis. This research was supported mainly by Grants-in-Aid from the MEXT, Japan, to Y.S. (17687022, 20687014, 21671004) and also by gCOE program A06 of Kyoto University. A.K. was supported by a JSPS Research Fellowship (194010).
References
Competing interests statement
The authors declare no competing financial interests.