The Drosophila sex-determination switch gene Sex-lethal(Sxl) and the X-chromosome signal element genes (XSEs) that induce the female-specific expression of Sxl are transcribed extremely early in development when most of the genome of this organism is still silent. The DNA sequence CAGGTAG had been implicated in this pre-cellular blastoderm activation of sex-determination genes. A genome-wide computational search,reported here, suggested that CAGGTAG is not specific to early sex-determination genes, since it is over-represented upstream of most genes that are transcribed pre-cellular blastoderm, not just those involved in sex determination. The same search identified similarly over-represented,one-base-pair degenerate sequences as possible functional synonyms of CAGGTAG. We call these heptamers collectively, the TAGteam. Relevance of the TAGteam sequences to pre-cellular blastoderm transcription was established through analysis of TAGteam changes in Sxl, scute (an XSE), and the `ventral repression element' of the pattern-formation gene zerknüllt.Decreasing the number of TAGteam sites retarded the onset of pre-blastoderm transcription, whereas increasing their number correlated with an advanced onset. Titration of repressors was thought to be the rate-limiting step determining the onset of such early transcription, but this TAGteam dose effect shows that activators must also play an important role in the timing of pre-blastoderm gene expression.
INTRODUCTION
Most animal embryos begin zygotic transcription only after a considerable delay following fertilization (Andeol,1994; Davidson,1986). For example, widespread activation of zygotic transcription in Drosophila melanogaster occurs only midway through nuclear cycle 14, some 2.5 hours after fertilization, during the cellular blastoderm stage,when division slows and the egg cell membrane invaginates to engulf peripheral somatic nuclei and form a cellularized embryo(Anderson and Lengyel, 1979; Foe and Alberts, 1983; Lamb and Laird, 1976; McKnight and Miller, Jr,1976). The fact that the nucleo-cytoplasmic ratio influences the time of onset of general zygotic transcription in diverse species led to the idea that this delay in the onset of zygotic gene expression is a consequence of the action of maternally loaded repressor proteins that are titrated by zygotic DNA (Newport and Kirschner,1982a; Newport and Kirschner,1982b; Pritchard and Schubiger, 1996).
Even for the frog, however, where the concept of a `midblastula transition'marking the onset of widespread zygotic transcription has perhaps been most thoroughly developed, a few genes are known to be transcribed earlier(Kimelman et al., 1987; Nakakura et al., 1987). The same is true for Drosophila, although even the earliest fly genes are not transcribed until 1 hour after fertilization (see Fig. 1). Remarkably little is known about the features of these unusually early genes that allow them to be expressed at a time when most of the genome is silent. Previous studies suggested that the time of onset of transcription for even these exceptional genes is determined by the disappearance of repressors that are titrated by DNA, ones presumably different from those proposed to suppress the majority of zygotic genes (Brown et al.,1991; Pritchard and Schubiger,1996). Work we report here adds a new perspective on this process by allowing us to infer that the binding of positively acting factors to specific sites near the promoters of such exceptionally early genes must also be an important factor in determining precisely when these genes become active.
Our attention was drawn to pre-cellular blastoderm (pre-CB) transcription by the fact that Drosophila sex-determination genes are among the handful of genes expressed so early. Drosophila sex and X-chromosome dosage compensation are determined by the number of X chromosomes, which are counted by the feminizing switch gene Sex-lethal (Sxl) prior to the cellular blastoderm stage (reviewed by Cline and Meyer, 1996). Sxl counts by measuring the level of zygotic gene products generated from at least four X-chromosome signal elements (XSEs): sisterlessA(sisA), scute (sc, a.k.a. sisB), outstretched (os, a.k.a. upd or sisC)(Sefton et al., 2000) and runt (run). Diplo-X (female) embryos, but not haplo-X (male)embryos, generate a level of XSE products high enough to activate transcription at the Sxl `establishment' promoter, Pe, in somatic cells, thereby producing a pulse of SXL protein. This female-specific protein pulse subsequently engages a positive feedback loop on Sxlpre-mRNA splicing that locks Sxl into its feminizing expression mode throughout the rest of development.
By counting X-chromosomes in nuclear cycle 12, embryos engage their dosage compensation machinery in time to avoid an imbalance in X-linked gene products that would otherwise arise between the sexes during cycle 14 when genome-wide transcription begins (Gergen,1987; Tracey et al.,2000). But such early chromosome counting demands that expression of the genes that communicate X-chromosome dose to Sxl must begin even before cycle 12. As Fig. 1illustrates, the two strongest XSEs, sisA and sc, are among the earliest expressed genes (Erickson and Cline, 1993).
The question of whether these sex-determination genes might have something in common that allows them to be expressed so early led us to the heptanucleotide motif, CAGGTAG. Three of the four XSEs in D. melanogaster (all but run) possess multiple copies of this sequence or its reverse complement within 500 bp of their transcription start sites (Erickson and Cline,1998; Sefton et al.,2000). The cluster of three CAGGTAG sites upstream of scwas shown to be functionally significant by the demonstration that their elimination reduced sc XSE activity, and abolished it when combined with a deletion of a downstream regulatory element(Wrischnik et al., 2003).
We have used genome-wide computational analysis and species comparisons to show that CAGGTAG clustering is not unique to sex-determination signal genes,but is also found upstream of other genes whose transcription begins prior to cycle 14. 1-bp degenerate heptamers are identified that cluster with, and can probably substitute for CAGGTAG. We call these sequences the TAGteam. Functional analysis of TAGteam clusters in transgenic reporter constructs for sc, SxlPe and the patterning gene zerknüllt(zen) revealed that TAGteam site number influences the time of onset of pre-CB transcription.
MATERIALS AND METHODS
Fly stocks and culture
Drosophila were cultured in uncrowded conditions on a standard cornmeal, yeast, sucrose and molasses medium. Transgenes were tested for sisB+ activity only after five generations of backcrossing to a standard scM6 stock to generate homogeneous genetic backgrounds. Wild-type SxlPe transgene lines were provided by J. Erickson, wild-type VRE600 transgene lines by M. Levine, and hermaphrodite alleles by B. Baker. Markers and balancers are described at http://flybase.bio.indiana.edu.
Bioinformatics
We created a simple algorithm to search for CAGGTAG clusters in the D. melanogaster genome (Adams et al.,2000). When the program FLY ENHANCER(Markstein et al., 2002)became available, we determined that it generated the same results. Pre-CB genes were chosen as genes transcribed prior to nuclear cycle 14, according to previously published reports in the literature(Fig. 1), and post-CB genes(Table S1 in supplementary material) were randomly chosen among those known not to be transcribed prior to gastrulation according to the BDGP expression database (Tomancak et al.,2002). For pre- and post-CB genes, the total number of occurrences of CAGGTAG and related heptamer sequences per interval was subjected to aχ 2 test incorporating Yate's correction. The same analysis was applied to the orthologous genes in D. pseudoobscura(Richards et al., 2005).
Identification of orthologous sequences and alignments
Sequences in D. yakuba, D. ananassae, D. pseudoobscura, D. mojavensis and D. virilis were downloaded from the Lawrence-Berkeley National Lab Vista Browser(Frazer et al., 2004). Promoter sequences for bottleneck (bnk) from D. willistoni, D. hydei and Zaprionus tuberculatus were obtained by PCR amplification with conserved D. pseudoobscura primers: (1)GGTGCGCGGAAAACACGTAAAATTCGCGTG; (2) GTGTTGGTCAGCTTGTTGAAGAAGTTGATTTTGTC. Sequences were aligned using ClustalW(Thompson et al., 1994), then manually adjusted to align motifs identified with MEME(Bailey and Gribskov,1998).
Site-directed mutagenesis and germline transformation
Standard techniques were used for germline transformation(Spradling, 1986). All mutagenesis reactions were carried out with the Quick-Change site-directed mutagenesis kit (Stratagene). PCR reactions were performed under standard conditions and all products were sequence verified prior to subcloning into P-element vectors. Mutagenic primers used are listed below, with their altered TAGteam sites underlined and the base pairs in bold indicating changes from the wild-type site (shown in parentheses).
sc:GATTCTCTCGCCCGTTTCTCGGGAACAGTGTTGTGCAGAGAGTTG(CTGCCTG);
SxlPe: (1)GATCAATTCGCGGGATCTTCGTTCTCCCACGCGACTGGCAC(TAGGTAG),
(2) CGAGAATGCGGAACATCTGAAGTACGGCGAAGATCCGCGATCC(CTGCCTGCCTG), (3)GCCTCCTTCGATCTTAGAACGGGCACCCAGCCACCGC(CTACCTG);
zen: (1)CATTTGCACCAGCGGACGGTGTTTATTCACCGAACGGAAACCCATAC(CTGCCTG-CTACCTG),
(2)CGGTCGCACTATTTCGTTCGACACTGTACCGTCCGCACTAGCGGG(TAGGTAG-CAGGCAG).
For scute, a 1.65-kb XhoI-BamHI genomic fragment in pBluescript (Stratagene) was used as a template for mutagenesis. Mutants were subcloned into the scute genomic DNA fragment described by Wrischnik et al. (Wrischnik et al.,2003). For Sxl, wild-type 1.4-kb SxlPe-lacZ vector(Yang et al., 2001) was digested with EcoRI-NotI and subcloned into pBluescript for use as a template to make mutants, which were subsequently cloned back into the original vector using the same digest. For zen, mutagenesis of VRE TAGteam sites was performed in a modified pGEM72f(-) vector, pGEM-BGS, in which the sequence from BspEI to KpnI was replaced with BglII-600VRE-SpeI (a gift from M. Markstein). Following mutagenesis reactions, the mut600VRE pGEM-BGS was digested with BglII-SpeI and cloned into newE2G, a gypsy-insulated pCaSpeR vector containing an eve minimal promoter fused to a lacZreporter (Markstein et al.,2004).
To generate the 250VRE transgene, the 600VRE in pGEM-BGS was digested with KasI-HindIII, incubated with Klenow polymerase (New England Biolabs) to create blunt ends, and self-ligated. The following primers were used to amplify the wild-type or mutant 111-bp fragment from the appropriate 600-bp VRE plasmid (lowercase sequences denote exogenous DNA added to introduce BamHI or BglII sites):
atggatccTTTGCACCAGCTGCCTGTGTTTA,
taagatctCCCGCTAGTGCTGCCTGTACAG,
atggatccTTTGCACCAGCGGACGGTGTTTA,
taagatctCCCGCTAGTGCGGACGGTACAG. The resulting products were digested with BamHI-BglII and cloned into a modified pBluescript SK(+) that contains BamHI/BglII sites at the original BamHI/SmaI sites (a gift from M. Markstein). The 111VRE fragment was duplicated by digesting the 111VRE vector with BamHI-ScaI or BglII-ScaI and ligating the VRE-containing vector fragments from the two different digests to each other. All VRE fragments were ligated into newE2G BglII-EcoRI sites following digestion with BamHI-EcoRI except the 250VRE, which was cut with BglII-EcoRI prior to ligation.
In situ hybridization and nuclear dots
0- to 2-hour D. melanogaster embryos were collected at 25°C on standard molasses/agar plates smeared with live yeast paste. Embryo fixation and hybridization was performed with digoxigenin-labeled antisense lacZ RNA probes as described previously(Jiang et al., 1991; Tautz and Pfeifle, 1989),except for embryos used for nuclear dot analysis, which were stained with 0.5μg/ml DAPI, washed with PBT, and mounted in 70% glycerol/1× PBS following alkaline phosphatase staining. Embryonic nuclear cycles were determined by the density of nuclei and their position relative to the periphery of the embryo.
We only counted dots in late interphase or prophase nuclei, since these stages appeared to have the greatest propensity for dot expression. For nuclear cycles 9-13, at least 80 nuclei per embryo were scored. For earlier cycles, all visible nuclei were scored. We defined the time of onset of transcription conservatively to be the first cycle with at least a 5-fold increase in the percentage of expressing nuclei over a previous cycle with at least 1% expressing nuclei.
Immunocytochemistry
Embryos grown at 25°C were stained with β-galactosidase antibody(5 Prime-3 Prime) and processed as described by Kuo et al.(Kuo et al., 1996), except that alkaline phosphatase-conjugated goat anti-rabbit secondary antibody was used and color was developed with 5-bromo,4-chloro,3-indolyl phosphate(BCIP)/nitroblue tetrazolium (NBT).
RESULTS
TAGteam sequences are over-represented upstream of genes transcribed pre-cellular blastoderm (pre-CB)
To determine whether CAGGTAG clusters are unique to genes that regulate Sxl in the early embryo, we wrote a computer algorithm to search the D. melanogaster genome for XSE-like CAGGTAG clusters containing three or more matches to CAGGTAG, or its reverse complement, within 500 bp upstream of a transcription start site. Only five genes had such clusters. Whereas three of them were XSEs, the other two were bnk and Neurotactin (Nrt), both autosomal genes with functions unrelated to sex determination, but that appear to share the feature of pre-CB transcription with XSEs (Lecuit and Wieschaus, 2000; Schejter and Wieschaus, 1993). Pre-CB transcription of bnk has been shown directly, but that for Nrt was inferred from immunostaining of its protein products in early cycle 14.
If CAGGTAG functions more generally to regulate pre-CB gene expression,perhaps in combination with other sequences, one might expect it to be over-represented in the promoters of genes expressed prior to the cellular blastoderm stage, even if it did not always appear in clusters. Moreover,CAGGTAG-related sequences might also be over-represented. To explore these possibilities, we compared the occurrence of CAGGTAG and all its 1-bp degenerate sequences in the region flanking the transcription start sites of genes expressed pre-CB to that for genes expressed post-CB (see Materials and methods). CAGGTAG is indeed over-represented in the 500-bp region upstream of pre-CB genes relative to post-CB genes (P=6.3×10-7, Fig. 2A), even if SxlPe and the genes from the genomic cluster search are removed from the pre-CB set to avoid a possible ascertainment bias(P=2.1×10-4, data not shown). The degenerate sequence tAGGTAG, that had been suggested as a possible alternative to CAGGTAG in an earlier comparison of Sxl promoters(Erickson and Cline, 1998), is nearly as over-represented as CAGGTAG (P=1.6×10-6). The sequence CAGGcAG (P=9.2×10-4) also emerged in this analysis as the only other 1-bp degenerate with a P-value less than 0.01. Hereafter, we refer collectively to these over-represented heptamers as TAGteam sequences.
To determine whether the over-representation of TAGteam heptamers upstream of pre-CB genes is unique to D. melanogaster, we performed a similar analysis on D. pseudoobscura orthologs. Again, CAGGTAG(P=1.5×10-6) and tAGGTAG(P=2.0×10-5) were over-represented in the interval immediately upstream of pre-CB genes relative to post-CB genes, but CAGGcAG was not (Fig. 2B). CAGGTAG and/or tAGGTAG were also over-represented in regions beyond 500 bp in the two species (Fig. 2A,B), though less so than in regions closer to the promoter, suggesting that TAGteam members may be able to exert influence over distances greater than 500 bp.
This relative over-representation of TAGteam sequences in the interval directly upstream of the transcription start site was a consequence of pre-CB genes containing more sites than expected by chance, although the values for CAGGTAG in pseudoobscura and for tAGGTAG in both species were also increased by a depression in the number of sites upstream of post-CB genes(Fig. 2C,D). In contrast,although CAGGcAG in D. pseudoobscura was six times more prevalent than expected upstream of pre-CB genes, a strong positive bias among post-CB genes as well dropped this heptamer below the over-representation cutoff. An extension of the analysis of pre-CB genes shown in Fig. 2C to four more Drosophila species, two of which are more distant than pseudoobscura, showed that CAGGTAG was reliably the most over-represented of the three TAGteam members, and that the other two were consistently among the top three runners up among all 1-bp degenerates (Table S2 in supplementary material).
While this manuscript was under review, a genome-wide microarray-based developmental profile of relative poly(A)+ mRNA levels in D. melanogaster prior to gastrulation was published (Pilot et al.,2006 ) which gave us an opportunity to expand the data set for pre-CB genes in the analysis of TAGteam sequence occurrence. Our analysis of this much larger and differently biased new data set (Fig. S1 in supplementary material) showed remarkable agreement with our conclusions from Fig. 2A. Again the three TAGteam heptamers described above stood out, with CAGGTAG leading the pack,but they were now joined by two additional 1-bp degenerate sequences, CAGGTAa and CAGGTAt that rose far above the P=0.01 significance threshold. CAGGTAa had been just below that threshold in our original analysis, and was consistently among the top five most over-represented 1-bp degenerates in the six-species comparison of pre-CB genes (Table S2 in supplementary material).
Conservation of individual TAGteam sites is variable among early gene promoters
To explore the pattern of conservation of TAGteam sites in a pre-CB gene unrelated to sex determination, we performed an alignment of bnksequence from nine species within the family Drosophilidae for the region in melanogaster that contains the triple CAGGTAG cluster and,just upstream of that, a tAGGTAG sequence(Fig. 3A,B). Support for the functionality of the latter sequence was strengthened by the observation that each of the melanogaster CAGGTAG sites was replaced by tAGGTAG in at least one species. Moreover, the upstream tAGGTAG sequence was conserved among the three closest relatives of melanogaster.
Although the overall number of TAGteam sites in pre-CB genes was relatively conserved (Fig. 2C,D; Table S2 in supplementary material), the conservation of specific sites and overall number within bnk was somewhat variable(Fig. 3A,B). The lack of conservation of the distal melanogaster TAGteam sites in species beyond pseudoobscura appears to be compensated for at least in part by the appearance of TAGteam sites elsewhere: D. willistoni and Z. tuberculatus evolved a new CAGGTAG site between the two most highly conserved CAGGTAG sites, whereas D. mojavensis acquired a CAGGTAG sequence upstream of all others.
To determine if the findings for bnk were representative, we used the genome sequence available for six Drosophila species to align the 500-bp regions upstream of the promoters of five additional pre-CB genes(Fig. 3C). Three of these genes are related to sex determination (sisA, sc, SxlPe), but the other two [snail (sna) and zen] are not. For sna and zen, known enhancer elements further upstream of these genes were also examined, since they contained TAGteam clusters in melanogaster. As with bnk, these regions exhibited a variable degree of conservation. There was no obvious difference in conservation between genes involved in sex determination and those that are not. The fact that TAGteam sites in the upstream enhancer elements of zen and sna seemed as conserved as those closer to the promoter argues for the functional significance of TAGteam site over-representation (Fig. 2A,B)regardless of its distance from the promoter.
There were several instances in the five-gene/six-species comparison in which de novo TAGteam sites appeared to replace lost sites, suggesting that stabilizing selection may influence TAGteam evolution (see Ludwig et al., 2000; Ludwig et al., 1998). In view of our recent analysis of the Pilot et al.(Pilot et al., 2006) data pointing to CAGGTAa as a member of the TAGteam, it seems significant that in the species comparisons shown in Fig. 3, this sequence replaced one of the conserved original three TAGteam heptamers even more frequently than any one of them replaced the two others (data not shown).
The least over-represented TAGteam member does contribute to pre-CB gene function
The functional significance of CAGGTAG clusters had previously been established using an assay for the XSE function of scute(sc) (Wrischnik et al.,2003). Since sc has a CAGGcAG site 53 bp upstream of its transcription start site (Fig. 4A), we could utilize the same assay to determine whether this least-over-represented member of the original TAGteam also contributes to pre-CB gene functioning.
This assay for sc XSE activity is based on the fact that females homozygous for the null allele scM6 fail to activate SxlPe and therefore die from dosage compensation upsets. A genomic transgene containing the scute protein coding region and only a few kb of DNA on either side (Fig. 4A) fully complements scM6 with respect to female-specific lethality (Fig. 4B). Consequently, the extent to which mutations interfere with the pre-CB functioning of this transgene is reflected in the extent to which they reduce the ability of the transgene to rescue scM6mutant females.
Mutating the single CAGGcAG site by itself had no effect on the ability of this transgene to rescue mutant females (101% viability, relative to that of the heterozygotes); however, rescue was reduced to only 3% in combination with the triple CAGGTAG knockout, a value far below the 35% average rescue for the triple knockout alone (Fig. 4B). We could be confident of the biological significance of this synergism, even in the face of considerable variation in rescue among insertion lines, since values for individual transgene lines were reproducible, and since the range of rescue values for the various quadruple knockout lines (1-5%) did not overlap that for the triple knockouts(7-76%).
TAGteam sites affect the early expression of Sxl
The target of XSE action, the promoter SxlPe, becomes active in female embryos during cycle 12, two cycles before cellularization of the blastoderm. Consistent with such pre-CB expression, SxlPe has a TAGteam cluster within 250 bp of its transcription start site. The melanogaster cluster includes one CAGGTAG, one tAGGTAG and a CAGGcAG doublet(Fig. 5A). From melanogaster to virilis, these sites are either identical or they are replaced by the reverse complement or another TAGteam member (see Fig. 3C). A 1.4-kb SxlPe-lacZ reporter transgene had been reported to faithfully mimic endogenous Sxl sex-specific expression(Estes et al., 1995; Yang et al., 2001). To determine if SxlPe expression depends on TAGteam sequences, we examined the effect of TAGteam mutations on theβ-galactosidase levels generated by this reporter in stage 8-11 embryos.
As expected, the wild-type transgene gave a 1:1 ratio of stained to unstained embryos, indicating females and males, respectively, with nearly all females staining darkly (Fig. 5B,F). By contrast, almost no embryos carrying the triple TAGteam mutant showed dark or intermediate staining, and only an estimated 8% of females reached even the lightest staining category(Fig. 5C,G). Two additional transgenes established that the contribution of TAGteam sites to SxlPe regulation is cumulative. Loss of just the two promoter-proximal sites, CAGGTAG and tAGGTAG (mutYAGGTAG), impaired expression(Fig. 5D,H), but not as severely as loss of all three. Again only a few embryos had dark or intermediate staining, but an estimated 50% of females were lightly stained. Even mutation of the CAGGcAG doublet by itself reduced staining(Fig. 5E,I), though not by as much as the loss of the two other TAGteam sites together. Inferences regarding the contribution of the CAGGcAG site are complicated by the fact that this double site partially overlaps an E-box sequence previously implicated in SxlPe regulation (Yang et al., 2001). Although the E-box hexamer itself was purposely left intact in the CAGGcAG mutants, the possibility remains that transcription factor binding to this E-box site could be marginally affected by sequences outside the canonical hexamer motif.
We used in situ hybridization to RNA from the SxlPe-lacZ transgene in 0- to 2-hour embryos both to establish that the effects of TAGteam mutations on β-galactosidase levels reflected effects on transcript levels by the cellular blastoderm stage, and to determine whether TAGteam mutations delayed the onset of Sxltranscription. Fig. 6 shows that, as expected, the level of cytoplasmic mRNA from the TAGteam mutant Sxl reporter was far below that from the wild-type transgene at cycle 14 (compare Fig. 6A and C).
In pre-CB embryos, the time of onset of transcription for a given gene can be deduced through the analysis of `nuclear dots', which reflect the hybridization of probe to nascent mRNA transcripts at the transcribed gene(Erickson and Cline, 1993; O'Farrell et al., 1989; Pritchard and Schubiger, 1996; Shermoen and O'Farrell, 1991). We defined the onset of transcription conservatively as the first nuclear cycle in which the average percentage of dot-containing nuclei exceeded 5%(see Materials and methods).
Dots from endogenous Sxl transcripts were reported to first appear at cycle 12, with the proportion of dot-containing nuclei quickly reaching 100% by the end of that cycle (Barbash and Cline, 1995; Erickson and Cline, 1993). We were surprised to find that the unmutated 1.4-kb SxlPe-lacZ reporter transgene was not as faithful a mimic of the endogenous gene as we had expected; nevertheless, it did reveal an unambiguous effect of the triple TAGteam knockout on both the timing and extent of transcription driven by SxlPe. The unmutated reporter initiated transcription during nuclear cycle 9, rather than cycle 12,with the proportion of active nuclei only gradually increasing during subsequent cycles to a maximum of just over 80%(Fig. 6B,E). Pritchard and Schubiger (Pritchard and Schubiger,1996) had observed a similar gradual increase in nuclear dots for many pre-CB genes, a behavior that had set these genes apart from endogenous Sxl. Moreover, we observed a small but reproducible amount of transcription in the nuclei of males, as deduced from the fact that after cycle 9, all embryos had at least a few nuclei with dots (data not shown). Nuclear dots of hybridization from the endogenous SxlPehad been observed previously in pre-CB male embryos, albeit not in all males examined (D. A. Barbash, PhD Thesis, University of California, 1995). In our study, the average percent of dot-containing nuclei in males always remained low, consistent with the fact that expression of the reporter appeared female-specific when assayed by β-galactosidase levels(Fig. 5F) or even by cytoplasmic mRNA accumulation (data not shown). In order to determine the onset of transcription of the 1.4-kb SxlPe-lacZ reporter specifically in females, we included only the top 50% of expressing embryos per cycle in our analysis.
The mutant SxlPe reporter initiated transcription three cycles later than the non-mutant, and fewer than 15% of the nuclei appeared to be active at any given time, even by cycle 13(Fig. 6D,E). Thus, TAGteam sites not only advance the time of onset of Sxl transcription, they also increase the proportion of nuclei that transcribe Sxl during a given cell cycle. By contrast, the TAGteam mutations had no effect on the time at which transcription of the SxlPe reporter shuts off, a key event in the process of sex determination. Nuclear dot frequency plummeted for both the wild-type and mutant reporters shortly after nuclei began elongating early in cycle 14 (data not shown).
TAGteam sites in the zen ventral repression element influence the time of onset of pre-CB transcription
The search for large TAGteam clusters led to zen, a gene expressed as early as the earliest XSEs (Pritchard and Schubiger, 1996), but one not involved in sex determination. The gene has six upstream TAGteam sites, four of which form an exceptionally compact cluster (91 bp) relatively far (1.4 kb) from the promoter(Fig. 7A) in a 600-bp region called the zen ventral repression element (VRE). The zen VRE had been shown to harbor pre-CB transcription activation sequences of unknown identity based on its ability to activate the even-skipped(eve) basal promoter in lacZ reporter constructs(Jiang et al., 1993; Kirov et al., 1993), a basal promoter that responds faithfully to pre-CB regulatory information(Markstein et al., 2002; Small et al., 1992). We assessed the potential role of the TAGteam in VRE functioning using the same reporter-gene strategy.
The intact 600-bp VRE begins to drive cytoplasmic mRNA accumulation in the soma prior to cellularization, with the level peaking early in cycle 14(Fig. 7B). Accumulation is limited to the dorsal half of the embryo as a result of the action of Dorsal protein that represses zen in the ventral half of the embryo by binding to the VRE (reviewed by Stathopoulos and Levine,2002). A 250-bp fragment of the VRE containing the four TAGteam sites had been shown to contain sequences necessary for transcriptional activation (Jiang et al.,1993). We found that this 250-bp fragment is not only necessary for activation, it is sufficient: it drove expression as effectively as the full-length 600-bp fragment (Fig. 7D). Expression was nearly uniform, since the 250-bp fragment lacked all but one Dorsal binding site.
We truncated the VRE even further to just a 111-bp fragment containing only the four TAGteam sites and 10 bp on each side. An alignment of this region from the species examined in Fig. 3C showed that the four TAGteam sites account for two-thirds of the nucleotides conserved in all six species. In contrast, the remaining 83 nucleotides of this fragment contained only eleven invariant base pairs, of which the longest contiguous run was only three base pairs that were adjacent to a TAGteam site. Even this minimal fragment drove ubiquitous expression in the early embryo, albeit at a reduced level, particularly at the posterior pole (Fig. 7E). Duplicating this minimal TAGteam cluster fragment increased transcript accumulation to levels comparable to those of the 250-bp VRE fragment(Fig. 7G). In contrast,mutating the four VRE TAGteam sites, whether in the original 600-bp fragment or in the minimal 111-bp fragment, reduced the accumulation of transcript at cycle 14 to low levels (Fig. 7C,F). The effects of the TAGteam mutations appear to be specific to pre-CB expression, since we saw no differences between the TAGteam mutant and wild-type 600-bp VRE fragment through stage (not cycle) 14 with respect to the highly patterned and dynamic post-CB embryonic expression pattern (data not shown).
Analysis of nuclear dots of lacZ RNA hybridization revealed that the minimal 111-bp VRE fragment begins to drive the eve basal promoter during nuclear cycle 9 (Fig. 7I), contemporaneous with the onset of endogenous zentranscription (Erickson and Cline,1993; Pritchard and Schubiger,1996). The minimal fragment was active in 23% of the nuclei during the onset cycle. That fraction increased gradually in subsequent cycles,peaking at 90% in nuclear cycle 12. In contrast, transcription driven by the TAGteam mutant 111-bp VRE transgene began only in cycle 10, one cycle later than for the non-mutant transgene, just barely reaching the 5% cutoff for onset even in that cycle. The fraction of nuclei transcribing the mutant transgene peaked in cycle 12, just as it had for the nonmutant, but that peak was only 38%.
Whereas decreasing the number of TAGteam sites retarded initiation of pre-CB transcription, increasing their number appeared to advance it. The onset of transcription from the (2X)111VRE construct was in nuclear cycle 7(Fig. 7H,I), two cycles earlier than that of the 111VRE construct and one cycle ahead of the earliest point that transcription of endogenous genes has been shown to begin(Erickson and Cline, 1993; Pritchard and Schubiger,1996). This advance is unlikely to be an artifact of increased staining levels, since cytoplasmic mRNA staining for the 250-bp fragment was comparable to that for the (2X)111VRE construct in early cycle 14 (compare Fig. 7D with G), yet the 250-bp construct initiated expression one cycle later. The lack of concordance between onset of transcription and cycle 14 staining level was even greater between the (2X)111VRE construct and the wild-type 1.4-kb SxlPe construct, which stained much more darkly by cycle 14, despite having begun transcription two cycles later (compare Fig. 6A with Fig. 7G).
hermaphrodite does not encode the TAGteam binding factor
The hermaphrodite (her) gene seemed an attractive candidate as a source of the TAGteam binding factor, since its maternally encoded product is a zinc-finger transcription factor that had been reported to positively regulate not only SxlPe but also non-sex-specific targets of unknown identity in the young embryo(Li and Baker, 1998; Pultz and Baker, 1995; Pultz et al., 1994). Consequently we determined whether the maternal effect of mutations in her would interfere with expression of the (2X)111VRE zentransgene. It did not, as the level of lacZ mRNA immunostaining from this transgene was the same for the progeny of her l(2)mathomozygous mutant mothers as for the progeny of their heterozygous sisters(data not shown).
Interpretation of this negative result is complicated by the fact that we did not observe the female-biased lethal maternal effect reported for her l(2)mat that had made the gene such an attractive candidate. For both her l(2)mat and the her l(2)mat/her1 heteroallelic combination, we did observe maternal-effect embryonic lethality, but both sexes of progeny were affected equally. Moreover, even when we tried to sensitize daughters to the maternal effect of her l(2)mat/her1 by reducing the dose of Sxl+ to one copy, we saw no sex bias in lethality. Since we verified that the her alleles carried the appropriate molecular lesions, we can only surmise that the effect of her on Sxl is far weaker than originally indicated, and that some undefined and unsuspected aspect of genetic background or culture conditions sensitized Sxl to her in the previous studies. Hence the identity of the protein that recognizes TAGteam sites to drive pre-CB gene expression remains to be determined.
DISCUSSION
The combined bioinformatic and functional analysis reported here establishes that the TAGteam of three heptanucleotide sequences (CAGGTAG,tAGGTAG and CAGGcAG) plays an important, positive role in determining the time of onset of transcription for Drosophila genes whose expression begins prior to the point in the cellular blastoderm stage when widespread transcriptional activation of the genome of the embryo occurs. Previous studies had emphasized the role of maternally loaded repressor proteins in determining when zygotic transcription could begin(Newport and Kirschner, 1982a; Newport and Kirschner, 1982b; Pritchard and Schubiger,1996). It was proposed that repression would ultimately be relieved by titration of the fixed amount of repressor protein present at fertilization by the increasing amount of zygotic DNA generated as development proceeds. Tramtrack is an example of one such repressor that was shown to control the timing of pre-cellular blastoderm (pre-CB) transcription for the Drosophila gene fushi tarazu (ftz)(Brown et al., 1991; Pritchard and Schubiger,1996). The onset of transcription driven by a ftzregulatory DNA could be advanced either by eliminating Tramtrack binding sites or by reducing the amount of maternally encoded Tramtrack protein present in the young embryo. Increasing Tramtrack protein had the opposite effect.
In contrast, we show here that eliminating TAGteam sites has the opposite effect on the onset of pre-CB gene expression, retarding transcription driven either by a 111-bp minimal regulatory fragment of the pattern-formation gene zerknüllt (zen) or by the promoter of Sex-lethal (Sxl), the gene that counts fly X chromosomes to determine sex. Hence TAGteam sites must recruit activators, rather than repressors. Our observation that duplication of the minimal TAGteam VRE fragment advances the onset of transcription suggests that precisely when pre-CB genes are first expressed is determined by a balance between specific activator and repressor proteins. Reciprocal changes in the number of binding sites for an activator protein involved in C. elegans pharynx development had been shown to have reciprocal effects on the onset of expression of the mutated target gene(Gaudet and Mango, 2002);however, the gene studied in that case was expressed only after widespread zygotic transcription had already begun.
Analysis of the mechanism by which the TAGteam sites act to influence pre-CB transcription may provide unique insights into transcriptional regulation, since the nuclear environment during this early period appears to differ significantly from that later when general transcription begins. For example, during this early period in Drosophila, the high mobility group protein D (HMG-D) stands in for histone H1(Ner et al., 2001; Ner and Travers, 1994). Moreover, it has been shown that Xenopus embryos have a uniquely large excess of core histones prior to the midblastula transition (MBT) that may play an important part in pre-MBT transcriptional quiescence(Prioleau et al., 1994). It is not known whether Drosophila embryos have a comparable shift in the level of core histones relative to DNA that correlates with the onset of widespread transcription, but measurements of histone mRNA levels are consistent with the possibility of a large shift(Anderson and Lengyel, 1980). The availability of TATA-binding factor also seems to be a factor limiting transcription in Xenopus only prior to the MBT(Veenstra et al., 1999), but it is not known whether a similar change occurs in Drosophila.
Although the TAGteam is highly over-represented upstream of the genes that serve as the sex-determination signal in D. melanogaster, our discovery that TAGteam sites can also be unusually abundant upstream of early expressed genes like zen and bnk that have nothing to do with sex determination shows that their function is not restricted to sex-determination genes. On the other hand, the TAGteam cannot be the only sequences that mark genes for pre-CB expression, since the mutant 111-bp zen VRE transgene with no TAGteam sequences was still able to initiate pre-CB transcription from the even-skipped promoter, albeit with a delay relative to the nonmutant transgene, and with a reduction in the fraction of nuclei that ultimately express. Moreover, pre-CB genes exist (e.g. nullo) that have no TAGteam heptamer within 2 kb of their transcription start site. As more information becomes available on various Drosophila species with respect not only to DNA sequence, but also to start times of transcription for specific genes, a clearer picture should emerge on the interchangeability of motifs that direct pre-CB gene expression. Already, our analysis of very new data that increased the list of recognized melanogaster pre-CB genes (Pilot et al., 2006) led us to two 1-bp degenerates of CAGGTAG that may earn the certified TAGteam label once their functional contribution to pre-CB gene expression is tested by mutation.
Thus the TAGteam seems to provide an opportunity to analyze how enhancer sequences change over evolutionary time under circumstances where the organism may have a variety of alternatives available to achieve the same functional goal. But if those alternatives are truly equivalent, it is puzzling why the X-chromosome signal elements (XSEs) in D. melanogaster seem to rely so heavily on the TAGteam. Even run, the only exception to the rule that XSEs in D. melanogaster have three CAGGTAG sites within 500 bp of their transcription start sites, has six TAGteam sites within 3.1 kb upstream. Previously published analysis of the regions we now know contain TAGteam sites showed that they direct the very early, broad expression of run that affects SxlPe(Klingler et al., 1996).
Because the duplicated 111-bp TAGteam fragment from zen drove expression earlier than any endogenous gene is known to be transcribed, it seems likely that any TAGteam binding protein that activates pre-CB transcription will be derived from maternal rather than zygotic gene expression. With our elimination of the genetically characterized hermaphrodite (her) gene as a potential source of this activity (and indeed perhaps even as a regulator of SxlPe), a straightforward biochemical approach may be the most practical route for identifying the relevant protein. Information on the identity of this protein should further understanding of the molecular mechanisms that govern pre-CB transcription and reveal whether those mechanisms differ qualitatively from those governing the subsequent widespread activation of zygotic gene expression.
Supplementary material
Acknowledgements
We thank M. Markstein, J. Erickson and B. Baker for providing stocks and reagents, and thank B. Meyer, M. Stern, and members of the Cline lab for helpful discussions and comments on the manuscript. This work was supported by US National Institutes of Health grant GM-23468 to T.W.C.