ABSTRACT
The Polycomblike gene of Drosophila melanogaster, a member of the Polycomb Group of genes, is required for the correct spatial expression of the homeotic genes of the Antennapaedia and Bithorax Complexes. Mutations in Polycomb Group genes result in ectopic homeotic gene expression, indicating that Polycomb Group proteins maintain the transcriptional repression of specific homeotic genes in specific tissues during development. We report here the isolation and molecular characterisation of the Polycomblike gene. The Polycomblike transcript encodes an 857 amino acid protein with no significant homology to other proteins. Antibodies raised against the product of this open reading frame were used to show that the Polycomb-like protein is found in all nuclei during embryonic development. Antibody staining also revealed that the Polycomblike protein is found on larval salivary gland polytene chromosomes at about 100 specific loci, the same loci to which the Polycomb and polyhomeotic proteins, two other Polycomb Group proteins, are found. These data add further support for a model in which Polycomb Group proteins form multimeric protein complexes at specific chromosomal loci to repress transcription at those loci.
INTRODUCTION
In the insect Drosophila melanogaster, morphological diversity of the body segments arises during embryogenesis by selective expression of homeotic genes of the Antennapaedia (ANT-C) and Bithorax (BX-C) Complexes (see Ingham, 1988 for a review). Spatial misexpression of the homeotic selector genes can result in inappropriate development of body segments, leading to the formation of structures characteristic of other segments. Clonal analysis has demonstrated that correct homeotic gene expression patterns must be maintained throughout most of Drosophila development to yield the normal adult segmental structures (Morata and Garcia-Bellido, 1976).
A number of trans-regulatory factors are involved in establishment and maintenance of the pattern of homeotic gene expression. One group of unlinked genes, known as the Polycomb group (PcG), appear to regulate homeotic expression by acting as transcriptional repressors of the BX-C and ANT-C. Mutations of PcG genes result in ectopic expression of homeotic genes within the BX-C and ANT-C, generating homeotic transformations of body segments (Jürgens, 1985; Wedeen et al., 1986; McKeon and Brock, 1991).
The PcG is composed of at least twelve genes. While all known members exhibit mutant phenotypes consistent with misexpression of the ANT-C and BX-C, double and triple mutants exhibit, in general, much stronger transformations than single mutants (Jurgens, 1985). It is noteworthy that lesions in one member of the PcG, Polycomb (Pc), cause transformations that are as strong as those normally observed in double mutants. Differences in PcG mutant phenotypes may reflect differing maternal contributions within the PcG (Breen and Duncan, 1986). Various loci in the group exhibit pleiotropic effects such as segmentation defects and interactions with the zeste locus, indicating that genes of the PcG have more general roles than regulation of the ANT-C and BX-C. (Ingham, 1984; Breen and Duncan, 1986; Dura and Ingham, 1988; Smouse et al., 1988; Adler et al., 1991; Moazed and O’Farrell, 1992).
Several lines of evidence have led to suggestions that the products of the PcG genes form multimeric complexes at specific chromosomal locations. Two of the PcG proteins, Polycomb (Pc) and polyhomeotic (ph), have been shown to be localized to the same set of polytene chromosome loci, including those of the ANT-C and BX-C (Zink and Paro, 1989; DeCamillis et al., 1992). The protein encoded by another member of the group, Posterior sex combs (Psc), appears to localize to a set of sites on polytene chromosomes that overlaps the set of Pc/ph sites (Martin and Adler, 1993; Rastelli et al., 1993), although co-immunolocalisation has yet to be reported. Anti-Pc antibodies immunoprecipitate formaldehyde cross-linked chromatin from within the BX-C (Orlando and Paro, 1993). Localisation of binding to Antennapedia sequences was confirmed by showing that the Pc protein binds to insertion sites of transgenes containing PcG-responsive Antennapaedia regulatory regions (Zink et al., 1991). Significantly, this binding is sensitive to the position of insertion of the transgene and binding is not observed in cases where ectopic expression of the transgene is observed. Finally, the Pc and ph proteins were shown to coprecipitate from embryonic nuclear extracts (Franke et al., 1992).
Circumstantial evidence suggests that the transcriptional repression is mediated through regional changes in chromatin structure. The Pc protein has significant homology to the HP-1 heterochromatin-specific protein in a region termed the ‘chromodomain’. HP-1 is encoded by the Su(var)205 locus (Paro and Hogness, 1991). Mutations at this locus suppress an independent mutant phenotype termed Position Effect Variegation (PEV). PEV occurs as a result of the translocation of a euchromatic gene next to heterochromatic sequences. PEV is the heritable inactivation of the expression of such a translocated gene in particular somatic cells during development. It is thus likely to be caused by a property of the heterochromatin. According to one model, for example, the heterochromatin in a subset of cells spreads to include the translocated gene, inactivating it (see Henikoff, 1990 for a review). Various loci of the PcG are able to enhance or repress this effect in a genedosage-dependent manner (Sinclair, Clegg, Grigliatti and Brock, personal communication), further implicating the PcG in chromatin organisation.
Despite the similarities between PcG mutant phenotypes, the molecular and functional properties of PcG proteins are diverse. Of the four PcG genes whose molecular characterisation has been reported, no two share significant homology, although several have identifiable domains. As described above, the 44×103 Mr Pc protein contains a conserved 37 amino-acid region, termed the ‘chromodomain’. The Psc gene encodes a putative 175×103 Mr protein containing a 200 aminoacid domain found also in the murine oncogene bmi-1 (Brunk et al., 1991). This domain is postulated to bind DNA, although the DNA-binding properties of the Psc protein are yet to be reported. The ph gene encodes a 169×103 Mr protein containing a single putative zinc finger and glutamine-rich repeats (DeCamillis et al., 1992). Finally, the Enhancer of zeste (E(z), also known as polycombeotic, pco) gene encodes an 87×103 Mr protein with sequence similarity to the Drosophila trithorax and human ALL-1/Hrx proteins (Jones and Gelbart, 1993).
Many members of the PcG remain to be characterized. One such member, the Polycomblike (Pcl) gene, was assigned and named on the basis of the similarity of its mutant phenotype to the Polycomb gene (Duncan, 1982). We describe here the isolation and characterisation of the Pcl gene and show that it encodes a putative 857 amino acid protein that is localized to polytene chromosomes in an identical pattern to that found for both the Pc and ph proteins, further supporting the proposal that the PcG proteins form a multimeric complex that represses the transcription of homeotic genes during development.
MATERIALS AND METHODS
General techniques
Routine methods including screening of libraries, cloning, northern and Southern analysis are described in Sambrook et al. (1989). DNA sequencing was performed using the Pharmacia Nested Deletion and Sequenase kits.
Preparation of antisera and immunolocalisation of proteins on polytene chromosomes
Fusion proteins were prepared using the pGEX glutathione-s-transferase bacterial expression system (Amrad). A 1.3 k bp XmnI fragment of the Pcl cDNA (see Fig. 3A) was cloned into pGEX1 and transformed into DH5α bacteria. Expression of the glutathione-s-transferase-Pcl fusion protein was induced by the addition of IPTG. The bacteria were harvested, sonicated until lysis was complete and the insoluble fraction was collected by centrifugation. (The majority of the fusion protein was found to be in the insoluble fraction). This fraction was resuspended in 1% Triton X-100, 1% Tween-20 and subjected to SDS-PAGE. The fusion protein was excised from the gel, mixed with an adjuvant and used to immunise rabbits. After three boost injections, blood was collected and serum was tested for Pcl-binding activity on western blots of Pcl fusion proteins. The antiserum was further purified by passage successively over two affinity columns, the first containing whole bacterial lysate (covalently bound to CNBr-Sepharose) from which the eluate was collected and the second containing Pcl fusion protein from which the bound, Pcl-specific antibodies were eluted and used for immunohistochemistry. Immunostaining of chromosomes followed the procedure of Franke et al. (1992).
In situ hybridisation
Whole-mount in situ hybridisations were performed with staged Canton-S embryos according to the protocol of Tautz and Pfeifle (1989), using digoxigenin probes (Boehringer Mannheim) prepared by random priming with 10× the normal concentration of oligonu-cleotides (C. Oh and B. Edgar, unpublished modifications of Feinberg and Vogelstein, 1983).
RESULTS
Isolation of the Polycomblike gene
The Pcl gene had previously been localised to cytological region 55A on the right arm of chromosome 2. Characterisation of deficiencies within this region revealed the Pcl gene to be genetically close to the maternal effect gene, staufen (stau) (R. Tearle, personal communication). A cosmid (cos8.1), isolated as part of the cloning and characterisation of stau was found to rescue the Pcl mutant phenotype (D. St. Johnston, personal communication).
Characterisation of the region contained within the cos8.1 genomic clone revealed the presence of three transcription units in addition to the stau transcription unit (summarized in Fig. 1). Two of these correspond to previously characterized genes. The one closest to stau was found by Southern blot hybridisation to correspond to the extensively characterized HSF gene (Clos et al., 1990) while the one furthest from stau was found by sequence analysis to correspond to the Poly(A)-binding Protein (PABP) gene (Lefrere et al., 1990). Both of these transcripts were considered unlikely to correspond to Pcl. A third previously undescribed transcription unit, represented by cDNA clone AL15, lay between the HSF and PABP genes.
To determine which transcription unit corresponds to the Pcl gene, P-element-induced insertion alleles of Pcl were generated by mobilising a P-element inserted in stau (St. Johnston et al., 1991). Two such insertion alleles, termed PclP1 and PclP2, were generated. Southern blot analysis showed both insertions to lie adjacent to the 5′ end of the AL15 cDNA (results not shown). DNA covering the site of insertion of the P-element in the PclP1 allele was isolated and its sequence determined. The P-element was found to have inserted 39 nucleotides upstream of the 5′ end of the 3.9 kb AL15 cDNA (Fig. 1 and see below) and approximately 500 base pairs from the HSF gene. A possible TATA box exists 12 bp downstream of the P-element insertion site. P-element-induced mutations frequently involve insertion very close to the transcription start site of the mutated gene (Tsubota et al., 1985; Chia et al., 1986; Searles and Volker, 1986; Kelley et al., 1987; Roiha et al., 1988). We conclude that the transcription unit defined by cDNA clone AL15 corresponds to the Pcl locus.
Characterisation of the Pcl transcript
The longest Pcl cDNA clone isolated was 3.9 kb in length (Fig. 1). Shorter clones appeared to correspond to incomplete cDNAs, as they formed a nested set with a common origin at the poly(A) tail. The entire cDNA was used as a probe to determine the size of the Pcl transcript on a northern blot (Fig. 2). Only one species of transcript was detected, corresponding in size to the largest cDNA that was isolated, which suggests that AL15 is a full-length cDNA. Sequence analysis revealed the presence of one major open reading frame (ORF), capable of encoding an 857 amino acid protein (Fig. 3). Two overlapping consensus start sequences (Cavener, 1987) were associated with two consecutive, in frame methionine codons at the 5′ end of this ORF. We have assumed that the first ATG is that used in vivo, although the second ATG also forms a reasonable consensus start site. The sequence of the Pcl protein was compared to sequences in all available databases, but no significant homologies were identified. Thus, like all other known members of the PcG, the Pcl protein shares no homology with any other PcG protein. A consensus polyadenylation site is present 21 nucleotides 5′ to the poly(A) tail in the cDNA (Fig. 3).
Digoxigenin-labelled DNA probes were prepared from Pcl cDNA clone AL15 and hybridised in situ to Drosophila whole-mount embryos. Pcl mRNA was found to be evenly distributed throughout all tissues at all stages of embryogenesis (data not shown).
Characterisation of the Pcl protein
To characterise the Pcl protein, rabbit antibodies were raised against the 857 amino acid product of the ORF encoded by the Pcl transcript. A 1.3 kb XmnI fragment of Pcl cDNA clone
AL15 (see Fig. 3) was ligated to the 3′ end of the glutathione-S-transferase (GST) gene in the vector pGEX2. The GST-Pcl fusion protein was produced in a bacterial expression system, isolated on SDS-polyacrylamide gels and used to immunise rabbits. Sera collected from the animals was affinity purified on CNBr-Sepharose columns to which fusion protein had been bound. Probing of western blots of the GST-Pcl fusion protein demonstrated that the antibodies were specific for the Pcl portion of the fusion protein (data not shown). Subsequently, we used the purified antisera to probe embryos that were homozygous for the Df(2R)PC4 deletion. The homozygous deficient embryos were also missing the closely linked three rows gene, a gene required for chromosome segregation and cytokinesis, but not for any other aspect of cell cycle progression (D’Andrea et al., 1993), so that 12-15 hour homozygous Pcl mutant embryos were identified by virtue of their three rows phenotype. In contrast to wild-type and Df(2R)PC4 heterozygous embryos, the homozygous mutant embryos showed no staining with the affinity purified anti-Pcl antibodies, indicating that the antibodies are specific for Pcl in vivo (Fig. 4). The staining of all cleavage, blastoderm and gastrulating embryos derived from heterozygous Df(2R)PC4 parents indicated the presence of a maternal Pcl component in the homozygous deficient embryos, consistent with the genetic analysis of Pcl mutants (Breen and Duncan, 1986).
Pcl is a nuclear protein that binds to specific loci on polytene chromosomes
The distribution of Pcl in embryos was examined by incubating fixed whole-mount embryos with the anti-Pcl primary and horseradish peroxidase linked secondary antibodies (Fig. 5). By comparing the Pcl staining pattern with that of the DNA-specific fluorochrome, DAPI, Pcl was found to be localised to the nucleus (data not shown). The protein was found in all nuclei throughout development (Fig. 5 and data not shown), consistent with the distribution of Pcl mRNA.
Two other PcG-encoded proteins, Pc and ph, are known to bind to an identical set of approximately 100 sites on salivary gland polytene chromosomes (Franke et al., 1992). We used double-labelled antibody stainings to demonstrate that the Pcl protein binds to an identical set of sites as those bound by the Pc protein (Fig. 6 and results not shown). As found for the Pc protein, two of the most intensely stained bands corresponded to the ANT-C and BX-C at cytological locations 84A-B and 89E, respectively (results not shown).
Segments of DNA derived from the Antp regulatory region and transposed to new chromosomal locations are capable of binding Pc (Zink et al., 1991), as demonstrated by antibody staining polytene chromosomes carrying such transposed fragments. We used this system to show that Pcl also binds to the same transposed DNA segments derived from the Antp regulatory regions. For example, a new Pcl-binding site in polytene chromosomes was found to be present on the third chromosome at the site of insertion of the pAPT2.3 construct (Fig. 7). This construct contains approximately 12.6 kb of DNA from the region of the Antp P2 promoter.
DISCUSSION
The Polycomb Group (PcG) of genes are defined on the basis of a common homeotic mutant phenotype that results from derepression of the genes of the ANT-C and BX-C. As such, they may represent a unique mechanism for the negative regulation of eukaryotic genes. The extraordinary conservation of the homeotic gene clusters (reviewed in McGinnis and Krumlauf, 1992) and, possibly, of their regulation (Awgule-witsch and Jacobs, 1992; Malicki et al., 1992) makes it likely that the PcG-mediated repression mechanism will be found to operate throughout the metazoa. The finding that proteins encoded by the Pc, Psc and E(z) genes share sequence similarity with mammalian proteins supports this speculation (Singh et al., 1991; Brunk et al., 1991; Jones and Gelbart, 1993).
We report here the molecular isolation and characterisation of the Polycomblike (Pcl) gene, a member of the PcG. The gene was identified by the molecular characterisation of transcripts from a cosmid that rescued the Pcl mutant phenotype and from the molecular analysis of a P insertion Pcl allele.
Sequence analysis of Pcl cDNA clones revealed the presence of a single large open reading frame that encodes an 857 amino acid protein. Exhaustive searching of databases yielded no clues to the function of this protein. It seems likely that at least one member of the PcG will encode a protein that exhibits sequence-specific DNA-binding activity, but Pcl does not possess any known DNA-binding motifs. Furthermore, the relatively weak phenotype of Pcl mutants in comparison to, for instance, Pc mutants, argues that Pcl is unlikely to provide sequence specificity for the PcG complex.
Although the PcG genes isolated to date exhibit no molecular homology with each other, the similarities in mutant phenotypes, i.e. derepression of the homeotic genes, argue that they are involved in the same process. The studies of the Pcl gene and its protein product reported here show that it shares certain characteristics with the other PcG genes reported to date. Pcl, like Pc, Psc and ph, is a nuclear protein that is expressed throughout embryonic development. This result is not surprising, as genetic analysis shows that Pcl function is required throughout development (Duncan, 1982). Pcl mutants exhibit both embryonic and adult phenotypes: Pcl null mutant homozygotes die during embryogenesis, while heterozygotes exhibit an adult phenotype in which the second and third legs of males are transformed to resemble the first leg. This generalised expression of PcG members suggests that the transcriptional repression of particular homeotic genes in a subset of tissues cannot to be due to specific expression of PcG members in those tissues.
Another functional similarity between the PcG proteins analysed to date is that the Pcl, Pc and ph proteins bind to the same large number of loci on polytene chromosomes. The observation that Pcl represents a third member of the PcG family of proteins that co-localises to specific sites on polytene chromosomes further strengthens the proposal that a heteromeric complex of PcG proteins binds to specific sites in chromosomes to repress particular genes (Paro, 1990).
While Pcl exhibits certain functional similarities with other PcG genes, it is also similar in having no molecular homology to any other member of the group. Thus while the different PcG genes are functionally related, the known PcG proteins share no molecular similarities. Each gene, therefore, appears to play a unique role in the repressing mechanism. Loss of any part of the PcG complex would be expected to lead to loss of function, so mutations in any one PcG gene would give a similar phenotype, despite its unique function within the complex. Phenotypes of PcG mutants are not, however, identical. This suggests that there are differences in the function of individual members, despite the overall functional similarity. Some of the differences in the zygotic null phenotypes very likely reflect differences in the provision and/or stability of the different maternal components. It is also possible, however, that particular PcG proteins exhibit subtle differences in function at different loci and in different cell types.
Our studies have not addressed the basis of specificity of binding of PcG proteins to specific sites on chromosomes. In contrast to the tissue specificity of repression of particular homeotic genes, Pcl, like the previously described Pc and ph proteins (Franke et al., 1992), is present in all nuclei examined. This suggests that the PcG-mediated repression of particular genes in some, but not all, tissues is not a function of the confinement of PcG expression to those tissues. Thus, repression of ANT-C and BX-C genes only in segments that do not actively express those genes during the establishment phase of homeotic gene expression, suggests that the postulated heteromeric PcG complex is able specifically to recognise a transcriptionally inactive gene. This recognition may occur at the level of chromatin structure, or may be the result of a PcG complex interacting with transcription factors that regulate ANT-C and BX-C expression. Understanding the basis of this recognition is likely to require further characterisation of the nature of the PcG genes and the molecular properties of the proteins they encode.
ACKNOWLEDGEMENTS
The authors would like to thank Carl Wu for unpublished information and for providing the HSF cDNA clone and Rick Tearle for providing unpublished information. A.L. was supported by an Australian Postgraduate Research Award and R.D’A by an Australian Research Council QEII Fellowship. This work was supported by the Australian Research Council.