Recent advances have shed new light on how the Q50 homeoproteins act in Drosophila. These transcription factors have remarkably similar and promiscuous DNA-binding specificities in vitro; yet they each specify distinct developmental fates in vivo. One current model suggests that, because the Q50 homeoproteins have distinct biological functions, they must each regulate different target genes. According to this ‘co-selective binding’ model, significant binding of Q50 homeoproteins to functional DNA elements in vivo would be dependent upon cooperative interactions with other transcription factors (cofactors). If the Q50 homeoproteins each interact differently with cofactors, they could be selectively targeted to unique, limited subsets of their in vitro recognition sites and thus control different genes. However, a variety of experiments question this model. Molecular and genetic experiments suggest that the Q50 homeoproteins do not regulate very distinct sets of genes. Instead, they mostly control the expression of a large number of shared targets. The distinct morphogenic properties of the various Q50 homeoproteins may principally result from the different manners in which they either activate or repress these common targets. Further, in vivo binding studies indicate that at least two Q50 homeoproteins have very broad and similar DNA-binding specificities in embryos, a result that is inconsistent with the ‘co-selective binding’ model. Based on these and other data, we suggest that Q50 homeoproteins bind many of their recognition sites without the aid of cofactors. In this ‘widespread binding’ model, cofactors act mainly by helping to distinguish the way in which homeoproteins regulate targets to which they are already bound.
Homeoproteins are a family of transcription factors that share an homologous DNA-binding domain, the homeodomain (Burglin, 1994). These proteins are found in animals, plants and fungi, and typically regulate aspects of development. A particularly well-characterized, highly conserved group of homeoproteins specify positional information along the anterior/posterior axis of many animals, including nematodes, fruit flies and vertebrates (McGinnis and Krumlauf, 1992; Burglin, 1994; Manak and Scott, 1994). These homeoproteins all contain a glutamine residue at position 50 of the homeodomain, and we refer to them as the ‘Q50 homeoproteins’.
In vitro, the Q50 homeoproteins all bind with similar specificity to a short, degenerate DNA consensus sequence that occurs frequently in a majority of genes. In this review, we consider the relationship between these in vitro properties and the biological functions of these proteins. We focus on the Drosophila Q50 homeoproteins encoded by the segmentation genes even-skipped (eve), fushi-tarazu (ftz), and engrailed (en), and those encoded by the Hox (or homeotic) genes, which include abdominal A (abd A), Deformed (Dfd) and Ultra-bithorax (Ubx).
HOW DIFFERENT ARE THE MORPHOGENIC FUNCTIONS OF Q50 HOMEOPROTEINS?
engrailed and the Hox genes are so called ‘selector genes’ (Garcia-Bellido, 1975; Lawrence, 1992). They are each expressed in unique patterns in the early embryo and their function is required persistently after this time (Castelli-Gair et al., 1994 and references therein). The selector genes divide the animal into a series of unique domains, called ‘compartments’, along the anterior/posterior axis. Generally, only one or two selector genes are highly expressed and active in each compartment. The selector genes prevent mixing of cells between adjacent compartments and provide cells with different developmental fates. For example, during the development of wild-type Drosophila, cells in the primordia of the third thoracic dorsal appendage divide repeatedly and then differentiate to form the haltere. In Ubx mutant animals, the level of Ubx protein in this primordia is reduced. This alters the way these cells divide and differentiate, and results in the formation a much larger wing appendage in the place of a haltere (Lewis, 1978).
Although selector genes control the formation of body regions that differ dramatically in appearance from each other, clonal analysis and morphological studies suggest that these genes do this by regulating the same processes. These common processes include: rates of cell division, orientation of cell divisions, cell affinities, differentiation into specific cell types, cell size and shape, and cell movement events such as axonal outgrowth (Garcia-Bellido, 1975; Postlethwait, 1978; Lawrence, 1992; Castelli-Gair et al., 1994; Bienz and Hart, 1996). For example, groups of cells that give rise to separate segments undergo different numbers of cell divisions, indicating that all the selector genes control rates of cell proliferation and/or final cell numbers. As a second example, one of the chief differences between segments is that they contain altered arrangements of the same cell types. (All segments have bristles, for instance, but different segments have distinct patterns of bristles.) Thus, selector genes must determine which cells within each segment express cell-type-specific genes like scute and shaven, which affect bristle formation in all segments (Garcia-Bellido, 1975; Lawrence, 1992).
Recent molecular genetic experiments also suggest similarities in the action of selector genes. Normally, the Hox proteins specify distinct segmental identities. However, when ectopically expressed in certain tissues, these proteins can direct similar developmental fates (Greig and Akam, 1995; Casares et al., 1996). For example, artificially expressing either Ubx, abd A or Abdominal B (Abd B) in cells that normally give rise to the wing will cause these cells to form haltere tissue instead. This is a surprising result as abd A and Abd B normally control formation of the abdomen, a region of the body that lacks both wings and halteres. In this and similar instances, since the same morphogenic processes are being directed, the same target genes are probably being activated and/or repressed in the same manner.
Studies of ‘phenotypic suppression’ provide further evidence that the Hox proteins may control many of the same targets. In contrast to the experiments described above, studies of ‘phenotypic suppression’ have assayed Hox proteins in tissues and at times of development where they act differently from one another. In these experiments, when Hox proteins are coexpressed in the same cells, some are able to dominantly suppress the action of others. Those Hox proteins controlling cell fates in more posterior regions of the animal usually dominate this competition, and it has been suggested that this ‘phenotypic suppression’ results from competition for common target elements (e.g. Gibson and Gehring 1988; Gonzalez-Reyes et al., 1990; Sanchez-Herrero et al., 1994).
It is also possible that eve and ftz act similarly to the selector genes. In early development, eve and ftz establish the expression of the selector genes. Consequently, part of their effect on morphogenesis is indirect (Lawrence, 1992). However, at later stages, eve and ftz are expressed in a subset of neurons where they affect cell fate and axonal outgrowth (Doe et al., 1988). Also, eve becomes expressed in the posterior of the embryo where this protein has been suggested to act as a selector protein (Ahringer, 1996). Thus, eve and ftz may directly regulate the same process and genes as the selectors.
THE RANGE OF GENES UNDER THE CONTROL OF Q50 HOMEOPROTEINS
What then is known about the target genes that are bound and directly regulated by the Q50 homeoproteins? Unfortunately, at present, only a few direct regulatory interactions have been demonstrated rigorously. However, based on their patterns of expression or on genetic data, many other genes have been identified that could be targets of the Q50 homeoproteins. We refer to both the potential and the characterized direct targets as ‘downstream genes.’
Given the complex nature of morphogenesis, perhaps it is not surprising that downstream genes encode a wide variety of proteins. These proteins include: growth-factor-like molecules, membrane receptors, cell adhesion molecules, structural proteins such as tubulin, enzymes, many transcription factors, cell cycle control genes, and genes regulating the orientation of cell divisions (reviewed by Botas, 1993; Follette and O’Farrell, 1997). Indeed, work by the Berkeley Drosophila Genome Project suggests that there are thousands of downstream genes. Analysis of randomly selected genes indicates that, although some genes are uniformly expressed in all cells, the majority are expressed in patterns that suggest that they are controlled by the Q50 homeoproteins (http://fruitfly.berkeley.edu; http://flyview.uni-muenster.de). In addition to the previously mentioned fact that eve and ftz control the expression of the selector genes, en and the Hox genes also regulate each other’s expression (Lawrence, 1992; Botas, 1993). Positive and negative feedback loops also occur, in which downstream genes regulate transcription of the Hox and segmentation genes (e.g. Harding et al., 1986; Riese et al., 1997). Thus, the regulatory network is complex and the control of downstream genes by Q50 homeoproteins is partially indirect, being mediated via other transcription factors.
Consistent with the evidence that many Q50 homeoproteins control the same morphological processes, the expression of many downstream genes is genetically under the control of multiple segmentation and Hox genes (e.g. Vachon et al., 1992; Gould and White, 1992; O’Hara et al., 1993; Manak et al., 1995; Mastick et al., 1995; Gould et al., 1997). Typically, these homeoproteins positively or negatively regulate the same genes to varying degrees: often modulating transcript levels by anywhere between twofold to 50-fold. As a result, most down-stream genes are expressed in different segmentally repeating patterns, in which the level of expression varies from segment to segment, as does the precise timing of expression, and the array of cells transcribing these genes. Even when loss-of-function mutations in one Q50 homeoprotein gene do not affect the expression of a gene controlled by many other Q50 homeoproteins, it cannot be assumed that the first homeoprotein does not or cannot regulate the potential target. For example, some of the genes bound most strongly by Q50 homeoproteins in embryos appear to be controlled by these proteins in a redundant manner (i.e., in parallel with other transcription factors that share the same function) (Walter et al., 1994; Laney and Biggin, 1996).
DNA BINDING OF Q50 HOMEOPROTEINS IN VITRO
The homeodomain is a highly conserved structure recognizing a six nucleotide consensus DNA sequence, NNATTA (Gehring et al., 1994). The glutamine residue at position 50 of the homeodomain is flexible and can make specific contacts with any of several different bases located at either of the two nucleotide positions just 5 ′ of the ATTA sequence (Gehring et al., 1994; Hirsch and Aggarwal, 1995; Billeter et al., 1996). This may explain why many Q50 homeodomains exhibit only fivefold differences in affinity or less for most variants of the consensus sequence, with disassociation constants typically between 10 −9and 10−10M (e.g. Florence et al., 1991; Ekker et al., 1992; Gehring et al., 1994). Different Q50 homeodomains show very similar preferences among these DNA sequences (e.g. Desplan et al., 1988; Ekker et al., 1992; Walter and Biggin, 1996); whereas homeodomains that have other amino acids at position 50 show significantly altered preferences for variants of the consensus sequence (e.g. Treisman et al., 1989).
The ability of Q50 homeoproteins to bind strongly to a range of DNA sequences has been shown, in some cases, to be enhanced by homomeric cooperative interactions. For En, Eve and Ubx, these interactions cause increased occupancy of lower affinity sites that do not exactly match the consensus sequence and are mediated by amino acid sequences outside the homeodomain (Desplan et al., 1988; Beachy et al., 1993; Austin and Biggin, 1995). Cooperative interactions frequently occur between molecules bound to clusters of sites separated by up to 200 base pairs of DNA, resulting in looping of the intervening DNA. Further, this cooperative binding at a distance appears to be important as it is essential for repression of transcription in vitro (Austin and Biggin, 1995; Ten-Harmsel and Biggin, 1995).
High and moderate affinity Q50 homeoprotein recognition sites are present at similar densities throughout the length of both known target genes and Drosophila genes chosen at random (Walter and Biggin, 1996). Typically, these binding sites occur in clusters containing between two to ten sites and are present at an overall density of five to ten monomer binding sites per kilo base pair (Appel and Sakonju, 1993; Walter and Biggin, 1996). It is not known whether the majority of these sites are functionally significant or whether most sites occur fortuitously. However, intriguingly, the frequency of sites in Drosophila genes appears to be somewhat greater than that occurring by chance in bacterial phage and plasmid DNA (Desplan et al., 1985; Walter and Biggin, 1996).
The ability to bind strongly to a majority of Drosophila genes distinguishes the Q50 homeoproteins from at least some other eukaryotic regulators. Under comparable conditions, the Drosophila regulator zeste, which is not a homeoprotein, shows at least a 100-fold difference in affinity between its known targets and other genes (Walter and Biggin, 1996). However, the Q50 homeoproteins are not unique in possessing very broad and similar DNA-binding specificities in vitro. Other groups of eukaryotic transcription factors, including at least some other classes of homeoprotein, recognize short, degenerate DNA sequences (e.g. Baumruker et al., 1988; Johnson and McKnight, 1989).
COFACTORS AND HOMEOPROTEIN RESPONSE ELEMENTS
Given the complexity of the regulatory network, how can we learn which genes are directly regulated by the Q50 homeo-proteins, and how do we determine the mechanisms that they employ? One approach has combined the systematic mutagenesis of transgenic promoter constructs with in vitro DNA-binding studies (e.g. Jiang et al., 1991; Han et al., 1993; Chan et al., 1994; reviewed by Gross and McGinnis, 1996a). To overcome the problem that Q50 homeoprotein recognition sites can potentially be bound by many proteins, second site suppression experiments have been used to determine which homeoproteins act directly through particular cis regulatory elements (e.g. Schier and Gehring, 1992; Sun et al., 1995).
Some general lessons can be abstracted from these studies. First, homeoprotein response elements are localized to certain promoter regions, sometimes far upstream of the transcription start site or in introns, and these autonomous units usually span some hundreds of base pairs of sequence. Second, in the cases where it has been tested, response elements are either activated or repressed by multiple homeoproteins (e.g. Manak et al., 1995; Gould et al., 1997). It is not certain if there are natural elements that respond to only one Q50 homeoprotein. Third, within these regulatory units, multiple homeoprotein recognition sites are important for the strength of the response: frequently, ten or more sites being involved. Fourth, homeoprotein recognition sites are generally not sufficient for a precise response in embryos. The specificity and activities of regulatory regions are critically dependent on binding sites for other proteins, cofactors, that typically exhibit no obvious spacing pattern to nearby homeoprotein binding sites. Fifth, in the case of enhancer sequences, the patterns of activation provided by these enhancers are generally limited to only one or two of the tissues in which the regulating homeoproteins are expressed, suggesting that cofactors act in a tissue-specific manner.
A number of cofactor proteins have recently been identified (e.g. Han et al., 1993; de Zulueta et al., 1994; McCormick et al., 1995; Copeland et al., 1996; Gross and McGinnis, 1996b). Here, we limit our discussion to the two well-characterized cofactors, Exd and Ftz-F1.
Mutations in the exd gene affect the activity of many Hox proteins during development (reviewed by Mann and Chan, 1996). The Exd protein contains a divergent homeodomain with a different binding specificity to the Q50 homeoproteins, recognizing a TGAT consensus sequence. Although Exd is ubiquitously expressed in early embryos, the nuclear accumulation of this protein is regulated in a spatial pattern (Mann and Abu-Shaar, 1996; Aspland and White, 1997). This implies that Exd controls Hox protein function in response to other regulatory inputs and is consistent with genetic evidence that exd is only required in certain tissues (Gonzalez-Crespo and Morata, 1995; Rauskolb et al., 1995). Exd recognition sites are found in many Hox responsive enhancers and, importantly, Exd protein can heterodimerize and bind DNA cooperatively with a variety of Hox proteins in vitro (reviewed by Mann and Chan, 1996).
The Ftz-F1 protein is a nuclear hormone receptor that binds to several Ftz-responsive enhancers. Genetic experiments indicate that Ftz-F1 is required for the activation of downstream target genes by Ftz (Han et al., 1993; Yu et al., 1997; Guichet et al., 1997; Florence et al., 1997), and in vitro experiments show that bound Ftz-F1 can cooperatively increase binding of Ftz to a single adjacent DNA site. The phenotype of Ftz-F1 mutant embryos indicates that Ftz-F1 is not required for the activity of other Q50 homeoproteins. Consequently, Ftz-F1 must help distinguish the activity of Ftz from that of the other Q50 homeoproteins. The important question then becomes: how is this accomplished?
We find it useful to distinguish two models by which Exd, Ftz-F1 and other cofactors might act.
The first model suggests that cofactors selectively target different Q50 homeoproteins to bind to different DNA sites, allowing each of these proteins to regulate unique target genes (Fig. 1). In this ‘co-selective binding’ model, Q50 homeoproteins cannot significantly occupy any functional promoter elements without the aid of cooperative interactions with cofactors. Support for the ‘co-selective binding’ model comes from the discovery that separate heterodimers, containing Exd complexed with different Hox proteins, have distinct DNA-binding specificities in vitro (Popperl et al., 1995; Chan and Mann, 1996; Mann and Chan, 1996). In one case, the differential binding of two separate Exd/Hox heterodimers to distinct ten base pair recognition sequences correlates with differential activation of two artificial promoters containing these sites in vivo (Chan et al., 1997).
The second model suggests that cofactors alter the ability of Q50 homeoproteins to regulate target genes to which they are already bound. There are a number of ways this might be accomplished. However, for simplicity, we chiefly discuss a version of this ‘widespread binding’ model based on a proposal for how Exd may act (Fig. 1; Pinsonneault et al., 1997). In this model, cofactors regulate Q50 homeoproteins so they are switched into transcriptional activators from a repressive or neutral state.
A number of arguments appear to favor the ‘widespread binding’ model. First, it more easily explains the data suggesting that the Q50 homeoproteins regulate a large number of shared target genes. Second, because natural elements contain many Q50 homeoprotein recognition sites, homomeric cooperative interactions between these proteins may stabilize DNA binding at functionally significant levels without a need to interact with cofactors. The published experiments showing cooperative DNA binding between cofactors and Q50 homeoproteins have not addressed this possibility since DNA binding was only assayed on single heterodimer sites. Third, although many of the identified enhancer elements for the Hox proteins contain Exd-binding sites, in most cases, these do not resemble the specialized heterodimer sites required for Exd to discriminate DNA binding by different Hox proteins (Chan et al., 1994; Pinsonneault et al., 1997). Fourth, the ‘widespread binding’ model is consistent with the observation that, in all examples examined to date, activation by Hox proteins requires exd genetic function whereas repression by Hox proteins does not (Peifer and Wieschaus, 1990; Pinsonneault et al., 1997).
DNA BINDING OF Q50 HOMEOPROTEINS IN VIVO
Given that the above two models predict very different distributions for endogenous Q50 homeoproteins on chromatin, what DNA sequences do these proteins bind in vivo? To date this has only been determined for Eve and Ftz proteins. These experiments employed an in vivo UV crosslinking method, which provides a quantitative comparison of binding to different DNA fragments in intact embryos (Walter and Biggin, 1996, 1997). Endogenous Eve and Ftz were found to bind with similar specificity to a very broad range of DNA fragments (Walter et al., 1994; Fig. 2). These two proteins bind at similar levels to three genes that they both regulate, eve, ftz and Ubx. Further, they are present at uniform levels on DNA fragments throughout the length of these targets. Eve and Ftz also bind at only twofold to 10-fold lower levels to genes initially chosen as unlikely targets, Adh, hsp70, rosy and actin 5C, suggesting that they bind at appreciable levels to a majority of genes (Fig. 2). In contrast, the transcription factor Zeste, which is not a homeoprotein, only binds to short regions within known target promoters (Walter et al., 1994; Laney and Biggin, 1996).
All DNA fragments crosslinked by Eve and Ftz in vivo contain multiple high and moderate affinity homeoprotein recognition sites (Walter and Biggin, 1996). Estimates suggest that there are at least 50,000 molecules of Eve and Ftz per cell (Krause et al., 1988; Walter et al., 1994). The thermodynamic prediction is that most of these molecules will be bound to DNA at specific sites (Lin and Riggs, 1975; von Hippel et al., 1974; Yang and Nash, 1995; Walter and Biggin, 1996). This implies a minimum density on the most weakly bound genes of one homeoprotein for every four kilo base pairs of DNA and, on the most strongly bound genes, a density of at least five monomers per kilo base pair of DNA. However, at present, it is not possible to identify which individual binding sites are occupied.
These in vivo data meet important predictions of the ‘wide-spread binding’ model. Most notably, the prediction that Q50 homeoproteins would bind with similar specificity to a large number of genes. The in vivo data also rule out extreme versions of the ‘co-selective binding’ model, since selective occupancy of homeoproteins on different genes is not observed. Because the other Q50 homeoproteins are expressed at similar levels to Eve and Ftz, many other Q50 homeoproteins may also display widespread DNA binding in vivo. Below, we further discuss the implications of these results.
WHAT DETERMINES THE DISTRIBUTION OF DNA SITES BOUND BY EVE AND FTZ IN VIVO?
Although the in vitro properties of Eve and Ftz are probably important determinants of their distribution on chromatin, these properties cannot fully explain the in vivo data. A quantitative comparison of in vivo and in vitro data suggests that DNA binding is significantly affected by conditions in the embryo (Walter and Biggin, 1996; Fig. 3). For example, Eve and Ftz bind most weakly to the Adh and rosy genes in vivo but, in vitro, Adh and rosy are bound more strongly than the actin 5C, eve or hsp70 genes. This difference may be partly explained by the fact that Adh and rosy are transcriptional inactive in most cells. The chromatin structure of such genes is thought to inhibit DNA binding (Workman and Buchman, 1993). Consequently, access to many homeodomain recognition sites is probably being occluded at these two loci. The chromatin structure of the other genes is likely to be more accessible and this may help explain the relatively stronger binding to these genes in vivo.
Cooperative interactions with cofactors such a Ftz-F1 may also affect DNA binding in vivo, but we suggest that this only occurs at a minority of sites within each gene. In early embryos, Ftz-F1 is required for the function of ftz, but not for the function of eve (Guichet et al., 1997; Yu et al., 1997). If Ftz-F1 significantly affected binding of Ftz at a majority of DNA sites, then, to obtain similar patterns of DNA binding for Eve and Ftz in vivo, it would be necessary to postulate that there are other cofactors that affect DNA binding by Eve in the same way that Ftz-F1 affects Ftz. This is certainly possible, but it is simpler to assume the following: Eve and Ftz bind similarly in vivo mainly because they have similar inherent DNA-binding specificities and this causes them to occupy largely the same set of accessible DNA sites.
In common with earlier suggestions, this model further predicts that, when coexpressed in the same cells, Q50 homeo-proteins will compete with each other for DNA binding (Ohkuma et al., 1990). This type of competition has probably not significantly affected binding of Eve and Ftz because, during the early stages of development at which binding was assayed, these two proteins are not coexpressed in the same cells and the other known Q50 homeoproteins are present in only a few cells. However, at later stages of development, some cells do coexpress two or more Q50 homeoproteins. The ‘phenotypic suppression’ experiments, described earlier, suggest that some Q50 homeo-proteins dominantly out-compete other Q50 homeoproteins. It will be fascinating to learn the mechanisms underlying the competition phenomenon and, more generally, to determine the biochemical principles governing target site selection by Q50 homeoproteins.
COFACTORS AND MECHANISMS OF PROMOTER REGULATION
According to the ‘widespread binding’ model, cofactors distinguish how the different Q50 homeoproteins regulate shared targets (Fig. 1). If cofactors are not required for the binding of Q50 homeoproteins at a majority of DNA sites, how might they act? We propose that they function partly by using mechanisms that do not affect DNA binding at all, and partly by altering DNA binding at a subset of sites within promoters.
There are many examples of transcription factors that influence the activity of other regulators without affecting their binding to DNA (e.g. Johnston et al., 1987; Sorger, 1990; Joung et al., 1993; Molkentin et al., 1995; Laney and Biggin, 1997). In some cases, proteins synergistically activate transcription in vitro, not by increasing each other’s ability to bind DNA, but by interacting with different components of the general transcriptional machinery (e.g. Hunchback and Bicoid each interact with different subunits of TFIID to cooperatively increase its binding to DNA in vitro (Sauer et al., 1995; reviewed by Ptashne and Gann, 1997)). In other cases, activation domains are masked by interactions with other proteins (e.g. Johnston et al., 1987). Covalent modifications can also affect transcription factor function without affecting DNA binding (e.g. Sorger, 1990). Thus, there are many precedents for this aspect of the ‘widespread binding’ model.
Eve and Engrailed repress transcription in vitro by inhibiting DNA binding of specific activators and general transcription factors, both by bending their DNA sites and by directly competing for overlapping binding sites (Ohkuma et al., 1990; Austin and Biggin, 1995; TenHarmsel and Biggin, 1995). If this is how Q50 homeoproteins repress transcription in vivo, then, because target promoters are bound throughout their length by a variety of activators, repression will involve binding to many DNA sites that overlap these activator binding sites. By the same token, Q50 homeoproteins that activate a target must occupy a slightly different set of DNA sites to avoid inhibiting binding of other activators. We suggest that Q50 homeoproteins will typically occupy the same clusters of high affinity binding sites within target genes, whether they activate or repress them. Differential interactions with cofactors will increase binding of separate Q50 homeoproteins to unique sets of lower affinity sites. It is also possible that cofactors may inhibit binding of Q50 homeoproteins at other weak sites (see Fig. 1). In this way, cofactor/ homeoprotein interactions may alter the overall chromatin structure of the promoter. Thus, by a combination of the mechanisms discussed in this section, cofactors could determine which Q50 homeoproteins activate or repress a given target and could also affect the degree of regulation.
WIDESPREAD-DNA BINDING AND THE RANGE OF DIRECT TARGET GENES
If many Q50 homeoproteins display widespread DNA binding in vivo, it will be important to know what proportion of binding sites are functionally significant. Although the Q50 homeoproteins ultimately control the expression of a majority of genes, it seems unlikely that all genes bound will be direct regulatory targets. For example, one of the genes bound most weakly by Eve and Ftz in vivo is not significantly transcribed during the stage of development at which binding was assayed (Walter et al., 1994). However, recent experiments indicate that at least one of the genes, that is bound at more moderate levels in vivo, the hsp70 gene, is regulated by Eve in early embryos (Liang and Biggin, unpublished data), suggesting that even moderate binding to transcriptionally active genes may be functionally significant. This, together with the accumulating evidence identifying so many different types of gene as direct targets, raises the possibility that Q50 homeoproteins directly regulate at least several thousand common target genes. The ability of the Q50 homeoproteins to bind a large proportion of genes may be an important determinant of their biological function.
Even if many Q50 homeoproteins display widespread DNA binding in vivo, there may be important exceptions to this generalization. Although the Hox proteins are expressed in many cells at similar levels to Eve and Ftz, in some cells they are present at much lower concentrations. Importantly, these alternate levels of Hox protein specify distinct developmental fates (e.g. Castelli-Gair and Akam, 1995; Bienz and Hart, 1996; Duncan, 1996). At the lower concentrations, Hox proteins may be unable to significantly occupy their recognition sites without the aid of cooperative interactions with cofactors. Therefore, by a ‘co-selective binding’ mechanism, Hox proteins may bind very different targets from each other in some cells; even though they regulate a larger number of common targets in other cells.
EVIDENCE AGAINST WIDESPREAD DNA-BINDING?
One experiment might argue against the idea that Hox proteins bind like Eve and Ftz in vivo. This experiment assayed binding of Ubx protein to polytene chromosomes by immunolocalization (Botas and Auwers, 1996). Because Ubx is not normally present in salivary glands, this protein was artificially expressed in this tissue using a heat-shock promoter construct. Under some conditions, only 100 discrete chromosomal locations appeared to be occupied by Ubx. But at a higher protein concentration, binding was detected at many more sites.
There are several reasons why Ubx may appear to bind more specifically in this assay than do Eve or Ftz in the UV crosslinking assay. First, because polytene chromosomes provide only low resolution, non-quantitative data, binding of Ubx may simply have been overlooked at many sites. Second, the ratio of ectopic Ubx protein to DNA in the salivary glands may have been much lower than the ratio of endogenous Eve or Ftz to DNA in early embryos. Third, many transcription factors may bind more specifically in differentiated tissues, such as salivary glands, than in early embryonic cells. This might result if transcription factor binding is made inaccessible at many gene loci as cells differentiate. To resolve this issue, it will be crucial to quantitate DNA binding by endogenous Hox proteins at various stages of development and in different regions of the body.
COMPARISON WITH OTHER REGULATORY TRANSCRIPTION FACTORS
Models derived from studies of the Drosophila Q50 homeo-proteins will probably be most applicable to their direct homologues in other animal phyla because the patterning functions of many of these proteins have been highly conserved (McGinnis and Krumlauf, 1992; Burglin, 1994; Manak and Scott, 1994). It is difficult to predict if the ‘widespread binding’ model will apply to any of the other classes of homeoprotein. Indeed, it is quite possible that they will use a variety of regulatory strategies. Certainly, the widespread DNA binding of Eve and Ftz in vivo differs from the selective binding predicted for the yeast homeoprotein Mat α2 (Johnson, 1995). Mat α2 regulates a relatively small number of genes via only one or a few elements within simple promoters. In contrast, the Q50 homeoproteins act in a wide array of cell types to control many different processes. This may have required these proteins to adopt significantly different regulatory strategies.
There is a wealth of information about the mechanisms by which other eukaryotic regulatory factors control transcription (McKnight and Yamamoto, 1992). However, there is little evidence about the range of DNA sequences and genes bound by these proteins in vivo. Most of the data derives from studies in Drosophila. As mentioned previously, in vivo UV crosslinking experiments indicate that Zeste protein binds very selectively in vivo (Walter et al., 1994). However, this same assay shows that the GAGA protein discriminates poorly between genes (O’Brien et al., 1995). The polytene chromosome assay has been used to study binding of endogenous transcription factors that are normally expressed in salivary gland cells, removing one of the difficulties associated with the Ubx-binding studies. Some endogenous transcription factors appear to bind selectively to relatively few genes in this assay (e.g. Urness and Thummel, 1990; Yao et al., 1993). In contrast, endogeous GAGA factor appears to bind a wide array of genes on polytene chromosomes (Tsukiyama et al., 1994), in apparent agreement with the UV crosslinking data. Given the small number of examples, and the non quantitative nature of the polytene chromosome data, it is difficult to assess how common highly selective or widespread DNA-binding modes are. Further studies will be required to fill this significant gap in our knowledge.
Changes in the regulatory network controlled by the Q50 homeoproteins have played a significant role in metazoan evolution (reviewed by Carroll, 1995; Raff, 1996). One model suggests that Q50 homeoproteins directly regulate only 100 genes, and that new morphologies have arisen because homeoproteins have acquired new target genes and lost others. However, the ‘widespread binding’ model suggests that, in addition to this switching of target genes, altered morphologies may frequently have resulted from modifications in the way Q50 homeoproteins regulate the many existing, shared targets. Changes in the number or position of homeodomain recognition sites within existing target genes, or the evolution of binding sites for novel cofactors, could have altered the way these promoters respond to homeoproteins. Certainly, to assess how Q50 homeoproteins have affected evolution, it will be necessary to understand the range of target genes that they regulate directly.
We are very grateful to Juan Botas, Dennise Dalma-Weiszhausz, Brian Florence, Michael Koelle, Sandy Johnson, Gines Morata, Bill Segraves, Joe Toth, Johannes Walter and Trevor Williams for comments, criticisms and valuable discussions. We also wish to thank the participants of the ‘Crete meetings’ for many insights that have greatly influenced our thinking.