The myogenic factor Myf5 plays a key role in muscle cell determination, in response to signalling cascades that lead to the specification of muscle progenitor cells. We have adopted a YAC transgenic approach to identify regulatory sequences that direct the complex spatiotemporal expression of this gene during myogenesis in the mouse embryo. Important regulatory regions with distinct properties are distributed over 96 kb upstream of the Myf5 gene. The proximal 23 kb region directs early expression in the branchial arches, epaxial dermomyotome and in a central part of the myotome, the epaxial intercalated domain. Robust expression at most sites in the embryo where skeletal muscle forms depends on an enhancer-like sequence located between −58 and −48 kb from the Myf5 gene. This element is active in the epaxial and hypaxial myotome, in limb muscles, in the hypoglossal chord and also at the sites of Myf5 transcription in prosomeres p1 and p4 of the brain. However later expression of Myf5 depends on a more distal region between −96 and −63 kb, which does not behave as an enhancer. This element is necessary for expression in head muscles but strikingly only plays a role in a subset of trunk muscles, notably the hypaxially derived ventral body muscles and also those of the diaphragm and tongue. Transgene expression in limb muscle masses is not affected by removal of the −96/−63 region. Epaxially derived muscles and some hypaxial muscles, such as the intercostals and those of the limb girdles, are also unaffected. This region therefore reveals unexpected heterogeneity between muscle masses, which may be related to different facets of myogenesis at these sites. Such regulatory heterogeneity may underlie the observed restriction of myopathies to particular muscle subgroups.
The myogenic regulatory factors (MRFs), Myf5, MyoD (Myod1; Mouse Genome Informatics), myogenin and Mrf4 (Myf6; Mouse Genome Informatics), are transcriptional activators of genes expressed in skeletal muscles. These four proteins, which form a subgroup of the basic helix-loop-helix class of transcription factors, can also induce myogenic conversion when expressed in a variety of cultured cells (for review, see Weintraub et al., 1991). Targeted mutation of the corresponding genes in the mouse demonstrates their essential role in the formation of skeletal muscle and also distinguishes their relative functions. Myf5 and MyoD play a role prior to muscle cell differentiation. Although Myf5 null (Braun et al., 1992) and MyoD null (Rudnicki et al., 1992) mice make muscle, MyoD/Myf5 double mutants are devoid of myoblasts as well as muscle fibres (Rudnicki et al., 1993). Furthermore, it has been shown that cells in which these genes have been activated, and marked by β;-galactosidase activity, can assume other cell fates in the absence of Myf5 or MyoD, thus demonstrating their role as myogenic determination factors (Tajbakhsh et al., 1996b; Kablar et al., 1999).
The four MRFs have distinct patterns of expression, reflecting their different roles and the complex spatiotemporal organization of skeletal muscle formation, which has recently begun to be appreciated (see Tajbakhsh and Buckingham, 2000). In vertebrates, skeletal muscles of the body proper are derived from somites, which form as segmented blocks of paraxial mesoderm on either side of the neural tube and follow a rostrocaudal developmental gradient from about embryonic day (E) 8. Shortly after their formation, somites give rise to the mesenchymal sclerotome, which will contribute to the cartilage of the vertebrae and ribs, and an epithelial dermomyotome (for review, see Christ and Ordahl, 1995). Myotome formation is initiated as muscle progenitor cells (MPCs) from the epaxial (dorsal) dermomyotome lip, adjacent to the neural tube, become positioned underneath this epithelium from E8.5. From about E9.75, MPCs from the hypaxial (ventral) extremity of the dermomyotome in the interlimb region also begin to migrate underneath the epithelium to form the hypaxial component of the myotome; the hypaxial extremity of the dermomyotome and myotome defines the somitic bud. As the epaxial and hypaxial myotomes develop, they merge to form a continuous myotome layer and can no longer be distinguished as separate domains with conventional markers. The epaxial myotome later contributes to the deep back muscles, whereas the hypaxial myotome including muscle precursors located in the somitic bud will contribute to hypaxial muscles, such as the intercostals and those of the ventral body wall (Ordahl and Le Douarin, 1992; Cinnamon et al., 1999; Denetclaw and Ordahl, 2000). Muscle progenitor cells will also migrate from the hypaxial dermomyotome to more distal sites to found muscle masses such as those of the limb, diaphragm and tongue. More recently, the notion of a third intercalated (dermo)myotome region, located between the epaxial-most and hypaxial (dermo)myotomes has emerged (Spörle and Schughart, 1998; see Tajbakhsh and Spörle, 1998). In the head, where the genetic hierarchy regulating myogenesis appears to be distinct from that in the trunk (Tajbakhsh et al., 1997), the majority of muscles arise from paraxial head mesoderm, located anterior to the most mature somite, and from prechordal mesoderm (for review, see Noden et al., 1999). The hypoglossal chord, which gives rise to tongue and pharyngeal muscles, arises from occipital/cervical somites (Noden, 1983). Myf5 is the first myogenic regulatory factor to be expressed, initially in the epaxial dermomyotome and then in the epaxial myotome (Ott et al., 1991). It is also the first MRF to be activated in the hypaxial myotome (Tajbakhsh et al., 1997). Here, and in distal muscle masses, such as those of the limb, MyoD expression closely follows that of Myf5 (Sassoon et al., 1989). This is also the case in head muscles. Unexpectedly Myf5 is also transcribed in specific regions of the ventral neural tube (Tajbakhsh et al., 1994) and in prosomeres p1 and p4 of the mesencephalon and diencephalon (Tajbakhsh and Buckingham, 1995), although the Myf5 protein is absent (Daubas et al., 2000).
Activation of the two myogenic determination genes depends on signals from tissues surrounding the somite, which in the mouse have been shown to differentially affect Myf5 and MyoD (Cossu et al., 1996). Wnt1, present in the dorsal neural tube, will preferentially activate Myf5 in explant experiments, whereas Wnt7a, present in surface ectoderm, preferentially activates MyoD (Tajbakhsh et al., 1998). Sonic hedgehog, produced by the notochord, is also required for myogenesis, but only in the epaxial domain: in its absence, Myf5 is not expressed in the epaxial myotome, but hypaxial activation proceeds normally (Borycki et al., 1999). This illustrates the complexity of signal integration in the activation of an ‘upstream’ gene such as Myf5. Little is known in molecular terms as to how signals at different sites of myogenesis activate Myf5 or MyoD, although the regulatory sequences of the MRF genes are beginning to be elucidated. In the case of MyoD, a 6 kb 5′ flanking sequence is sufficient for muscle-specific expression in vitro but the transgene is expressed in differentiated muscles predominantly during foetal and postnatal stages in vivo (Asakura et al., 1995). A second regulatory sequence, situated at 23 kb 5′ of the MyoD gene, is capable of reiterating the correct spatiotemporal pattern of MyoD expression in the mouse embryo (Faerman et al., 1995; Goldhamer et al., 1995), although this is not maintained at later stages (Asakura et al., 1995). Further dissection of the 258 bp core of this sequence, which has the properties of an enhancer, shows that it is regulated by multiple cis-acting elements, among which those responsible for Myf5-dependent activation of MyoD in the myotome can be distinguished (Kucharczuk et al., 1999).
Myf5, together with Pax3, acts upstream of MyoD in the genetic hierarchy that controls myogenesis (Tajbakhsh et al., 1997). Therefore, regulation of the Myf5 gene, which plays a key role in integrating the signalling cascades leading to the acquisition of muscle cell identity, is likely to be complex. The locus is complicated by the presence of the Mrf4 gene, situated upstream of Myf5, such that 8.8 kb separate their translational start codons in the mouse locus (Miner and Wold, 1990). Comparison of different Mrf4 knockout alleles indicates that the regulatory elements of Myf5 involved in early myotomal expression extend into the Mrf4 gene (Olson et al., 1996; Yoon et al., 1997). Transgenic analysis also demonstrates that key regulatory elements lie outside a 5.5 kb intergenic region (Tajbakhsh et al., 1996a), which directs expression in the branchial arches and their derivatives, some head muscles, and later in some trunk muscles as well as muscles of the proximal forelimb (Patapoutian et al., 1993).
We chose a YAC transgenic approach to localise all possible Myf5 regulatory elements required for the correct spatiotemporal expression of an nlacZ (where n=nuclear localisation signal) reporter targeted into the Myf5 gene. Heterozygous embryos in which this reporter had been targeted into the endogenous Myf5 gene (Tajbakhsh et al., 1996b) provided a control. After the initiation of this work, Zweigerdt et al. (1997) showed that an element lying between 95 kb and 45 kb upstream of the Myf5 gene is necessary to direct expression in the limbs, whereas expression in other muscle domains is directed by sequences within −45 kb. In this paper, we report that the spatiotemporal expression of Myf5 is regulated by multiple modules, which underlie muscle heterogeneity in the mouse embryo. An element located between −96 and –63 kb is required for later Myf5 expression in head and many trunk muscles, but not those of the limbs. It reveals unexpected heterogeneity in hypaxial muscle derivatives. This element does not act like a classical enhancer in the embryo. A second enhancer type element located between −58 and −48 kb directs expression of the proximal Myf5 promoter in the myotome, the limbs and the central nervous system (CNS). Lastly, within the 23 kpb region upstream of Myf5 are sequences that direct expression in the epaxial dermomyotome, intercalated myotome and branchial arches.
MATERIALS AND METHODS
Library screening, yeast transformation and selection
The YAC library used was constructed from C57BL/6J mouse genomic DNA by Larin et al. (1991) in Saccharomyces cerevisiae strain AB1380 (MATaade2-1 can1-100 lys2-1 trp1 ura3 his5-u) using pYAC4. This library was PCR-screened with two oligos, amplifying in exon 1 of the Myf5 gene, leading to the isolation of Y13 and Y2.
Lithium acetate transformations of yeast cells were carried out, based on the protocol of Ito et al. (1983), using 10 μg of linearized plasmid DNA. Following transfomation, YAC-containing strains were selected and maintained in synthetic dextrose (SD) medium lacking either uracil, lysine or histidine depending on the YAC. Selection for G418-resistant transformants was performed as described by Fairhead et al. (1996) Following integration of the URAblaster cassette (see below), Ura− derivatives of Ura+ transformants were obtained by plating onto SD containing 1 g/l of 5-fluoro-orotic acid (5-FOA) and 50 mg/l of uracil (Boeke et al., 1984).
Analysis of yeast clones
Yeast clones were tested by colony PCR for the presence of Myf5 (targeted or not), using the oligos m5KO1, m5KO2 and m5KO3, described by Tajbakhsh et al. (1997). Agarose plugs of high molecular mass DNA from positive clones were prepared as described by Huxley et al. (1991). Low molecular mass DNA was prepared as described by Philippsen et al. (1991). High and low molecular mass DNA were subjected to Southern blot analysis following enzymatic digestion and Pulse-Field Gel Electrophoresis (PFGE) or classical electrophoresis, respectively. DNA transfer and hybridisation were carried out as described by Kelly et al. (1995). Probes used to analyse the lacZ targeting in Myf5 are described by Tajbakhsh et al. (1996b). Probes isolated either from genomic or YAC DNA are indicated in Fig. 1A.
YAC and vector construction
The centromeric deletion vectors pB1FCT-His and pB1RCT-His were constructed by replacing the Alu element in pCAT-H5 by the B1 element of pB1F and pB1R (McKee-Johnson and Reeves, 1996). pB1FCTUrab and pB1RCTUrab were created by replacing the HIS5 gene with the URAblaster cassette, a yeast selection marker that can be excised subsequently in the presence of 5-FOA (Alani et al., 1987). The 6 kb PstI-SalI fragment, subcloned from the YAC y200-Myf5-nlacZ and located upstream of the SalI site at −90 kb from Myf5, replaced the B1 element in this vector, creating pDel96-Ura. A fragment extending 1.1 kb downstream of the ClaI site and located −63 kb from Myf5, subcloned from y200-Myf5-nlacZ, was inserted into pB1FCT-His replacing the B1 element to create pDel63-His. To delete sequences downstream of the Myf5 gene in y200-Myf5-nlacZ, the B1 element in pB1F was replaced by a 5.3 kb SmaI-BamHI fragment containing exon 3 of Myf5 and 3.8 kb downstream of its polyadenylation site (see Fig. 1A), to create pM5FL.
To construct y200-Myf5-nlacZ, an nlacZ reporter gene flanked by two I-SceI sites (ω; Choulika et al., 1994; Plessis et al., 1992) and a neomycin cassette (Tajbakhsh et al., 1996b) were introduced into a 13 kb BamHI fragment of the mouse Myf5 gene at the ATG of Myf5, with 5.5 kb of 5′ flanking sequence. The bHLH domain of Myf5 was deleted in this construct (amino acids 14-122). The construct, for facility of cloning, differs slightly from that of Tajbakhsh et al. (1996b) where the nlacZ reporter was introduced 13 amino acids downstream of the Myf5 ATG to create a fusion protein. The URAblaster cassette was inserted between nlacZ and neomycin. To construct y240-Myf5-lacZ, the nlacZ reporter gene was replaced by lacZ, and URA3 was inserted instead of the URAblaster cassette. The I-SceI sites facilitated localisation of the Mrf4-Myf5 locus on the YAC.
The split-marker vectors pKA and pAN described by Fairhead et al. (1996) were used to delete from the ClaI site located at −63 kb from the Myf5 gene to various proximal regulatory elements. A 800 bp ClaI-EcoRI fragment located downstream of the ClaI site was inserted into pKA to create pKA-Δ63. A 3.4 kb XbaI fragment containing the Myf5 minimal promoter and the nlacZ gene was inserted into pAN to obtain pAN-Myf5. These two vectors were used to create the YAC y96Δ63-Myf5-nlacZ. A 700 bp XbaI-EcoRI fragment located between −6.8 and −6.1 kb from Myf5 was inserted into pAN to create pAN-Epax. Together with pKA-Δ63, this vector was used to generate the YAC y96Δ63-EpaxMyf5-nlacZ. pAN-23 was constructed by inserting a 1.4 kb EcoRV fragment located between −22.9 and −21.5 kb from Myf5 into pAN. pKA-Δ63 and pAN-23 were used to generate the YAC y96Δ63-23Myf5-nlacZ.
The 5′ genomic region and nlacZ of the knock-in construct (see Tajbakhsh et al., 1996b) were recuperated and deleted to −2.6 kb to create pbaMyf5-nlacZ. This contains the branchial arch element located between −1706 bp (NheI site) and −592 bp (BsaBI site) (Summerbell et al., 2000) and the minimal promoter (−175 bp) from the Cap site, assumed to be at −116 bp from the Myf5 ATG (E. Bober, personal communication). A 10 kb XhoI-HindIII fragment located between −58 and −48 kb of Myf5 was inserted into the XhoI and HindIII sites of pbaMyf5-nlacZ to create p58/48baMyf5-nlacZ.
Generation of transgenic mice
YAC DNA purification was carried out as described by Manson et al. (1997). YAC integrity was verified after each step of the purification process by PFGE and the concentration of the YAC DNA was estimated on a mini-gel using as a standard a DNA mass ladder (Gibco Life Technologies). Transgenic mice were generated by microinjection of purified YAC DNA into fertilized (C57BL/6J × SJL) F2 eggs at a concentration of approx. 1 ng/μl using standard techniques (Hogan et al., 1994). Injected eggs were reimplanted the day after the injection into pseudopregnant (C57BL/6J × CBA) F1 foster mothers.
Identification of transgenic animals, determination of transgene copy number and analysis of YAC integrity in transgenic animals
DNA was prepared from mouse tails or, for transient transgenics, a portion of the embryo and analysed by Southern blot or PCR analysis (Tajbakhsh et al., 1997). For Southern blot analysis, 15 μg of DNA was digested with BamHI or BglII and subjected to electrophoresis. Hybridisation probes were either from the nlacZ gene (1.1 kb ClaI-SacI fragment) or, to determine copy number, Myf5 gene fragments (either 1 kb SmaI-ScaI or 776 bp KpnI-BglII fragments; see Fig. 1A). Transgene copy number for each transgenic line (see Table 1) was determined by analysis of hybridisation signals with a Phosphorimager (Molecular Dynamics).
YAC DNA integrity in transgenic animals was assessed by Southern blot analysis following PFGE. Agarose plugs containing intact chromosomal DNA from spleen cells were prepared as described by Manson et al. (1997). One plug (approx. 8 μg of DNA) was digested overnight with 20 Units of ClaI or XhoI in a total reaction volume of 200 μl and submitted to PFGE, using 1% agarose gels in 0.5 × Tris-borate EDTA buffer at 14°C; running conditions were 6 V/cm, 3 seconds pulse interval for 22 hours. Hybridisation probes were either from nlacZ (ATG to EcoRV, 1.6 kb fragment) or from the Mrf4-Myf5 intergenic region (Fig. 1A).
Analysis of transgene expression
Heterozygous and homozygous transgenic males were crossed with non-transgenic females ([C57BL/6J × SJL] F1). Embryos were staged taking E0.5 as the day of the vaginal plug. Transient transgenic embryos were staged taking the day of reimplantation into the pseudopregnant foster mothers as E0.5. They were dissected in PBS, fixed in 4% paraformaldehyde (for 5-60 minutes depending on the age of the embryo), rinsed twice in PBS and stained in X-gal solution (Tajbakhsh et al., 1996a) at 37°C from 2 hours to overnight. Transgenic embryos were examined microscopically either as whole-mounts or after cryostat sectioning, as described by Kelly et al. (1995). In situ hybridisations on transgenic embryos were carried out as described by Tajbakhsh et al. (1997). Antisense probes used during this study were nlacZ (ClaI to the polyadenylation site, 2.4 kb fragment from the 3′ end) and Myf5 (Ott et al., 1991).
Relative quantification of transcripts
Total RNA from E14.5 trangenic embryos was prepared as described by Daubas et al. (2000). 1 mg of total RNA was used to isolate poly(A)+ mRNA with Dynabeads mRNA DIRECT kit (Dynal) to eliminate contaminating genomic DNA. cDNA was synthesised as described by Daubas et al. (2000) and amplified using an ABI PRISM 7700 Sequence Detection System (PE Applied Biosystems). Sequences of specific primers for the detection of Myf5 and nlacZ transcripts were as follows: Myf5 identical (exon 1): 5′-CC-AGCCCCACCTCCAACT-3′; Myf5 complementary (exons 2/3): 5′-CTTTTATCTGCAGCACATGCATTT-3′; nlacZ identical: 5′-GC-AGCCTGAATGGCGAAT-3′; nlacZ complementary: 5′-CGCATC-GTAACCGTGCATC-3′. TaqMan® probes, respectively for Myf5 and nlacZ, were 5′Fam-AGCCCTGTCTGGTCCCGAAAGAACA-3′ (exon 2) and 5′Vic-CCGATACTGTCGTCGTCCCCTCAAAC-3′. The PCR reactions were done in separate tubes for the amplification of Myf5 and nlacZ transcripts, using the TaqMan® Universal PCR Master Mix (PE Biosystems), with 300 nM of each primer and 200 nM of TaqMan® probes. The cycling sequence consisted of an initial step at 50°C for 2 minutes and a denaturation for 10 minutes at 95°C, followed by 40 cycles of 15 seconds at 95°C and 1 minute at 60°C. Contaminating genomic DNA was checked not to perturbate the detection of nlacZ transcripts. Relative quantitation with data from the ABI PRISM 7700 Sequence Detection System was performed using the comparative CT method after verification that the efficiencies of target (nlacZ) and reference (Myf5) amplification were approximately equal. CT is the threshold cycle, indicating the fractional cycle number at which the amount of amplified target reaches a fixed threshold level. The number N of molecules accumulated at CT is calculated according to the following formula: logN=log No + CT.log (1+E), where No represents the number of molecules prior to amplification. For two different concentrations of NMyf5 and NnlacZ, NMyf5/NnlacZ= 2CtnlacZ-CtMyf5, assuming that the efficiency, E=1.
Construction of a series of deletions around the Myf5 gene
In order to include potential regulatory elements located at a distance from the Myf5 gene, a YAC library containing large fragments (100-1600 kb) of mouse (C57BL/6) genomic DNA (Larin et al., 1991) was screened by PCR with Myf5-specific oligonucleotides. Two YACs, Y13 (1000 kb) and Y2 (650 kb) were identified. Pulsed field gel electrophoresis (PGFE), followed by Southern analysis, showed that Y13 and Y2 contain about 450 kb and 550 kb upstream of the linked Mrf4-Myf5 genes, respectively. All subsequent modifications were carried out on Y13.
A fragmentation vector, pM5FL, containing 3.8 kb of Myf5 3′ flanking sequence was used for targeted deletion of Y13 beyond this downstream region. This was followed by random deletion of the upstream region using the B1 mouse repeat element as the homologous sequence (Heard et al., 1994), cloned in both orientations in the fragmentation vectors pB1F and pBIR (see Materials and Methods). The resulting YAC, y200-Myf5, contains about 190 kb of upstream sequence. The nlacZ reporter gene was then targeted at the ATG of Myf5 by homologous recombination to give y200-Myf5-nlacZ (Fig. 1B). This reporter gene, which had previously been targeted into the endogenous mouse Myf5 gene (Tajbakhsh et al., 1996b), permits comparison between heterozygous Myf5nlacZ/+ embryos and transgenics expressing Myf5 on YAC constructs. To make further 5′ deletions, defined regions of y200-Myf5-nlacZ, located at 96 and 63 kb 5′ of the Myf5 Cap site, were subcloned into the fragmentation vector (see Materials and Methods). Chromosome fragmentation (Pavan et al., 1990) of y200-Myf5-nlacZ with these genomic sequences gave y96-Myf5-nlacZ and y63-Myf5-nlacZ (Fig. 1B).
An additional YAC, y240-Myf5-lacZ, was also obtained using B1 fragmentation vectors (see Materials and Methods). This contains 190 kb of sequence upstream and 50 kb of sequence downstream of the Mrf4-Myf5 locus (Fig. 1B). In this case the lacZ reporter does not contain a nuclear localisation signal, making it possible to trace the axonal tracts of neurons expressing Myf5 (Daubas et al., 2000).
All essential regulatory elements are located within 96 kb upstream of Myf5
Purified YAC DNA was injected into the pronucleus of fertilized mouse eggs and a number of independent transgenic lines were established for each construct. Embryos were examined from E8.5 until birth and compared with Myf5nlacZ/+ heterozygotes (Tajbakhsh et al., 1996b). Two out of three lines generated with y240-Myf5-lacZ gave the expected skeletal muscle expression pattern; the third line showed only very weak expression, possibly due to breakage of the YAC upon integration. All three lines generated with y200-Myf5-nlacZ are comparable to the Myf5nlacZ/+ heterozygotes. The first β;-galactosidase positive (β;-gal+) cells are detected in the epaxial lip of the dermomyotome at E8.5 in the recently formed somites of heterozygous knock-in and transgenic embryos (Fig. 2A,D,G). By E9.75, X-gal staining begins to be detectable in the hypaxial somitic bud (data not shown). In more mature somites, lacZ expression is observed in both dermomyotomes and myotomes, following a rostrocaudal gradient; by E11.5, β;-gal+ cells extend from the epaxial (green arrowhead, Fig. 2B) to the hypaxial (red arrowhead, Fig. 2B) extremities of the myotomes. At E9, labelling is detected in the mandibular arch (results not shown). By E11.5, expression is seen in the masseter, the hyoid arch (black arrows, Fig. 2B) and faintly in the third branchial arch. At this stage, both fore and hind limb buds contain β;-gal+ cells (Fig. 2B,E,H). Labelling is also detected in defined regions of the brain (cns, Fig. 2B; Tajbakhsh and Buckingham, 1995), where the neuronal tracts were identified as a result of cytoplasmic X-gal labelling in the 240-Myf5-lacZ lines (Daubas et al., 2000). All skeletal muscles in the body proper, head and limbs, are labelled at E14.5 (Fig. 2C,F,I) and this is maintained during subsequent foetal development. We therefore conclude that all essential regulatory elements required for correct Myf5 expression are located upstream of the locus within a 190 kb region.
The y96-Myf5-nlacZ construct was then analysed for expression. In the two lines obtained, X-gal labelling is again comparable to that of Myf5nlacZ/+ heterozygous embryos (Fig. 2J-L), indicating that the essential regulatory sequences are within the 96 kb upstream region. It is notable, however, that a weaker expression is observed in the muscles of the most anterior portion of the face, compared to other facial muscles at E14.5 (Fig. 2L). Additional ectopic expression is seen in one of the 96-Myf5-nlacZ transgenic lines in the distal limb mesenchyme (Fig. 2K,L). It is interesting to note that most of the transgenic lines show some degree of reporter gene expression in cells located dorsally to the somites, most notably in the interlimb region (e.g. Fig. 2H,K). In Myf5nlacZ/+ embryos, only a few cells express nlacZ transiently in this region, along the anterior-posterior axis (Tajbakhsh et al., 1996b). The nature of these cells is unclear.
Regulatory elements directing later nMyf5 expression in head and ventral trunk muscles are located between −96 and −63 kb
Deletion to 63 kb upstream of Myf5 resulted in an incomplete expression pattern. Three mouse lines were established with y63-Myf5-nlacZ, one of which exhibited weak and partial X-gal labelling, again suggesting transgene breakage upon integration. In the two other lines, nlacZ expression initiates correctly in somites at E8.5 (data not shown). However, at E11.5 (Fig. 3A) the expression pattern begins to differ from that of the Myf5nlacZ/+ heterozygotes (Fig. 2B) or embryos expressing the y96-Myf5-nlacZ transgene (Fig. 3B), particularly in the branchial arches and their derivatives. By E12.5, a decrease in X-gal staining is observed in the rostral somites as well as in other differentiating muscle masses of the head and trunk (data not shown). By E14.5 (Fig. 3C), in contrast with limb muscles, the transgene is not expressed in the head and ventral trunk muscles, such as the musculus (m)-rectus abdominis and the diaphragm. The m-latissimus dorsi is stained only at the attachment site behind the shoulder. In contrast, muscles of the girdle, such as the m-pectoralis (Fig. 3G), and intercostal muscles (Fig. 3C,E,G) continue to express the transgene, as do the deep back muscles (Fig. 3E). The complete expression pattern seen with the y96-Myf5-nlacZ transgene is shown for comparison (Fig. 3D,F,H).
The integrity of y63-Myf5-nlacZ, which had been checked prior to injection, was verified by PFGE and Southern blotting once integrated into the mouse genome. The two lines with the complete pattern of early muscle expression show no indication of breakage or rearrangement of this YAC transgene (data not shown). To verify that the differential staining observed among distinct muscle masses is not due to a sensitivity problem in the detection of β;-galactosidase activity, 63-Myf5-nlacZ and 96-Myf5-nlacZ embryos were stained for different times (overnight and 6 hours, respectively) to give similar staining intensities in the deep back and limb muscles. This result, shown in Fig. 3C-H, clearly demonstrates that, at E14.5, many muscles in the trunk and head are negative for y63-Myf5-nlacZ expression, whereas such a difference is not noted for limb muscles. This pattern is maintained until birth (data not shown). It is notable that, apart from this differential staining, the β;-galactosidase activity appears weaker in 63-Myf5-nlacZ embryos throughout development, possibly due to the lower transgene copy number. Relative quantification of trancripts (Fig. 4) shows that nlacZ transcripts are indeed less abundant in 63-Myf5-nlacZ than in 96-Myf5-nlacZ embryos, when normalized to transgene copy number. A two-fold decrease in the level of nlacZ transcripts, normalized to those of the endogenous Myf5 gene, is observed in the trunk versus the limbs of the 63-Myf5-nlacZ embryos, compared to 96-Myf5-nlacZ embryos. Given that the whole trunk was used to prepare mRNA and that some trunk muscles continue to express y63-Myf5-nlacZ (Fig. 3), this difference is an underestimate.
The −96/−63 region does not interact with identified proximal Myf5 regulatory elements
To test the properties of the −96/−63 region, we first placed it directly in front of a minimal Myf5 promoter, including the presumptive TATA box. Using the split-marker vectors (Fairhead et al., 1996; see Materials and Methods), we created a deletion in y96-Myf5-nlacZ between the ClaI site at −63 kb and the XbaI site at −175 bp from the Cap site of Myf5, to yield y96Δ63-Myf5-nlacZ (Fig. 5A). The pattern of expression of this YAC was examined at E14.5 in transient transgenic embryos (Fig. 5B). The absence of any muscle expression suggests that the −96/−63 element does not act as an enhancer on the minimal Myf5 promoter to direct transcription in those head and trunk muscles where the endogenous gene is expressed. We therefore investigated whether it requires more extensive proximal regulatory sequences. A proximal region that drives Myf5 expression in the branchial arches has previously been identified (Patapoutian et al., 1993) and its importance confirmed more recently (Summerbell et al., 2000). This 1168 bp region located between −1760 and −592 bp from the Myf5 Cap site directs early expression in cells that will contribute to head musculature. A second 600 bp region, located between −6.1 and −5.5 kb upstream of Myf5, has been found to direct early Myf5 expression in the epaxial somite (Summerbell et al., 2000). We therefore created a second deletion in y96-Myf5-nlacZ extending from the ClaI site at −63 kb to the XbaI site 5′ of the epaxial element (Fig. 5A). The expression pattern of the resulting YAC, y96Δ63-EpaxMyf5-nlacZ, was again analysed in transient transgenic embryos at E14.5. Only ectopic expression was observed. At E10.75, transgenic embryos showed expression in the branchial arches and in a small number of myotomal cells (Fig. 5B). These experiments confirm the early regulation of Myf5 by the epaxial and branchial arch elements and indicate that the −96/−63 region has no cis-acting effect at E14.5 in the context of these proximal sequences.
We have identified (T. C. and M. B., unpublished) an additonal myogenic regulatory sequence located at −17 kb from the Myf5 gene (A17, Fig. 5A). To test the potential interaction of this element with the remote −96/−63 region, we constructed a third YAC, y96Δ63-23Myf5-nlacZ, by deleting the region from the ClaI site at −63 kb to the KpnI site at −23 kb from Myf5 (Figs 1A, 5A). Of 19 embryos dissected at E14.5, 1/19 showed only ectopic β;-galactosidase activity, while 1/19 had some β;-gal+ myotomal cells posterior to the hind limb, and in the caudal deep back muscles (Fig. 5C,D) in addition to ectopic expression. An additional 3/19 embryos were positive by PCR, but showed no X-gal staining.
We conclude from these experiments that the −96/−63 region does not act as a classical enhancer-type element since it shows no activity with the branchial arch element, or the early epaxial element, or other potential elements within 23 kb of the Myf5 gene. It appears therefore that the −96/−63 region, if it can act out of context, may require other regulatory sequences, potentially located between −63 and −23 kb from Myf5.
The region between −58/−48 kb contains important regulatory elements for Myf5 expression
To examine earlier expression directed by the 23 kb sequence upstream of the Myf5 gene, we derived transgenic lines with the YAC y23-Myf5-nlacZ. This YAC was created by chromosome fragmentation of y63-Myf5-nlacZ with a genomic sequence located between −23 and −20 kb (Fig. 6A). Three independent lines were generated, showing variable intensities of X-gal staining and different ectopic sites of expression. Nevertheless, in all three lines, the transgene exhibits a reproducible muscle expression pattern. From E8.5, β;-gal+ cells are present in the epaxial dermomyotome of the somite (Fig. 6B) and later in the epaxial myotome (Fig. 6C). nlacZ expression decreases from E10.75 following a rostrocaudal gradient, such that only very weak staining is observed in deep back muscles by E14.5 (Fig. 6D, see also Fig. 5C). Also, as expected, early branchial arch expression is seen with this YAC; later expression in head muscles is not detected. Furthermore, no expression is seen in the limbs or in the hypaxial region of the myotome and its derivatives at all stages examined, as opposed to the 63-Myf5-nlacZ embryos. Some X-gal staining is detectable in a subset of cells of the rostral intercostal muscles in 23-Myf5-nlacZ transgenics between E11.5 and E12.5 (Fig. 6C), whereas intercostal muscles are fully labelled at E14.5 in the 63-Myf5-nlacZ embryos (Fig. 3). No expression is detected in the central nervous system with y23-Myf5-nlacZ. Head expression seen in Fig. 6C corresponds to ectopic labelling of head mesenchyme. These results suggest that the −63/−23 kb region contains regulatory sequences responsible for Myf5 expression in the limbs, the hypaxial myotome and derivatives, and the brain. From previous studies (Zweigerdt et al., 1997) and those described here, sequences driving limb expression are probably located just 3′ to −63 kb.
In fact we found that, in p58/48baMyf5-nlacZ transgenic embryos, the −58/−48 kb region, linked to a fragment containing the Myf5 minimal promoter with the branchial arch region as a postive control (see Materials and Methods), is active in the branchial arches, the limb buds (Fig. 7D), the brain (Fig. 7C), the hypoglossal chord (Fig. 7B) and the myotome from its epaxial to hypaxial extent (Fig. 7E). Myf5 expression is absent from the epaxial dermomyotome lip (Fig. 7B,E). This is particularly evident in the caudal somites (c/f Fig. 2). We therefore conclude that key regulatory elements for Myf5 expression at different sites in the embryo, including limb muscles, are localised in this distal 10 kb fragment.
The distal 10 kb fragment and the −23 kb region direct distinct aspects of Myf5 expression in the somite
It is notable that the epaxial myotome is positive for β;-galactosidase activity with both y23-Myf5-nlacZ and p58/ 48baMyf5nlacZ transgenes. However, a decrease in the epaxial staining in more rostral somites is observed only in the 23-Myf5-nlacZ embryos (Fig. 6C, compared with Fig. 7B). Given the stability of the β;-galactosidase protein, it is possible that the X-gal staining in the epaxial myotome of the 23-Myf5-nlacZ embryos is due to a persistence of β;-galactosidase activity rather than a continuous expression of the transgene. In situ hybridisation with an nlacZ probe performed on transgenic embryos shows that this is the case. At E9.75 (Fig. 6H), nlacZ transcripts are restricted to the dorsal lip of the epaxial dermomyotome in 23-Myf5-nlacZ transgenic embryos; the epaxial dermomyotome is also labelled in caudal somites of older embryos (Fig. 6I). At E10.75, a second site of transcription is seen in the central region of the myotome (Fig. 6I), which corresponds to part of the intercalated epaxial domain (see Tajbakhsh and Spörle, 1998), as shown also on a cross section (Fig. 6J). Transcripts are not detected in the hypaxial myotome (c/f Fig. 6F,G). We therefore conclude that Myf5 expression in the MPCs of the epaxial dermomyotome and in the differentiating epaxial-most myotome is regulated by two separable elements. Furthermore, this analysis identifies the intercalated domain of the epaxial myotome as a distinct site of Myf5 myotomal transcription.
Myf5 expression is regulated by multiple modules
Using a YAC and plasmid transgenic approach, we show that correct spatiotemporal expression of Myf5 in skeletal muscle and the central nervous system depends on multiple proximal and far distal regulatory elements, dispersed over −96 kb from the gene (Fig. 8). In the proximal 23 kb sequence upstream of Myf5, lie elements directing early expression of Myf5 in the epaxial dermomyotome, the epaxial intercalated myotome and the branchial arches. Previous studies showed that a region lying immediately upstream of the Myf5 promoter (purple box in Fig.8) directs robust early expression in the branchial arches (Patapoutian et al., 1993; Summerbell et al., 2000). A second sequence, located at −6.1 kb, directs expression in the epaxial dermomyotome (Summerbell et al., 2000; orange box in Fig. 8) and may also be implicated in the intercalated myotomal expression reported here. Studies carried out in other laboratories (Pin et al., 1997; Yoon et al., 1997) and our own (T. C. and M. B., unpublished) suggest that other Myf5 and Mrf4 elements are located within the 23 kb sequence. However, our analysis shows that these elements are not sufficient to confer overall robust expression with a Myf5 minimal promoter. A region between −58 and −48 kb (blue box in Fig. 8) contains important regulatory elements that direct expression from a proximal Myf5 promoter in the complete extent of the epaxial and hypaxial myotome, in the developing limb muscles and at other sites to which muscle precursors migrate from the somite, such as the hypoglossal chord. This 10 kb fragment also directs correct expression of Myf5 in discrete regions of the brain where the endogenous gene is expressed (Tajbakhsh and Buckingham, 1995; Daubas et al., 2000). A more distal region between −96 and −63 kb, which has an effect on the overall level of expression, is also required for late Myf5 expression in a specific subset of skeletal muscles such as those of the head and ventral trunk muscles, but not the deep back, intercostal and limb muscles (Fig. 3). This region does not function as an enhancer. The uncoupling of early and late expression of Myf5 is reminiscent of MyoD regulation, where proximal and distal elements confer late and early activity respectively. Both regions, however, function as enhancers in the context of a minimal promoter and the “late” elements are required in all foetal and postnatal muscles (Asakura et al., 1995). The core enhancer for early MyoD expression, situated at −23 kb, is 258 bp in size (Goldhamer et al., 1995). It remains to be seen whether the activity of the 10 kb fragment at −58/−48 kb from Myf5 can be reduced to such an extent. Preliminary data suggest that its structure is more complex.
In a previous study of chimaeric embryos, using ES cells containing YAC transgenes, Zweigerdt et al. (1997) suggested that all muscle regulatory elements of Myf5 are present in a region spanning from −95 kb to +500 kb of the gene, and that, whereas limb muscle regulatory elements are located between −45 and −95, early muscle expression is seen with the −45 kb sequence. In contrast, we find that the −58/−48 fragment is essential for correct myotomal expression. This discrepancy may be due in part to the fact that Zweigerdt et al. (1997) analysed chimaeric embryos. Given the recent more detailed understanding of the complex spatiotemporal expression of Myf5 (see Tajbakhsh and Buckingham, 2000), we would suggest that some domains of the (dermo)myotome do not express their YAC-45 construct (most evident in the epaxial somite of the cervical region, and hypaxial somite in the tail of E12.5 embryos; Fig. 2 in Zweigerdt et al., 1997) and that sequences located between −45 and −96 kb are indeed necessary for faithful expression of Myf5 in trunk and tail muscles. Our analysis shows that this region can be subdivided into two domains with different characteristics (Fig. 8), where the most distal regulatory region is required for later expression of Myf5 in the head and some trunk muscles. A previous report on an nlacZ transgene with 5.5 kb of Myf5 5′ flanking sequence described expression in some facial muscles, in deep back muscles and in the proximal limb at E13.5 (Patapoutian et al., 1993). These observations probably reflect the presence of further regulatory elements within this 5.5 kb region, which direct low level expression, only detectable perhaps with high copy numbers, or different genetic backgrounds. Zweigerdt et al. (1997) also observed labelling in the head, but not in limbs, with a 45 kb 5′ flanking sequence. Again without a quantitative comparison of transcript levels, it is difficult to assess whether this is lower than that seen with larger constructs. There are also negative regulatory elements for expression in head muscles lying 5′ to the 5.5 kb region (Summerbell et al., 2000). With the 23 kb construct, we see relatively weak expression in the arches at E11.5 and head muscles derived from them are negative at E14.5.
Comparison of expression between the YAC transgenes and the endogenous Myf5 gene
The sequences that lie within 58 kb upstream of the Myf5 gene direct correct spatiotemporal transcription initially in the embryo, but not at later stages (Fig. 3). In addition to a specific effect on certain muscle masses, the absence of more upstream sequences results in a general reduction in the overall level of expression, confirmed by quantification of nlacZ/Myf5 transcripts by RT-PCR with the y63-Myf5-nlacZ transgene (Fig. 4). With an additional 30 kb of 5′ upstream sequence, the level of expression per transgene copy compared to Myf5 expression at E14.5 increases to a ratio that approaches 1. It is striking, however, that the level of nlacZ expression from YAC transgenes is lower than that of the Myf5nlacZ/+ allele. Even for multicopy transgenes, X-gal labelling is reduced and this phenomenom is confirmed by quantitative RT-PCR (data not shown). It is possible that this may be due to a minor difference in the construction (see Materials and Methods). Alternatively, since haploinsufficiency has been observed with one allele of Myf5 (Rudnicki et al., 1993), the fact that less Myf5 is being produced in the Myf5nlacZ/+ embryos may have repercussions on the regulation of the gene, although direct positive autoregulation of Myf5 has been ruled out since Myf5nlacZ/nlacZ embryos are more intensely stained than Myf5nlacZ/+ embryos (Tajbakhsh et al., 1996b).
Although the y96-Myf5-nlacZ transgene recapitulates almost all aspects of Myf5 expression, oddly, X-gal staining in the region of the snout is lacking, suggesting that sequences required for this are present in the larger y200-Myf5-nlacZ transgene. Some ectopic expression is seen even with large flanking sequences (e.g. 96 kb; Fig. 2), suggesting that insulator sequences may be distant from the locus. One unexpected site of transgene expression, which is probably not ectopic, seen clearly in some transgenic lines at about E11.5, extends dorsally in a segmented pattern from the interlimb somites. Although much less prominent, a similar phenomenon is also detectable in Myf5nlacZ/+ heterozygous embryos, where occasional β;-gal+ cells are seen dorsal to the cervical somites (Tajbakhsh et al., 1996b). These cells may be precursors of subcutaneous muscles. However, the transgene pattern ressembles that seen in Msx1nlacZ/nlacZ homozygous embryos (Houzelstein et al., 2000) and may therefore initially mark dermal precursors, derived from multipotent Myf5 expressing cells in the dermomyotome.
The regulation of Myf5 expression in the somite
It is becoming apparent from the analysis of different marker genes and mouse mutants that the (dermo)myotome is regionalised along the dorsoventral axis (see Tajbakhsh and Buckingham, 2000) suggesting that different regulatory elements may direct expression either in the epaxial-most, intercalated or hypaxial (dermo)myotome (Spörle and Schughart, 1998). Our observations are consistent with this notion. Analysis of the expression pattern of the y23-Myf5-nlacZ transgene by in situ hybridisation clearly identifies the intercalated myotome as a transcriptional domain distinct from the epaxial-most myotome. This transcriptional domain had been distinguished on the basis of frizzled9 expression (Wang et al., 1999). The intercalated domains of the dermomyotome and myotome are marked by the expression of Sim1 and En1 (see Tajbakhsh and Spörle, 1998). Expression of the latter is also a characteristic of adaxial cells in the zebrafish embryo, which are specified by sonic hedgehog (see Currie and Ingham, 1998). Further analysis of the 23 kb region is required to establish whether sonic hedgehog activates Myf5 in the intercalated myotome. The isolation of a sequence that specifically targets this region will also permit its manipulation and hence investigation of its significance.
The −23 kb region also directs transgene expression to the dorsal epaxial dermomyotome of the somite. Consistent with this, epaxial myotome expression is initiated correctly by the −58/−48 region, whereas expression in the epaxial dermomyotome is lacking. In sonic hedgehog mutant mice, Myf5 activation and epaxial myogenesis are compromised (Borycki et al., 1999). It is possible that this is due to an effect on the initial dermomyotome expression of Myf5 and that both this and the intercalated myotome expression depend on sonic hedgehog. Expression of Myf5 in the somite is also influenced by Wnts, produced by the dorsal neural tube and surface ectoderm; in explants, the presence of Wnt1 leads to preferential expression of Myf5 (Tajbakhsh et al., 1998).
Nature of the distal regulatory elements
Whereas the −58/−48 regulatory region has the property of an enhancer element in that it functions with a proximal Myf5 promoter, this is not the case for the −96/−63 regulatory region, which is inactive with the minimal promoter or sequences up to 23 kb 5′ of Myf5, including the epaxial and branchial arch elements (Summerbell et al., 2000). Despite the presence of the latter, expression is not maintained in head muscles that derive from the branchial arches. The −96/−63 element does not appear to act efficiently as an insulator since early epaxial expression due to the sequence at −6.1/−5.5 is not observed in all founder 96Δ63-EpaxMyf5-nlacZ embryos examined at E10.75. Silencing, presumably due to the integration site, is seen, as in the case of the transgene with the epaxial element alone (Summerbell et al., 2000). The −96/−63 region may interact with elements present in the 10 kb fragment located between −58 and −48 kb, which direct early expression of Myf5. However, preliminary experiments suggest that this is not the case. It appears therefore that the regulatory elements present in the −96/−63 region require the specific configuration of the endogenous gene. The importance of distance between promoters and regulatory elements has been highlighted notably in the case of gene clusters, such as the β;-globin locus or the HoxD complex, where changing the postion of a given gene in the complex affects the timing and/or pattern of its expression (Dillon et al., 1997; van der Hoeven et al., 1996).
Dissection of Myf5 regulatory regions reveals unexpected heterogeneity in hypaxial muscles
The most distal regulatory region, situated between −96 and −63 kb, is required for Myf5 expression in a subset of skeletal muscles. The first differences between 63-Myf5-nlacZ and 96-Myf5-nlacZ embryos are observed from about E11.5. By E14.5, head muscles do not express the y63-Myf5-nlacZ transgene, whereas branchial arches, from which many head muscles derive, do at earlier stages, in keeping with the presence of a branchial arch element in the 23 kb upstream of Myf5. The existence of a distinct regulatory pathway for myogenesis in the head was shown by analysis of splotch/Myf5 double mutant embryos (Tajbakhsh et al., 1997). The expression profile of y63-Myf5-nlacZ reveals an unexpected regulatory heterogeneity in body muscles: by E14.5, the differences in expression in many trunk muscles are striking when compared with the limbs, whereas others retain the y96-Myf5-nlacZ expression profile. The expressing and the non-expressing muscles cannot be simply distinguished by their epaxial or hypaxial origin. The deep back muscles, formed by elongation and in situ from the epaxial myotome (Denetclaw and Ordahl, 2000), express the transgene. y63-Myf5-nlacZ is also expressed in the intercostal muscles. The medial-most (adjacent to the sclerotome) part of these muscles has an epaxial contribution, but they are mainly formed by elongation of the somitic bud including the hypaxial myotome, ventrally (Christ et al., 1983; Cinnamon et al., 1999). Ventral body wall musculature, which is of hypaxial origin, does not express the transgene. The exact location of the precursors of ventral body wall muscles in the hypaxial somite is unknown. They may derive from the somitic bud, either from the hypaxial dermomyotome or the hypaxial myotome, or both. Christ et al. (1983) described a dispersion of the cells of the somitic bud, subsequent to its extension into the somatopleure. Indeed some dispersion leading to migration of precursors is suggested by the fact that ventral body wall musculature is perturbed in splotch mice (Tajbakhsh et al., 1997; Tremblay et al., 1998). This is also the case for diaphragm and tongue muscles where Myf5 transgene expression is also lacking in the absence of the −96/−63 region. However, it is clear that Myf5 expression in limb muscles and those of the limb girdle does not depend on this region. The migration of myogenic progenitor cells to the limb is Pax3 dependent and limb muscles are absent in splotch mice (see Tajbakhsh and Buckingham, 2000). However, other regulatory genes of importance in limb muscle development are not expressed at other sites where muscle will form as a result of migration of hypaxial somite derivatives. This is the case for Mox2 (Mankoo et al., 1999), which has been implicated in the regulation of Myf5 specifically in the limbs. Another example is Lbx1, which is not expressed in the interlimb region (Jagla et al., 1995). The fact that distinct regulatory sequences direct Myf5 expression in ventral body wall or limb muscles suggests that these elements respond to different regulators of the ‘migratory’ network which is implicated in myogenesis.
Spatiotemporal heterogeneity in the transcriptional regulation of expression as demonstrated for the gene encoding the myogenic determination factor, Myf5, is not a common property of downstream muscle genes. Apart from later fibre type differences, their expression tends to be directed in all skeletal muscles by the same regulatory elements. However, heterogeneity between skeletal muscles with similar characteristics is well known to clinicians, who are confronted by dystrophies affecting only a subset of muscles, such as those of the face (Mathews and Mills, 1996) or limb girdle (Bushby, 1999). The further analysis of factors regulating Myf5 expression in muscle subdomains may throw light on this phenomenon and also open the way to targeted gene expression and manipulation in specific muscles.
The authors thank R. Kelly, F. Relaix and R. Spörle for helpful discussions and critical reading of the manuscript and C. Cimper and C. Bodin for technical assistance. We also thank C. Fairhead for the gift of the pKA and pAN vectors and her and E. Heard for useful advice on yeast manipulations, and C. Huxley for advice about YAC transgenesis. This work was supported by grants from the Pasteur Institute, the CNRS (Centre National de la Recherche Scientifique) and the AFM (Association Française contre les Myopathies). J. H. was a recipient of a fellowship from the MRES (Ministère de la Recherche et de l’Enseignement Supérieur), the ARC (Association pour la Recherche sur le Cancer) and the AFM. T. C. was a recipient of a NATO (North Atlantic Treaty Organisation) and now a NIH (National Institutes of Health) fellowship (HD08570). M. P. was supported by fellowships from EMBO (European Molecular Biology Organisation), the Austrian Academy of Sciences and the AFM.