ABSTRACT
Immunoelectron microscopical results have shown that the Z and M bands of the sarcomere are interconnected by the long titin molecules. Here we have characterized by monoclonal antibodies, cDNA cloning and immunoelectron microscopy the two titin-associated proteins (190 and 165 kDa proteins), which seem responsible for the formation of a head structure on one end of the 0.9 μm long titin string. The human 165 kDa (1465 residues) and 190 kDa (1451 residues) proteins have unique N-terminal domains some 110 residues in length. Both proteins show 12 repeat domains with strong homology to either fibronectin type III (motif I) or immunoglobulin C2 (motif II) domains, which are arranged in the order II-II-I-I-I-I-I-II-II-II-II-II. Over these repeat domains the two proteins share 50% sequence identity (70% similarity). Epitopes situated in the C-terminal 138 or in the preceeding 206 residues of the 165 kDa protein locate in immunoelectron microscopy to stripes situated 18 or 15 nm from the center of the M band. An epitope situated 277 to 129 residues prior to the C-terminus of the 190 kDa protein (i.e. repeats 10 and 11) locates to the center of the M band. Thus the head structure of the titin molecule extends into the center of the M band. Microsequence data on peptides from the titin-associated bovine 165 kDa protein and from conventionally purified bovine M-protein argue together with the reactivity of the antibodies that 165 kDa protein and M-protein are identical. The integrating structure of the sarcomere, which is based on titin and its side-on (C-protein and 86 kDa protein) or end-on (190 kDa protein and 165 kDa protein) associated proteins arises from muscle-specific members of the superfamily of immunoglobulin-like proteins.
INTRODUCTION
The giant protein titin, sometimes also referred to as connectin, has an approximate molecular mass of 2400 to 3000 kDa (Maruyama et al., 1984; Kurzban and Wang, 1988). It is thought to integrate the thick and thin filaments and to confer elastic behaviour on the sarcomere of striated muscle myofibrils (for reviews see Wang, 1985; Maruyama, 1986; Trinick, 1991). Immunoelectron microscopy with 11 monoclonal antibodies, recognizing distinct and non-repetitive epitopes, shows that the titin molecule extends from the Z band into the M band. The end of the titin molecule, which is lost due to proteolysis in the purified titin TII molecules, is anchored in the Z band (Fürst et al., 1988, 1989). Additional titin monoclonal antibodies that recognize repetitive epitopes have defined a 42 to 43 nm repeat pattern in the A band (Fürst et al., 1989), which coincides with striations of the A band known to harbor two myosin-associated proteins, the C-protein and the 86 kDa protein (Sjöström and Squire, 1977; Craig and Offer, 1976; Dennis et al., 1984; Bähler et al., 1985a,b; Fürst et al., 1989).
Purified titin II molecules have a narrow length distribution of about 900 nm and lack the Z-band anchoring domain. Individual titin molecules appear as long strings carrying a single globular head, which seems connected to the presence of two M-band proteins of apparent molecular masses of 190 and 165 kDa (Nave et al., 1989). The tour de force approach of cDNA cloning taken by Labeit et al. (1990, 1992) has so far yielded several arrays, which constitute together some 30% of the entire titin molecule. These results show that titin consists almost entirely of repetitions of two types of motifs, called class I and class II. Both are roughly 100 residues in length and are related to type III fibronectin and C2 immunoglobulin domains, respectively. This repetitive display of 100-residue domains fits the 4 nm beaded appearance seen in electron micrographs of titin (Trinick et al., 1984). The 42 to 43 nm repeat observed particularly in the A band (Fürst et al., 1989) seems to reflect an 11 domain super-repeat pattern (Labeit et al., 1992) and biochemical results prove the expected interaction of titin with the LMM part of myosin and with C-protein (Labeit et al., 1992; Fürst et al., 1992). In addition, the cDNA cloning results show that the physical end attached to the Z band (Fürst et al., 1989) is the N-terminal region (Labeit et al., 1992).
While the Z-band binding region of titin is not yet available for molecular analysis, the I-band portion shows high elasticity, which is reflected by epitope movement as sarcomere length is increased (Fürst et al., 1988; Itoh et al., 1988; Whiting et al., 1989). Along the myosin A band, titin is fixed by integration to thick filaments via myosin binding proteins such as C-protein and possibly by additional direct interactions. The C-terminal end of the titin string extends into the M-line (Fürst et al., 1989) where it binds tightly to M-band constituents of apparent molecular masses of 190 kDa and 165 kDa (Nave et al., 1989). Since C-protein is now accepted as a side-on binding protein (Fürst et al., 1992; Labeit et al., 1992) and since it is difficult to study the end-on anchorage of titin at the Z band, we have concentrated on the opposite end. Based on cDNA cloning we report full sequences for the 165 kDa protein and the 190 kDa protein. The two human proteins share 50% sequence identity and reveal the modular structure based on fibronectin type III and immunoglobulin C2 repeats that is also present in titin. Monoclonal antibodies with epitopes mapped to the C-terminal 138 or the preceding 206 residues, respectively, decorate in immunoelectron microscopy adjacent to the center of the M band. Direct implications for models of the M-band portion of the sarcomere are discussed.
MATERIALS AND METHODS
Purification of 165 kDa protein and 190 kDa proteins from bovine skeletal muscle
A 100 g sample of bovine skeletal muscle (M. iliacus) were homogenized in 700 ml LSB (100 mM KCl, 2 mM MgCl2, 4 mM EGTA, 1 mM 2-mercaptoethanol, 1 mM NaN3, 10 mM Trismaleate, pH 6.8) containing 2 mM Na4P2O7. Myofibrillar material was collected by centrifugation (3,000 g, 15 min) and washed three times with LSB. The pellet was extracted for 40 min in 450 ml extraction buffer (600 mM KCl, 2 mM MgCl2, 2 mM EGTA, 1 mM 2-mercaptoethanol, 1 mM NaN3, 10 mM imidazole-HCl, pH 7.0) supplemented with the following protease inhibitors: PMSF (1 mM); trypsin-inhibitor II (10 mg/ml); E-64 (10 μM); and pepstatin (1 μM). The supernatant obtained after centrifugation (20,000 g, 30 min) was dialyzed extensively against buffer A (2 mM EGTA, 1 mM 2-mercaptoethanol, 1 mM NaN3, 50 mM Tris-HCl, pH 7.9). After clarification by centrifugation the supernatant was applied to a 2.5 × 27 cm Q-Sepharose column (Pharmacia) equilibrated in buffer A containing 90 mM KCl. After washing with the same buffer, bound protein was eluted with a shallow salt gradient (90 to 300 mM KCl in buffer A; total volume 750 ml). One peak contained the 165 kDa protein at a purity of 90% as judged by SDS-PAGE. A second peak contained similar amounts of the 190 kDa protein, C-protein and a protein of apparent molecular mass 120 kDa. Both fractions were used to immunize mice for the production of monoclonal antibodies.
In later experiments the 165 kDa protein was purified to homogeneity by the following method: the pooled fractions from the Q-Sepharose column were dialysed extensively against buffer A and subsequently applied to a 5’AMP-Sepharose 4B column (Pharmacia; 1 × 10 cm) equilibrated in the same buffer. The flowthrough fractions containing the 165 kDa protein were pooled and applied to a MonoQ HR5/5 column equilibrated in buffer A. Bound protein was eluted with a linear salt gradient (0 to 400 mM KCl; total volume of 20 ml). The 165 kDa protein eluted at around 180 mM KCl and contained no impurities on SDS-gels.
Purification of titin II and M-protein from bovine skeletal muscle
The preparation of titin followed the protocol for titin II from chicken breast muscle, excluding the final gel permeation chromatography on a TSK 6000 PW column (Nave et al., 1989). M-protein was purified essentially as described for chicken muscle by Eppenberger and Strehler (1982). In order to reduce proteolysis, the following protease inhibitors were added: PMSF (0.5 mM); TPCK (0.5 μM); TLCK (0.5 μM); pepstatin (1.25 mg/l); E-64 (5 μM); and benzamidine (1.25 mg/l). The concentration of EGTA in the wash and extraction buffers was increased from 1 to 10 mM.
Protein chemical methods
Automated sequencing was performed on an Applied Biosystems sequenator (model A470) and a Knauer sequenator (model 810). Both instruments were equipped with an on line PTH-amino acid analyzer. All methods used have been described recently (Fürst et al., 1992).
Immunization and monoclonal antibodies
Balb/c mice were immunized with native 165 kDa and 190 kDa protein purified as described above; 30-50 μg were used per injection and a standard immunization protocol was used. Spleen cells were fused with mouse myeloma cells (line PAI) as before (Fürst et al., 1988). Supernatants were screened by immunofluorescence microscopy on frozen sections of bovine skeletal muscle, and subsequently characterized by western blots of the original antigen and of titin TII (Fürst et al., 1988). Interesting hybridomas were subcloned by limiting dilution and several antibodies were further characterized by immunoelectron microscopy on myofibrils from bovine and rat skeletal muscle (Fürst et al., 1988). Two antibodies specific for the 165 kDa protein (AA241 and AA259) and one specific for the 190 kDa protein (BB78) were suitable for immunoelectron microscopy.
Cloning of the 165 kDa protein from human skeletal muscle
The initial antibody screen using a pool of antibodies AA241 and AA259 was on a human skeletal muscle cDNA library in λgt11, prepared from poly(A)+ RNA of a skeletal muscle from a fetus in the 22nd week of gestation. This library was kindly provided by Dr H. Arnold, University of Braunschweig, FRG. The antibody screen was performed as described for a lamin cDNA clone (Stick, 1988). This screen of 106 clones yielded 5 positively reacting clones with insert sizes between 1.5 and 2.3 kb. The most 5’-situated RsaI fragment (∼350 bp) of the largest insert was labelled with [32P]dCTP (Amersham) using the T7 Quick Prime Kit (Phar-macia) and employed in a second hybridization round of the same library. The hybridization conditions were: 5× SSPE, 10× Denhardt’s solution, 0.5% SDS, 100 mg/ml sonicated salmon sperm DNA at 65°C. The final wash was carried out in 0.1% SDS, 0.1× SSC. This screen yielded a clone containing ∼1000 additional nucleotides. The most 5’-situated 400 bp of the clone were amplified by PCR with the appropriate primers, labelled and used for a third screening round of the same library as described above. This resulted in the isolation of a clone with a 3.9 kb insert. The full-length sequence was obtained by screening a human skeletal muscle λgt10 library, which was randomly primed (Clontech, Palo Alto, USA). The 700 bp situated closest to the 5’-end of the longest λgt11 clone obtained in the third screening round was amplified by PCR and used as a hybridization probe. The size of the portion by which the positive clones extended beyond the 5’-end of the longest λgt11 clone was determined by PCR employing a specific primer from the known 5’-end and a primer situated close to the λgt10 cloning site. Inserts were subcloned according to standard protocols (Sambrook et al., 1989) into plasmids Bluescript (Stratagene) and M13mp18 and 19 (Yanisch-Perron et al., 1985) for sequencing. Plasmid DNA was purified on Qiagen columns (Diagen, Hilden, FRG). Sequencing was performed by the dideoxy chain termination reaction according to Sanger et al. (1977) using the Sequenase version 2.0 DNA sequencing kit (United States Biochemical Corporation). Both strands were fully sequenced.
Cloning of the 190 kDa protein from human skeletal muscle
The λgt11 library, described above, was screened with mAb BB78. Eight clones harboring cDNA inserts of between 1.2 and 3.3 kb were isolated and analyzed further. The most 5’-situated EcoRI fragment of the largest clone (∼1600 bp) was labeled with [32P]dCTP as described above and used to screen a randomly primed human skeletal muscle λgt10 library (Clontech, Palo Alto, USA). The hybridization conditions were: 7% SDS, 0.25 M Na2HPO4, pH 7.2, 63°C. The final wash was carried out in 1% SDS, 20 mM Na2HPO4, pH 7.2, at 65°C. The length by which the positive clones extended beyond the 5’-end of the λgt11 clone was determined by PCR. Inserts were subcloned and sequenced as described above.
Miscellaneous procedures
Total cellular RNA was extracted from bovine skeletal muscle and from F9 mouse teratocarcinoma cells following the protocol of Chomczynski and Sacchi (1987). The RNA was separated on a 1% agarose/glyoxal gel (Sambrook et al., 1989) and transferred to a nylon membrane (GeneScreen Plus; DuPont, Bad Homburg) by capillary action. UV crosslinking and hybridization conditions for northern analysis were as described by the manufacturer.
Computer analyses of the cDNA sequences were carried out on a Vax 9000 microcomputer using the GCG program package. Immunofluorescence and immunoelectron microscopy were performed as described (Fürst et al., 1988). For calculating the distances of antibody decoration lines the center of these lines was considered.
RESULTS
Purification of bovine titin-associated proteins and production of mAbs
Both titin-associated proteins were obtained from bovine skeletal muscle by a novel purification protocol. The 165 kDa protein fraction was more than 90% pure as judged by SDS-PAGE. The 190 kDa protein eluted from the Q-Sepharose column together with C-protein of slow fibers (140 kDa protein) and a 120 kDa protein, which we have identified as C-protein from fast fibers (our unpublished results). The 160 kDa and 190 kDa proteins were not purified further but instead were used directly to immunize mice. Monoclonal antibodies were isolated and characterized by immunoblotting on purified titin preparations containing the associated proteins (Fig. 1) and by immunoelectron microscopic localization of the epitopes in the sarcomere (Fig. 2). Antibodies AA241 and AA259 gave in immunoblots a strong and specific decoration of the 165 kDa protein in whole muscle samples and in purified titin II preparations (Fig. 1). Immunoelectron microscopy clearly located both epitopes to the center of the M-band region. Two poorly resolved decoration lines were found on either side of the center of the M band (the M1 line in the nomenclature of Sjöström and Squire, 1977; see Fig. 2). For mAb AA241 the center of each line was some 15 nm from the M1 line (Fig. 2a). Antibody AA259 yielded two lines at approximately 18 nm distance from the M1 line (Fig. 2b). Antibody BB78 recognized the 190 kDa polypeptide in western blots of whole muscle extracts and in purified titin II (Fig. 1). Its epitope located to a single line in the central portion of the M band. The limit of resolution of the immunoelectron microscopic method, thought to be around 10 nm (Fürst et al., 1989) does not allow the visualization of two distinct decoration lines (Fig. 2c).
Since the subsequent molecular characterization showed that the 165 kDa and the 190 kDa proteins are highly related (see below), it was important to establish whether they are merely two isoforms of one protein, which are distributed in a fiber type-specific fashion, or whether they coexist independently in the same sarcomere. Both immunofluorescence and immunoelectron microscopy clearly showed a codistribution of the two proteins in the fibers of the three muscles we examined (bovine M. iliacus, rat M. iliacus and rat M. psoas). mAb AA259 revealed an immunofluorescence pattern in which the intensity of the label varied slightly between different fiber types, indicating that these contained varying amounts of the 165 kDa protein.
Isolation and sequence of the cDNA encoding the 165 kDa protein of human skeletal muscle
Monoclonal antibodies AA241 and AA259 to the bovine titin-associated 165 kDa protein were used to screen a λgt11 cDNA expression library constructed from poly(A)+ RNA of fetal human skeletal muscle (22nd week of gestation). The primary screen of around 1×106 plaques yielded 5 strongly reacting clones. They contained cDNA inserts of ∼0.9 to 2.3 kb. Restriction mapping and cross-hybridization patterns indicated descent from a single mRNA template. Two further screening rounds on the same library yielded only clones of up to 3.9 kb, i.e. ∼1 kb shorter than the size expected from northern blots (result not shown). Therefore a randomly primed λgt10 cDNA library of human muscle was employed in the next screen. PCR was used to estimate by how much the inserts extended in the 5’ direction beyond the known sequence. Two clones, D1 and D2, were selected for detailed analysis. Since the λgt11 library was derived from fetal muscle and the λgt10 library from a 15-year-old child we analyzed a longer overlapping portion for sequence identity. Clone D1 is 1763 bp long; 719 bp extend over the previously established sequence towards the 5’-end. The remaining 1044 bp comprised an overlap that matched perfectly with the λgt11 sequence. The same was true for clone D2; 1055 bp gave new sequence information, while the remaining 553 bp were identical to the sequence previously established on the λgt11 clone. We therefore conclude that all the clones we describe are derived from the same mRNA template.
The complete sequence of the cDNA contains 4939 bp (see Fig. 3). It has a single large open reading frame (ORF) that encodes a protein of 1465 amino acids. The coding region is flanked by 5’- (nucleotides 1 to 48) and 3’- (nucleotides 4444 to 4939) untranslated sequences. The ATG start codon at positions 49 to 51 lies within a sequence with a high degree of similarity to the consensus sequence required for efficient translational initiation (Kozak, 1989). The TGA stop codon at residues 4444 to 4446 marks the start of a nontranslated sequence of 492 bp. The polyadenylation signal is present at residues 4905 to 4910, and a short poly(A) tail (17 bases) follows 12 nucleotides later. The calculated molecular mass of the polypeptide predicted from the ORF is 164,883. This value is in excellent agreement with the estimate of 165,000 from SDS-PAGE. Since this value also fits the size of 5 kb for the mRNA estimated by northern blots (result not shown) we believe that we have obtained the complete cDNA sequence for the human 165 kDa protein.
Isolation and sequence of the cDNA encoding the 190 kDa protein of human skeletal muscle
mAb BB78, directed against the bovine 190 kDa protein, was used to screen the λgt11 cDNA expression library described above. This resulted in 8 strongly reacting clones in the 1×106 plaques screened. The cDNA inserts were between 1.3 and 3.3 kb. Restriction mapping and crosshybridization patterns revealed a set of overlapping clones descending from a single mRNA template. Since the original λ clone has two internal EcoRI sites, it was important to verify their exact position. This was achieved by amplifying fragments ∼400 bp in length around these sites by PCR using the original λ DNA as template. The amplified fragments were subjected to direct sequencing according to the method of Winship (1989). To obtain the full-length cDNA sequence the randomly primed λgt10 cDNA library that was used to isolate the human 165 kDa cDNA (see above) was screened. PCR was again used to analyze how much the inserts extended beyond the known sequence. Two clones, N1 and N2, were selected for detailed analysis. Clone N1 extends 1661 bp over the previously established sequence in the 5’ direction, while the extension provided by clone N2 was only 1492 bp.
The complete cDNA sequence contains one large ORF of 4353 nucleotides coding for 1451 amino acids (Fig. 4). The ATG start codon at positions 118 to 120 lies within a sequence that reveals a high degree of similarity to the consensus sequence required for the initiation of translation (Kozak, 1989). The 3’-untranslated sequence starts with the TGA stop codon at positions 4470 to 4472 and has a total length of 479 bp. The canonical polyadenylation signal AATAAA is found at residues 4891 to 4896. A poly(A) remnant of 26 A residues is revealed 26 bases later. The ORF codes for a polypeptide of molecular mass 162,451 kDa. Although this value is somewhat lower than the apparent mass expected from the mobility in SDS-PAGE (190 kDa) we believe that it represents the complete cDNA of the ‘190 kDa protein’. The length of the cDNA fits the size of 5 kb for the mRNA estimated from northern blots. In addition a second screen of the randomly primed λgt10 library using a PCR fragment close to the 5’-end as a probe provided no clones that extended further in the 5’ direction than the ones identified before.
Modular structure of two homologous M-band proteins
The N-terminal 160 residues of the 165 kDa protein form a unique region that is unrelated to the rest of the sequence or to other known proteins. It contains only two proline residues (positions 7 and 154) and is rather basic, particularly along the N-terminal 78 residues. Secondary structural prediction rules allow for α-helical elements around residues 30 and 80, and a longer α-helix between residues 100 and 150. Interestingly the N-terminal domain of the 165 kDa protein contains two potential serine phosphorylation sites for cAMP-dependent protein kinase: KKRAS at residues 35 to 39 and KRVS at residues 73 to 76 (Fig. 5).
Nearly 90% of the sequence of the 165 kDa protein consists of repetitions of two types of structural motifs, which have a pronounced homology either to the fibronectin type III domain (class I motif) or to the immunoglobulin C2-like domain (class II motif) (Fig. 5). Such motifs are also known for some other myofibrillar proteins such as titin, C-protein, 86 kDa protein, myosin kinase and twitchin (see Discussion). These domains are usually around 100 to 110 residues in length and increase only in the first and third repeat to 130 and 128 residues, respectively. There are twelve such modules in the 165 kDa and the 190 kDa protein and these show the linear arrangement II-II-I-I-I-I-I-II-II-II-II-II (Fig. 5). This arrangement is unique in the currently known list of myofibrillar proteins belonging to the superfamily of immunoglobulin-like proteins (see Discussion). When all type I or type II domains are compared, a block of about 70 residues shows the highest sequence conservation. The most conserved residues of the block are marked in Fig. 5 for both type I and type II domains.
The 190 kDa protein has a unique N-terminal domain of some 110 residues, which shows no obvious relation to other proteins or to the N-terminal domain of the 165 kDa protein (see above). Interestingly the sequence covering residues 46 to 93 in the 190 kDa protein shows 8 consecutive copies of the sequence KQSTAS (Fig. 5). This motif is absolutely conserved 4 times (repeats 1, 3, 4 and 5) while the other 4 copies contain either one (repeats 2 and 6), or three (repeats 7 and 8) conservative replacements. Secondary structural prediction rules do not assign a defined structure to this region. Currently it is not known whether the conserved serine or threonine residues of these 8 motifs are targets for a protein kinase. Past the N-terminal domain the 190 kDa protein displays the same 12 class I and II repeats as the 165 kDa protein. Over this region the two proteins share about 50% sequence identity (similarity level 70%). At the nucleotide level the sequence identity reaches even 60%. The striking similarity also holds true for the 3 ′ untranslated regions. They are very similar in length (476 bp for the 165 kDa protein and 453 bp for the 190 kDa protein from the stop codon to the beginning of the poly(A) tail) and reveal an identity level of 33% with a single gap of 4 nucleotides introduced into the mRNA of the 190 kDa protein for optimal alignment (data not shown). Fig. 5 also shows the minimal length variability of homologous domains in both proteins. Thus the fourth class I repeat is increased by only 3 residues in the 190 kDa protein. This striking length conservation extends nearly to the C-terminal end, where the 165 kDa protein has an extra 8 residues. When individual type I or II domains of one class are compared with domains of the same class either within the same protein or in both proteins, some interesting differences are seen. When the repeats are situated in the same relative position in the two proteins, class I domains show 43 to 61% sequence identity (50 to 76% sequence similarity) while similar domains in different locations display usually only 26 to 51% sequence identity (50-73% sequence similarity). Overall, class II domains show a lower degree of sequence conservation than class I domains (on average 20% versus 34% identity level). For corresponding type II domains, the two proteins show 41 to 53% identity (60 to 74% similarity), while random comparison yields only 14 to 30% identity (34 to 54% similarity).
Epitope mapping of mAbs used in immunoelectron microscopy
Our initial antibody screen of the λgt11 expression library yielded 5 clones for the 165 kDa protein that reacted with mAb AA259. Since all clones extend from a common 3’-end and the shortest clone has an insert of 909 bp only, the epitope recognized by this antibody is confined to the protein sequence encoded by nucleotides 4027 to 4443, i.e. to the carboxy-terminal 138 amino acids (residues 1327 to 1465), which provide the C-terminal end of repeat 11 and the complete repeat 12 (see Fig. 5). mAb AA241 recognized exclusively a clone that extends by an additional 618 bp towards the 5’-end. Therefore this epitope of the 165 kDa protein involves some part of the protein sequence that is situated around residues 1121 to 1327, which form the C-terminal part of repeat 9, the complete repeat 10 and most of repeat 11 (Fig. 5). All clones reacting with mAb BB78, which is specific for the 190 kDa protein, had only the sequence between nucleotides 3642 to 4086 in common. Thus the epitope is located to amino acid residues 1154 to 1327 of the protein sequence. They form the C-terminal half of repeat 10 and most of repeat 11.
The titin-associated 165 kDa protein is most likely M-protein
Purified titin II from bovine skeletal muscle was subjected to SDS-PAGE and the separated polypeptides blotted on a PVDF membrane. The blot corresponding to the titin-associated 165 kDa protein lacked a free N-terminus upon direct sequencing. Corresponding blots of the titin-associated 165 kDa protein were digested in situ with endoproteinase Asp-N or with trypsin. Peptides released from the blot were subjected to HPLC separation and the elution profiles were screened for pure peptides by automated sequencing. Fig. 6 shows that all bovine peptide sequences could be located along the complete human sequence predicted by cDNA cloning (see above), although several peptides showed one or two conservative amino acid exchanges. The combined results cover 138 residues of the titin-associated 165 kDa protein.
Bovine skeletal muscle M-protein was purified as described (Eppenberger and Strehler, 1982) and the fraction from the 5’-AMP-Sepharose 4B column was separated by SDS-PAGE. The 165 kDa protein of this preparation, which is defined as M-protein (Eppenberger and Strehler, 1982), was recognized in immunoblots by the two monoclonal antibodies AA241 and AA259 (Fig. 1C), which react with the titin II-associated bovine 165 kDa protein. The band was not decorated by antibody BB78, which is specific for the 190 kDa protein. Similar blots on a PVDF membrane were used to obtain peptides of bovine M-protein by digestion with endoproteinase Asp-N and trypsin. Fig. 6 shows that all pure peptides could be fitted along the human sequence predicted from the complete cDNA clone for the 165 kDa protein, although again an occasional conservative amino acid exchange is noted. The combined results cover nearly 100 residues of the bovine M-protein.
Digests of the titin-associated 165 kDa protein and M-protein were obtained at different times during this study and slightly different experimental conditions were used both in digestion and HPLC separation. As the HPLC elution profiles for the peptides from high molecular mass proteins are necessarily very complex, a perfect fit for the few pure peptides present in both preparations was not expected. Nevertheless, both peptide profiles show seven identical peptides and these and all other pure peptides sequenced from both preparations follow the human sequence (Fig. 6). To obtain somewhat longer sequences, bovine M-protein isolated by preparative SDS-PAGE was subjected to partial acid cleavage, using conditions relatively specific for cleavage of aspartic acid-proline bonds (48 hours in 80% formic acid at room temperature). Fragments were subjected to SDS-PAGE. The blot of a 7 kDa fragment provided the N-terminal sequence PPEPRGKEPLMY-FIEKSMVGSGXXQXVNAQXAVXSP, which fits residues 533 to 568 of the human 165 kD protein where the residues that are underlined reflect conservative amino acid replacements. In addition a CNBr fragment of bovine M-protein (5 kDa) yielded after blotting the N-terminal sequence FFGEGQASLSFSXLNXDDEGLYTLXIVS, which corresponds to residues 332 to 359 of the human 165 kDa protein. The peptide sequence results, the reactivity with monoclonal antibodies AA241 and AA259, and the same apparent molecular masses in SDS-PAGE argue that the titin-associated 165 kDa protein and M-protein are identical. After our study of the human 165 kDa protein was complete and we had characterized the last nine repeats of the 190 kDa protein (Vinkemeier, 1992). Noguchi et al. (1992) reported the cDNA sequence for the M-protein of chicken muscle. This shows 73% identity (82% similarity) with the human 165 kDa protein characterized here (see Discussion). Although the arrangement of class I and class II domains is identical in both proteins Noguchi et al. (1992) did not note that residues 1018 to 1087 form a class II repeat (Fig. 5) and instead left this region unassigned in their domain description.
DISCUSSION
The sarcomere has been described for a long time essentially as a structure of thin and thick filaments fixed at the Z and M bands, respectively. However, recent results on additional myofibrillar proteins have established a supramolecular organisation, which integrates the myofibril (Wang, 1985; Maruyama, 1986; Trinick, 1991). It is based on titin, which spans the half sarcomere (Fürst et al., 1988, 1989) and on titin-associated proteins, which allow for side-on and end-on interactions. We have previously identified two polypeptides of molecular masses, 190 kDa and 165 kDa, strongly associated with purified titin II. Immunoelectron microscopy tentatively located these proteins to the myofibrillar M band (Nave et al., 1989). The present work aimed at a characterization of the primary structure of these two proteins and a better understanding of the three-dimensional architecture of the sarcomeric M band.
The two M-band proteins show no obvious relation in their N-terminal domains, which cover some 110 residues, but the following 1300 residues are highly homologous (50% sequence identity, 70% similarity). They form, in both proteins, twelve modules in the linear arrangement II-II-I-I-I-I-I-II-II-II-II-II, where motifs II and I reflect immunoglobulin C2 and fibronectin type III like domains. Except for the first and third domains, each of these units is about 100 residues in length. Length variability of comparable domains of both proteins is very small and essentially restricted to the C-terminal ends of the first and the last domain. Class I repeats show a relatively strong conservation of sequence. In fibronectin highly conserved amino acid residues occur at ten positions per repeat (Peterson et al., 1989). Six out of these are retained in the type I repeats of the two M-band proteins. The first class I domain of the 165 kDa protein shares with fibronectin even nine of the conserved positions. Of the 15 highly conserved residues in the constant and variable regions of immunoglobulins (Williams and Barclay, 1988) six are retained in the type II domains of both M-band proteins (Fig. 5). The high homology of the cDNA regions encoding the twelve domains of both M-band proteins (60% sequence identity on the nucleotide level) continues, albeit at a reduced level, in the untranslated 3’ regions of the two mRNAs. A linear alignment indicates a 33% sequence identity in this region. We speculate that the genes responsible for the two M-band proteins separated relatively recently in evolution.
The small N-terminal domains of both proteins show interesting features. The 165 kDa protein displays two strong target sites for the cyclic AMP-dependent protein kinase A. The 190 kDa protein shows eight consecutive copies of the short motif KQSTAS. Four of these repeats are perfectly conserved, while two have a single conservative replacement, and the last repeats are more degenerate. Whether this serine- and threonine-rich segment of 48 residues serves as target for certain protein kinases is not known.
Members of the superfamily of immunoglobulin-like molecules have been known for some time to participate in a wide variety of extracellular and membrane attachment processes like for instance cell-cell recognition (reviewed by Williams, 1987). Recently it was found that such molecules also participate in the architecture of the myofibril where they form an integrating structure or scaffold. This is built by titin (Labeit et al., 1990, 1992) and the titin- and myosin-binding proteins, C-protein and 86-K protein (Einheber and Fischman, 1990; Epstein and Fischman, 1991; Fürst et al., 1992; Labeit et al., 1992). Interestingly, myosinlight chain kinase also belongs to the superfamily (Olson et al., 1990) and a myosin kinase-like domain is found towards the C-terminal ends of both titin and nematode twitchin (Benian et al., 1989; Labeit et al., 1992). The latter molecule is essentially a mini-titin (Nave and Weber, 1990; Nave et al., 1991). We have now shown that the two titin-associated proteins of the M band also follow these sequence principles. Currently it seems that each member of the myofibrillar proteins of the immunoglobulin-like superfamily has its distinct arrangement of fibronectin type III and immunoglobulin C2-like domains. Five consecutive type II motifs, as in the two M-band proteins, are known to occur once in the N- and C-terminal regions of twitchin (Benian et al., 1989) and once in C-protein with the first repeat being degenerate (Fürst et al., 1992). However the five consecutive type I motifs of the M-band proteins have so far no counterpart in other proteins. Although we have not made a systematic comparison of the various domains in the different proteins, we note that as in C-protein (Fürst et al., 1992), class I domains of the M-band proteins have a significantly higher conservation (up to 35% identity) than class II domains (around 20% identity). Comparison of the domains of the M-band proteins with corresponding domains in other proteins shows that the first class II domain of the 165 kDa protein shares 38% identity with a class II domain situated at positions 2933 to 3003 of the titin clone AB5 (Labeit et al., 1992).
Our immunoelectron microscope results with monoclonal antibodies of known epitope location (Figs 2 and 5) directly confirm the prediction that the globular head observed in purified titin TII molecules arises from M-band proteins (Nave et al., 1989). They show in addition that the C-terminal regions of the two titin-associated proteins locate at or around the center of the M band. Since a monospecific polyclonal antibody reacting with chicken but not with mammalian 190 kDa protein provided labelling of the M6/M6’ lines (Nave et al., 1989), we assume that the two titin-associated proteins traverse the M band. As the most C-terminal epitope of the 165 kDa protein (AA259) labels at 18 nm from the center of the M band (M1 line) while the preceeding epitope (AA241) labels only at 15 nm distance, we speculate that the 165 kDa protein extends through the M1 line. Thus, neighbouring molecules could be arranged in an antiparallel manner with their C-terminal ends extending into the neighbouring half-sarcomere. An antiparallel and staggered arrangement is also characteristic for myosin molecules in the M band, the center of the thick filaments (Offer, 1987). Additional monoclonal antibodies of known epitope position, or tailormade peptide antibodies, should allow a precise determination of the arrangement of the 190 kDa and 165 kDa proteins and should provide a test for the proposed on stagger arrangement of these proteins in the center of the M band.
The immunoelectron microscope results have also direct implications for the C-terminal end of titin, which is thought to locate to the M-band side of the half-sarcomere (Labeit et al., 1992). The closest apposition of titin and the M band relies so far on monoclonal antibody T33. It labels in immunoelectron microscopy a single line per half-sarcomere. This is situated 55 nm from the center of the M band and corresponds to the M7 line (Fürst et al., 1989). On purified titin molecules, T33 marks the rod-head junction (Nave et al., 1989). Since the head contains the titin-associated M-band proteins 190 kDa and 160 kDa (see above) it is possible that the titin rod also extends to the center of the M band although this has not yet been shown by immunoelectron microscopy using antibodies to the C-terminal end of the molecule. Interestingly, the C-terminal end of titin is formed by 10 consecutive class II modules (S. Labeit, pers. communication). As the two M-band proteins end with 5 consecutive class II modules and have their C-termini around the center of the M band, the tight association of titin and these M-band proteins may occur by parallel class II modules as in the immunoglobulin molecule. Future biochemical experiments using recombinantly expressed domains should be able to test this view. Binding mechanisms based on homophilic and/or heterophilic interactions have been observed in the case of NCAM and LCAM (Edelman, 1987; Hoffmann and Edelman, 1983).
ACKNOWLEDGEMENTS
We greatly appreciate the help of Uwe Plessmann in obtaining the partial protein sequence data on bovine 165 kDa protein and M-protein. Dr H. Arnold, University of Braunschweig, kindly provided the λgt11 cDNA library from fetal human skeletal muscle. This work was supported in part by a grant from the Deutsche Forschungsgemeinschaft to D.O.F. and K.W. The contributions of U.V. to this study form part of the requirements for the Ph.D. at the University of Gießen, FRG. The sequences reported above have been deposited in the EMBL/GenBank database and are available under accession numbers X69089 (165 kDa protein) and X69090 (190 kDa protein).
REFERENCES
NOTE ADDED IN PROOF
We have meanwhile purified the bovine 190 kDa protein to homogeneity. Peptide sequences document the homology with the human protein predicted by cDNA cloning. The positive reaction of the 190 kDa protein with a monoclonal antibody kindly provided by Dr H. Eppenberger indicates that this protein is the mammalian counterpart of chicken myomesin (W. Obermann, D. O. Fürst and K. Weber, unpublished).