Neural cell adhesion molecules (N-CAMs) are a family of cell surface sialoglycoproteins encoded by a single copy gene. A full-length cDNA clone that encodes a nontransmembrane phosphatidylinositol (PI) linked N-CAM of Mr125×103 has been isolated from a human skeletal muscle cDNA library. The deduced protein sequence encodes a polypeptide of 761 amino acids and is highly homologous to the N-CAM isoform in brain of Mr120×103. The size difference between the 125×103Mr. skeletal muscle form and the 120× 103Mr N-CAM form from brain is accounted for by the insertion of a block of 37 amino acids called MSD1, in the extracellular domain of the muscle form. Transient expression of the human cDNA in COS cells results in cell surface N-CAM expression via a putative covalent attachment to PI-containing phospholipid. Linked in vitro transcription and translation experiments followed by immunoprecipitation with anti-N-CAM antibodies demonstrate that the full-length clone of 761 amino acid coding potential produces a core polypeptide of Mr110×103 which is processed by microsomal membranes to yield a 122× 103Mr species. Taken together, these results demonstrate that the cloned cDNA sequence encodes a lipid-linked, PI-specific phospholipase C releasable surface isoform of N-CAM with core glycopeptide molecular weight corresponding to the authentic muscle 125×103Mr N-CAM isoform. This is the first direct correlation of cDNA and deduced protein sequence with a known PI-linked N-CAM isoform from skeletal muscle.
Specific homotypic and heterotypic cell-cell interactions occurring during embryogenesis and tissue formation are modulated by the temporal and spatial patterns of expression of cell adhesion molecules (CAMs) (Edelman, 1985). Two main adhesive systems operating either dependently or independently of calcium have been described (Edelman, 1986; Takeichi, 1987). Of the calcium-independent CAMs the best characterized is the neural cell adhesion molecule (N-CAM) (Edelman, 1986; Rutishauser & Goridis, 1986; Nybroe et al. 1988; Goridis & Wille, 1988; Walsh, 1988). N-CAM is a cell surface sialogly coprotein that is involved in both homotypic and heterotypic adhesive interactions via a homophilic binding mechanism (Edelman et al. 1987). Perturbation experiments with anti-N-CAM antibodies have implicated this molecule in a variety of events including migration and guidance of axons, neural tissue formation and nerve-muscle interactions (Edelman, 1985). It is known that the N-CAM gene is present as a single copy and the diversity of N-CAM RNAs and isoforms found during development and in specific cell types can be accounted for by specific patterns of alternative splicing and polyadenylation site selection (Cunningham et al. 1987; Owens et al. 1987; Barthels et al. 1987; Santoni et al. 1987 ; Dickson et al. 1987). In addition to changes in amino acid sequence, N-CAM is also subject to a variety of cell and isoform-specific post-translational modifications including glycosylation, phosphorylation and sulphation (Nybroe et al. 1988).
Two categories of N-CAM forms have been described in neural and skeletal muscle tissues to date. These are, first, those with transmembrane topology of Mrs 180 and 140 ×103 in brain, and 145 ×103 in muscle; and second, nontransmembrane species of Mr120 × 103 in brain and 125 and 155 ×103 in muscle, all of which are anchored to the external surface of the plasma membrane through covalent linkage to phosphatidylinositol (PI) via a specific glycan (Cunningham et al. 1987; He et al. 1987; Nybroe et al. 1988). The mechanism generating these two different membrane-associated groups of N-CAM isotypes has been clearly shown to be based on alternative RNA splicing and differential polyadenylation site selection (Cunningham et al. 1987; Goridis & Wille, 1988). In addition, a lipid-linked isoform(s) in skeletal muscle containing an additional tissuespecific domain (MSD1) in its extracellular region has been predicted by cDNA sequencing studies (Dickson et al. 1987). This insertion occurs at a recognized splice junction according to the chick (Owens et al. 1987), and human (J. Thompson, unpublished observations) N-CAM gene structure and represents the first alternative splicing event, so far detected, within the extracellular domain that is not associated with membrane attachment. Indeed the difference in molecular size between the brain N-CAM-120 and muscle N-CAM-125 forms is likely in part to reflect expression of this muscle-specific domain.
While complete cDNA sequence corresponding to the protein coding region for brain N-CAM-120 is available (Barthels et al. 1987), the full sequence corresponding to its 125 ×103Mr muscle counterpart has been lacking. In the present study, a cDNA containing the complete coding segment of a muscle specific N-CAM isoform was isolated and sequenced. Transcription, translation and processing of the cloned coding sequence in vitro and cell transfection results in the synthesis of a cell surface N-CAM glycopeptide of Mr122 ×103. This corresponds in size to an isoform observed in skeletal muscle cells both in vitro and in vivo that can be released by PI-specific phospholipase C (Moore et al. 1987).
Materials and methods
Cloning of human skeletal muscle N-CAM cDNAs
32P-labelled cDNA probe 911 was used to isolate additional human fetal muscle N-CAM cDNA clones from a library in λgt11. The library was prepared from poly(A) +mRNA isolated from midfusion human muscle cultures (Dickson et al. 1986) by standard procedures (Huynh et al. 1985). Hybridizations were performed as before (Dickson et al. 1987) and positive clones were plaque purified by recloning at least three times.
DNA sequence analysis
Recombinant DNA from γ phage was prepared by standard methods and EcoRI cDNA fragments were subcloned into the EcoRI site of M13mpl8 and recombinants detected by using IPTG and X-gal. M13 recombinants were selected in both orientations by restriction mapping. Replicative form plasmid DNA was prepared by alkaline lysis and purified by two rounds of CsCl equilibrium density gradient centrifugation (Maniatis et al. 1982). A series of overlapping subclones were generated by Xba1/Sph1 double digestion of the replicative form plasmid DNA and deletions generated with exonuclease III and SI nuclease (Henikoff, 1984). Overlapping subclones were sequenced by the dideoxy chain termination method (Sanger et al. 1977). Full cDNAs were sequenced at least twice on opposing strands. Subclone sequences were read manually, aligned and merged to generate full-length clones using the Microgenie DNA sequence analysis software package (Queen & Korn, 1984).
Northern and Southern analysis
Poly(A) mRNA (2 μg), purified from tissue or cultured cells as described before (Dickson et al. 1986), was fractionated by glyoxal agarose electrophoresis (Maniatis et al. 1982) and blotted onto Genescreen as described by the manufacturer. Hybridizations using 32P-labelled whole plasmid or gel-purified insert were performed as described previously (Dickson et al. 1987).
Cell-free transcription-linked translation
The two EcoRI fragments of CHB1 were isolated by partial digestion and subcloned into the EcoRI site of the bacterial expression vector pGEM (Promega Systems, Liverpool, UK). The continuity and relative orientation of the recombinant DNA was ascertained by restriction mapping. Following linearization of the recombinant plasmid, transcription was initiated with either T7 or Sp6 RNA polymerase depending on the orientation of the CHB1 insert. Conditions of the transcription reaction were as described by the manufacturer with template DNA at 50 μg ml −1, 1 unit of RNasin and a reaction time of 60 min at 40 °C. Synthetic RNA was removed from DNA template and translated in a rabbit reticulocyte lysate in vitro translation system (NEN) with [35S]methionine as the radiolabel. Optimal incorporation into TCA-precipitable material was achieved by titration of both Mg2+ and K+ concentrations with optimal concentrations of 0·7mM and 52mM, respectively. Synthesized products were separated on polyacrylamide gels with or without immunoprecipitation with anti-N-CAM (Moore et al. 1987). Proteins bands were identified by fluorography at −70 °C.
DNA transfection studies
A Hindi 11 fragment corresponding to the entire large Hindi II portion of CHB1 plus pGEM polylinker regions was ligated into the unique HindIII site of p4.4.4 (Gunning, 1987) which follows the β-actin promoter. Sense-orientated constructs were identified by restriction mapping and purified plasmid DNA was used to transfect monkey kidney cells via the calcium phosphate precipitation method. Transient expression was examined after 48 h using indirect immunofluorescence with rabbit antibodies to N-CAM (Moore et al. 1987).
N-CAM cDNA isolation and sequencing
N-CAM clones were isolated from a λgt11 cDNA library constructed from human fetal muscle poly ( A) +mRNA (Dickson et al. 1987). The cDNA library was screened by plaque hybridization with a mouse brain cDNA probe, 911, which was isolated by screening of a young postnatal mouse brain cDNA library with an oligonucleotide probe to the N-CAM N-terminal protein sequence (J.-C. Chaix & C. Goridis, unpublished results). The 911 probe contained both N-terminal-coding and 5 ′ untranslated sequence. The largest cDNA clone isolated, CHB1, was composed of two EcoRI fragments of 1·6 and 1·2 kb and these were subcloned in both orientations into M13 mpl8 for sequencing and detailed restriction site analyses.
Restriction endonuclease mapping clearly indicated that the small (T2kb) EcoRI fragment of CHB1 was similar to a previously described human muscle N-CAM cDNA, λ9·5 (Dickson et al. 1987) whose sequence spans coding regions corresponding to membrane-proximal and COOH-terminal domains of a nontransmembrane lipid-linked muscle N-CAM form (Fig. 1 A). The large (1·6 kb) EcoRI fragment of CH Bl was shown by Southern blot analysis to be the source of the hybridization signal with the mouse brain probe 911, and DNA sequence analysis (see below) of the 1·6 kb fragment confirmed its identity with 5 ′ sequence ofchick and mouse N-CAM cDNAs (Cunningham et al. 1987; Barthels et al. 1987). A Northern blot analysis using the entire CHB1 cDNA clone as hybridization probe identifies characteristic N-CAM mRNA transcripts of 6·7, 5·2, 4·3 and 2·9 kb in human myotube RNA (Fig. 2). A subfragment probe encoding the COOH-terminal domain of the proposed protein hybridized to N-CAM RNAs of 5·2, 4·3 and 2·9 kb. This is consistent with previous studies (Dickson et al. 1987) indicating that these transcripts contain the necessary coding sequence to allow N-CAM attachment to membrane by a phosphatidylinositol linkage and with only RNA transcripts from muscle containing the MSD1 region.
CH Bl encodes N-CAM-125
The entire coding sequence of CHB1 (Fig. 3) was found to be highly homologous with mouse and chick cDNAs corresponding to brain N-CAM-120 (88 and 82%, respectively) with the exception of the previously described muscle-specific domain MSD1 (Dickson et al. 1987). The major open reading frame of CHB1 predicts a 761 amino acid polypeptide of core Mr 83 ×103. Significant discrepancies in the predicted Mr of N-CAM polypeptides and their observed migration by SDS-PAGE have been reported (Barthels et al. 1987; Cunningham, 1987). In this respect the difference between mouse brain N-CAM-120 (725 amino acids, predicted Mr 79 ×103) and the putative muscle isoform N-CAM-125 is accounted for entirely by the 37 amino acids of the MSD1 domain. With the exception of this domain the percentage homology at the amino acid level increases to 92 % and 87 % for mouse and chick sequences, respectively.
In the human N-CAM sequence, the selected translational initiation codon exhibits two nucleotides of the five-base initiation consensus of Kozak (1984) and is followed by a putative signal peptide of 19 predominantly hydrophobic amino acids (Fig. 1E) similar to that described for mouse brain N-CAM (Barthels et al. 1987). The subsequent 17 amino acids are identical to those found by direct protein sequencing of the NH2-terminus of rat brain N-CAM (Rougon & Marshak, 1986), with a predicted NH2-terminal leucine residue in the mature polypeptide. The central region of the molecule contains six consensus sites for N-linked glycosylation (Fig. 1B).
At the COOH-terminus of the predicted human muscle N-CAM polypeptide a stretch of hydrophobic amino acids is found (Fig. 1E). This conforms to the requirements for covalent anchorage to the plasma membrane via a PI containing glycan moiety (Cross, 1987) as described for mouse brain N-CAM-120 (Barthels et al. 1987). In this respect, the N-CAM-125 isoform of mouse muscle has been clearly shown to be released from intact cells by PI-specific phospholipase C treatment (Moore et al. 1987). Given these structural features, its predicted Mr and similarity to N-CAM-120, it is thus likely that CHB1 carries the entire coding region for the human muscle N-CAM-125 isoform, directly comparable to brain N-CAM-120 but incorporating the MSD1 domain.
In vitro expression of CH Bl coding sequence
Further evidence to suggest that the determined sequence of CHB1 encodes a complete human skeletal muscle N-CAM polypeptide was obtained by translating mRNA from in vitro transcribed cDNA. The intact CHB1 cDNA was subcloned into the Gemini vector (Promega) via a partial EcoRI digest of the original phage clone. Capped sense and antisense RNA corresponding to CHB1 cDNA was synthesized by initiation from both SP6 and T7 promoters in the presence of cap analogue (Melton et al. 1984) and then translated in vitro using a rabbit reticulocyte lysate. A major anti-N-CAM reactive product was observed at 110 ×103Mr (Fig. 4). Several smaller specific translation products at 65 and 30 ×103Mr were also observed and may correspond to proteolytic fragments or aberrant initiation and/or termination reactions. Anti-sense RNA failed to generate immunoreactive N-CAM (not shown). In addition, translation in the presence of dog pancreas microsomes led to processing of the 110 ×103Mr primary product to yield a 122 × 103MT form migrating by SDS-PAGE just below desialo N-CAM-125 (Fig. 4B). Neuraminidase treatment (not shown) of the 122 ×103Mr form resulted in no further mobility change, indicating that the level of sialylation is identical to the immunoprecipitated, metabolically labelled N-CAM forms in Fig. 4. Thus the CHB1 cDNA encodes a complete in-frame N-CAM polypeptide whose molecular weight corresponds to the core polypeptide of N-CAM-125 in skeletal muscle myotubes. The minor mobility difference probably arises due to the failure to attach a PI-tail or for the in vitro system to perform tissue-specific glycosylation events.
In order to establish that the cloned cDNA contains the appropriate sequence to express an N-CAM isoform destined for attachment to the cell surface via a putative PI-linkage, the HindIII fragment from the pGEM vector was subcloned into a eukaryotic expression vector (Gunning et al. 1987). The resulting construct of the appropriate orientation was used to transfect monkey kidney (COS) cells by the calcium phosphate precipitation method. Two days after exposure to the DNA construct, cells were examined for transient, cell surface expression of N-CAM by immunofluorescence staining. Many cells exhibited punctate, surface-associated N-CAM immunostaining (Fig. 5) which was absent from control cells and abolished by omitting N-CAM antibody.
N-CAM-125 is an immunoglobulin gene superfamily member
On the basis of protein sequence and secondary structure predictions it has been proposed that N-CAM is a member of the immunoglobulin gene superfamily (Hemperly et al. 1986; Cunningham et al. 1987; Barthels et al. 1987; Williams, 1987; Hunkapiller & Hood, 1987). The structural features that lead to this assignment include a centrally positioned disulphide bridge within a domain of 100 amino acids and conserved residues at defined sites known to be important in maintaining the structure of the so-called antibody fold of immunoglobulin. The predicted sequence from the human skeletal muscle nontransmembrane N-CAM isoform described above exhibits similar structural features and thus can be categorized as an immunoglobulin gene superfamily member.
The primary amino acid sequence of N-CAM-125 contains five extracellular repeats with pairs of cysteine residues at positions 41 –96, 139 –189, 235 –287, 329 –385 and 426 –479 (Fig. 1C) delineating five immunoglobulin homology units, numbered I-V, respectively (Fig. 1D). Alignment of homology unit segments around corresponding cysteine residues reveals four other invariant residues (Fig. 6). These include a D-X-A/G-X-Y motif immediately preceding the second cysteine in each homology unit and corresponding to a variable type domain structure. There are four residues conserved in four of the five domains segments shown and even greater numbers of conserved residues maintained in only three domains. The invariant residues are likely to be important in maintaining three-dimensional structure, since alignment of other homology unit segments from cell surface multidomain glycoproteins (Mostov et al. 1984; Salzer et al. 1987; Beauchemin et al. 1987) from outside the immune system but assigned to the immunoglobulin gene superfamily, share many of these invariant residues (Fig. 6). Furthermore, positions where residues are not invariant have in many cases undergone conservative changes thereby maintaining the chemical characteristic of a particular position and the overall secondary structure conformations characteristic of Ig-like domains.
N-CAM isoforms in neural and skeletal muscle tissue exist as transmembrane and peripheral PI-linked plasma membrane glycoproteins (Cunningham et al. 1987; Dickson et al. 1987). PI-specific phospholipase C digestion releases core glycopeptides of Mr125 and 155 ×103 from intact muscle cells (Moore et al. 1987), and a species of Mr 120 × 103 from C6 glioma in tissue culture (He et al. 1986). While amino acid sequences exhibiting features compatible with PI-linkage have indeed been described from both tissues (Barthels et al. 1987; Dickson et al. 1987), direct demonstration of sequence correlates with authentic N-CAM polypeptide isoforms has been lacking. We describe here a complete cDNA coding sequence for a human muscle-specifc N-CAM isoform. Transcription, translation and processing studies using in vitro cell-free systems clearly correlate the cloned coding sequence with an authentic 125 ×103Mr PI-linked and PI-specific phospholipase C releasable desialo-N-CAM isoform present in human and mouse myotube cultures.
While the deduced polypeptide encoded by CHB1 has a theoretical Mr of 83 ×103, translation in vitro of synthetic mRNA produces an immunoprecipitable N-CAM core polypeptide migrating by SDS-PAGE with a Mr of 110 ×103. This discrepancy between theoretical and experimentally determined Mr has been observed with brain N-CAMs from other species and is thought to reflect nonideal migration by SDS-PAGE. In the presence of dog pancreas microsomes, the primary in vitro translation product of CHB1 mRNA (110 ×103Mr) is processed to yield a non-sialylated core glyco- and/or lipo-peptide of Mr 122 ×103 which migrates just below the authentic PI-linked desialo N-CAM-125 isoform of human skeletal muscle. The remaining 3 ×103 difference between the MrS of these species may reflect either the failure of the in vitro processing system to attach a PI-containing glycan tail or to carry out tissue-specific glycosylation events.
The ability of the CHB1 coding sequence to indeed direct synthesis of a polypeptide destined for cell surface expression was verified by transient cellular expression using DNA-mediated transfection of monkey kidney cells. These results clearly indicate that the cloned sequence of Mr83 × 103 coding potential which directs the synthesis of a 110 ×103Mr core polypeptide, is sufficient to confer cell surface attachment for this particular N-CAM isoform. Furthermore, direct sequence comparison of the deduced muscle N-CAM protein sequence with those predicted for chick and mouse brain N-CAM-120 (Cunningham et al. 1987; Barthels et al. 1987) indicates high colinear homology with the exception of the previously described MSD1 region (Dickson et al. 1987) indicating that CHB1 encodes the full protein sequence for a skeletal muscle N-CAM-125 isoform. However, unlike the other sequences this was derived from a single full-length clone.
Selection between transmembrane and PI-linked N-CAMs has been shown to operate via alternative RNA processing of a single primary gene transcript (Goridis & Wille, 1988). In skeletal muscle, the extracellular MSD1 coding block is associated with mRNAs of 5·2, 4·3 and 2·9kb which all contain a sequence compatible with PI-linkage and MSD1 is itself the product of a differential RNA splicing event operating in a tissue-specific and developmentally regulated manner (unpublished observation). The significance of this inserted sequence remains as yet undefined, but some similarity with the putative hinge region of immunoglobulin in terms of high proline, serine and threonine content exists (Putnam et al. 1979) and its presence is correlated with the expression of O-linked carbohydrate in appropriate N-CAM isoforms (F. S. Walsh and S. E. Moore, unpublished observations), a feature of the Ig hinge region.
As in the case of brain N-CAM species from chick and mouse, homologous repeat segments in the predicted NH2-terminal portion of the present human muscle N-CAM-125 exhibit some homology with the Ig superfamily (Williams, 1987). Indeed, it has been proposed that cell-cell adhesion may be mediated by interactions between Ig-like domains (Hoffman & Edelman, 1983). Secondary structure predictions for N-CAM-125 show β-stranded regions containing cysteines and other conserved residues that might assume an Ig-like structure (Hemperly et al. 1986).
While studies of the expression of N-CAM during skeletal muscle development and following injury in the adult indicate precise regulatory mechanisms controlling quantity, isoform ratios and post-translational processing (Moore et al. 1987), assignment of biological function to the N-CAM family in myogenesis has remained hypothetical and the use of antibodies to brain N-CAM in attempts to perturb myoblast fusion in vitro have been negative (Rutishauser et al. 1983). The availability of full-length muscle N-CAM cDNAs offers the opportunity via DNA-mediated stable transfection using sense and anti-sense constructs to engineer mutant myoblast cell lines either overexpressing or underexpressing a particular N-CAM isoform. In this way, the cellular aspects of myoblast migration and fusion, and myotube-neurone interaction which involve N-CAM-mediated events may be identified.
In addition, using cells not normally expressing N-CAM, the effect of various N-CAM domains, e.g. MSD1, on adhesive properties can be examined directly by in vitro deletion and site-directed mutagenesis of cDNA constructs.
This work was supported by the Muscular Dystrophy Group of Great Britain, Wellcome Trust and Brain Research Trust. F. S. Walsh is a Wellcome Trust Senior Lecturer.