Neural cell adhesion molecules (N-CAMs) are a family of cell surface sialoglycoproteins encoded by a single copy gene. A full-length cDNA clone that encodes a nontransmembrane phosphatidylinositol (PI) linked N-CAM of Mr125×103 has been isolated from a human skeletal muscle cDNA library. The deduced protein sequence encodes a polypeptide of 761 amino acids and is highly homologous to the N-CAM isoform in brain of Mr120×103. The size difference between the 125×103Mr. skeletal muscle form and the 120× 103Mr N-CAM form from brain is accounted for by the insertion of a block of 37 amino acids called MSD1, in the extracellular domain of the muscle form. Transient expression of the human cDNA in COS cells results in cell surface N-CAM expression via a putative covalent attachment to PI-containing phospholipid. Linked in vitro transcription and translation experiments followed by immunoprecipitation with anti-N-CAM antibodies demonstrate that the full-length clone of 761 amino acid coding potential produces a core polypeptide of Mr110×103 which is processed by microsomal membranes to yield a 122× 103Mr species. Taken together, these results demonstrate that the cloned cDNA sequence encodes a lipid-linked, PI-specific phospholipase C releasable surface isoform of N-CAM with core glycopeptide molecular weight corresponding to the authentic muscle 125×103Mr N-CAM isoform. This is the first direct correlation of cDNA and deduced protein sequence with a known PI-linked N-CAM isoform from skeletal muscle.

Specific homotypic and heterotypic cell-cell interactions occurring during embryogenesis and tissue formation are modulated by the temporal and spatial patterns of expression of cell adhesion molecules (CAMs) (Edelman, 1985). Two main adhesive systems operating either dependently or independently of calcium have been described (Edelman, 1986; Takeichi, 1987). Of the calcium-independent CAMs the best characterized is the neural cell adhesion molecule (N-CAM) (Edelman, 1986; Rutishauser & Goridis, 1986; Nybroe et al. 1988; Goridis & Wille, 1988; Walsh, 1988). N-CAM is a cell surface sialogly coprotein that is involved in both homotypic and heterotypic adhesive interactions via a homophilic binding mechanism (Edelman et al. 1987). Perturbation experiments with anti-N-CAM antibodies have implicated this molecule in a variety of events including migration and guidance of axons, neural tissue formation and nerve-muscle interactions (Edelman, 1985). It is known that the N-CAM gene is present as a single copy and the diversity of N-CAM RNAs and isoforms found during development and in specific cell types can be accounted for by specific patterns of alternative splicing and polyadenylation site selection (Cunningham et al. 1987; Owens et al. 1987; Barthels et al. 1987; Santoni et al. 1987 ; Dickson et al. 1987). In addition to changes in amino acid sequence, N-CAM is also subject to a variety of cell and isoform-specific post-translational modifications including glycosylation, phosphorylation and sulphation (Nybroe et al. 1988).

Two categories of N-CAM forms have been described in neural and skeletal muscle tissues to date. These are, first, those with transmembrane topology of Mrs 180 and 140 ×103 in brain, and 145 ×103 in muscle; and second, nontransmembrane species of Mr120 × 103 in brain and 125 and 155 ×103 in muscle, all of which are anchored to the external surface of the plasma membrane through covalent linkage to phosphatidylinositol (PI) via a specific glycan (Cunningham et al. 1987; He et al. 1987; Nybroe et al. 1988). The mechanism generating these two different membrane-associated groups of N-CAM isotypes has been clearly shown to be based on alternative RNA splicing and differential polyadenylation site selection (Cunningham et al. 1987; Goridis & Wille, 1988). In addition, a lipid-linked isoform(s) in skeletal muscle containing an additional tissuespecific domain (MSD1) in its extracellular region has been predicted by cDNA sequencing studies (Dickson et al. 1987). This insertion occurs at a recognized splice junction according to the chick (Owens et al. 1987), and human (J. Thompson, unpublished observations) N-CAM gene structure and represents the first alternative splicing event, so far detected, within the extracellular domain that is not associated with membrane attachment. Indeed the difference in molecular size between the brain N-CAM-120 and muscle N-CAM-125 forms is likely in part to reflect expression of this muscle-specific domain.

While complete cDNA sequence corresponding to the protein coding region for brain N-CAM-120 is available (Barthels et al. 1987), the full sequence corresponding to its 125 ×103Mr muscle counterpart has been lacking. In the present study, a cDNA containing the complete coding segment of a muscle specific N-CAM isoform was isolated and sequenced. Transcription, translation and processing of the cloned coding sequence in vitro and cell transfection results in the synthesis of a cell surface N-CAM glycopeptide of Mr122 ×103. This corresponds in size to an isoform observed in skeletal muscle cells both in vitro and in vivo that can be released by PI-specific phospholipase C (Moore et al. 1987).

Cloning of human skeletal muscle N-CAM cDNAs

32P-labelled cDNA probe 911 was used to isolate additional human fetal muscle N-CAM cDNA clones from a library in λgt11. The library was prepared from poly(A) +mRNA isolated from midfusion human muscle cultures (Dickson et al. 1986) by standard procedures (Huynh et al. 1985). Hybridizations were performed as before (Dickson et al. 1987) and positive clones were plaque purified by recloning at least three times.

DNA sequence analysis

Recombinant DNA from γ phage was prepared by standard methods and EcoRI cDNA fragments were subcloned into the EcoRI site of M13mpl8 and recombinants detected by using IPTG and X-gal. M13 recombinants were selected in both orientations by restriction mapping. Replicative form plasmid DNA was prepared by alkaline lysis and purified by two rounds of CsCl equilibrium density gradient centrifugation (Maniatis et al. 1982). A series of overlapping subclones were generated by Xba1/Sph1 double digestion of the replicative form plasmid DNA and deletions generated with exonuclease III and SI nuclease (Henikoff, 1984). Overlapping subclones were sequenced by the dideoxy chain termination method (Sanger et al. 1977). Full cDNAs were sequenced at least twice on opposing strands. Subclone sequences were read manually, aligned and merged to generate full-length clones using the Microgenie DNA sequence analysis software package (Queen & Korn, 1984).

Northern and Southern analysis

Poly(A) mRNA (2 μg), purified from tissue or cultured cells as described before (Dickson et al. 1986), was fractionated by glyoxal agarose electrophoresis (Maniatis et al. 1982) and blotted onto Genescreen as described by the manufacturer. Hybridizations using 32P-labelled whole plasmid or gel-purified insert were performed as described previously (Dickson et al. 1987).

Cell-free transcription-linked translation

The two EcoRI fragments of CHB1 were isolated by partial digestion and subcloned into the EcoRI site of the bacterial expression vector pGEM (Promega Systems, Liverpool, UK). The continuity and relative orientation of the recombinant DNA was ascertained by restriction mapping. Following linearization of the recombinant plasmid, transcription was initiated with either T7 or Sp6 RNA polymerase depending on the orientation of the CHB1 insert. Conditions of the transcription reaction were as described by the manufacturer with template DNA at 50 μg ml −1, 1 unit of RNasin and a reaction time of 60 min at 40 °C. Synthetic RNA was removed from DNA template and translated in a rabbit reticulocyte lysate in vitro translation system (NEN) with [35S]methionine as the radiolabel. Optimal incorporation into TCA-precipitable material was achieved by titration of both Mg2+ and K+ concentrations with optimal concentrations of 0·7mM and 52mM, respectively. Synthesized products were separated on polyacrylamide gels with or without immunoprecipitation with anti-N-CAM (Moore et al. 1987). Proteins bands were identified by fluorography at −70 °C.

DNA transfection studies

A Hindi 11 fragment corresponding to the entire large Hindi II portion of CHB1 plus pGEM polylinker regions was ligated into the unique HindIII site of p4.4.4 (Gunning, 1987) which follows the β-actin promoter. Sense-orientated constructs were identified by restriction mapping and purified plasmid DNA was used to transfect monkey kidney cells via the calcium phosphate precipitation method. Transient expression was examined after 48 h using indirect immunofluorescence with rabbit antibodies to N-CAM (Moore et al. 1987).

N-CAM cDNA isolation and sequencing

N-CAM clones were isolated from a λgt11 cDNA library constructed from human fetal muscle poly ( A) +mRNA (Dickson et al. 1987). The cDNA library was screened by plaque hybridization with a mouse brain cDNA probe, 911, which was isolated by screening of a young postnatal mouse brain cDNA library with an oligonucleotide probe to the N-CAM N-terminal protein sequence (J.-C. Chaix & C. Goridis, unpublished results). The 911 probe contained both N-terminal-coding and 5 ′ untranslated sequence. The largest cDNA clone isolated, CHB1, was composed of two EcoRI fragments of 1·6 and 1·2 kb and these were subcloned in both orientations into M13 mpl8 for sequencing and detailed restriction site analyses.

Restriction endonuclease mapping clearly indicated that the small (T2kb) EcoRI fragment of CHB1 was similar to a previously described human muscle N-CAM cDNA, λ9·5 (Dickson et al. 1987) whose sequence spans coding regions corresponding to membrane-proximal and COOH-terminal domains of a nontransmembrane lipid-linked muscle N-CAM form (Fig. 1 A). The large (1·6 kb) EcoRI fragment of CH Bl was shown by Southern blot analysis to be the source of the hybridization signal with the mouse brain probe 911, and DNA sequence analysis (see below) of the 1·6 kb fragment confirmed its identity with 5 ′ sequence ofchick and mouse N-CAM cDNAs (Cunningham et al. 1987; Barthels et al. 1987). A Northern blot analysis using the entire CHB1 cDNA clone as hybridization probe identifies characteristic N-CAM mRNA transcripts of 6·7, 5·2, 4·3 and 2·9 kb in human myotube RNA (Fig. 2). A subfragment probe encoding the COOH-terminal domain of the proposed protein hybridized to N-CAM RNAs of 5·2, 4·3 and 2·9 kb. This is consistent with previous studies (Dickson et al. 1987) indicating that these transcripts contain the necessary coding sequence to allow N-CAM attachment to membrane by a phosphatidylinositol linkage and with only RNA transcripts from muscle containing the MSD1 region.

Fig. 1.

Structure of human skeletal muscle N-CAM cDNA clone CHB1 and its predicted protein. (A) Restriction map. Restriction endonuclease cleavage sites are shown for ApaLl (A), EcoRI (E), HindiIII (H), Kpnl (K), Nhel (N), Pstl (P), 5acl (S). The cDNA is orientated as indicated, also shown are the initiation ATG and termination TAG codons. The scale bar shown is 300 bp or 100 amino acids. (B) /V-linked carbohydrate attachment sites. Six consensus sites for N-linked glycosylation within the central region of the derived protein sequence are shown by vertical lines. (C) Position of cysteine residues. The eleven cysteine residues are shown by vertical lines. The ten N-proximal cysteines are implicated in five intramolecular disulphide bridges with the remaining cysteine being the COOH-terminal residue of the native protein. (D) Domain structure of human skeletal muscle N-CAM-125. Two hydrophobic domains (solid) were identified from hydropathy analysis (see below), with the N-terminal being the signal peptide and the COOH-terminal region conforming to the requirements for membrane attachment via PI linkage. The five immunoglobulin homology units delineated by the typical disulphide bridges are numbered I-V and the muscle-specific domain (MSDl) is shown (vertical lines). (E) Hydropathy analysis. Hydropathy analysis was performed using the algorithm of Kyte & Doolittle (1982) with hydrophobic and hydrophilic residues above and below the line, respectively. The distinctive hydrophobic signal peptide and COOH-terminal regions are clearly within the hydrophobic region.

Fig. 1.

Structure of human skeletal muscle N-CAM cDNA clone CHB1 and its predicted protein. (A) Restriction map. Restriction endonuclease cleavage sites are shown for ApaLl (A), EcoRI (E), HindiIII (H), Kpnl (K), Nhel (N), Pstl (P), 5acl (S). The cDNA is orientated as indicated, also shown are the initiation ATG and termination TAG codons. The scale bar shown is 300 bp or 100 amino acids. (B) /V-linked carbohydrate attachment sites. Six consensus sites for N-linked glycosylation within the central region of the derived protein sequence are shown by vertical lines. (C) Position of cysteine residues. The eleven cysteine residues are shown by vertical lines. The ten N-proximal cysteines are implicated in five intramolecular disulphide bridges with the remaining cysteine being the COOH-terminal residue of the native protein. (D) Domain structure of human skeletal muscle N-CAM-125. Two hydrophobic domains (solid) were identified from hydropathy analysis (see below), with the N-terminal being the signal peptide and the COOH-terminal region conforming to the requirements for membrane attachment via PI linkage. The five immunoglobulin homology units delineated by the typical disulphide bridges are numbered I-V and the muscle-specific domain (MSDl) is shown (vertical lines). (E) Hydropathy analysis. Hydropathy analysis was performed using the algorithm of Kyte & Doolittle (1982) with hydrophobic and hydrophilic residues above and below the line, respectively. The distinctive hydrophobic signal peptide and COOH-terminal regions are clearly within the hydrophobic region.

Fig. 2.

N-CAM transcripts identified by CHB1. Poly(A) + mRNA from human fetal myotube cultures was subjected to Northern blot analysis and hybridized with the whole CHB1 cDNA probe (lane 1) or with a Henikoff deletion subprobe encoding the phosphatidylinositol linkage and some 3 ′ untranslated sequence (lane 2). The size classes of the RNAs are indicated on the left.

Fig. 2.

N-CAM transcripts identified by CHB1. Poly(A) + mRNA from human fetal myotube cultures was subjected to Northern blot analysis and hybridized with the whole CHB1 cDNA probe (lane 1) or with a Henikoff deletion subprobe encoding the phosphatidylinositol linkage and some 3 ′ untranslated sequence (lane 2). The size classes of the RNAs are indicated on the left.

CH Bl encodes N-CAM-125

The entire coding sequence of CHB1 (Fig. 3) was found to be highly homologous with mouse and chick cDNAs corresponding to brain N-CAM-120 (88 and 82%, respectively) with the exception of the previously described muscle-specific domain MSD1 (Dickson et al. 1987). The major open reading frame of CHB1 predicts a 761 amino acid polypeptide of core Mr 83 ×103. Significant discrepancies in the predicted Mr of N-CAM polypeptides and their observed migration by SDS-PAGE have been reported (Barthels et al. 1987; Cunningham, 1987). In this respect the difference between mouse brain N-CAM-120 (725 amino acids, predicted Mr 79 ×103) and the putative muscle isoform N-CAM-125 is accounted for entirely by the 37 amino acids of the MSD1 domain. With the exception of this domain the percentage homology at the amino acid level increases to 92 % and 87 % for mouse and chick sequences, respectively.

Fig. 3.

Nucleotide and derived amino acid sequence of the full-length N-CAM coding region for a nontransmembrane isoform from human skeletal muscle. Nucleotides are numbered on the right from the initiation ATG codon and amino acids are numbered on the left from the first methionine residue. The hydrophobic signal peptide is underlined (amino acids, 1 –19), the ten cysteine residues are circled, probable N-linked carbohydrate attachment sites are shown as black dots, the muscle-specific sequence (MSDl) (amino acids 598 –635) is shown by double underlines and the COOH-terminal hydrophobic tail (amino acids 742 –761) is boxed.

Fig. 3.

Nucleotide and derived amino acid sequence of the full-length N-CAM coding region for a nontransmembrane isoform from human skeletal muscle. Nucleotides are numbered on the right from the initiation ATG codon and amino acids are numbered on the left from the first methionine residue. The hydrophobic signal peptide is underlined (amino acids, 1 –19), the ten cysteine residues are circled, probable N-linked carbohydrate attachment sites are shown as black dots, the muscle-specific sequence (MSDl) (amino acids 598 –635) is shown by double underlines and the COOH-terminal hydrophobic tail (amino acids 742 –761) is boxed.

In the human N-CAM sequence, the selected translational initiation codon exhibits two nucleotides of the five-base initiation consensus of Kozak (1984) and is followed by a putative signal peptide of 19 predominantly hydrophobic amino acids (Fig. 1E) similar to that described for mouse brain N-CAM (Barthels et al. 1987). The subsequent 17 amino acids are identical to those found by direct protein sequencing of the NH2-terminus of rat brain N-CAM (Rougon & Marshak, 1986), with a predicted NH2-terminal leucine residue in the mature polypeptide. The central region of the molecule contains six consensus sites for N-linked glycosylation (Fig. 1B).

At the COOH-terminus of the predicted human muscle N-CAM polypeptide a stretch of hydrophobic amino acids is found (Fig. 1E). This conforms to the requirements for covalent anchorage to the plasma membrane via a PI containing glycan moiety (Cross, 1987) as described for mouse brain N-CAM-120 (Barthels et al. 1987). In this respect, the N-CAM-125 isoform of mouse muscle has been clearly shown to be released from intact cells by PI-specific phospholipase C treatment (Moore et al. 1987). Given these structural features, its predicted Mr and similarity to N-CAM-120, it is thus likely that CHB1 carries the entire coding region for the human muscle N-CAM-125 isoform, directly comparable to brain N-CAM-120 but incorporating the MSD1 domain.

In vitro expression of CH Bl coding sequence

Further evidence to suggest that the determined sequence of CHB1 encodes a complete human skeletal muscle N-CAM polypeptide was obtained by translating mRNA from in vitro transcribed cDNA. The intact CHB1 cDNA was subcloned into the Gemini vector (Promega) via a partial EcoRI digest of the original phage clone. Capped sense and antisense RNA corresponding to CHB1 cDNA was synthesized by initiation from both SP6 and T7 promoters in the presence of cap analogue (Melton et al. 1984) and then translated in vitro using a rabbit reticulocyte lysate. A major anti-N-CAM reactive product was observed at 110 ×103Mr (Fig. 4). Several smaller specific translation products at 65 and 30 ×103Mr were also observed and may correspond to proteolytic fragments or aberrant initiation and/or termination reactions. Anti-sense RNA failed to generate immunoreactive N-CAM (not shown). In addition, translation in the presence of dog pancreas microsomes led to processing of the 110 ×103Mr primary product to yield a 122 × 103MT form migrating by SDS-PAGE just below desialo N-CAM-125 (Fig. 4B). Neuraminidase treatment (not shown) of the 122 ×103Mr form resulted in no further mobility change, indicating that the level of sialylation is identical to the immunoprecipitated, metabolically labelled N-CAM forms in Fig. 4. Thus the CHB1 cDNA encodes a complete in-frame N-CAM polypeptide whose molecular weight corresponds to the core polypeptide of N-CAM-125 in skeletal muscle myotubes. The minor mobility difference probably arises due to the failure to attach a PI-tail or for the in vitro system to perform tissue-specific glycosylation events.

Fig. 4.

In vitro expression of human skeletal muscle N-CAM-125. The two EcoRI fragments of clone CHB1 were subcloned into the EcoRI site of the Gemini vector (Promega). Synthetic RNA was prepared by transcription with T7 RNA polymerase using the m7 GpppG cap analogue to initiate synthesis. RNA was isolated and translated in a rabbit reticulocyte lysate. (A) Lane 1 shows mRNA-independent incorporation, lane 2 total incorporation and lane 3 immunoprecipitable products from lane 2 with antibody to human N-CAM. Numbers on the right relate to the apparent molecular weights of standards. (B) Translation was performed in the absence (lane 1) or presence (lane 2) of dog pancreas microsomes. Immunoprecipitated desialo N-CAM isoforms from mouse myotube cultures metabolically labelled with [35S]methionine are shown in lane 3. Molecular weight markers are shown on the left and muscle N-CAM isoform sizes on the right.

Fig. 4.

In vitro expression of human skeletal muscle N-CAM-125. The two EcoRI fragments of clone CHB1 were subcloned into the EcoRI site of the Gemini vector (Promega). Synthetic RNA was prepared by transcription with T7 RNA polymerase using the m7 GpppG cap analogue to initiate synthesis. RNA was isolated and translated in a rabbit reticulocyte lysate. (A) Lane 1 shows mRNA-independent incorporation, lane 2 total incorporation and lane 3 immunoprecipitable products from lane 2 with antibody to human N-CAM. Numbers on the right relate to the apparent molecular weights of standards. (B) Translation was performed in the absence (lane 1) or presence (lane 2) of dog pancreas microsomes. Immunoprecipitated desialo N-CAM isoforms from mouse myotube cultures metabolically labelled with [35S]methionine are shown in lane 3. Molecular weight markers are shown on the left and muscle N-CAM isoform sizes on the right.

In order to establish that the cloned cDNA contains the appropriate sequence to express an N-CAM isoform destined for attachment to the cell surface via a putative PI-linkage, the HindIII fragment from the pGEM vector was subcloned into a eukaryotic expression vector (Gunning et al. 1987). The resulting construct of the appropriate orientation was used to transfect monkey kidney (COS) cells by the calcium phosphate precipitation method. Two days after exposure to the DNA construct, cells were examined for transient, cell surface expression of N-CAM by immunofluorescence staining. Many cells exhibited punctate, surface-associated N-CAM immunostaining (Fig. 5) which was absent from control cells and abolished by omitting N-CAM antibody.

Fig. 5.

Transient cell surface expression of CHB1 coding sequence in monkey kidney (COS) cells. Monkey kidney (COS) cells were transfected with the full-length coding sequence of a non-transmembrane, putative PI-linked human skeletal muscle N-CAM-125 isoform under the control of the β-actin promoter. Cells (phase-contrast view (A)), were examined for transient cell surface N-CAM expression by indirect immunofluorescence (B) 48 h after transfection by using rabbit antibodies to N-CAM. Bar, 25 μ m.

Fig. 5.

Transient cell surface expression of CHB1 coding sequence in monkey kidney (COS) cells. Monkey kidney (COS) cells were transfected with the full-length coding sequence of a non-transmembrane, putative PI-linked human skeletal muscle N-CAM-125 isoform under the control of the β-actin promoter. Cells (phase-contrast view (A)), were examined for transient cell surface N-CAM expression by indirect immunofluorescence (B) 48 h after transfection by using rabbit antibodies to N-CAM. Bar, 25 μ m.

N-CAM-125 is an immunoglobulin gene superfamily member

On the basis of protein sequence and secondary structure predictions it has been proposed that N-CAM is a member of the immunoglobulin gene superfamily (Hemperly et al. 1986; Cunningham et al. 1987; Barthels et al. 1987; Williams, 1987; Hunkapiller & Hood, 1987). The structural features that lead to this assignment include a centrally positioned disulphide bridge within a domain of 100 amino acids and conserved residues at defined sites known to be important in maintaining the structure of the so-called antibody fold of immunoglobulin. The predicted sequence from the human skeletal muscle nontransmembrane N-CAM isoform described above exhibits similar structural features and thus can be categorized as an immunoglobulin gene superfamily member.

The primary amino acid sequence of N-CAM-125 contains five extracellular repeats with pairs of cysteine residues at positions 41 –96, 139 –189, 235 –287, 329 –385 and 426 –479 (Fig. 1C) delineating five immunoglobulin homology units, numbered I-V, respectively (Fig. 1D). Alignment of homology unit segments around corresponding cysteine residues reveals four other invariant residues (Fig. 6). These include a D-X-A/G-X-Y motif immediately preceding the second cysteine in each homology unit and corresponding to a variable type domain structure. There are four residues conserved in four of the five domains segments shown and even greater numbers of conserved residues maintained in only three domains. The invariant residues are likely to be important in maintaining three-dimensional structure, since alignment of other homology unit segments from cell surface multidomain glycoproteins (Mostov et al. 1984; Salzer et al. 1987; Beauchemin et al. 1987) from outside the immune system but assigned to the immunoglobulin gene superfamily, share many of these invariant residues (Fig. 6). Furthermore, positions where residues are not invariant have in many cases undergone conservative changes thereby maintaining the chemical characteristic of a particular position and the overall secondary structure conformations characteristic of Ig-like domains.

Fig. 6.

Alignment of immunoglobulin homology units of N-CAM with other multidomain structures outside the immune system. Human skeletal muscle N-CAM immunoglobulin homology units (I-V) are aligned with domains II-VII of the carcinoembryonic antigen (CEA), domains II-V of myelin associated glycoprotein (MAG) and domains I-V of the polylg receptor protein (POLYIgR), a typical variable domain structure (Williams, 1987). The single letter amino acid code is used and residues in N-CAM are boxed where there are at least either three identical amino acids or conserved changes (Schwartz & Dayhoff, 1978). Residues in the domains from the other proteins are only boxed when they match boxed residues in N-CAM. Gaps were introduced where necessary so as to align the cysteine pair of each domain. The immunoglobulin homology domain I of MAG was not included in the analysis, since it exhibited fewer matches than MAG V.

Fig. 6.

Alignment of immunoglobulin homology units of N-CAM with other multidomain structures outside the immune system. Human skeletal muscle N-CAM immunoglobulin homology units (I-V) are aligned with domains II-VII of the carcinoembryonic antigen (CEA), domains II-V of myelin associated glycoprotein (MAG) and domains I-V of the polylg receptor protein (POLYIgR), a typical variable domain structure (Williams, 1987). The single letter amino acid code is used and residues in N-CAM are boxed where there are at least either three identical amino acids or conserved changes (Schwartz & Dayhoff, 1978). Residues in the domains from the other proteins are only boxed when they match boxed residues in N-CAM. Gaps were introduced where necessary so as to align the cysteine pair of each domain. The immunoglobulin homology domain I of MAG was not included in the analysis, since it exhibited fewer matches than MAG V.

N-CAM isoforms in neural and skeletal muscle tissue exist as transmembrane and peripheral PI-linked plasma membrane glycoproteins (Cunningham et al. 1987; Dickson et al. 1987). PI-specific phospholipase C digestion releases core glycopeptides of Mr125 and 155 ×103 from intact muscle cells (Moore et al. 1987), and a species of Mr 120 × 103 from C6 glioma in tissue culture (He et al. 1986). While amino acid sequences exhibiting features compatible with PI-linkage have indeed been described from both tissues (Barthels et al. 1987; Dickson et al. 1987), direct demonstration of sequence correlates with authentic N-CAM polypeptide isoforms has been lacking. We describe here a complete cDNA coding sequence for a human muscle-specifc N-CAM isoform. Transcription, translation and processing studies using in vitro cell-free systems clearly correlate the cloned coding sequence with an authentic 125 ×103Mr PI-linked and PI-specific phospholipase C releasable desialo-N-CAM isoform present in human and mouse myotube cultures.

While the deduced polypeptide encoded by CHB1 has a theoretical Mr of 83 ×103, translation in vitro of synthetic mRNA produces an immunoprecipitable N-CAM core polypeptide migrating by SDS-PAGE with a Mr of 110 ×103. This discrepancy between theoretical and experimentally determined Mr has been observed with brain N-CAMs from other species and is thought to reflect nonideal migration by SDS-PAGE. In the presence of dog pancreas microsomes, the primary in vitro translation product of CHB1 mRNA (110 ×103Mr) is processed to yield a non-sialylated core glyco- and/or lipo-peptide of Mr 122 ×103 which migrates just below the authentic PI-linked desialo N-CAM-125 isoform of human skeletal muscle. The remaining 3 ×103 difference between the MrS of these species may reflect either the failure of the in vitro processing system to attach a PI-containing glycan tail or to carry out tissue-specific glycosylation events.

The ability of the CHB1 coding sequence to indeed direct synthesis of a polypeptide destined for cell surface expression was verified by transient cellular expression using DNA-mediated transfection of monkey kidney cells. These results clearly indicate that the cloned sequence of Mr83 × 103 coding potential which directs the synthesis of a 110 ×103Mr core polypeptide, is sufficient to confer cell surface attachment for this particular N-CAM isoform. Furthermore, direct sequence comparison of the deduced muscle N-CAM protein sequence with those predicted for chick and mouse brain N-CAM-120 (Cunningham et al. 1987; Barthels et al. 1987) indicates high colinear homology with the exception of the previously described MSD1 region (Dickson et al. 1987) indicating that CHB1 encodes the full protein sequence for a skeletal muscle N-CAM-125 isoform. However, unlike the other sequences this was derived from a single full-length clone.

Selection between transmembrane and PI-linked N-CAMs has been shown to operate via alternative RNA processing of a single primary gene transcript (Goridis & Wille, 1988). In skeletal muscle, the extracellular MSD1 coding block is associated with mRNAs of 5·2, 4·3 and 2·9kb which all contain a sequence compatible with PI-linkage and MSD1 is itself the product of a differential RNA splicing event operating in a tissue-specific and developmentally regulated manner (unpublished observation). The significance of this inserted sequence remains as yet undefined, but some similarity with the putative hinge region of immunoglobulin in terms of high proline, serine and threonine content exists (Putnam et al. 1979) and its presence is correlated with the expression of O-linked carbohydrate in appropriate N-CAM isoforms (F. S. Walsh and S. E. Moore, unpublished observations), a feature of the Ig hinge region.

As in the case of brain N-CAM species from chick and mouse, homologous repeat segments in the predicted NH2-terminal portion of the present human muscle N-CAM-125 exhibit some homology with the Ig superfamily (Williams, 1987). Indeed, it has been proposed that cell-cell adhesion may be mediated by interactions between Ig-like domains (Hoffman & Edelman, 1983). Secondary structure predictions for N-CAM-125 show β-stranded regions containing cysteines and other conserved residues that might assume an Ig-like structure (Hemperly et al. 1986).

While studies of the expression of N-CAM during skeletal muscle development and following injury in the adult indicate precise regulatory mechanisms controlling quantity, isoform ratios and post-translational processing (Moore et al. 1987), assignment of biological function to the N-CAM family in myogenesis has remained hypothetical and the use of antibodies to brain N-CAM in attempts to perturb myoblast fusion in vitro have been negative (Rutishauser et al. 1983). The availability of full-length muscle N-CAM cDNAs offers the opportunity via DNA-mediated stable transfection using sense and anti-sense constructs to engineer mutant myoblast cell lines either overexpressing or underexpressing a particular N-CAM isoform. In this way, the cellular aspects of myoblast migration and fusion, and myotube-neurone interaction which involve N-CAM-mediated events may be identified.

In addition, using cells not normally expressing N-CAM, the effect of various N-CAM domains, e.g. MSD1, on adhesive properties can be examined directly by in vitro deletion and site-directed mutagenesis of cDNA constructs.

This work was supported by the Muscular Dystrophy Group of Great Britain, Wellcome Trust and Brain Research Trust. F. S. Walsh is a Wellcome Trust Senior Lecturer.

Barthels
,
D.
,
Santoni
,
M.-J.
,
Wille
,
W.
,
Ruddert
,
C.
,
Chaix
,
J.-C.
,
Hirsch
,
M.-R.
,
Fontecilla-Camps
,
J. C.
&
Goridis
,
C.
(
1987
).
Isolation and nucleotide sequence of mouse N-CAM cDNA that codes for a Mr 79000 polypeptide without a membrane spanning region
.
EMBO J
.
6
,
907
914
.
Beauchemin
,
N.
,
Benchimol
,
S.
,
Cournoyer
,
D.
,
Fuks
,
A.
&
Stanners
,
C. P.
(
1987
).
Isolation and characterisation of full-length functional cDNA clones for human carcinoembryonic antigen
.
Molec. Cell Biol
.
7
,
3221
3230
.
Cross
,
G. A. M.
(
1987
).
Eukaryotic protein modification and membrane attachment via phosphatidylinositol
.
Cell
48
,
179
181
.
Cunningham
,
B. A.
,
Hemperly
,
J. J.
,
Murray
,
B. A.
,
Prediger
,
E. A.
,
Brackenbury
,
R.
&
Edelman
,
G. M.
(
1987
).
Neural cell adhesion molecule: structure, immunoglobulin-like domain, cell surface modulation and alternative RNA splicing
.
Science
236
,
799
806
.
Dickson
,
G.
,
Gower
,
H. J.
,
Barton
,
C. H.
,
Prentice
,
H. M.
,
Elsom
,
V.
,
Moore
,
S. E.
,
Cox
,
R. D.
,
Quinn
,
C. A.
,
Putt
,
W.
&
Walsh
,
F. S.
(
1987
).
Human muscle neural cell adhesion molecule (N-CAM): identification of a muscle specific sequence in the extracellular domain
.
Cell
50
,
1119
1130
.
Dickson
,
G.
,
Prentice
,
H.
,
Julien
,
J.-P.
,
Ferrari
,
G.
,
Leon
,
A.
&
Walsh
,
F. S.
(
1986
).
Nerve growth factor activates Thy-1 and neurofilament gene transcription in rat PC12 cells
.
EMBO J
.
5
,
3449
3453
.
Edelman
,
G. M.
(
1985
).
Cell adhesion and the molecular process of morphogenesis
.
A. Rev. Biochem
.
54
,
135
169
.
Edelman
,
G. M.
(
1986
).
Cell adhesion molecules in the regulation of animal form and pattern formation
.
A. Rev. Cell Biol
.
2
,
81
116
.
Edelman
,
G. M.
,
Murray
,
B. A.
,
Mece
,
R.-M.
,
Cunningham
,
B. D.
&.
Gallin
,
W. J.
(
1987
).
Cellular expression of liver and neural cell adhesion molecules after transfection with their cDNAs results in specific cell-cell binding
.
Proc. natn. Acad. Sci. U.S.A
.
84
,
8502
8506
.
Goridis
,
C.
&
Wille
,
W.
(
1988
).
The three size classes of mouse N-CAM proteins arise from a single gene by a combination of alternative splicing and use of different polyadenylation sites
.
Neurochem. Int
.
12
,
269
272
.
Gunning
,
P.
,
Leavitt
,
J.
,
Muscat
,
G.
,
Ng
,
S.-Y.
&
Kedes
,
L.
(
1987
).
A human /Lactin expression vector system directs high-level accumulation of antisense transcripts
.
Proc. natn. Acad. Sci. U.S.A
.
84
,
4831
4835
.
He
,
H. T.
,
Barbet
,
J.
,
Chaix
,
J. C.
&
Goridis
,
C.
(
1986
).
Phosphatidylinositol is involved in the membrane attachment of N-CAM-120, the smallest component of the neural cell adhesion molecule
.
EMBO J
.
5
,
2489
2494
.
He
,
H. T.
,
Finne
,
J.
&
Goridis
,
C.
(
1987
).
Biosynthesis, membrane association, and release of N-CAM-120, a phosphatidylinositol-linked form of the neural cell adhesion molecule
.
J. Cell Biol
.
105
,
2489
2500
.
Hemperly
,
J. J.
,
Murray
,
B. A.
&
Edelman
,
G. M.
(
1986
).
Sequence of a cDNA clone encoding the polysialic acid-rich and cytoplasmic domains of the neural cell adhesion molecule N-CAM
.
Proc. natn. Acad. Sci. U.S.A
.
83
,
3037
3041
.
Henikoff
,
S.
(
1984
).
Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing
.
Gene
28
,
351
359
.
Hoffman
,
S.
&
Edelman
,
G. M.
(
1983
).
Kinetics of homophilic binding by embryonic and adult forms of the neural cell adhesion molecule
.
Proc. natn. Acad. Sci. U.S.A
.
80
,
5762
5766
.
Hunkapiller
,
T.
&
Hood
,
L.
(
1986
).
The growing immunoglobulin gene superfamily
.
Nature, Lond
.
323
,
15
18
.
Huynh
,
T. V.
,
Young
,
R. A.
&
Davis
,
R. W.
(
1985
).
Construction and screening of cDNA libraries in ÂgtlO and Agtll
.
In DNA Cloning
, vol.
1
(ed.
D. M.
Glover
), pp.
49
78
.
New York
:
IRL Press
.
Kozak
,
M.
(
1984
).
Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs
.
Nucleic Acid Res
.
12
,
857
872
.
Kyte
,
J.
&
Doolittle
,
R. F.
(
1982
).
A simple method for displaying the hydrophobic character of a protein
.
J. molec. Biol
.
157
,
105
132
.
Maniatis
,
T.
,
Fritsh
,
E. F.
&
Sambrook
,
J.
(
1982
).
Molecular Cloning: A Laboratory Manual
.
Cold Spring Harbor, New York
:
Cold Spring Harbor Laboratory
.
Melton
,
D. A.
,
Krieg
,
P. A.
,
Rebagliati
,
M. R.
,
Maniatis
,
T.
,
Zinn
,
K.
&
Green
,
M. R.
(
1984
).
Efficient in vitro synthesis of biologically active RNA and RNA hybridization probes from plasmids containing a bacteriophage SP6 promoter
.
Nucleic Acid Res
.
12
,
7035
7056
.
Moore
,
S. E.
,
Thompson
,
J.
,
Kirkness
,
V.
,
Dickson
,
J. G.
&
Walsh
,
F. S.
(
1987
).
Skeletal muscle neural cell adhesion molecule (NCAM): changes in protein and mRNA species during myogenesis of muscle cell lines
.
J. Cell Biol
.
105
,
1377
1386
.
Mostov
,
K. E.
,
Friedlander
,
M.
&
Blobel
,
G.
(
1984
).
The receptor for transepithelial transport of IgA and IgM contains multiple immunoglobulin-like domains
.
Nature, Lond
.
308
,
37
43
.
Nybroe
,
O.
,
Linnemann
,
D.
&
Bock
,
E.
(
1988
).
N-CAM biosynthesis in brain
.
Neurochem. int
.
12
,
251
262
.
Owens
,
G. C.
,
Edelman
,
G. M.
&
Cunningham
,
B. A.
(
1987
).
Organisation of the neural cell adhesion molecule N-CAM gene: alternative exon usage as a basis for different membrane associated domains
.
Proc, natn. Acad. Sci. U.S.A
.
84
,
294
298
.
Putnam
,
F. W.
,
Liu
,
Y.-S. V.
&
Low
,
T. L. K.
(
1979
).
Primary structure of an IgAl immunoglobulin
.
J. biol. Chem
.
254
,
2865
2874
.
Queen
,
C.
&
Korn
,
L. J.
(
1984
).
A comprehensive sequence analysis program for the IBM personal computer
.
Nucleic Acids Res
.
12
,
581
599
.
Rougon
,
G.
&
Marshak
,
D.
(
1986
).
Structural and immunological characterization of the amino-terminal domain of mammalian cell adhesion molecules
.
J. biol. Chem
.
261
,
3396
3401
.
Rutishauser
,
U.
&
Goridis
,
C.
(
1986
).
N-CAM: The molecule and its genetics
.
Trends Genet
.
2
,
72
76
.
Rutishauser
,
U.
,
Grumet
,
M.
&
Edelman
,
G. M.
(
1983
).
Neural cell adhesion molecule mediates initial interaction between spinal cord neurons and muscle cells in culture
.
J. Cell Biol
.
97
,
145
152
.
Salzer
,
J. L.
,
Holmes
,
W. P.
&
Colman
,
D. R.
(
1987
).
The amino acid sequence of the myelin-associated glycoproteins: homology to the immunoglobulin gene superfamily
.
J. Cell Biol
.
104
,
957
965
.
Sanger
,
F.
,
Nicklen
,
S.
&
Coulson
,
A. R.
(
1977
).
DNA sequencing with chain-terminating inhibitors
.
Proc. natn. Acad. Sci. U.S.A
.
74
,
5463
5467
.
Santoni
,
M.-J.
,
Barthels
,
I. D.
,
Barbas
,
J. A.
,
Hirsch
,
M.-R.
,
Steinmetz
,
M.
,
Goridis
,
C.
&
Wille
,
W.
(
1987
).
Analysis of cDNA clones that code for the transmembrane forms of the mouse neural cell adhesion molecule (N-CAM) and are generated by alternative RNA splicing
.
Nucleic Acids Res
.
15
,
8621
8641
.
Schwartz
,
R. M.
&
Dayhoff
,
M. O.
(
1978
).
Atlas of Protein Sequence and Structure
, vol.
5
,
Suppl. 3
(ed.
M. O.
Dayhoff
), pp.
353
358
.
Washington, DC
:
National Biomedical Research Foundation
.
Takeichi
,
M.
(
1987
).
Cadherins: a molecular family essential for selective cell-cell adhesion and animal morphogenesis
.
Trends Genet
.
3
,
213
216
.
Walsh
,
F. S.
(
1988
).
The N-CAM gene is a complex transcriptional unit
.
Neurochem. Int. (In press)
.
Williams
,
A. F.
(
1987
).
A year in the life of the immunoglobulin superfamily
.
Immunol Today
8
,
298
303
.