The unusual Chlamydomonas linkage group XIX - called the uni linkage group for the uni mutants that lack one of the paired flagellae of wild-type cells - has been reported to be physically located exclusively at the basal bodies. To learn whether the structure of genes on this linkage group differs from the structure of nuclear genes in this organism, we determined the primary structure of a gene that maps to linkage group XIX. This analysis reveals the presence of nine intervening sequences; the nucleotides at exon/intron boundaries conform with nuclear gene intron junction sequences. Also typical for C. reinhardtii nuclear genes are the position and sequence of the putative polyadenylation signal. These findings suggest that transcripts from linkage group XIX are likely to be processed in the nucleus. The open reading frame, which displays weak but easily detected Chlamydomonas codon bias, potentially encodes a protein similar to a membrane anchor for cytoskeletal proteins. The observation that expression of this gene is regulated during interphase and in gametes is not consistent with the hypothesis that linkage group XIX may be expressed only during mitotic and meiotic processes.

Linkage group XIX of Chlamydomonas reinhardtii has three unusual characteristics: it is genetically circular, displays altered recombinational properties and every locus identified by mutation to date affects processes involving microtubules (Ramanis and Luck, 1986; reviewed by Dutcher, 1989). Hall and coworkers (1989) presented evidence that linkage group XIX is physically located at the basal bodies, renewing interest in the notion that these structures might contain their own genome (cf. Wheatley, 1982). Furthermore, the DNA was detected only at the basal bodies and not in the nucleus (Hall et al. 1989). If correct, these conclusions raise many questions, including several concerning the mechanisms by which genes of this linkage group might be transcribed and the RNA processed.

During an ongoing project to place molecular markers on the genetic map of C. reinhardtii (Ranum et al. 1988), the gene corresponding to cDNA clone pcf9-26 was assigned to linkage group XIX (Ranum, 1989). This clone was originally selected by differentially screening a cDNA library for mRNAs that accumulate during flagellar regeneration (Schloss et al. 1984). The mRNA is enriched in the polyadenylated fraction of total cellular RNA, prompting us to ask whether a typical C. reinhardtii nuclear polyadenylation signal was used for this gene and, furthermore, whether splicing would be required for transcript processing. Knowledge of the structure of this gene would also contribute to studies in our laboratory on the regulation of flagellar gene expression. We therefore isolated genomic DNA clones and additional cDNA clones for this mRNA, and determined their nucleotide sequences.

Plasmid pcf9-26 DNA (Schloss et al. 1984) was nick translated and used to screen a genomic DNA library as described previously (Schloss, 1990). Three different clones were purified and restriction endonuclease cleavage site mapping demonstrated that each contained similar and mostly overlapping regions of the genome. One clone, λ9-26A, containing a 13.8 kilobase (kb) segment of algal DNA, was analyzed further. A 3.8 kb Kpnl fragment from the center of the insert was shown to contain the gene of interest by hybridizing the cloned cDNA to Southern blots containing fragments (generated by restriction enzyme digestion) of the cloned genomic DNA, and by hybridizing DNA fragments isolated from the genomic clone to Northern blots containing RNA from control or deflagellated Chlamydomonas cell cultures (data not shown). The available cDNA clone was clearly not full length, containing a 0.9 kb insert that hybridized to an RNA of approximately 2 kb (Schloss et al. 1984). Therefore, the 3.8 kb genomic DNA fragment was used to screen C. reinhardtii gamete cDNA libraries in λgtll (Adair and Apt, 1990), and several additional cDNA clones were identified. The DNA sequences of the entire 3.8 kb genomic DNA fragment (both strands) and of a subset of the cDNA clones, were determined by the chain termination method (Sanger et al. 1977) using Sequenase™ kits (United States Biochemical) as described earlier (Schloss, 1990). The only discrepancies between cDNA clones were in the precise position of the poly(A) tail, which began 13 (one clone), 14 (one clone) or 15 (two clones) nucleotides past the putative polyadenylation signal. Sequence data were analyzed using IBI Pustell sequence analysis software (International Biotechnologies, Inc.) and the Genbank FASTA Server.

Gene structure

The structure of the gene corresponding to cDNA clone pcf9-26, which maps to linkage group XIX, appears to be that of a typical C. reinhardtii gene. The gene contains at least nine intervening sequences that are flanked by consensus splice junction sequences (Fig. 1). If any feature of the gene structure is remarkable, it is the large number of introns (the largest number reported to date for any gene in this organism). A putative polyadenylation signal TGTAA (Silflow et al. 1985) occurs 12-14 nucleotides upstream of the polyadenylation site (Fig. 2). The length of the 3’ untranslated region (426 nucleotides) is typical for nuclear genes of this organism. The 3.8 kb gene sequence is 60% G+C (the introns, total cDNA and translated regions are 59%, 60% and 64% G+C, respectively), reflecting the base composition of C. reinhardtii genomic DNA (63%; Chiang and Sueoka, 1967). On the basis of this gene structure analysis, this RNA is most likely to be spliced and polyadenylated in the nucleus.

The cDNA sequence is 1987 nucleotides long, excluding the poly(A) tail. We estimate from Northern blots that the mRNA is 2 kb long and thus believe that the longest cDNA clone is virtually full length. However, we have not conducted 5′ end mapping experiments, and therefore cannot exclude the possibilities that the 5′ untranslated region may be slightly longer or that another intron might exist.

Predicted protein product

The major open reading frame is 432 codons long. This predicted translation product uses as the initiator codon the first AUG (at position 466 of the gene sequence) that is in frame with the longest potential open reading frame. We selected this AUG for several reasons. Firstly, the calculated molecular weight of the predicted polypeptide is 46 860, in agreement with preliminary mRNA hybrid selection and translation experiments that reveal a faint band at 45 000 molecular weight (J. A. Schloss, unpublished observations). Secondly, this AUG has the essential purine at -3 (Kozak, 1989); two upstream AUGs and the first two AUGs downstream of position 466 have pyrimidines at −3. The AUG at 1091 matches more closely the initiator codon context we have compiled for Chlamydomonas mRNA sequences, with an A at −3 and G at +4. However, the subsequent open reading frame is relatively short (135 codons) and exhibits none of the bias generally observed in Chlamydomonas open reading frames. The long open reading frame we have identified exhibits weak Chlamydomonas codon bias. For example, the percentage of codons with A in the third position is 0.1% for the four tubulin genes, that display strong bias (Youngblom et al. 1984; Silflow et al. 1985), 8% for the arylsulfatase gene, that displays the weakest bias reported to date (de Hostos et al. 1989), and 9% for this 9-26 gene.

The predicted protein product of this open reading frame (hereafter referred to as the linkage group XIX open reading frame [lgl9orf]) is not readily identified by comparison (Pearson and Lipman, 1988) with entries in the GenPept or Swiss-Prot (releases 64.3 and 18) databases. The best match is to the Drosophila segment polarity protein armadillo (Riggleman et al. 1989). The arm protein contains 13 repeated segments, each 42 amino acids long. The region of lgl9orf from amino acid residue 182 to 345 can be aligned either with arm protein repeat segments #l-#4 or #3-#6 (Fig. 3), and each of these alignments yields a similar score: about 20% identity and 50% similarity (sum of identities and conservative amino acid changes). The regions of these polypeptides that lie N-terminal to the repeat domains are nearly 50% similar. Some general characteristics of the polypeptide structures are also conserved: the N-terminal domain of each polypeptide is acidic and hydrophilic, and the central region (which is much smaller in the algal polypeptide) is basic and hydrophobic (cf. Riggleman et al. 1989). The C-terminal domains are both acidic, although this region is hydrophobic in the algal and hydrophilic in the fly protein. We also note that introns 4, 6 and 7 are each located a few amino acids past the positions in the lgl9orf product with which the beginning of the repeat segments align, consistent with the concept that repeated protein segments may have arisen by duplication of an ancestral DNA segment, with subsequent maintenance of intron positions in the Chlamydomonas gene.

Armadillo is the fly homolog of mammalian plakoglobin (Peifer and Wieschaus, 1990; Franke et al. 1989), a component of cell membrane-associated plaques in which intermediate filaments and microfilaments are anchored (Cowin et al. 1986). The armadillo protein also localizes to plasma membranes and sometimes colocalizes with actin filaments (Riggleman et al. 1990). This prompts the intriguing speculation that the lgl9orf product may also be a component of a membrane-associated cytoskeletal anchor. In this case the cytoskeletal component might be axonemal outer doublet or central pair microtubules that terminate in specific structures at the distal tip of the flagellar membrane (Dentier, 1980). Consistent with this proposed identity is the observation that the abundance of this mRNA is 30-to 40-fold lower than that of tubulin mRNA (see below), as expected for an mRNA whose product is present in a small region of the axoneme. An alternative interpretation, that the lgl9orf product is related to arm only as a consequence of the presence of similar repeat segments, is not favored because the similarity between lgl9orf and arm is more apparent than the similarity between repeated domains within the lgl9orf product.

Gene regulation in growing and differentiated cells

Preliminary characterization of pcf9-26 included the demonstration that the corresponding mRNA accumulates in deflagellated gametic cells (Schloss et al. 1984). Fig. 4 confirms this observation and extends the analysis to vegetative cells, in which the mRNA is also expressed and induced approximately 5-to 10-fold upon deflagellation. We estimate that this mRNA is 30-fold lower in abundance than a-tubulin mRNA in deflagellated gametes (based on quantitative RNA dot hybridizations used to generate Figs 5 and 6 of Schloss et al. 1984), and 40-fold lower than a-tubulin mRNA in deflagellated vegetative cells (unpublished observations).

Conclusion

This is the first reported nucleotide sequence analysis of a gene on linkage group XIX, and it raises several important issues pertinent to the reported basal body localization of the DNA. The transcript from this gene is spliced and polyadenylated. If the gene resides in basal bodies and not in the nucleus, then RNA transcription and processing enzymes would likely need to exist in basal bodies. Alternatively, Hall and coworkers (1989) proposed that DNA of linkage group XIX might be quiescent during interphase, and replicated and expressed only during mitotic replication and meiotic reorganization, when the basal bodies themselves are reorganized (the cytological location of basal body DNA at these times was not specified). However, their proposal is inconsistent with the observation that 9-26 mRNA accumulates during flagellar regeneration in experiments that are conducted in the middle of interphase, and in gametes, which are differentiated cells not progressing through the cell cycle. Such accumulation requires either an increase in transcription rate, or continuous transcription accompanied by changes in mRNA turnover rate.

Two cytological studies have recently presented strong evidence against the presence of DNA in the C. reinhardtii basal body (Johnson and Rosenbaum, 1990; Kuroiwa et al. 1990). While this manuscript was in preparation, Johnson and Dutcher (1991) reported molecular evidence against the presence of linkage group XIX DNA in the basal body. They showed that the copy number per cell of linkage group XIX and nuclear linkage groups is the same (the nuclear genome is haploid, but each cell has two basal bodies), and that this remains true in mutant strains lacking basal bodies. The present study does not directly address the cytological location of linkage group XIX DNA, but places constraints on the cellular machinery that must be able to act on that DNA and its product, and on the timing when that machinery must act. These data need to be incorporated into any future evaluation of potential relationships between linkage group XIX and the presence or function of basal body DNA.

We thank L. Ranum, P. Lefebvre and C. Silflow for communicating the results of RFLP mapping studies, M. Gillette for the restriction map of genomic clone À9-26B, S. Adair and S. Waffenschmidt for supplying cDNA libraries and P. Lefebvre and C. Silflow for helpful comments on the manuscript. A. Bhat provided excellent technical assistance. H.B.C. was a Fellow of the Faculty Scholars Program, University of Kentucky. This research was also supported by NIH grant GM-34837; BRSG S07 RR07114-21 awarded by the Biomedical Research Support Grant Program, Division of Research Resources, NIH; and Major Equipment grants from the University of Kentucky Graduate School to J.A.S.

Adair
,
W. S.
and
Apt
,
K. E.
(
1990
).
Cell wall regeneration in Chlamydomonas: accumulation of mRNAs encoding cell wall hydroxyproline-rich glycoproteins
.
Proc. natn. Acad. Sci. U.S.A
.
87
,
7355
7359
.
Chiang
,
K.-S
, and
Sueoka
,
N.
(
1967
).
Replication of chloroplast DNA in Chlamydomonas reinhardii during vegetative cell cycle: its mode and regulation
.
Proc. natn. Acad. Sci. U.S.A
.
57
,
1506
1513
.
Cowin
,
P.
,
Kapprell
,
H.-P.
,
Franke
,
W. W.
,
Tamkun
,
J.
and
Hynes
,
R. O.
(
1986
).
Plakoglobin, a protein common to different types of intercellular adhering junctions
.
Cell
46
,
1063
1073
.
Dayhoff
,
M.
,
Schwartz
,
R. M.
and
Orcutt
,
B. C.
(
1978
). In:
Dayhoff
,
M
(ed.)
Atlas of Protein Sequence and Structure
. Vol.
5
,
Suppl. 3
, pp.
345
352
.
Silver Spring, MD
:
Nat. Biomed. Res. Found
.
de Hostos
,
E. L.
,
Schilling
,
J.
and
Grossman
,
A. R.
(
1989
).
Structure and expression of the gene encoding the periplasmic arylsulfatase of Chlamydomonas reinhardtii
.
Molec. gen. Genet
.
218
,
229
239
.
Dentler
,
W. L.
(
1980
).
Structures linking the tips of ciliary and flagellar microtubules to the membrane
.
J. Cell Sci
.
42
,
207
220
.
Dutcher
,
S. K.
(
1989
).
Linkage group XIX in Chlamydomonas reinhardtii (Chlorophycaea): Genetic analysis of basal body function and assembly
. In:
Coleman
,
A.
,
Goff
,
L.
and
Stein-Taylor
,
J.
(eds)
Algae as Experimental Systems
, pp.
39
53
.
New York
:
Alan R. Liss, Inc
.
Franke
,
W. W
,,
Goldschmidt
,
M. D.
,
Zimbelmann
,
R.
,
Mueller
,
H. M.
,
Schiller
,
D. L.
and
Cowin
,
P.
(
1989
).
Molecular cloning and amino acid sequence of human plakoglobin, the common junctional plaque protein
.
Proc. natn. Acad. Sci. U.S.A
.
86
,
4027
4031
.
Hall
,
J. L.
,
Ramanis
,
Z.
and
Luck
,
D. J. L.
(
1989
).
Basal body/centriolar DNA: molecular genetic studies in Chlamydomonas
.
Cell
59
,
121
132
.
Johnson
,
D. E.
and
Dutcher
,
S. K.
(
1991
).
Molecular studies of linkage group XIX of Chlamydomonas reinhardtii: evidence against a basal body location
.
J. Cell Biol
.
113
,
339
346
.
Johnson
,
K. A.
and
Rosenbaum
,
J. L.
(
1990
).
The basal bodies of Chlamydomonas reinhardtii do not contain immunologically detectable DNA
.
Cell
62
,
615
619
.
Kozak
,
M.
(
1989
).
The scanning model for translation: an update
.
J. Cell Biol
.
108
,
229
241
.
Kuroiwa
,
T.
,
Yorihuzi
,
T.
,
Yabe
,
N.
,
Ohta
,
T.
and
Uchida
,
H.
(
1990
).
Absence of DNA in the basal body of Chlamydomonas reinhardtii by fluorimetry using a video-intensified microscope photon-counting system
.
Protoplasma
158
,
155
164
.
Pearson
,
W. R.
and
Lipman
,
D. J.
(
1988
).
Improved tools for biological sequence comparison
.
Proc. natn. Acad. Sci. U.S.A
.
85
,
2444
2448
.
Peifer
,
M.
and
Wieschaus
,
E.
(
1990
).
The segment polarity gene armadillo encodes a functionally modular protein that is the Drosophila homolog of human plakoglobin
.
Cell
63
,
1167
1178
.
Ramanis
,
Z.
and
Luck
,
D. J. L.
(
1986
).
Loci affecting flagellar assembly and function map to an unusual linkage group in Chlamydomonas reinhardtii
.
Proc. natn. Acad. Sci. U.S.A
.
83
,
423
426
.
Ranum
,
L. P. W.
(
1989
).
Mapping nuclear sequences in Chlamydomonas reinhardtii using restriction fragment length polymorphisms
.
Ph.D. Thesis
.
University of Minnesota. St. Paul, Minnesota
.
Ranum
,
L. P. W.
,
Thompson
,
M. D.
,
Schloss
,
J. A.
,
Lefebvre
,
P. A.
and
Silflow
,
C. D.
(
1988
).
Mapping flagellar genes in Chlamydomonas using restriction fragment length polymorphisms
.
Genetics
120
,
109
122
.
Riggleman
,
B.
,
Schedl
,
P.
and
Wieschaus
,
E.
(
1990
).
Spatial expression of the Drosophila segment polarity gene armadillo is posttranslationally regulated by wingless
.
Cell
63
,
549
560
.
Riggleman
,
B.
,
Wieschaus
,
E.
and
Schedl
,
P.
(
1989
).
Molecular analysis of the armadillo locus: uniformly distributed transcripts and a protein with novel internal repeats are associated with a Drosophila segment polarity gene
.
Genes Dev
.
3
,
96
113
.
Sanger
,
F.
,
Nicklen
,
S.
and
Coulson
,
A. R.
(
1977
).
DNA sequencing with chain-terminating inhibitors
.
Proc. natn. Acad. Sci. U.S.A
.
74
,
5463
5467
.
Schloss
,
J. A.
(
1990
).
A Chlamydomonas gene encodes a G protein f subunit-like polypeptide
.
Molec. gen. Genet
.
221
,
443
452
.
Schloss
,
J. A.
,
Silflow
,
C. D.
and
Rosenbaum
,
J. L.
(
1984
).
mRNA abundance changes during flagellar regeneration in Chlamydomonas reinhardtii
.
Molec. cell. Biol
.
4
,
424
434
.
Silflow
,
C. D.
,
Chisholm
,
R. L.
,
Conner
,
T. W.
and
Ranum
,
L. P. W.
(
1985
).
The two cr-tubulin genes of Chlamydomonas reinhardtii code for slightly different proteins
.
Molec. cell. Biol
.
5
,
2389
2398
.
Wheatley
,
D. N.
(
1982
).
The Centriole: a Central Enigma of Cell Biology
.
Amsterdam, New York, Oxford
:
Elsevier Biomedical
.
Youngblom
,
J.
,
Schloss
,
J. A.
and
Silflow
,
C. D.
(
1984
).
The two α-tubulin genes of Chlamydomonas reinhardtii code for identical proteins
.
Molec. cell. Biol
.
4
,
2686
2696
.
Zimmer
,
W. E.
,
Schloss
,
J. A.
,
Silflow
,
C. D.
,
Youngblom
,
J.
and
Watterson
,
D. M.
(
1988
).
Structural organization, DNA sequence, and expression of the calmodulin gene
.
J. biol. Chem
.
263
,
19370
19383
.