The ascidian egg is a well-known mosaic egg. In order to investigate the molecular nature of the maternal genetic information stored in the egg, we have prepared cDNAs from the mRNAs in the fertilized eggs of the ascidian, Halocynthia roretzi. The cDNAs of the ascidian embryo were sequenced, and the localization of individual mRNA was examined in staged embryos by whole-mount in situ hybridization. The data obtained were stored in the database MAGEST (http://www.genome.ad.jp/magest) and further analyzed. A total of 4240 cDNA clones were found to represent 2221 gene transcripts, including at least 934 different protein-coding sequences. The mRNA population of the egg consisted of a low prevalence, high complexity sequence set. The majority of the clones were of the rare sequence class, and of these, 42% of the clones showed significant matches with known peptides, mainly consisting of proteins with housekeeping functions such as metabolism and cell division. In addition, we found cDNAs encoding components involved in different signal transduction pathways and cDNAs encoding nucleotide-binding proteins. Large-scale analyses of the distribution of the RNA corresponding to each cDNA in the eight-cell, 110-cell and early tailbud embryos were simultaneously carried out. These analyses revealed that a small fraction of the maternal RNAs were localized in the eight-cell embryo, and that 7.9% of the clones were exclusively maternal, while 40.6% of the maternal clones showed expression in the later stages. This study provides global insights about the genes expressed during early development.
The fertilized egg is a single totipotent cell that cleaves many times to give rise to a multicellular organism. Within an embryo, embryonic cells develop into various tissue types, and ultimately into all cell types and structures in a series of specification and differentiation steps. Ascidian embryos develop into tadpole-shaped swimming larvae, which have a primitive chordate body. Ascidians have been the subject of embryological studies for more than a hundred years (Chabry, 1887) and have been used as classic organisms to study what is known as ‘mosaic development’ (Conklin, 1905a; Conklin, 1905b). Of the major organs and tissues, the epidermis, muscle and endoderm are formed by cell-autonomous processes. The mosaicism of the ascidian embryo is evidence that the localized cytoplasmic factors in the egg specify the tissue precursor cells during embryogenesis (Reverberi and Minganti, 1946; see Satoh, 1994 for a review). In addition, a series of experiments has shown that localized maternal factors are responsible for controlling the axis specification, cleavage pattern, morphogenetic movements such as gastrulation (for a review, see Nishida, 1997) and the responsiveness of the notochord and mesenchyme precursor blastomeres (Kim et al., 2000). These factors are segregated into particular blastomeres by an invariant cleavage pattern of the ascidian embryo. Thus, various processes of ascidian embryogenesis are mediated by localized maternal factors in the egg. We are interested in the maternal factors in the egg cytoplasm that provide information for the diversification of cell types and for the morphogenesis during the successive stages of embryogenesis.
In the early 1980s, it was reported that actin mRNA accumulated preferentially in a region of the ascidian egg called the myoplasm, providing the first evidence for polarized mRNA distribution within a cell (Jeffery et al., 1983). This has been followed by numerous reports of mRNAs localized in the oocytes, eggs and embryos of various organisms, including both protostomes, such as the fruit fly and nematode, and deuterostomes, such as Xenopus (for reviews, see Wilhelm and Vale, 1993; Ding and Lipshitz, 1993; Glotzer and Ephrussi, 1996; Gavis, 1997; Schnapp et al., 1997). The RNAs present in a cell determine the range of functions the cell can perform. In most of the species studied, including the fruit fly, nematode, sea urchin, frog and possibly in fish, the establishment of axes, diversification of cell types and morphological changes during early embryogenesis rely partly on the maternally stored mRNAs (e.g. Davidson, 1986; Wieschaus, 1996; Wylie et al., 1996; Bowerman et al., 1997; Yamaha et al., 1998). A large-scale analysis of the functional requirements in the Drosophila germline has suggested that 75% of the 3600 lethal loci in the genome are functionally required during oogenesis (Perrimon et al., 1989). Therefore, large-scale identification of RNAs stored in the eggs and analysis of the abundance of the RNAs can give valuable insights into the basic molecular processes. Reflecting the enormous developmental potential of the egg, the egg contains the greatest transcript complexity of any known cell type (Davidson, 1986; Poustka et al., 1999; Ko et al., 2000). This fact also makes the egg a particularly useful source of cDNA clones for the identification of genes.
Because of its phylogenetic position, the ascidian would be a key organism for large-scale genome-wide research. In the phylum Chordata, genomic duplication events took place twice during the process of vertebrate evolution. The ascidian, which is a lower chordate, however, has a non-duplicated genome that can be regarded as a basic set of the chordate-type genome (Holland et al., 1994; Sidow, 1996). The haploid genome of the ascidian consists of about 160 Mb, and contains about 15,000 genes, which are approximately one-twentieth and one-fifth of the respective amounts in mammals (Laird, 1971; Simmen et al., 1998). As a result of the absence of genome duplication in the ascidian lineage, the number of genes and the genome size of the ascidian are in the same range as those of the fruit fly and nematode (Miklos and Rubin, 1996), for which the genomes have already been sequenced. The RNA complexity in the egg of the ascidian is therefore expected to be lower than those of other chordates. Therefore, the ascidian is a good model system with which to investigate the functions of chordate genomes, especially as the function of genes can easily be analyzed by gene introduction and gene disruption in ascidians, which are suitable for microsurgical manipulation and modern molecular techniques.
One approach to the comprehensive study of RNA functions is to sequence randomly isolated cDNAs and analyze the expression pattern of the corresponding mRNAs by whole-mount in situ hybridization on a large scale, because it is generally accepted that differential gene expression is a major mechanism in cellular differentiation and development. Description of the gene expression pattern is now considered to be an essential part of the characterization of genes because it clarifies their functions. There are examples of similar research and related databases for several organisms (Tabara et al., 1996; Gawantka et al., 1998; Ko et al., 2000).
We have therefore constructed a database named MAGEST, an acronym for Maboya (the ascidian Halocynthia roretzi) gene expression patterns and sequence tags (Kawashima et al., 2000), to analyze the data produced in this project, which is implemented in the Sybase relational database system and is accessible through the Internet. Our aim is to achieve an all-inclusive and systematic description of maternal transcripts stored in the fertilized eggs of the ascidian, Halocynthia roretzi, in terms of the cDNA sequences and their localization and zygotic expression patterns during embryogenesis. Data collection is ongoing and we describe an overview of the large-scale, global analysis so far carried out on the identification and localization/expression of the maternal mRNAs stored in the ascidian.
MATERIALS AND METHODS
Eggs and embryos
Adult ascidians, Halocynthia roretzi, were obtained from fishermen near the Otsuchi Marine Research Institute, the University of Tokyo, Iwate, Japan and the Asamushi Marine Biological Station, Tohoku University, Aomori, Japan. Naturally spawned eggs were artificially fertilized with a suspension of non-self sperm. The fertilized eggs were immediately collected or cultured at 12°C in Millipore-filtered sea water containing 50 μg/ml streptomycin up to the eight-cell, 110-cell and early tailbud stages, at about 4 hours, 9 hours and 24 hours after fertilization, respectively. The eggs and embryos were collected by low-speed centrifugation.
Construction of the arrayed cDNA library and EST sequencing
Ascidian eggs are surrounded with the chorion, attached to which are many follicle cells. Within the perivitelline space, numerous test cells are found. The presence of the follicle and the test cells makes it difficult to analyze the property of the eggs when it is homogenized with the chorion (Jeffery, 1980). Therefore, we used eggs that were chemically dechorionated, as described by Mita-Miyazawa et al. (Mita-Miyazawa et al., 1985) prior to RNA extraction. Poly(A) RNA was extracted from the fertilized eggs using AGPC (Chomczynski and Sacchi, 1987) and purified using Oligotex-dT30 beads (Roche Japan, Tokyo). An oligo(dT)-primed cDNA library was constructed from 5 μg of poly(A) RNA. For the cDNA synthesis, a lambda uni-ZAP cDNA synthesis kit (Stratagene) was used until purification of the adaptor-ligated cDNA. The cDNAs of the largest and second-largest fraction obtained in the fractionation step were ligated into pBluescript(SK-) and electroporated into Escherichia coli DH10B cells (Gibco) using a Gene Pulser II electroporation system (Pharmacia). A total of 86,000 independent cDNA clones of the library were picked randomly and arrayed in over two hundred 384-well plates in a Q-Bot robot (Genetix, UK). All the clones had a clone ID as their identification number, each of which was a combination of a plate number and a well number of the clone. The arrayed library was stored at –80°C. Single-pass cDNA sequencing was conducted. The 384-well plates were thawed and the clones were withdrawn, inoculated directly into a PI-50 six-hole tube that contained 3 ml of LB broth and cultured overnight, after which the cDNAs were isolated in a PI-50 mini-prep machine (Kurabo). Their 5′ and 3′ termini were sequenced in a half-scale reaction by conventional procedures in an automated ABI PRISM 377 sequencer (Perkin Elmer Japan), using Big-Dye dideoxy chain terminators. The sequencing primers were T3 (5′-ATTAACCCTCACTAAAGGGA-3′) or SK20 (5′-CGCTCTAGAACTAGTGGATC-3′) for sequencing the 5′ termini and T7 (5′-GCGTAATACGACTCACTATA-3′) for the 3′ termini.
Sequence data analysis
After removing the vector sequences and ambiguous regions that contained stretches of N (ambiguous nucleotide) from the raw sequence data, the processed sequences were registered and used as query sequences for BLAST homology searches against GenBank (Benson et al., 2000) including dbEST (Boguski et al., 1993) at the nucleotide sequence level and also against nr-aa, a non-redundant protein sequence database constructed from SWISS-PROT (Bairoch and Apweiler, 2000), PIR (Barker et al., 1999), PRF (Protein Research Foundation, Osaka, Japan) and GenPept (translated GenBank) (Benson et al., 2000) at the amino acid sequence level. Up to 10 entries above a given threshold (blast E value<e−8) were retrieved from the original databases by the DBGET/LinkDB system (Fujibuchi et al., 1998) and stored in our database, named MAGEST (Kawashima et al., 2000).
All the 3′ sequences were used as fingerprints of the genes. Clustering of the 3′-sequences was carried out by searching for sequence similarities in MAGEST using the entire length of the sequences, in order to examine the number of redundant clones. The clustered clones derived from the same gene were given a cluster ID number, which began at C. As to clones whose 3′ sequence was accidentally not available, some of them were clustered by the 5′ sequences, but the other clones that stood alone were temporarily given cluster ID numbers which began at H.
Whole-mount in situ hybridization
Whole-mount specimens were mechanically dechorionated by a chorion peeler (H. I., H. K. and T. N., unpublished) after fixation. The specimens were hybridized in situ at 45°C using digoxigenin-labeled antisense probes, according to a newly developed method described by Ogasawara et al. (Ogasawara et al., 2001). We analyzed three developmental stages: the eight-cell, 110-cell and early tailbud embryos. To render the specimens transparent when necessary, they were dehydrated in ethanol and cleared in a 1:2 mixture (v/v) of benzyl alcohol and benzyl benzoate. Images of stained embryos were acquired with an HC-300Z/OL-2 digital CCD camera (Olympus) mounted on an SZX-12-3121 dissecting microscope (Olympus) and processed using Adobe-Photoshop software. These expression data were compiled in MAGEST. The photo images and the classification data can be found at http://www.genome.ad.jp/magest/
RESULTS AND DISCUSSION
Statistical overview of the maternal RNA repertoire stored in the egg
The ascidian embryo has several advantages as an experimental system (see Satoh et al., 1996 for a review). In particular, one ascidian species, Halocynthia roretzi, possesses a few additional advantages. First, they are the largest species among all the ascidians widely used in biological studies and they produce a large number of eggs. Second, in most cases of in situ hybridization with whole-mount Halocynthia specimens, signals are first detected in the nuclei of certain blastomeres. This enables us to ascertain clearly when and which cells of the early embryos express the gene, together with the complete description of the cleavage pattern up to the gastrula stage in Halocynthia (Nishida, 1987). Third, more detailed experimental and embryological studies have been possible in this than in any other species (see Nishida, 1997 for a review). Fourth, the egg of this species is as large as 280μm in diameter and easy to manipulate in future functional analyses. Therefore, we constructed a directionally cloned, arrayed plasmid cDNA library from the RNA of the uncleaved fertilized egg of H. roretzi.
We determined the sequences of the 4240 cDNA clones from the library. The length of inserts in the library was approximately 2 kb on average and lay mainly in the range from 0.5 kb to 5 kb. The average readable sequence length, on which the following analysis is based, was about 500 bp. All the sequences obtained have been deposited in the public sequence database and made available since the summer of 1999. The sequences obtained were divided into 2221 C clusters consisting of 3832 clones and 408 H clusters. Unless otherwise specified, the following sequence analyses were done based on the C clusters.
The GC contents of all the sequences obtained were calculated for each terminus. The 5′ terminus sequences, which mainly consisted of open reading frames, had 700859 GC bases and 1048999 AT bases(40.1% GC). In contrast, the 3′ terminus sequences without poly(A) tails, most of which are considered to be 3′ UTRs, had 539096 GC bases and 1048707 AT bases (34.0% GC). According to the data collected by Nakamura et al. (Nakamura et al., 2000), the overall GC content of 48447 codons from the coding sequences of Halocynthia roretzi is 43.45%, while that of the third codons, which are relatively free from GC/AT-usage restriction, is 41.38%. This suggests that the genome of Halocynthia roretzi has a tendency to prefer AT to GC. Taking the lack of general restriction of the use of nucleotides in the 3′ UTRs into account, this may explain the lower GC content in the 3′ termini we obtained.
As we used a non-normalized primary library without amplification, the cluster size, namely the clone abundance, will reflect the relative population of the corresponding maternal RNAs. Fig. 1 illustrates the prevalence distribution of the cluster sizes. The total numbers of clusters in each cluster size are shown. Four general abundance classes were identified.
(1) The most frequently represented among the redundant clones was mitochondrial 16S ribosomal RNA.
(2) There were only nine clusters that contained more than 12 clones, except for the mitochondrial 16S ribosomal RNA. These nine clusters, representing the most abundant transcripts, constituted 4.7% of the total clones (198 of 4240 clones), but represented just 0.41% of the total clusters (nine of 2221 clusters). The genes of higher prevalence classes encoded proteins such as the C9 protein (unknown mammalian protein located on human chromosome 12 and mouse chromosome 6)(Ansari-Lari et al., 1998), cyclins, ubiquitin and tubulin, most of which are involved in cell division (Fig. 1).
(3) There were 658 medium-sized clusters with two to eight members. These 658 clusters together contained 1699 clones (40.0% of the starting clones) and constituted 29.6% of the total clusters. Interestingly, a cDNA encoding a homolog of CsEPI-1 in Ciona savignyi, which is reported to be expressed only zygotically after the 110-cell stage in the epidermal blastomeres (Chiba et al., 1998), was maternally recovered as a member of this class in this study.
(4) The vast majority of the egg poly (A) RNAs were of the rare sequence class, consisting of RNA species appearing only once in the database, which were 1556 clusters containing only 36.7% of the total clones but 70.0% of the total clusters. In Fig. 1, the clusters were separated into sequences of putative mRNAs encoding known peptides, sequences with homology to ESTs or cosmids of other organisms, and sequences with no known similarity. A similar tendency was observed in the medium and low prevalence classes.
Overall identification and categorization of the sequences
The nucleotide sequences obtained and the predicted amino acid sequences were compared with those in the public databases, such as GenBank, to identify significant matches. Fig. 2 shows the general categories. The total complete sequence data already analyzed were divided into four general categories, defined by the sequence similarity searches: (1) 1610 recognized protein-coding sequences for which there is a high probability that the identifications are meaningful, totaling 42.0% of the 3832 clones and representing 934 clusters (42.1% of 2221 clusters); (2) 403 sequences showing significant similarities to unidentified ESTs or cosmids of other organisms, totaling 10.5% of the total clones, and representing 233 clusters (10.5% of the total clusters); (3) 425 sequences representing the mitochondrial 16S ribosomal RNA, totaling about 11.1% of the total clones; and (4) 1394 sequences belonging to neither of the above classes, totaling about 36.4% of the total clones, and representing 1053 clusters (47.4% of the total clusters). These represent unidentified mRNAs. Comparison of the number of cDNAs analyzed here versus the number of genes estimated to be expressed in the egg shows that the above-obtained percentages must be close to those of the RNA repertoire in the egg.
Table 1 lists examples of the most significant matches discovered by comparing the ascidian sequences to the public databases. Except for the previously isolated ascidian proteins and proteins highly conserved beyond species or phyla, they are not necessarily the Halocynthia homologs, but in general they suggest the probable protein families to which the ascidian proteins belong or the proteins sharing some motifs. For further analysis, we attempted a functional classification of the 934 identified clusters containing 1610 clones with significant matches, according to attributable function. We basically made the classification according to a system described by Lee et al. (Lee et al., 1999). Category A includes the proteins required for housekeeping functions including the cell cycle, metabolism and energy transfer. Category B consists of signaling molecules and intercellular communication proteins. Category C contains transcription factors and other nuclear proteins that affect gene regulation. Category D consists of the other proteins that are not classified into the above categories. Table 2 summarizes the proportion and the number of diverse mRNA species represented in each of these categories. Though long suspected, this provided the direct evidence for the high fraction of the maternal RNAs encoding housekeeping machinery. Irrespective of the large numbers of clones and clusters in category A, it consisted mainly of a relatively high prevalence, low complexity sequence set (i.e. low cluster/clone ratio), in contrast to category C, which consisted of a low prevalence, high complexity sequence set (high cluster/clone ratio). Strikingly, 12.8% of the clusters with attributable function represent genes with a potential regulatory role in development (namely, genes in categories B and C). Comparison of the proportion of the four categories obtained from the ascidian one-cell embryo with that of the sea urchin 7 hour cleavage-stage embryo (Lee et al., 1999) showed a significant difference in the number of genes in category B. Unlike categories A and C, whose numbers were comparable in the two analyses, the ascidian embryo expressed fewer RNAs for cell-cell communication proteins. This may reflect both the stage difference between the one-cell and cleavage-stage embryos and the embryological difference between the mosaic and regulative natures of the two species.
In Table 3, several examples of the Halocynthia clones classified into the above categories are listed. Several cDNA clones that have been reported to be expressed maternally in the egg were reasonably recovered. One of the most practical products of any EST project is the new probes obtained for interesting gene homologs of the animal. Among these, the most interesting genes are likely to be those concerning cell signaling and transcriptional control. There are several such discoveries listed in Table 3, among which we have reported 001A03, HrWnt-5 (Sasakura et al., 1998a), and 002C13, Hrsmad1/5, which is assumed to transduce BMP signaling (Kobayashi et al., 1999), for example. The presence of these maternal transcripts encoding signaling molecules suggests that signaling mediated by these proteins will soon occur in cleavage-stage embryos, and that these proteins may be involved in early specification events. We also found sequences similar to those of various transcription factors that have not been previously isolated from the ascidian embryos, such as groucho and Sin3. In addition, 25 clusters whose products were predicted to contain zinc-finger motifs were found (Table 4). Ten of them had zinc-finger motifs of the C2H2 type, which are expected to bind to DNA as transcription factors. As one of the most important roles performed by maternal factors stored in eggs is to control subsequent zygotic gene expression during early embryogenesis as transcriptional regulators, they may function in the cell-type diversification processes. All of these findings have new implications for the functional activities of the ascidian egg. The mechanism(s) by which these maternal transcription factors are modified and/or sequestered so that they become functional at the proper developmental stages in the proper blastomeres remains elusive.
In category D, we found a cluster consisting of two clones with a high similarity to the transposase MER37 of the human transposable element, Tigger1. Tigger1 belongs to the pogo-class DNA transposon superfamily, which is distantly related to the Tc1/mariner superfamily (Smit and Riggs, 1996). This is the first evidence that a sequence encoding a transposase, which may originate from an endogenous transposon, is transcribed in ascidian cells.
Sequences constituting about 47.5% of the total clusters displayed no significant similarity to any known sequences or ESTs in the public databases. This is probably due to two reasons: first, the ascidian sequences are too divergent to have any similarity to the sequences registered in the public databases represented by GenBank, which consist mainly of mammalian genes. Second, this reflects the high complexity of mRNA species in the egg, as is the case for the mammalian preimplantation embryo (Ko et al., 2000).
General classification of the localization of maternal RNAs in early embryos
Whole-mount in situ hybridization was carried out to collect information about the localization and/or expression sites of maternal mRNAs, which is important for understanding the outline of the gene functions in developmental mechanisms. To determine the distribution of the maternal RNAs, the eight-cell embryo was used, because it is easier to identify the orientation along the animal-vegetal and anterior-posterior axes in the eight-cell embryo than in the uncleaved egg. We must emphasize that large-scale expression screening by whole-mount in situ hybridization is extremely effective for gaining a global overview and identifying candidate clones that show particular localization, but is not very suitable for determining detailed expression profiles of the genes. These features are analogous to single-pass sequencing in EST projects. We classified the expression data according to each bilateral blastomere pair at this stage, a4.2 (anterior-animal), b4.1 (posterior-animal), A4.2 (anterior-vegetal) and B4.1 (posterior-vegetal). Furthermore, categories of ‘mitochondria-like’ and ‘postplasm’ were added, based on the initial preliminary results, in which we found many clones that otherwise would be classified into the category B4.1. Mitochondrial large ribosomal RNA (mtlrRNA) encoded by the mitochondrial genome contains a poly (A) tail and is abundant in egg cytoplasm. It is abundant in the myoplasm of the posterior-vegetal region in eggs and embryos of H. roretzi (Oka et al., 1998; Fig. 3A). The postplasm is a small region located at the posterior-most part of the embryo (Sasakura et al., 1998a; Fig. 3D). Photo images of the maternal localization of each clone are available and can also be searched for, according to localization patterns, using MAGEST via the Internet.
As shown in Table 5, among the 2626 clones representing1206 clusters for which we examined the localization sites at the eight-cell stage, the vast majority (40.0%: 1047/2626) were distributed ubiquitously in the embryo. The largest category showing localization was mitochondria like, and contained 494 mitochondrial cDNA clones (Fig. 3A). There were, however, three large localization clusters containing heterogenous populations of clones; the first showed localization in the animal half of the embryo, totaling 43 clusters of the analyzed clones, although this number was vague and can be perturbed by the threshold employed for localization (Fig. 3B). The second group was 58 clusters that were widely distributed throughout the eight-cell embryo except in the B4.1 blastomere pair (Fig. 3C). The third group showed clear localization in the postplasm of the posterior-vegetal blastomere, and totaled 28 clusters, including type I and II postplasmic RNAs such as HrWnt-5, HrPOPK-1, HrZF-1, and HrPet-1, HrPet-2 and HrPet-3 – reported previously (Sasakura et al., 1998a; Sasakura et al., 1998b; Sasakura et al., 2000). Table 6 lists all the clones localized in the postplasm and their molecular identities. The most prevalent in this group was Hrpem, which is a Halocynthia homolog of the gene reported in Ciona savignyi by Yoshida et al. (Yoshida et al., 1996; Fig. 3D). The maternal transcripts of all the genes in this category showed the same distribution as pem throughout embryogenesis: the transcripts were segregated into the posterior-most blastomere pair during the cleavage stage and were partitioned into a pair of endodermal strand cells in the posterior tip of the tail. As the posterior cytoplasm of the eight-cell embryo possesses multiple functions, such as muscle formation, anterior-posterior axis specification (Nishida, 1996), and generation of the difference in the responsiveness of the notochord and mesenchyme precursor blastomeres (Kim et al., 2000), these postplasmic clones are expected to play some roles in these processes. In addition to these major localization patterns, a few minor localization groups such as the a4.2 blastomere pair (Fig. 3E) and the two anterior blastomere pairs (Fig. 3F) were found.
Contrary to our expectations at the beginning that many RNA species might be localized in each of the cytoplasmic domains such as myoplasm, ectoplasm, endoplasm, chymoplasm, caudal chymoplasm and chorda-neuroplasm (Conklin , 1905b), there were only three major and a few minor localization patterns in the eight-cell embryo, as mentioned above. Therefore, localization of maternal RNAs seems to follow a few simple patterns, and these simple patterns may be enough to generate complicated diversification of embryonic cells during development. However, this apparent simplicity may have resulted because we missed other patterns, owing to the limits of sensitivity and resolution in our in situ experiments, or because the localized activities of the ascidian embryo are not due to RNA localization at this stage, but are rather controlled mainly at the protein level.
Global complexity of gene expression during embryogenesis
It has been reported that about 90% of zygotically active genes are also functional during oogenesis in the fruit fly, nematode and sea urchin (Davidson, 1986). Simultaneously with the analysis of the eight-cell embryos, we investigated the spatial expression patterns of zygotic transcription in the 110-cell and early tailbud embryos using the MAGEST cDNAs as probes. The 110-cell embryo represents the stage just prior to gastrulation, and all of its blastomeres face the outer surface of the embryo. The developmental fates of most of the blastomeres are completely restricted to give rise to a single tissue type at this stage (Nishida, 1987). The 110-cell embryo was therefore used to determine the lineage specificity of gene expression. In the early tailbud embryo, the basic chordate body plan is established. The embryo at this stage was used to determine the tissue specificity of gene expression. To make a compromise between large-scale throughput and the depth of analysis, we had to restrict scoring only to major embryonic tissues and two additional locations: epidermis, adhesive organ, brain, nerve cord, A-line notochord, B-line notochord, muscle, mesenchyme, trunk lateral cell, trunk ventral cell, endoderm, endodermal strand, mitochondria-like and postplasm. All the photo images of the in situ results obtained are accessible and can be searched according to expression patterns using MAGEST through the Internet.
As summarized in Table 7, a substantial proportion (29.3%) of the total clusters, including 420 maternal clones, did not show any staining at any stage, including the eight-cell stage (000 in Table 7). This is mainly due to the fact that many of maternal messages belong to the rare abundance class, as mentioned above, for which the expression level is below the limit of detection of the whole-mount in situ hybridization technique. Of the genes analyzed, 7.9% of the clusters seemed to be exclusively maternal (100 and 200 in Table 7), while the remaining 40.6% of the clones showed signals in later stages. There were almost all the patterns in the temporal expression profiles, ranging from 000 to 222, shown in Table 7. Interestingly, for 14 genes, expression was detected only at the 110-cell stage (010 and 020 in Table 7). For example, one of these clones, 001D02, showed transient expression exclusively in the presumptive notochord cells in the 110-cell embryo (see Fig. 6A).
Fig. 4 shows an overview of the frequency of gene expression in the different tissues scored in the early tailbud embryos. The largest proportion (31%) of the genes were expressed in the central nervous system (brain and nerve cord), followed by mesenchyme and the epidermis. It should be noted that a single gene is sometimes expressed in multiple tissues. Table 8 classifies the zygotic expression according to how many tissues and which tissues the staining was detected in. One of the most useful aspects of this kind of project is finding new probes for interesting genes that are expressed in a cell type-specific manner. Of the genes mentioned above, 13.2% of the clones showed tissue-specific expression. Fig. 5 shows examples of such clones. Many genes were identified whose expression pattern is so restricted that they may serve as useful markers of tissue differentiation. Among them were tissues for which no molecular probes were previously available, such as the adhesive organ, a subset of neural cells and trunk lateral cells, for example (Fig. 5B,C,F, respectively).
The complexity of the zygotic gene expression at the 110-cell and early tailbud stage is broad, from the cell-type specific, as mentioned above, to the complex expression in multiple, ontogenetically unrelated regions (e.g. 003O15, Fig. 6B). Some genes showed widespread expression in many tissues but lack of expression in one tissue (e.g. 004D21, Fig. 6C). This observation must be attributed to the pleiotropic and/or borderless functions of the gene products in the context of tissue types. For example, the zygotic expression of genes whose maternal RNA showed postplasmic localization was detected in a variety of cell types in the early tailbud embryo (Table 6, Fig. 6D-F). These maternal gene products may serve various purposes in the postplasm in the early stages, and the zygotic gene products may function differently in various cell types.
A discussion of the genes showing a variety of interesting expression patterns is largely beyond the scope of this report. However, as one of fruitful results of the clustering analysis of the expression profiles, we refer to a group of genes that are zygotically expressed in a pair of B7.6 cells in the early tailbud embryo. The endodermal strand is a line of cells located in the ventral midline just beneath the notochord in the tail. These cells are derived from three pairs of blastomeres of the 110-cell embryo, B7.2, b8.17 and B7.6 (Nishida, 1987). The main source of the tissue is B7.2, which contributes most of the endodermal strand cells. There are another two supplementary sources of this tissue. The first is a cell located at the tip of the tissue, which is derived from one of the b8.17 pair, and the second is B7.6, which is the posterior-most blastomere at the 64-cell stage. The bilateral pair of B7.6 does not further divide after the 64-cell stage, and gives rise to the two endodermal strand cells in the posterior region of the tail. All these cells are identified as endodermal strand cells by their position and expression of endoderm/endodermal strand markers, such as clone 003D22, for example (Fig. 5H). However, little is known about the cellular differences among the endodermal strand cells, except for their lineal origins. As a localization study has shown that B7.6 carries many maternal products, such as postplasmic RNAs, including vasa RNA (Fujimura and Takamura, 2000), which is a germ-cell marker, there should be specific characteristics of their cellular functions. It was therefore interesting to identify subset-specific zygotic genes by which endodermal strand cells can be distinguished from each other. We found 20 genes that were zygotically expressed in the B7.6 pair, but not in the other cells in the endodermal strand of the tailbud embryo (Table 9). Maternal mRNAs of most of these genes were not localized at the posterior end, but rather were present ubiquitously during the early stages (Fig. 7A). All these genes had various expression sites other than B7.6: some genes were expressed in neural tissues, some in mesenchyme and others in the epidermis (e.g. 003D19, 003C20, 003H03 in Fig. 7B-D, respectively). Except for the expression in the B7.6 blastomere pair, there was no common feature among these genes in terms of other expression sites or the timing of the onset of zygotic expression. These observations suggest that these genes do not define cell types, but rather that they play pleiotropic roles and/or roles independent of the tissue type. To investigate the characteristics of B7.6 endodermal strand cells, further studies of these genes will be necessary.
We have described the large-scale analysis of the maternal genetic repertoire stored in the ascidian egg in terms of the sequences and localization and/or expression patterns in early development. We analyzed 4240 cDNA clones representing the 2629 clusters according to our definition. Many genes with interesting similarities to known proteins and with clearly discernable expression patterns were identified. These data will also serve as resources for conventional studies to elucidate the molecular mechanisms of development. Therefore, we next need to study the functions of these genes by tests of physiological relevance during development, which will be approached by gain- or loss-of-function experiments. In the past decade, several significant advances in technology have allowed us to modify gene expression: gene introduction by microinjection (Hikosaka et al., 1992) and electroporation (Corbo et al., 1997), the suppression of zygotic gene expression by an antisense oligonucleotide (Swalla and Jeffery, 1996) and the overexpression of an exogenous gene by synthetic mRNA injection (Yoshida et al., 1996) have all been applied to the ascidian embryology. However, further technical innovation and the development of new methods for suppressing the activities of specific maternal RNAs and proteins are needed in order to assess their functions. Recently, we have been able to deplete specific maternal RNAs using antisense oligonucleotides (Nishida and Sawada, 2001). Extensive studies of the maternal genetic information using loss-of-function techniques of this kind will shed light on the molecular mechanisms of ascidian embryogenesis.
This project will provide a resource for future research that will facilitate the investigation of the maternal factors involved in early developmental events, and consequently, comprehensive studies of the maternal genetic information. This cannot otherwise be achieved by specific investigations on particular phenomena during embryogenesis. Finally, this study should also facilitate the elucidation of the molecular mechanisms of the establishment of embryonic body plans of chordates and lead to further understanding of the evolution from invertebrates to vertebrates.
We thank all members of the Asamushi Marine Biological Station of Tohoku University and of the Otsuchi Marine Research Center of the University of Tokyo for cooperation. We also thank Prof. Nori Satoh at Kyoto University and Prof. Eric Davidson at California Institute of Technology for the generous use of their facilities; and Dr Susumu Goto at Kyoto University for support in the database management. Computational resources were provided by the Super Computer Laboratory of the Institute for Chemical Research, Kyoto University. This research was supported by Grants-in-Aid from the Ministry of Education, Science, Sports and Culture, Japan (numbers 10168213, 11149212, 12024211, 12202021, 12680714, 13202024 to K. W. M.) and by a grant from the ‘Research for the Future’ program of the Japanese Society for the Promotion of Science (96L0040) to K. W. M., T. N. and H. N.