Maternal and zygotic activities of the homeodomain protein PAL-1 specify the identity and maintain the development of the multipotent C blastomere lineage in the C. elegans embryo. To identify PAL-1 regulatory target genes, we used microarrays to compare transcript abundance in wild-type embryos with mutant embryos lacking a C blastomere and to mutant embryos with extra C blastomeres. pal-1-dependent C-lineage expression was verified for select candidate target genes by reporter gene analysis, though many of the target genes are expressed in additional lineages as well. The set of validated target genes includes 12 transcription factors, an uncharacterized wingless ligand and five uncharacterized genes. Phenotypic analysis demonstrates that the identified PAL-1 target genes affect specification, differentiation and morphogenesis of C-lineage cells. In particular, we show that cell fate-specific genes (or tissue identity genes)and a posterior HOX gene are activated in lineage-specific fashion. Transcription of targets is initiated in four temporal phases, which together with their spatial expression patterns leads to a model of the regulatory network specified by PAL-1.
Introduction
The C. elegans embryo develops rapidly with an invariant, fully described lineage (Sulston et al.,1983), allowing determination of defined gene expression states in each cell of the developing embryo. As a first step toward this goal, we previously measured temporal profiles of transcript abundance during development with small cohorts of staged whole embryos(Baugh et al., 2003). Because whole embryos are used, expression in different lineages is integrated in the measurement. However, the phenotypes of previously described mutants offer a genetic approach with which to dissect lineage-specific patterns of gene expression from whole-embryo data.
Each of the six founder blastomeres gives rise to different cell types by characteristic patterns of cell division (lineages)(Sulston et al., 1983). Founder blastomere fates are specified by a variety of spatially and temporally restricted maternal gene activities. In addition, the embryo contains a global anteroposterior (AP) patterning system that differentially specifies the daughters of blastomeres dividing on the AP axis(Kaletta et al., 1997; Lin et al., 1998). When gastrulation commences at the 26-cell stage, all of the founder blastomeres have been born and tissue identity begins to be specified, as indicated by the initial expression of cell fate-specific genes, whose functions are required for the development of specific cell types, and increasing resistance of the embryo to cell fate transformations induced by ectopic expression of such genes (Gilleard, 2001; Zhu,1998). It has been noted that although such fate-specific genes are expressed in multiple lineages (most tissues being polyclonal), the cells expressing them are born at about the same time and in regional domains as if in creation of a tissue or organ primordium (Labouesse and Mango, 1999). From work on the C. elegans pharynx it is clear that organ identity genes control autonomous organ-specific genetic networks (Gaudet and Mango,2002). However, it remains to be determined how the lineage-based mechanisms in the early embryo work together with the global AP patterning system to pattern tissue and organ identity gene expression, thus causally relating the maternal genetic network that patterns the early embryo with the zygotic networks that pattern later developmental structures.
The fate of the C and D founder blastomeres, the somatic descendants of P2,is specified by the Caudal-like homeobox gene pal-1. Maternal PAL-1 activity is temporally and spatially targeted to the C and D founder blastomeres by first restricting translation of maternal pal-1 mRNA to the descendants of the posterior blastomere P1 (EMS and P2) and then by restricting the activity of the translated protein to the somatic descendants of P2 (C and D) (Hunter and Kenyon,1996). The KH domain protein MEX-3 is required to restrict translation of maternal pal-1 mRNA to the posterior blastomeres at the four-cell stage (EMS and P2) (Draper et al., 1996; Huang et al.,2002; Hunter and Kenyon,1996), while the bZIP transcription factor SKN-1 blocks PAL-1 function in EMS, and the zinc-finger protein PIE-1 maintains the germline blastomeres P2 and P3 in a transcriptionally quiescent state so that PAL-1 activity is restricted to their somatic descendants, C and D(Bowerman et al., 1993; Hunter and Kenyon, 1996; Mello et al., 1996; Seydoux et al., 1996).
The C lineage gives rise primarily to muscle and epidermis but also two neuronal cells and a cell death (Sulston et al., 1983). In the absence of maternal PAL-1 activity, the C and D blastomeres fail to develop in any discernible way, while ectopic PAL-1 activity causes other blastomeres to produce muscle, epidermal and neuronal cells by a C-like lineage (Draper et al.,1996; Hunter and Kenyon,1996). Although other somatic lineages also give rise to muscle and epidermal cells, the lack of discernable C cell fates in the absence of maternal PAL-1 function (Hunter and Kenyon, 1996), indicates that PAL-1 activates cell fate specification factors (tissue identity genes) in the C lineage.
To learn how maternal PAL-1 activity leads to the patterned specification of multiple cell fates within a single blastomere lineage, we aim to identify the genes directly and indirectly activated by PAL-1 and determine their loss-of-function phenotypes and regulatory interactions. To identify genes expressed in the C lineage by microarray we have used pie-1 and mex-3 mutations, as well as skn-1 RNAi, to produce embryos that either lack a C blastomere or that contain almost exclusively C-like blastomeres. Our results are verified by reporter gene analysis and complemented by phenotypic analysis of PAL-1 targets, and a model for the regulatory network specified by PAL is presented.
Materials and methods
Microarray sample preparation
Wild-type samples were prepared and described previously(Baugh et al., 2003) but re-hybridized here to a different microarray. Strains JJ532 [pie-1(zu154)unc-25(e156)/qC1[dpy-19(e1259) glp-1(q239)]; III] and JJ518[mex-3(zu155) dpy-5(e61)/hT1; I] were grown at 15°C either on E. coli OP50 or on HT115 (for RNAi), and Unc and Dpy adults were respectively picked and cut for embryo collection. Embryo collection and staging was as described (Baugh et al.,2003) except that embryos were washed in 10 mM NaCl as opposed to water and aged in the lids of 0.6 ml tubes rather than on microscope slides. RNA was extracted and amplified as described(Baugh et al., 2001); for protocol see http://mcb.harvard.edu/hunter/Protocols/protocols.htm.
Microarray hybridization and data reduction
Biotinylated, amplified RNA (1 μg) was hybridized to the Affymetrix C. elegans microarray as described(Baugh et al., 2003). Array data were quantile normalized and reduced by the robust multi-chip average algorithm (RMA) (Irizarry et al.,2003), using the Bioconductor Affy package (version 1.0, www.bioconductor.org)for the R statistics software (version 1.5.0, www.r-project.org). All expression levels reported here were back-transformed to the linear scale,i.e. reported values are 2(RMA). All raw data have been submitted to the Gene Expression Omnibus database, Accession Number GSE2180, and averaged data and analysis are available in the supplementary material.
Clustering of gene expression profiles
Clusters were generated by a modified version of the QT clustering algorithm (Heyer et al.,1999). This algorithm assembles a series of clusters ordered by size with a defined limit on the largest pair-wise distance allowed between any two profiles in a cluster. Distance between profiles is measured as 1-R,where R is the Pearson correlation coefficient. Although we limited this distance to 0.3, some genes are included in clusters simply by chance. To reduce the spurious inclusion of these genes in the final clusters, we systematically re-sampled our data (100 times) with two forms of synthetic noise added at each reiteration to generate an Ravg. Noise was added to log2 scale RMA expression data, and was generated by a two-component model consisting of an additive Gaussian background with standard deviation 0.2, and a multiplicative Gaussian sampling error with a standard deviation of 0.05. Simulated data were floored at 1 RMA unit. Graphs plotting average expression for each cluster, and the cluster to which each gene belongs can be found in Fig. S1 and Data S1, S2 in supplementary material.
ANOVA
Analysis of variance (ANOVA) was performed by using a randomization test to assess differences in expression among genotypes at each time point. Tests were performed to assess for each gene, overall variation among all three genotypes at each time point, and variation between each of the three possible pairs of genotypes at each time point. All statistical tests were performed on the log2 scale data. Data for the pie-1(zu154) and pie-1(zu154); pal-1(RNAi) genotypes were pooled to form a single group, denoted `pm'. Sample labels were randomly shuffled 100 times, and for each shuffling, at each time point, a null distribution of F-statistics were computed among all three genotypes, and between each of the three pair-wise combinations of genotypes at each time point [N2 versus mex-3(zu155);skn-1(RNAi), N2 versus pm, and mex-3(zu155); skn-1(RNAi) versus pm). P-values for differential expression among groups were determined by referring the F-statistics from the observed data to the null distribution arising from the random permutations. P-values are not adjusted for multiple testing. The total number of statistical tests per gene was 40 (10 time points with three pair-wise comparisons and 1 overall comparison).
Target scoring
Clustering, correlation to known target genes and ANOVA were used to score the PAL-1 target potential of each gene. Only genes with maximum expression over time and genotype greater than the median of all genes over time and genotype (transcript abundance of thirteen RMA units) were considered as potential targets. Clusters 24, 50, 60, 140, 187, 141, 168, 85, 105, 130, 131,144, 177, 195 and 88 were selected as potential target clusters. Because our selection of pal-1 target clusters was subjective, each gene belonging to one of these clusters was given a target score of only 1. We also leveraged our limited prior knowledge to give genes a target score of 1 if they were either one of the ten best correlated with vab-7 over time and genotype or one of the 100 best correlated with cwn-1. vab-7 and cwn-1 had been validated as target genes and are both expressed specifically in the C lineage. The decision to include the 10 and 100 best correlated genes for each was based on inspection of expression patterns.
The most rigorous approach was model-based and quantitative ANOVA, which was used to assign a target score of 1-5, depending on the P-value assigned to each gene at each time point for the observed differences in expression between the three genotypes. For each gene at each time point it was also determined if it was higher in mex-3(zu155); skn-1(RNAi)than wild type and lower in pie-1(zu154) than wild type. Genes with a P-value less than a given cut-off and appropriate differences between genotypes were noted. With this information, a target score was generated for each gene in two ways and the maximum was kept. The first relies on the gene satisfying both criteria in a pair of adjacent time points. Genes with a P-value below 10-2 for two adjacent time points were given a score of 1, those below 10-3 were given a score of 2, those below 10-4 were given a score of 3, those below 10-5 were given a score of 4, and those below 10-6 were given a score of 5. The adjacency requirement excludes late genes that show differences between genotypes in only the last time point, and so the second target score requires the gene to satisfy both criteria in only a single time point. Genes with a P-value below 10-4 were given a score of one, those below 10-5 were given a score of two, and those below 10-6were given a score of three. A nominal P-value of 10-4 (or 10-2 twice) with ∼10,000 genes considered at ten time points with two models (pair of adjacent time points and single time point) should result in about 20 false positives; however, the actual number of false positives should be less given the additional requirement that the genotypes differ in specific ways.
To assign each gene a final target score the max of the ANOVA-based models is added to the score from cluster analysis (1 or 0) and correlation to known targets (1 or 0) producing a maximum score of 7. The target score therefore relies heavily on the ANOVA analysis, with a score of 7 indicating that the gene is in a potential target cluster, correlates well with either of the known targets, and has a P-value below 10-6 in a pair of adjacent time points. By contrast, genes with a score of 1 are either in a potential target cluster or well correlated with a known target or have a minimally sufficient P-value.
Reporter analysis
Reporter constructs were made by PCR(Hobert, 2002). 5′genomic sequence, either 5 kb or up to the next gene, was used as promoter. YFP was from pPD132.112, which includes C. elegans introns, a nuclear localization signal and the unc-54 3′UTR; for additional information see http://www.ciwemb.edu/pages/firelab.html(Fire et al., 1990). Either 4 kb from pRF4 containing a dominant rol-6(Mello et al., 1991) or a 2.2 kb sequence unc-119 rescue sequence from pDPMM051(Maduro and Pilgrim, 1995) was included in the final PCR construct as co-transformation marker. The vab-7 reporter (HC14) was made by ligating a 5 kb AgeI/PstI GFP fragment from pPD104.53(Fire et al., 1990) to a 14 kb AgeI/PstI digestion pJA15(Ahringer, 1996) to produce pHC16. JK3363 (Mathies et al.,2003) was used rather than our hnd-1 reporter, as our reporter was too dim to score. Either N2 or CB4845 [unc-119(e2498)]was injected with PCR product diluted 10-fold in water as described(Mello et al., 1991); rol-6 plasmid (pRF4) was co-injected at 50 ng/μl where rol-6 was used as co-transformation marker, including HC14. Stable extrachromosomal lines were used for initial reporter analysis, and select reporters were chosen for chromosomal integration by gamma irradiation as described (Inoue et al.,2002). Table S1 in the supplementary material provides strain names and oligonucleotide sequences used.
RNAi
Hairpin RNAi feeding vectors were made as described(Winston et al., 2002) for pal-1, mex-3 and skn-1, and transformed into E. coli HT115; for protocol see http://mcb.harvard.edu/hunter/Protocols/protocols.htm. JJ532 was grown on both OP50 and HT115 expressing double-stranded pal-1 RNA, but no differences were detected by microarray (data not shown) and the data were merged. JJ518 was only grown on HT115 expressing double-stranded skn-1 RNA. For the reporter assay, in addition to being grown on RNAi food, worms were soaked in double-stranded RNA as described (Maeda et al., 2001)(Table 1).
ORF . | Gene . | Target score . | Promoter size (kb) . | Wild-type expression pattern . | pal-1 (RNAi) . | mex-3 (RNAi) . | Target . |
---|---|---|---|---|---|---|---|
C28C12.7 | spp-10 | 7 | 1.7 | C lineage | NE | EE | Yes |
F01F1.6 | 7 | 0.3 | No expression | ND | ND | No | |
R02D3.1 | 7 | 0.8 | C lineage | NE | EE | Yes | |
K10B4.6 | cwn-1 | 6 | 1.9 | Posterior C lineage | NE | EE | Yes |
B0304.1 | hlh-1 | 6 | 3.0 | Muscle | NPE | EE | Yes |
C09D4.2 | 5 | 0.8 | C lineage | NE | EE | Yes | |
C46H11.2 | 5 | 2.1 | C epidermis | NE | EE | Yes | |
R07C3.11 | 5 | 1.1 | Epidermis | LE | EE | Yes | |
T22B7.3 | 5 | 1.8 | C epidermis then more epidermis | NE | EE | Yes | |
ZK1307.1 | 5 | 1.1 | C epidermis | NE | EE | Yes | |
C55C2.1 | 4 | 6.1 | C and D muscle (left side) | NE | EE | Yes | |
F20D1.4 | 4 | 0.4 | No expression | NE | NE | No | |
F45E4.2 | plp-1 | 4 | 1.1 | No expression | NE | NE | No |
F23H12.4 | sqt-3 | 4 | 2.0 | C epidermis then more epidermis | EE | EE | Maybe |
ZK829.4 | 4 | 0.6 | No expression | NE | NE | No | |
T07C4.6 | tbx-9 | 3 | 3.8 | C and AB epidermis, MS | NPE | EE | Yes |
C38D4.6 | pal-1 | 3 | 1.4 | C and D lineage | NE | EE | Yes |
C54F6.8 | 3 | 5.0 | Epidermis | LE | EE | Yes | |
D1081.2 | unc-120 | 3 | 5.0 | Muscle | NPE | EE | Yes |
F35D2.3 | 3 | 5.0 | Many cells | ND | ND | No | |
C32F10.5 | hmg-3 | 3 | 5.0 | No expression | NE | NE | No |
C44C10.8 | hnd-1 | 3 | 1.5 | Muscle* | NPE | EE | Yes |
Y75B8A.2 | nob-1 | 3 | 4.9 | Posterior epidermis and posterior E | LE | EE | Yes |
C32E8.8 | ptr-2 | 3 | 3.0 | Too dim to score | EE | EE | No |
T07C4.2 | tbx-8 | 2 | 3.8 | C and AB epidermis, MS | NPE | EE | Yes |
ZK270.1 | ptr-23 | 2 | 5.0 | C epidermis then all epidermis | EE | ND | No |
F35G12.6 | mab-21 | 2 | 4.7 | C and AB epidermis | NPE | EE | Yes |
W09C2.1 | elt-1 | 1 | NA | Epidermis† | Unk | Unk | Probably |
M142.4 | vab-7 | 1 | 13.8 | Posterior C lineage | NE | EE | Yes |
T27B1.2 | 1 | 5.0 | Epidermis | ND | EE | Maybe | |
F57A10.5 | nhr-60 | 1 | 2.0 | Epidermis | ND | EE | Maybe |
F11C1.6 | nhr-25 | 1 | 5.0 | Epidermis‡ | NPE | ND | Yes |
F18A1.2 | lin-26 | 1 | 0.8 | C epidermis then all epidermis | ND | ND | No |
F45E4.9 | hmg-5 | 1 | 0.3 | Late epidermis | ND | ND | No |
K02B9.4 | elt-3 | 1 | 4.9 | Non-seam epidermis | LE | EE | Yes |
C13G5.1 | ceh-16 | 1 | 4.1 | No expression | ND | ND | No |
C30G4.3 | gcy-11 | 1 | 3.2 | No expression | NE | NE | No |
C25G6.5 | 1 | 4.4 | Too dim to score | EE | ND | No | |
W03D2.5 | wrt-5 | 1 | 0.8 | No expression | ND | ND | No |
Y71F9AL.17 | 1 | 0.4 | Late epidermis | ND | ND | No |
ORF . | Gene . | Target score . | Promoter size (kb) . | Wild-type expression pattern . | pal-1 (RNAi) . | mex-3 (RNAi) . | Target . |
---|---|---|---|---|---|---|---|
C28C12.7 | spp-10 | 7 | 1.7 | C lineage | NE | EE | Yes |
F01F1.6 | 7 | 0.3 | No expression | ND | ND | No | |
R02D3.1 | 7 | 0.8 | C lineage | NE | EE | Yes | |
K10B4.6 | cwn-1 | 6 | 1.9 | Posterior C lineage | NE | EE | Yes |
B0304.1 | hlh-1 | 6 | 3.0 | Muscle | NPE | EE | Yes |
C09D4.2 | 5 | 0.8 | C lineage | NE | EE | Yes | |
C46H11.2 | 5 | 2.1 | C epidermis | NE | EE | Yes | |
R07C3.11 | 5 | 1.1 | Epidermis | LE | EE | Yes | |
T22B7.3 | 5 | 1.8 | C epidermis then more epidermis | NE | EE | Yes | |
ZK1307.1 | 5 | 1.1 | C epidermis | NE | EE | Yes | |
C55C2.1 | 4 | 6.1 | C and D muscle (left side) | NE | EE | Yes | |
F20D1.4 | 4 | 0.4 | No expression | NE | NE | No | |
F45E4.2 | plp-1 | 4 | 1.1 | No expression | NE | NE | No |
F23H12.4 | sqt-3 | 4 | 2.0 | C epidermis then more epidermis | EE | EE | Maybe |
ZK829.4 | 4 | 0.6 | No expression | NE | NE | No | |
T07C4.6 | tbx-9 | 3 | 3.8 | C and AB epidermis, MS | NPE | EE | Yes |
C38D4.6 | pal-1 | 3 | 1.4 | C and D lineage | NE | EE | Yes |
C54F6.8 | 3 | 5.0 | Epidermis | LE | EE | Yes | |
D1081.2 | unc-120 | 3 | 5.0 | Muscle | NPE | EE | Yes |
F35D2.3 | 3 | 5.0 | Many cells | ND | ND | No | |
C32F10.5 | hmg-3 | 3 | 5.0 | No expression | NE | NE | No |
C44C10.8 | hnd-1 | 3 | 1.5 | Muscle* | NPE | EE | Yes |
Y75B8A.2 | nob-1 | 3 | 4.9 | Posterior epidermis and posterior E | LE | EE | Yes |
C32E8.8 | ptr-2 | 3 | 3.0 | Too dim to score | EE | EE | No |
T07C4.2 | tbx-8 | 2 | 3.8 | C and AB epidermis, MS | NPE | EE | Yes |
ZK270.1 | ptr-23 | 2 | 5.0 | C epidermis then all epidermis | EE | ND | No |
F35G12.6 | mab-21 | 2 | 4.7 | C and AB epidermis | NPE | EE | Yes |
W09C2.1 | elt-1 | 1 | NA | Epidermis† | Unk | Unk | Probably |
M142.4 | vab-7 | 1 | 13.8 | Posterior C lineage | NE | EE | Yes |
T27B1.2 | 1 | 5.0 | Epidermis | ND | EE | Maybe | |
F57A10.5 | nhr-60 | 1 | 2.0 | Epidermis | ND | EE | Maybe |
F11C1.6 | nhr-25 | 1 | 5.0 | Epidermis‡ | NPE | ND | Yes |
F18A1.2 | lin-26 | 1 | 0.8 | C epidermis then all epidermis | ND | ND | No |
F45E4.9 | hmg-5 | 1 | 0.3 | Late epidermis | ND | ND | No |
K02B9.4 | elt-3 | 1 | 4.9 | Non-seam epidermis | LE | EE | Yes |
C13G5.1 | ceh-16 | 1 | 4.1 | No expression | ND | ND | No |
C30G4.3 | gcy-11 | 1 | 3.2 | No expression | NE | NE | No |
C25G6.5 | 1 | 4.4 | Too dim to score | EE | ND | No | |
W03D2.5 | wrt-5 | 1 | 0.8 | No expression | ND | ND | No |
Y71F9AL.17 | 1 | 0.4 | Late epidermis | ND | ND | No |
Thirty-nine genes were selected from the list of 308 candidate targets for validation by the assay presented in Fig. 4. Genes were selected either because they have high candidate target scores, they are predicted to be involved in transcription or signaling, or they have known developmental phenotypes.
NE, no expression; NPE, no posterior expression; LE, less expression; EE,ectopic expression; ND, no difference from wild type; Unk, not determined.
Expression in the C and D lineages was not distinguished. Additional,relatively rare expression may not be reported. Results are for early embryo(∼4 hours) only (i.e. where `No expression' is reported).
Data for hnd-1 expression were obtained with JK3363(Mathies et al., 2003)
We were unable to establish elt-1::yfp reporter lines (consistent with B. Page's rescue experiments) and therefore rely on the published expression pattern (Page et al.,1997)
hhr-25 is reported to be expressed in the posterior epidermis but also the endoderm (Asahina et al.,2000); however, we do not see endodermal expression with either the published strain or ours
3D imaging
Four-cell embryos were collected from cut mothers by mouth pipette and mounted on 2.0% agarose pads. An Olympus IX70 microscope equipped with Nomarski and fluorescence optics and DeltaVision Spectris software was used to image embryos 210 minutes after the four-cell stage (22°C). Generally,about 40 optical sections along the z-axis (ranging from 0.7 to 1.2μm) were gathered at exposure times ranging from 0.2 to 1.3 seconds. Sections were rendered into three-dimensional images with softWorx Volume Viewer, and projections were generated from maximum intensity voxels every 12° around the anteroposterior axis of the embryo for a total of thirty projections. Dorsal and lateral projections are presented in Fig. 8, and all 30 projections were assembled as a .mov file in Graphic Converter to be viewed as a movie in Quicktime (see Movies 1-13 in supplementary material).
Results
Identification of lineage-specific targets of PAL-1
We have previously analyzed transcript abundance in precisely staged embryos, roughly two time points per cell cycle from the one-cell stage to mid-gastrulation (190-cell stage), using custom synthesized high-density oligonucleotide microarrays (Baugh et al.,2003). Because these wild-type data represent the baseline for all subsequent comparisons, we re-hybridized these published samples to commercially available microarrays used for our current analysis and also by other members of the C. elegans community. The data are marginally improved by hybridization to the newer microarrays (lower coefficient of variation), and in general they are in very good agreement with their published counterpart. The overall correlation coefficient comparing the previously published data with that resulting from re-hybridization to the commercial microarray is 0.90. The differences observed tend to be small in magnitude and distributed over many genes (data not shown), though there are exceptions (including end-3). Because the precision and density of sampled time points in the wild-type data greatly increased sensitivity(Baugh et al., 2003), we collected multiple replicates at similar time points for the mutant embryos beginning at the four-cell stage (Fig. 1).
Embryos from homozygous pie-1(zu154) mothers lack pal-1-dependent C and D blastomeres because the identity of the P2 blastomere is transformed to that of its somatic sister, EMS(Fig. 1)(Mello et al., 1992). In the absence of the C and D lineages, PAL-1 targets, whether direct or indirect,should have reduced expression in pie-1(zu154) relative to wild type, and genes that are exclusively expressed in the C or D lineage should not be detected. A pal-1 mutant could not be used because maternal activity specifies the C and D fates, and pal-1(null)mutants are zygotic lethal. We chose to use pie-1(zu154)rather than pal-1(RNAi) embryos as the mutation results in higher penetrance than RNAi of pal-1 (data not shown). pie-1(zu154) embryos contain wild-type levels of PAL-1, but PAL-1 is unable to specify C-lineage identity because, in the absence of PIE-1 function, SKN-1 dominantly controls the development of the P2 lineage(Hunter and Kenyon, 1996). To be certain that the PAL-1 protein present in pie-1 mutant embryos does not affect transcript abundance, we measured transcript abundance in parallel in pie-1(zu154); pal-1(RNAi) embryos. However, the results from the two genotypes were indistinguishable, indicating that PAL-1 is impotent in the pie-1(zu154) embryos (data not shown). The aggregate pie-1(zu154) and pie-1(zu154); pal-1(RNAi) replicates are combined in the pie-1(zu154) data we report.
Embryos from homozygous mex-3(zu155) mothers translate pal-1 mRNA throughout the embryo, transforming the eight great-granddaughters of the AB blastomere into C-like blastomeres(Fig. 1)(Draper et al., 1996). These eight anterior blastomeres, born at approximately the same time as the C blastomere, produce eight serially homologous lineages giving rise to the muscle and epidermal cell types characteristic of the C lineage. This transformation requires pal-1 function(Hunter and Kenyon, 1996),thus expression of PAL-1 targets should be greater in mex-3(zu155)than in wild type. To sensitize our ability to detect PAL-1 targets further, skn-1 RNAi was also used in the mex-3(zu155) background,because, in the absence of skn-1 function, PAL-1 is active in the EMS lineage transforming its daughters into C-like blastomeres(Bowerman et al., 1992; Hunter and Kenyon, 1996).
To test the feasibility of using mutants to deconvolve lineage-specific patterns of gene expression and to develop computational rules to enrich for lineage-specific transcripts, we analyzed a known, skn-1-dependent lineage-specific transcriptional cascade. In contrast to PAL-1 targets (genes regulated directly or indirectly by PAL-1), SKN-1 targets should be more abundant in pie-1(zu154) and less abundant in mex-3(zu155); skn-1(RNAi)(Fig. 1). SKN-1 directly activates its earliest zygotic targets, the redundant GATA transcription factors med-1 and med-2, in the EMS blastomere, initiating a transcriptional cascade resulting in expression of new genes after each cell division (Maduro, 2001; Maduro and Rothman, 2002). med-1 and med-2 activate the expression of two more GATA factors, end-1 and end-3,specifically in the E lineage, which activate another GATA factor, elt-2, that activates the expression of the gut esterase gene ges-1 (Fig. 2). med-1 and med-2 are expressed at too low abundance to detect quantitatively in wild type (Baugh et al.,2003), but they show an expected increase in pie-1(zu154) at 23 minutes (eight-cell stage). end-1 transcripts are about 1.5-fold more abundant in pie-1(zu154) and are not detected following skn-1RNAi. Although end-3, elt-2 and ges-1 are not detected at increased abundance in pie-1(zu154), all three are not detected following skn-1 RNAi. These observations indicate that skn-1(RNAi) appears to phenocopy a null mutant with respect to target gene expression. Furthermore, the times of activation and maximum expression of the skn-1 target genes are equivalent in the mutants and wild type, suggesting that the transformed lineage is developing at the same molecular rate as its native sister. However, the fact that only end-1 shows elevated expression in pie-1(zu154) indicates that few target genes will behave as expected and that we must use flexible rules to identify candidate PAL-1 target genes.
To identify candidate PAL-1 target genes, we filtered the data by cluster analysis, a model-directed approach based on analysis of variance (ANOVA) and expression correlation to known targets. In order to maximize identification of true PAL-1 targets, we combined all three approaches, complementing the biases of each. In order to accommodate the resulting high false-positive rate, we assigned a cumulative numerical score from zero to seven for all genes (see Materials and methods) so that strong candidates can be discerned from weak ones. Genes scoring one or higher are considered candidate PAL-1 targets, of which 308 were identified. A score of one indicates that the gene appears to be a target by only one of the three analytical approaches, while a score of seven indicates that a gene appears to be a target by all three approaches. The distribution of scores for candidate target genes is highly skewed, with nearly half of them (146) scoring one and only 17 scoring five or greater (Fig. 3). Hierarchical clustering of expression patterns for the 308 candidate targets shows that expression is generally lower in pie-1(zu154) and greater in mex-3(zu155); skn-1(RNAi), and it reveals distinct patterns across time and genotype (Fig. 3), consistent with PAL-1 targets being expressed in diverse spatial patterns and encoding a variety of developmental functions.
Validation of PAL-1 targets
We expect PAL-1 targets to be expressed in complex expression patterns,some of which may correlate with the C-lineage and or muscle or epidermal cell fates. The summary of expectations given in Fig. 1D is for truly lineage-specific genes. Because of secondary and confounded effects of the homeotic transformations used to identify candidate targets genes, true targets may not have received high scores and many false targets likely received low scores. To validate select candidates, we used 'reporter analysis' to determine approximately where each candidate target is expressed and whether that expression is appropriately responsive to a lack[pal-1(RNAi)] or excess [mex-3(RNAi)] of pal-1activity. For example, pal-1 received a moderate target score of three, and maternal PAL-1 has been suspected to activate the transcription of zygotic pal-1 (Hunter and Kenyon,1996). We find that a pal-1 promoter::YFP fusion is expressed exclusively in the C and D lineage of early wild-type embryos, but is not detected in pal-1(RNAi) embryos, and is expressed ectopically in the anterior of mex-3(RNAi) embryos(Fig. 4). Because this reporter contains no pal-1-coding sequence, RNAi of pal-1 does not directly effect reporter expression, but only the endogenous gene. This result demonstrates that maternal pal-1 activates zygotic pal-1expression and suggests that zygotic pal-1 auto-regulates its expression.
We analyzed 39 candidate target genes by reporter analysis in wild type and following pal-1 and mex-3 RNAi(Table 1). Among the 308 unique candidate target genes, all predicted transcription factors or signaling molecules were selected for the reporter assay, as well as a random eight out of 17 high-scoring genes (scored as five or greater). Expression was detected for 31 (79%) of the reporters and 21 (68%) of those were confirmed as targets,with those scoring higher in the cumulative index validating more frequently(Table 2). Temporal expression profiles for each of the validated targets can be found in Fig. S2 in the supplementary material. Failure to be validated does not necessarily mean that the gene is not a target as there are multiple reasons why the reporter assay may result in false negatives. Nevertheless, the validation efficiency suggests that roughly 130 of the 308 candidates would be validated by this reporter assay.
Target score . | Number of genes . | Number tested by reporter . | Expression detected . | Validated as target . | Validation efficiency . |
---|---|---|---|---|---|
5, 6 or 7 | 17 | 10 | 9 | 9 | 90% |
2, 3 or 4 | 145 | 17 | 13 | 9 | 53% |
1 | 146 | 12 | 9 | 3 | 25% |
All | 308 | 39 | 31 | 21 | 54% |
Target score . | Number of genes . | Number tested by reporter . | Expression detected . | Validated as target . | Validation efficiency . |
---|---|---|---|---|---|
5, 6 or 7 | 17 | 10 | 9 | 9 | 90% |
2, 3 or 4 | 145 | 17 | 13 | 9 | 53% |
1 | 146 | 12 | 9 | 3 | 25% |
All | 308 | 39 | 31 | 21 | 54% |
Only genes with a `yes' in Table 1 are included in the percentage.
Lineage-based regulation of tissue identity genes
With the exception of intestine and germline, cell fates in the C. elegans embryo are polyclonal(Sulston et al., 1983). Nevertheless, precursor cells for a given tissue or organ are typically born near each other such that a fate map can be drawn for the mid-gastrula embryo. It has been suggested that this regional positioning of tissue and organ precursors enables their coherent development as tissue and organ primordia(Labouesse and Mango, 1999). However, it is not known what mechanisms direct the expression of tissue and organ identity genes in the appropriate cells. It is possible that the position of a cell in the embryo could influence the decision to express a given tissue identity gene (Schnabel,1996). Alternatively, different cell-autonomous mechanisms could direct the expression of the tissue identity gene in different lineages.
The muscle-specific genes hnd-1, hlh-1 and unc-120 are expressed in muscle precursors in the MS, C and D lineages(Krause et al., 1990; Mathies et al., 2003)(Fig. 5, Fig. 8). Although the expressing cells are derived from three lineages, they are born in the ventral-posterior region of the embryo and within two cell divisions form four longitudinal stripes pre-ordaining the four quadrants of body wall muscle present at hatching (Fig. 5, Fig. 8). To determine if the decision to express these tissue identity genes is controlled in a cell-autonomous fashion, we examined reporter gene expression more carefully following RNAi of pal-1. Consistent with a lineage-based mechanism, pal-1 function is required for expression of hnd-1, hlh-1and unc-120 reporter genes in the C and D lineages but not MS(Fig. 5), where we presume an unknown lineage-specific factor is required for their expression.
Regulation of a posterior HOX gene in lineage-specific fashion
pal-1, also known as nob-2, was isolated from the same forward genetic screen as the Abd-b/Hox9-13 ortholog nob-1(Van Auken et al., 2000). The Nob (no back end) phenotype suggests that the two genes may function in a common pathway to pattern posterior development. Consistent with this hypothesis, a transcriptional reporter for nob-1 is expressed in the extreme posterior of the embryo in cells derived from the AB, E and C lineages(see Figs 7 and 9). This expression pattern suggests regional, as opposed to lineage-based, regulation of nob-1because regional regulation need only require a single mechanism, while lineage-based regulation would require nob-1 expression to be activated independently in the most posterior descendants of each lineage. We find that expression of a nob-1 reporter within the C lineage requires pal-1 function and that mex-3(RNAi) results in ectopic anterior expression of nob-1::YFP in a relatively small number of cells consistent with nob-1 being expressed in one-eighth of the C lineage (Fig. 6). These results indicate that nob-1 expression in the C lineage requires lineage-specific pal-1-dependent information, although we cannot rule out the possibility that nob-1 expression requires pal-1-dependent signaling activity. However, given that in pal-1(RNAi) embryos nob-1::YFP is expressed in the AB and E lineages and that in mex-3(RNAi) embryos nob-1::YFP is expressed in sparse pairs of anterior cells, it is unlikely that nob-1 expression is activated by a pal-1-dependent signaling activity.
PAL-1 targets affect specification, differentiation and morphogenesis of C-lineage cells
Candidate PAL-1 targets selected for validation by reporter gene analysis were biased toward potential developmental regulators(Table 3). Twelve transcription factors representing many families, including homeodomain, zinc-finger, GATA,MADS domain, bHLH and T-box are included, consistent with PAL-1 targets controlling diverse aspects of development. That so many previously characterized transcription factors are among the targets of PAL-1 suggests that pal-1 controls a transcriptional regulatory network.
ORF . | Gene . | Identification . | Published RNAi* . | Embryonic lethality (n) . | Embryonic function . | Larval phenotype . |
---|---|---|---|---|---|---|
C28C12.7 | spp-10 | Predicted prosaposin | Wild type | 1% (787) | ||
R02D3.1 | Dehydrogenase | Wild type | 0% (731) | |||
C09D4.2 | Uncharacterized | Wild type | 2% (657) | |||
C38D4.6 | pal-1 | Homeobox transcription factor (cad subfamily) | Emb | 100% (1038) | Specification and morphogenesis of C lineage†,‡ | Nob, Vab‡ |
M142.4 | vab-7 | Homeobox TF (even-skipped subfamily) | Bmd | 2% (512) | Morphogenesis of C lineage§ | Nob, Vab§ |
C55C2.1 | Zinc-finger transcription factor | Wild type | 2% (1032) | |||
B0304.1 | hlh-1 | bHLH transcription factor | Wild type | 9% (716) | Differentiation of muscle¶ | Unc¶ |
D1081.2 | unc-120 | MADS domain transcription factor | Unc | 3% (680) | Differentiation of muscle** | Unc** |
C44C10.8 | hnd-1 | Hand bHLH transcription factor | Bmd | 1% (516) | Posterior bulges*** | |
K10B4.6 | cwn-1 | Putative wingless ligand | NA | 2% (507) | Rare tail defects††† | |
R07C3.11 | Uncharacterized | Wild type | 1% (699) | |||
C54F6.8 | Nuclear hormone receptor | Wild type | 2% (875) | |||
F11C1.6 | nhr-25 | Nuclear hormone receptor | Wild type | 1% (718) | Rare tail defects‡‡ | |
W09C2.1 | elt-1 | GATA transcription factor | Emb, Unc | 100% (487) | Specification of epidermis†† | |
K02B9.4 | elt-3 | GATA transcription factor | Wild type | 1% (833) | ||
F35G12.6 | mab-21 | Highly conserved novel protein | Wild type | 2% (711) | Sqt, male tale defects§§ | |
T22B7.3 | Uncharacterized | Wild type | 1% (657) | |||
C46H11.2 | Uncharacterized | Wild type | 1% (791) | |||
ZK1307.1 | Uncharacterized | Wild type | 1% (698) | |||
T07C4.2 | tbx-8 | T-box transcription factor (Brachyury) | Wild type | 3% (502) | Vab, posterior bulges, tail defects††† | |
T07C4.6 | tbx-9 | T-box transcription factor (Brachyury) | Wild type | 1% (520) | Vab, posterior bulges, tail defects††† | |
Y75B8A.2 | nob-1 | Homeodomain transcription factor (posterior Hox paralog) | Emb, Bmd | 36% (761) | Morphogenesis of posterior epidermis¶¶ | Nob,Vab¶¶ |
ORF . | Gene . | Identification . | Published RNAi* . | Embryonic lethality (n) . | Embryonic function . | Larval phenotype . |
---|---|---|---|---|---|---|
C28C12.7 | spp-10 | Predicted prosaposin | Wild type | 1% (787) | ||
R02D3.1 | Dehydrogenase | Wild type | 0% (731) | |||
C09D4.2 | Uncharacterized | Wild type | 2% (657) | |||
C38D4.6 | pal-1 | Homeobox transcription factor (cad subfamily) | Emb | 100% (1038) | Specification and morphogenesis of C lineage†,‡ | Nob, Vab‡ |
M142.4 | vab-7 | Homeobox TF (even-skipped subfamily) | Bmd | 2% (512) | Morphogenesis of C lineage§ | Nob, Vab§ |
C55C2.1 | Zinc-finger transcription factor | Wild type | 2% (1032) | |||
B0304.1 | hlh-1 | bHLH transcription factor | Wild type | 9% (716) | Differentiation of muscle¶ | Unc¶ |
D1081.2 | unc-120 | MADS domain transcription factor | Unc | 3% (680) | Differentiation of muscle** | Unc** |
C44C10.8 | hnd-1 | Hand bHLH transcription factor | Bmd | 1% (516) | Posterior bulges*** | |
K10B4.6 | cwn-1 | Putative wingless ligand | NA | 2% (507) | Rare tail defects††† | |
R07C3.11 | Uncharacterized | Wild type | 1% (699) | |||
C54F6.8 | Nuclear hormone receptor | Wild type | 2% (875) | |||
F11C1.6 | nhr-25 | Nuclear hormone receptor | Wild type | 1% (718) | Rare tail defects‡‡ | |
W09C2.1 | elt-1 | GATA transcription factor | Emb, Unc | 100% (487) | Specification of epidermis†† | |
K02B9.4 | elt-3 | GATA transcription factor | Wild type | 1% (833) | ||
F35G12.6 | mab-21 | Highly conserved novel protein | Wild type | 2% (711) | Sqt, male tale defects§§ | |
T22B7.3 | Uncharacterized | Wild type | 1% (657) | |||
C46H11.2 | Uncharacterized | Wild type | 1% (791) | |||
ZK1307.1 | Uncharacterized | Wild type | 1% (698) | |||
T07C4.2 | tbx-8 | T-box transcription factor (Brachyury) | Wild type | 3% (502) | Vab, posterior bulges, tail defects††† | |
T07C4.6 | tbx-9 | T-box transcription factor (Brachyury) | Wild type | 1% (520) | Vab, posterior bulges, tail defects††† | |
Y75B8A.2 | nob-1 | Homeodomain transcription factor (posterior Hox paralog) | Emb, Bmd | 36% (761) | Morphogenesis of posterior epidermis¶¶ | Nob,Vab¶¶ |
Protein identification, published phenotypes from a genome-wide RNAi screen, embryonic lethality following RNAi, and known embryonic functions and larval phenotypes are summarized.
Emb, embryonic lethal; Bmd, body morphology defective; Unc, uncoordinated;Nob, no back end; Vab, variably abnormal morphogenesis; Sqt, squat.
Published RNAi results are only shown for Emb, Bmd and Unc only (wild type only relates to those three phenotypes).
This work (data not shown)
To begin to dissect the function and regulatory relationships among the validated PAL-1 targets, we measured embryonic lethality following RNAi by soaking. Given the pleiotropy and severity of pal-1 loss-of-function phenotypes, we were surprised that RNAi of only two of the 22 genes resulted in 100% embryonic lethality (pal-1 and elt-1), while another two resulted in significant lethality (hlh-1 and nob-1)(Table 3). The embryonic function of these four genes has been described in the literature, and they affect specification, differentiation or morphogenesis of C-lineage cells. Zygotic pal-1 function is required for proper morphogenesis of the C lineage (Edgar et al., 2001), elt-1 is required for specification of the epidermis(Page et al., 1997), hlh-1 is required for differentiation of the muscle(Chen et al., 1994) and nob-1 is required for morphogenesis of the posterior epidermis(Van Auken et al., 2000). RNAi of several of the other genes produced post-embryonic phenotypes indicative of a function in posterior morphogenesis. For example, tail defects in larvae lacking vab-7, pal-1, nob-1, tbx-8 and tbx-9 function have been shown to result from improper morphogenesis of C-lineage cells(Ahringer, 1996; Edgar et al., 2001; Pocock et al., 2004; Van Auken et al., 2000), and the rare larval defects seen following loss of function of hnd-1,nhr-25 and cwn-1 are also likely to result in part from improper morphogenesis of C-lineage cells (Table 3). Although additional subtle phenotypes may be detected in more sensitive phenotypic assays, the limited number of penetrant phenotypes following RNAi may be attributable to ineffective depletion by RNAi,functional redundancy of individual genes or network level compensatory mechanisms.
Predicting regulatory network structure from expression
To model the regulatory network specified by PAL-1, we focused on the expression patterns of validated targets most likely to directly effect gene expression. This includes 12 transcription factors, an uncharacterized wingless ligand (cwn-1) and a conserved novel protein known to affect cell fate decisions in the male tail (mab-21). Global analysis of transcript abundance in wild-type embryos has shown that most temporally modulated embryonic genes are expressed transiently, with abundance increasing for one cell cycle and decreasing for one after that(Baugh et al., 2003),consistent with the timing of gene expression during specification of endodermal cell fate (Maduro and Rothman,2002). The first three divisions of the C blastomere are asymmetric in either cell fate or gene expression(Ahringer, 1996; Sulston et al., 1983),suggesting that the PAL-1 network may also operate on a one cell cycle time scale. Consistent with this expectation, transcription of these target genes is activated in one of four temporal phases, each a cell cycle apart(Fig. 7). Twelve out of the 14 selected targets are expressed in two waves corresponding to the 4C-cell stage(phase II) and the 8C-cell stage (phase III). mab-21 is the only phase IV gene among the 14, and pal-1 is considered phase I because its zygotic transcripts are first detected by in situ hybridization at the 2C-cell stage in Ca and Cp (Hunter and Kenyon, 1996).
To generate a working model of the regulatory network, we began with the simple notion that targets from one phase regulate targets of the next, with transcription factors regulating targets expressed in the same cells and signaling molecules regulating targets in adjacent cells from where each is expressed. As YFP expression perdures, we compared spatial expression patterns of each target at the end of phase IV, capturing the spatial expression patterns initiated earlier. Fig. 8 shows dorsal and lateral views of volume rendered images of transcriptional reporters (see also Movies 1-13 in the supplementary material). The expression patterns are summarized in Table 4, and are consistent with published descriptions. hnd-1, hlh-1 and unc-120 are expressed in all of the embryonic myoblasts, but hnd-1 is phase II and hlh-1 and unc-120 are phase III, suggesting that hnd-1 is a positive regulator of hlh-1 and unc-120. We have extended this logic, combining spatial expression with quantitative temporal data to identify the best candidate upstream activator for each of the 14 target genes (Table 4). This set of predictions provides a first draft model of the transcriptional network specified by PAL-1 in the C lineage(Fig. 9). Expression of all fourteen targets is by definition dependent on pal-1 function, and we presume that both maternal and zygotic PAL-1 act directly on phase II targets,but we do not know if PAL-1 acts directly on phase III or IV targets. The model predicts which phase II targets activate which phase III targets, as well as the activation of the phase IV target mab-21 by the phase III target elt-3, and it can easily be extended for other validated targets.
Gene . | Temporal phase . | Spatial expression pattern . | Supporting data . | Refs . | Best candidate upstream activator . |
---|---|---|---|---|---|
pal-1 | I | 16 C, 4 D | Reporter, antibody, in situ hybridization | Edgar et al., 2001; Hunter and Kenyon, 1996 | pal-1 |
cwn-1 | II | C and D muscle, 2 posterior C ectoderm | pal-1 | ||
C55C2.1 | II | Left posterior muscle (C and D), unidentified ventral cells | pal-1 | ||
hnd-1 | II | All muscle (including 8 C muscle, 4 D muscle) | Reporter | Mathies et al., 2003 | pal-1 |
tbx-8 | II | Dorsal epidermis, unidentified lateral cells (including 8 C ectoderm) | Reporter | Pocock et al., 2004; Andachi, 2004 | pal-1 |
tbx-9 | II | Dorsal epidermis, unidentified lateral cells (including 8 C ectoderm) | Reporter, in situ hybridization | Pocock et al., 2004; Andachi, 2004 | pal-1 |
elt-1 | II | All major epidermis | Antibody | Page et al., 1997 | pal-1 |
vab-7 | III | 8 C: four posterior muscle, four posterior ectoderm | Reporter, in situ hybridization | Ahringer 1996 | cwn-1 |
unc-120 | III | All muscle (including 8 C muscle, 4 D muscle) | Reporter | Waterston (unpublished) | hnd-1 |
hlh-1 | III | All muscle (including 8 C muscle, 4 D muscle)* | Reporter, antibody, in situ hybridization | Krause et al., 1990; Seydoux and Fire, 1994 | hnd-1 |
elt-3 | III | Non-seam major epidermis (including 8 C ectoderm) | Reporter | Gilleard et al., 1999 | elt-1 |
nhr-25 | III | Dorsal epidermis (including 8 C ectoderm) | Reporter | Asahina et al., 2000 | elt-1 |
nob-1 | III | Posterior endoderm (E) and ectoderm – mostly non-C | In situ hybridization | Kohara (unpublished) | cwn-1 |
mab-21 | IV | Dorsal-most epidermis (including 8 C ectoderm) | Reporter | Ho et al., 2001 | elt-3 |
Gene . | Temporal phase . | Spatial expression pattern . | Supporting data . | Refs . | Best candidate upstream activator . |
---|---|---|---|---|---|
pal-1 | I | 16 C, 4 D | Reporter, antibody, in situ hybridization | Edgar et al., 2001; Hunter and Kenyon, 1996 | pal-1 |
cwn-1 | II | C and D muscle, 2 posterior C ectoderm | pal-1 | ||
C55C2.1 | II | Left posterior muscle (C and D), unidentified ventral cells | pal-1 | ||
hnd-1 | II | All muscle (including 8 C muscle, 4 D muscle) | Reporter | Mathies et al., 2003 | pal-1 |
tbx-8 | II | Dorsal epidermis, unidentified lateral cells (including 8 C ectoderm) | Reporter | Pocock et al., 2004; Andachi, 2004 | pal-1 |
tbx-9 | II | Dorsal epidermis, unidentified lateral cells (including 8 C ectoderm) | Reporter, in situ hybridization | Pocock et al., 2004; Andachi, 2004 | pal-1 |
elt-1 | II | All major epidermis | Antibody | Page et al., 1997 | pal-1 |
vab-7 | III | 8 C: four posterior muscle, four posterior ectoderm | Reporter, in situ hybridization | Ahringer 1996 | cwn-1 |
unc-120 | III | All muscle (including 8 C muscle, 4 D muscle) | Reporter | Waterston (unpublished) | hnd-1 |
hlh-1 | III | All muscle (including 8 C muscle, 4 D muscle)* | Reporter, antibody, in situ hybridization | Krause et al., 1990; Seydoux and Fire, 1994 | hnd-1 |
elt-3 | III | Non-seam major epidermis (including 8 C ectoderm) | Reporter | Gilleard et al., 1999 | elt-1 |
nhr-25 | III | Dorsal epidermis (including 8 C ectoderm) | Reporter | Asahina et al., 2000 | elt-1 |
nob-1 | III | Posterior endoderm (E) and ectoderm – mostly non-C | In situ hybridization | Kohara (unpublished) | cwn-1 |
mab-21 | IV | Dorsal-most epidermis (including 8 C ectoderm) | Reporter | Ho et al., 2001 | elt-3 |
Temporal phase and spatial expression pattern are summarized for 14 target genes likely to be involved in patterning. The best candidate upstream activator for each gene is the gene from the previous temporal phase with the most similar spatial expression pattern. References with supporting data measured by a variety of techniques are included where available. All reporter expression patterns are consistent with published data. *Expression is seen only in posterior muscle (C and D), where it is initiated (see Fig. 7).
Discussion
We have shown that lineage-specific patterns of gene expression can be deconvolved from microarray data collected from whole C. elegansembryos by analyzing mutants with transformed lineage identity (Figs 2, 3). This approach can be extended with available mutants to identify genes specifically expressed in each of the somatic founder lineages of the embryo. Although reporter genes will eventually be made for the entire genome, scoring all of their embryonic expression patterns will be difficult. By contrast, microarrays can be used to predict the early embryonic expression pattern for each gene in the genome and reporters can be used for validation and refinement.
The identification of PAL-1 targets confirms previous phenotypic analysis(Hunter and Kenyon, 1996) by demonstrating that it controls development of the C lineage, including specification of multiple cell types. PAL-1 activates expression of target genes responsible for various developmental functions such that their disruption phenocopies different aspects of the pal-1 mutant phenotype (Table 3). Although many of the target genes function in the descendants of additional founder blastomeres, their expression in the C lineage depends on PAL-1(Table 1). We specifically show three muscle-specific genes to require pal-1 function for C-lineage expression (Fig. 5) and find the same to be true for epidermis-specific expression(Table 1). If specification of cell fate were controlled by regionalizing influences as opposed to lineage-based mechanisms, we would not expect expression of fate-specific genes to depend on pal-1 function (unless pal-1 were required to respond to regionalizing influences). Furthermore, expression of the posterior HOX gene nob-1 is complex with respect to lineage but simple with respect to region, yet we show that nob-1::YFP expression in the C lineage requires PAL-1.
We propose a model for the structure of the network specified by PAL-1 based only on temporal and spatial expression patterns in wild type(Fig. 9). The rationale behind the model is simple: genes activated in one cell cycle affect the expression of genes expressed in the next cell cycle. This premise is supported by both functional analysis of the endodermal network(Maduro and Rothman, 2002) and global analysis of expression dynamics(Baugh et al., 2003). Furthermore, whereas zygotic pal-1 transcripts are first detected at the 2C-cell stage in Ca and Cp (phase I), loss of zygotic pal-1function results in a detectable mutant phenotype in their daughters at the 4C-cell stage (Edgar et al.,2001; Hunter and Kenyon,1996). In addition, protein for the phase II gene elt-1is first detected at the end of the 4C-cell stage, and transcription of its confirmed phase III target elt-3 begins in the 8C-cell stage(Fig. 7)(Gilleard and McGhee, 2001; Page et al., 1997).
That ectopic PAL-1 activity in early blastomeres is sufficient to cause complete transformation of one lineage into another indicates that the regulatory network specified by PAL-1 is modular or self-contained(Draper et al., 1996; Hunter and Kenyon, 1996). After maternal PAL-1 specifies the C lineage, embryonically expressed PAL-1 is required for C-lineage development (Edgar et al., 2001). We therefore hypothesize that PAL-1 continuously regulates target genes during patterning of the C lineage, as opposed to simply initiating a transcriptional cascade. Although it is not known how far into development PAL-1 function is required, phenotypically mutant pal-1 mosaic animals were recovered corresponding to loss of pal-1 in one Cxx cell at the 4C-cell stage and PAL-1 expression is detectable in the C lineage until the 16C-cell stage(Edgar et al., 2001), leaving open the possibility that PAL-1 directly activates each of the target genes identified here. Combinatorial control of gene expression, where early targets regulate late targets in combination with PAL-1, offers one possible mechanism for the timing of gene expression within this modular network(Mangan and Alon, 2003; Penn et al., 2004).
There must be additional regulation not predicted by our model. Genes that are not PAL-1 targets are likely to participate in transcriptional regulation and patterning of the C lineage. For example, the Homothorax ortholog unc-62 and the Extradenticle homologs ceh-20 and ceh-40 have superficially similar phenotypes to nob-1 and pal-1, suggesting that these co-factor homeodomain proteins interact with and modify the function of PAL-1 and NOB-1(Van Auken et al., 2002). Likewise, the Tcf/Lef factor pop-1 is thought to mediate cell-fate decisions associated with every cell division on the AP axis of the early embryo (Lin et al., 1998),and, as has been shown for development of the E lineage(Calvo et al., 2001; Maduro et al., 2002), we expect POP-1 to contribute to patterning of PAL-1 target expression, in particular where targets are expressed in only the anterior or posterior daughters following a round of C cell divisions (e.g. hlh-1, elt-1and vab-7). Repression is completely ignored in the current model,but is probably crucial for patterning, as indicated by the fact that very few targets are expressed in all PAL-1-expressing cells. So there may also be genes repressed by PAL-1. We have not allowed for genes of the same temporal phase to regulate each other, though it is likely that there is mutual repression between genes specifying muscle and epidermis, leading to insulation of the two states. In addition, genes of the same temporal phase expressed in the same cells may activate the expression of one another, and we imagine multiple auto-regulatory positive feedbacks in addition to the one demonstrated for pal-1. It will be interesting to compare the structures of different developmental regulatory networks in an effort to understand better how different topological motifs contribute to the functional properties of the regulatory network(Milo et al., 2002; Shen-Orr et al., 2002) and ultimately how network structure relates to body plan.
Acknowledgements
Strain JK3363 was kindly provided by Judith Kimble. We thank Shai Shen-Orr for bioinformatics support. This work was funded by an NIH GM64429 to C.P.H.