Patterning of the facial skeleton involves the precise deployment of thousands of genes in distinct regions of the pharyngeal arches. Despite the significance for craniofacial development, how genetic programs drive this regionalization remains incompletely understood. Here we use combinatorial labeling of zebrafish cranial neural crest-derived cells (CNCCs) to define global gene expression along the dorsoventral axis of the developing arches. Intersection of region-specific transcriptomes with expression changes in response to signaling perturbations demonstrates complex roles for Endothelin 1 (Edn1) signaling in the intermediate joint-forming region, yet a surprisingly minor role in ventralmost regions. Analysis of co-variance across multiple sequencing experiments further reveals clusters of co-regulated genes, with in situ hybridization confirming the domain-specific expression of novel genes. We then created loss-of-function alleles for 12 genes and uncovered antagonistic functions of two new Edn1 targets, follistatin a (fsta) and emx2, in regulating cartilaginous joints in the hyoid arch. Our unbiased discovery and functional analysis of genes with regional expression in zebrafish arch CNCCs reveals complex regulation by Edn1 and points to novel candidates for craniofacial disorders.
The vertebrate facial skeleton is generated from cranial neural crest-derived cells (CNCCs) that populate a series of pharyngeal arches (Platt, 1893; Schilling and Kimmel, 1994). Signaling from endodermal and ectodermal epithelia, as well as from CNCCs themselves, establishes nested patterns of gene expression in arch CNCCs, in particular along the dorsoventral axis (Medeiros and Crump, 2012; Mork and Crump, 2015). CNCCs then progressively adopt a number of fates, including cartilage, bone, and ligament (Bronner and LeDouarin, 2012), with a subset of cells remaining as progenitors for later differentiation and possibly adult repair (Paul et al., 2016). The shapes and functions of distinct facial regions are inextricably tied to the selection of these cell fates and the subsequent growth and rearrangements of skeletal cells (Kimmel et al., 1998). The earliest fate adopted by arch CNCCs is cartilage, which occurs first in ventral-intermediate arch regions and then spreads to ventral and dorsal poles (Barske et al., 2016). Domain-specific differences in cartilage versus bone fates are likely to contribute to region-specific skeletal morphologies. In dorsal and intermediate domains, early cartilage differentiation must be actively suppressed to ensure proper formation of joints and later-forming intramembranous bones (Askary et al., 2015; Nichols et al., 2016). Identifying the molecular differences that prefigure regional cell fate choices and behaviors is therefore key to unraveling how the facial skeleton is assembled.
Candidate-based approaches, as well as forward genetic screens in zebrafish (Piotrowski et al., 1996; Schilling et al., 1996), have identified key members of craniofacial signaling pathways and their downstream targets (Minoux and Rijli, 2010). Edn1 signaling is required for gene expression and subsequent skeletal patterning in the intermediate and ventral-intermediate regions of the arches, including the joint-forming domain (Kurihara et al., 1994; Clouthier et al., 1998; Miller et al., 2000; Ozeki et al., 2004; Sato et al., 2008; Gordon et al., 2013). The Bmp pathway has an overlapping function in patterning the lower face (Tucker et al., 1998; Bonilla-Claudio et al., 2012), although it appears to be preferentially required for gene expression in ventralmost arch regions (Alexander et al., 2011; Zuniga et al., 2011). By contrast, Jagged-Notch signaling is required to pattern dorsal arch CNCCs, at least in the hyoid and posterior mandibular arches of zebrafish (Zuniga et al., 2010; Barske et al., 2016). Downstream targets have also been identified, including Edn1 activation and Jagged-Notch inhibition of the Dlx3/4/5/6 family in ventral-intermediate CNCCs (Beverdam et al., 2002; Depew et al., 2002; Talbot et al., 2010; Zuniga et al., 2010), and Edn1 and Bmp regulation of Hand2 in ventral CNCCs (Thomas et al., 1998; Miller et al., 2003; Yanagisawa et al., 2003; Zuniga et al., 2011; Bonilla-Claudio et al., 2012). However, the extent to which Edn1 and Notch globally regulate dorsoventral gene expression remains incompletely understood.
Recently, genome-wide expression profiling in mice has identified stage-specific expression signatures of craniofacial compartments, such as the mandibular, maxillary and frontonasal prominences (Feng et al., 2009; Fujita et al., 2013; Brunskill et al., 2014; Hooper et al., 2017). Similar studies have revealed genes regulated by Bmp4 (Bonilla-Claudio et al., 2012) and Dlx5/6 (Jeong et al., 2008). These studies largely relied on dissection of facial prominences rather than purification of the arch CNCCs that generate the facial skeleton. As the arches consist of not only CNCCs, but also endodermal and ectodermal epithelia and mesodermal cores, whether the identified genes were expressed in CNCCs was not always clear.
In the current study, we use the nested expression of hand2:GFP and dlx5a:GFP transgenes along the dorsoventral axis to identify genes with domain-specific expression in CNCCs of the zebrafish mandibular and hyoid arches. By combining this domain-specific profiling with effects of altered signaling on arch CNCCs (Barske et al., 2016), we demonstrate global roles for Edn1 and Jagged-Notch signaling in establishing intermediate/ventral-intermediate and dorsal arch gene expression, respectively, yet only a minor role for Edn1 in the ventralmost arches. We then used gene editing to test the requirements for 12 previously uncharacterized domain-specific genes and found opposing requirements for two new Edn1 targets, fsta and emx2, in coordinating skeletal development in the intermediate hyoid arch. Whereas fsta inhibits cartilage differentiation in the developing hyoid joint, emx2 promotes cartilage differentiation at the connection points between individual hyoid cartilages. Thus, in addition to providing a global description of dorsoventral gene expression in arch CNCCs, these findings uncover a complex role for Edn1 in balancing skeletal differentiation in the intermediate arches.
Generation of domain-specific arch transcriptomes by combinatorial transgene labeling
We have previously reported using dual labeling by sox10:DsRed and fli1a:GFP transgenes to purify all arch CNCCs from embryos at 20, 28, and 36 h post-fertilization (hpf), followed by mRNA isolation, cDNA library construction and deep sequencing (Barske et al., 2016). Here, we performed two additional replicates at 36 hpf to better define a minimum set of 472 arch CNCC-enriched genes. Next, we took advantage of the nested patterns of hand2:GFP and dlx5a:GFP transgenes to isolate distinct subsets of arch CNCCs along the dorsoventral axis at 36 hpf, followed by RNA sequencing (RNAseq) (Fig. 1A). Whereas fluorescence activated cell sorting (FACS) of hand2:GFP+; sox10:DsRed+ cells enriches for the ventralmost CNCCs of the arches, FACS of dlx5a:GFP+; sox10:DsRed+ cells enriches for a broader domain of ventral to intermediate arch CNCCs (and the otic vesicle).
We then compared how RNAseq data from different experiments align to the zebrafish genome. In our analysis, we included new hand2:GFP and dlx5a:GFP samples, as well as the previously described sox10:DsRed+; fli1a:GFP+ cells from wild types, edn1 mutants, jag1b mutants, and embryos with elevated Edn1 (hsp70l:Gal4; UAS:Edn1) and Notch (hsp70l:Gal4; UAS:Notch1a-ICD) signaling. For each experiment, the majority of reads (∼80-90%) could be aligned to a unique position in the zebrafish genome, and the percentage of uniquely aligned reads was independent of the total number of reads for each particular sample (Fig. 1B). The uniformly high percentage of uniquely aligned reads is a positive indication of the quality of the RNAseq data.
Transcriptomic analysis reveals genes differentially enriched in hand2:GFP and dlx5a:GFP domains
In order to compare gene expression levels between samples, read counts were normalized to yield transcript per million values (TPMs). We identified the set of 472 genes enriched in arch CNCCs by filtering for genes with average TPMs greater than 2 across the three wild-type fli1a:GFP+; sox10:DsRed+ samples at 36 hpf, as well as for those enriched 1.5-fold or higher in double-positive versus single-positive cells (i.e. non-CNCC) (Tables S4 and S5). We then took advantage of the relative levels of hand2:GFP and dlx5a:GFP to subdivide arch CNCCs into four dorsoventral domains (Fig. 1C). As previously described, hand2:GFP displays graded expression from strong in ventralmost domains to weaker in more ventral-intermediate regions, and dlx5a:GFP transitions from strong in ventral and intermediate regions to weaker in dorsal domains (Medeiros and Crump, 2012). We therefore binned genes into clusters by comparing their relative expression in cells sorted with different transgenes: hand2:GFPhigh; dlx5a:GFPhigh (ʻventral', 15 genes), hand2:GFPlow; dlx5a:GFPhigh (ʻventral-intermediate', 22 genes), hand2:GFP−; dlx5a:GFPhigh (ʻintermediate', 16 genes), and hand2:GFP−; dlx5a:GFPlow (ʻdorsal', 30 genes) (see Fig. 1C, Table 1 for cut-off values). Confirming the validity of this filtering strategy, each dorsoventral cluster includes several genes with known expression in that domain (Table S1). For example, the ventral cluster includes endogenous hand2 (Miller et al., 2003) and homologs of genes known to be expressed in the ventral/distal domains of the murine arches (foxf1 and foxf2a) (Jeong et al., 2004); the ventral-intermediate cluster includes endogenous dlx5a, as well as dlx3b, dlx4b and msxe (msx1a) (Miller et al., 2000); the intermediate cluster includes grem2b (Zuniga et al., 2011) and genes required for joints such as irx7 (Askary et al., 2015) and nkx3.2 (Miller et al., 2003); and the dorsal cluster includes jag1b (Zuniga et al., 2010) and homologs of murine genes with dorsal arch expression (pou3f3a and pou3f3b) (Jeong et al., 2008).
In situ validation of novel domain-specific gene expression in the arches
Given the inclusion of known genes with correctly predicted dorsoventral expression, we sought to validate the expression of uncharacterized genes in each domain-specific list. To do so, we conducted fluorescent in situ hybridization in 36 hpf embryos, along with sox10:GFPCAAX (membrane GFP) in a second color to highlight all arch CNCCs.
Of the eight predicted ventral genes tested, all showed some expression in the ventral mandibular or hyoid arches, yet their expression patterns were distinct (Fig. 2A). Similar to hand2 (Miller et al., 2000), we observed expression of foxf1, foxf2a, fzd9b, smad6a and skp2 in the ventralmost CNCCs of both arches, with pitx1 showing more limited ventral expression in only the mandibular arch. By contrast, sema3bl and twist1b were expressed in the ventral domain and also, to a lesser degree, in subsets of dorsal arch CNCCs, reminiscent of the published expression pattern of barx1 (Nichols et al., 2013; Barske et al., 2016), another gene on our ventral list. The weaker expression of these genes in dorsal relative to ventral arch CNCCs, together with their lower levels in the intermediate domain, is likely to explain why they were enriched in the hand2:GFP dataset and classified as ventral genes by our filtering scheme.
Among the predicted ventral-intermediate genes, two out of five tested showed expression within the ventral-intermediate domain (fgfbp2a and shox), one displayed both ventral-intermediate and some dorsal expression (stmn1a), one was expressed in the ventral mandibular arch and dorsal posterior hyoid arch cells (tmem107l), and one in intermediate regions (her6) (Fig. 2B). The ventral-intermediate list also contains several known genes with apparent arch-wide expression [hoxa2b and hoxb2a (Hunter and Prince, 2002) and dlx1a (Sperber et al., 2008)] or ventral and dorsal expression domains (gsc) (Miller et al., 2000), although a closer examination of these previous reports suggests higher ventral-intermediate expression for these genes at 36 hpf.
Of the four previously characterized genes on this list, three are expressed in the intermediate joint-forming region and required for joints (irx7, grem2b and nkx3.2) (Miller et al., 2003; Zuniga et al., 2011; Askary et al., 2015). The inclusion of dlx4a might reflect the broader expression of Dlx3-6 genes in both ventral-intermediate and intermediate domains (Talbot et al., 2010), although other members of this family were filtered into the ventral-intermediate category. All five newly tested genes showed highly specific expression in the intermediate domain, including fsta, igfbp5b, ctgfb and homologs of mouse genes expressed in intermediate arch regions – emx2 (Compagnucci et al., 2013) and foxd1 (Jeong et al., 2004) (Fig. 2C). Interestingly, fsta and ctgfb were largely restricted to the hyoid arch and igfbp5b to the mandibular arch.
We found six of nine predicted dorsal genes to be enriched in the dorsal mandibular and hyoid arches: cadherin 11 (cdh11), pou3f3a and pou3f3b [homologs of mouse Pou3f3 with dorsal expression (Jeong et al., 2008)], cd248a, kctd15a (see also Gharbi et al., 2012) and postnb (Fig. 2D). The other three genes were excluded from the dlx5a:GFP and hand2:GFP expression domains, as predicted, but they did not present typical ʻdorsal' expression patterns: sfrp2 showed dorsal-specific expression but in mesoderm, rassf10a was largely confined to the surface ectoderm rather than arch CNCCs, and osr2 was expressed in a few cells between the mandibular arch and the eye (Swartz et al., 2011). Although not tested here, previous reports also suggest dorsal-enriched expression of ednraa from 28-36 hpf (Nair et al., 2007; Zuniga et al., 2010) and expression of gata3 in the more anterior maxillary prominence (Sheehan-Rooney et al., 2013b).
Distinct regulation of domain-specific genes by Edn1 and Jagged-Notch signaling
Given their major roles in dorsoventral arch patterning, we next tested how Edn1 and Jagged-Notch signaling regulate bulk expression of genes in the dorsoventral domains defined by our RNAseq analysis. We intersected our previously published RNAseq data of gain or loss of Edn1 or Jagged-Notch signaling in arch CNCCs (Barske et al., 2016) with domain-specific genes identified based on enrichment in sorted hand2:GFP and dlx5a:GFP cells. We found that the expression of intermediate and ventral-intermediate genes, but not ventral and dorsal genes, was significantly reduced in edn1−/− embryos (Fig. 3A) and increased upon Edn1 misexpression (Fig. 3B). In particular, intermediate genes were most affected by perturbation of Edn1 signaling, consistent with the sensitivity of intermediate skeletal elements (joints, symplectic, palatoquadrate) to partial reduction of Edn1 signaling in zebrafish (Miller and Kimmel, 2001). By contrast, loss of Jagged-Notch signaling in jag1b−/− embryos resulted in a downregulation of only dorsal genes (Fig. 3C), and gain of Notch signaling resulted in a downregulation of ventral, ventral-intermediate and intermediate genes (Fig. 3D). Gain of Notch signaling also showed a trend towards increasing the expression of dorsal genes (Bonferroni corrected P=0.18).
We next used in situ hybridization to confirm the predicted regulation of a subset of genes by Edn1 and Jagged-Notch signaling (Fig. 4). Ventral-intermediate genes fgfbp2a, shox and stmn1a, and intermediate genes fsta and ctgfb, were reduced in edn1 mutants. However, the intermediate gene igfbp5b was unaffected and the intermediate gene emx2 was variably upregulated or downregulated in edn1 mutants. Consistent with RNAseq data, ventral genes smad6a, skp2 and fzd9b were unaffected, and the ventral but not dorsal expression of sema3b1 and the mandibular-specific ventral expression of pitx1 were lost in edn1 mutants. Of the dorsal genes examined, pou3f3a and pou3f3b were ventrally expanded in edn1 mutants, cdh11 expression shifted ventrally, and cd248a was largely unaffected. Reciprocally, pou3f3a, pou3f3b and cd248a were reduced in jag1b mutants, with cdh11 and kctd15a expression unaffected. In summary, we find intermediate and ventral-intermediate genes, but only a subset of ventral and dorsal genes, to be regulated by Edn1 signaling, and a subset of dorsal genes to be regulated by Jagged-Notch signaling.
Co-expression network analysis of pharyngeal arch genes
As an independent strategy to uncover genes co-expressed in arch domains, we performed a weighted gene co-expression network analysis (WGCNA) (Zhang and Horvath, 2005) across 19 of our RNAseq datasets (see Fig. 1B). We limited this analysis to the 6000 genes exhibiting the greatest variance across all datasets and showing an expression level above 2 TPM in at least one experiment. A searchable dendrogram (Fig. S5) reflects the topological overlap metric (TOM), which is a measure of the correspondence in expression between genes across samples. In order to determine the utility of TOM in uncovering novel genes within known networks, we examined five representative branches containing genes with validated dorsoventral-restricted expression (Fig. 5A). Cluster 1 is composed of six genes, including the known dorsal gene jag1b (Zuniga et al., 2010). Of these, cd248a, fgf20b and snail1a (snai1a) were detected in dorsal arch CNCCs (Fig. 2D, Fig. 5B). Cluster 2 contains 14 genes, including a known intermediate gene (grem2b) required for joint formation in zebrafish (Zuniga et al., 2011), as well as four newly validated intermediate genes (emx2, fsta, igfbp5b, foxd1) (Fig. 2C). This cluster also contains twist1a, which displays more complex expression in dorsal and ventral arch domains (Germanguz et al., 2007). Cluster 3 contains 11 genes, including five tightly clustered Dlx genes (dlx3b, dlx4a, dlx4b, dlx5a, dlx6a) known to be co-expressed in the ventral-intermediate domain (Talbot et al., 2010), as well as another known ventral-intermediate gene (msxe) (Miller et al., 2000) and a non-coding RNA (si:ch673-351f10.4) in an analogous position to the mouse Evf2 (Dlx6os1) gene, an antisense transcript that promotes the expression of the Dlx5-6 locus (Feng et al., 2006). This cluster also contains an uncharacterized gene, fgfbp2b, which we find to be expressed in a subset of ventral-intermediate first arch CNCCs (Fig. 5B). We also examined two distinct branches containing ventral-restricted genes. We verified five out of six genes in cluster 4 as being restricted to the ventralmost arches (pitx1, fzd9b, foxf1, foxf2a) or expressed more strongly in the ventral arches (sema3bl) (Fig. 2A). Cluster 5 contains a known ventral-restricted gene (satb2) (Sheehan-Rooney et al., 2013a) that tightly co-varies with mrrf, which we find to have similar ventral-restricted expression (Fig. 5B).
We next asked which RNAseq experiments informed gene co-regulation by iteratively computing the average TOM disruption caused by removing experimental groups, thus producing a TOM driver score for each experiment (Fig. 5C). Expression in dlx5a:GFP+ cells was the strongest driver for cluster 2, containing known and validated intermediate genes, and a strong driver for cluster 3, containing ventral-intermediate-restricted genes. By contrast, expression in hand2:GFP+ cells was the strongest driver for clusters 4 and 5, consistent with these clusters containing known and newly validated ventral-restricted genes. Consistently, the ventral-intermediate (3) and intermediate (2) clusters and the two ventral clusters (4 and 5) formed separate subgroups when compared across all experiments. By contrast, the dorsal cluster (1) was driven by gain-of-function Notch signaling and not relative enrichment in dlx5a:GFP+ and hand2:GFP+ cells. Interestingly, expression in edn1 mutants was a strong driver for the ventral-intermediate cluster 3, yet disrupted intermediate cluster 2 and ventral cluster 4. This finding is consistent with our in situ validation showing opposite Edn1 regulation of intermediate genes emx2 and fsta in cluster 2, and regulation of sema3bl and pitx1 but not fzd9b in cluster 4. TOM driver analysis thus represents a powerful method for understanding how sets of genes share common expression domains and/or regulation by distinct signaling pathways in the developing face.
We also applied WGCNA to our filtered dorsoventral lists to help resolve false positives (Fig. S1). For genes whose expression patterns were validated by in situ hybridization (Fig. 2), 8/8 ventral, 3/3 ventral-intermediate, 4/5 intermediate and 6/6 dorsal genes clustered together on distinct branches. The one outlier was ctgfb, although repeating WGCNA without Edn1-related RNAseq datasets revealed that this was due in part to stronger regulation by Edn1 signaling. This analysis also revealed two classes of ventral-intermediate genes: those with more restricted ventral-intermediate expression (similar to the Dlx3-6 class) and those with dual ventral-intermediate and dorsal domains (e.g. sema3bl and twist1b). We also found that 4/5 genes identified as false positives (her6, osr2, tmem107l, rassf10a) did not cluster with other genes in their predicted dorsoventral domains. The sole exception was sfrp2, which clustered with dorsal genes despite in situ validation revealing expression in dorsal mesoderm and not CNCCs. Thus, combining dorsoventral filtering and WGCNA analysis decreased the false positive rate and uncovered distinct classes of arch expression patterns.
Absence of larval skeletal defects in loss-of-function mutants for many domain-specific genes
We next sought to uncover potential requirements for novel domain-specific genes in zebrafish craniofacial development. We used TALEN and CRISPR technologies to introduce early frameshift mutations in 12 genes (cd248a, ctgfa, ctgfb, cdh11, emx2, fsta, fstb, her6, mrrf, sfrp2, osr1, osr2; see Table S2 for details) and analyzed homozygous mutant embryos for cartilage and bone defects at 5 days post-fertilization (dpf). For all mutants except fsta and emx2, no craniofacial skeletal defects were observed (data not shown). ctgfa; ctgfb and osr1; osr2 double mutants also failed to display obvious craniofacial skeletal defects. Although not displaying larval craniofacial defects, mrrf mutants grew more slowly than wild-type siblings, rarely survived past 1 month and, even before general growth defects were apparent, were unable to regenerate their tail fins (Fig. S2).
Opposite requirements for fsta and emx2 in hyoid cartilage development
Homozygous mutants for fsta and emx2, two new Edn1 targets expressed in the intermediate domain, displayed defects in the hyoid arch skeleton (Fig. 6A,B). In fsta mutants, we detected variable alterations of the hyoid joint, a compound joint in which a small interhyal cartilage makes connections to the hyomandibular and ceratohyal cartilages on either side. The interhyal cartilage was reduced and made abnormal cartilaginous connections with adjacent cartilages, the symplectic cartilage was reduced in length, and the connection between the hyomandibular and symplectic cartilages was thickened. fsta; fstb double mutants displayed a subtle enhancement of craniofacial defects compared with fsta single mutants (Fig. S3). These joint and symplectic phenotypes are similar to those reported for irx7; irx5a mutants (Fig. 6A) (Askary et al., 2015), and we correspondingly observed a reduction in arch irx7 expression in 2/7 fsta; fstb mutants (Fig. 6E).
By contrast, emx2 mutants had separated symplectic and hyomandibular cartilages, with weakly Alcian Blue-positive cells evident at the interface. As these two elements start out separate in wild types at 3 dpf (Fig. 6A), we interpret the emx2 phenotype as a failure of later cartilage fusion. These mutants also have a near complete loss of the opercle bone (a hyoid arch derivative) and abnormalities in the palatoquadrate cartilage (a mandibular arch derivative). The expression of fsta and irx7 is unaffected in emx2 mutants, and emx2 expression is unaffected in fsta mutants (Fig. 6C-E). Further, loss of emx2 did not restore normal hyoid joint formation to fsta or irx7; irx5a mutants, and, conversely, loss of fsta or irx7 and irx5a failed to rescue the opercle bone loss of emx2 mutants (Fig. 6A,B). fsta and emx2, two intermediate domain genes regulated in distinct ways by Edn1, therefore act in parallel pathways, with fsta acting upstream of irx7 to promote early joint and symplectic formation and emx2 promoting later cartilage fusion and bone development (Fig. 6F).
Our global gene expression analysis of zebrafish pharyngeal arch CNCCs revealed general principles of arch patterning and novel expression patterns and functions of genes not previously implicated in craniofacial development. The intersection of domain-specific gene expression with changes upon signaling perturbation uncovered distinct roles for Edn1 signaling along the dorsoventral axis that might help explain the complex phenotypes of edn1 mutants. In particular, we identified new roles for two distinctly regulated Edn1 target genes, fsta and emx2, in coordinating joint, cartilage, and bone morphogenesis in the intermediate regions of the developing arches.
Identification of novel domain-specific arch genes
We used two complementary methods to identify co-expressed modules of genes in mandibular and hyoid arch CNCCs. The first approach took advantage of the graded expression of hand2:GFP and dlx5a:GFP transgenes along the dorsoventral axis to group genes into four compartments, and the second approach mined co-variation across 19 RNAseq datasets to identify genes with similar expression patterns and/or regulation.
A limitation of the first strategy is that filtering thresholds are empirically determined, and that genes must pass all thresholds to be included (which is likely to account for some known genes, such as dlx6a, being excluded). Empirical shifting of thresholds, guided in part by anchoring well-characterized genes in each cluster, led to a balance between the number of false positives (genes not expressed in the predicted domains) and false negatives (genes with known domain-specific expression not being included). An advantage of the second, co-variance strategy is that it is unbiased, although both the relative enrichment in hand2:GFP+ and dlx5a:GFP+ domains and expression changes in response to signaling perturbation drive clustering. Nonetheless, considerable concordance between the approaches points to the validity of each. For example, four ventral genes (pitx1, fzd9b, foxf1, foxf2a), four ventral-intermediate genes (dlx3b, dlx4b, dlx5a, msxe) and five intermediate genes (emx2, fsta, igfbp5b, grem2b, foxd1) were similarly identified by hand2:GFP/dlx5a:GFP filtering and co-variance analysis.
Using these types of analyses, we uncovered a number of new genes with validated domain-specific expression. In the ventral domain, these included S-phase kinase-associated protein 2 (skp2) and mitochondrial ribosome recycling factor (mrrf), perhaps reflecting distal growth of this domain to elongate the lower jaw (Bonilla-Claudio et al., 2012; Medeiros and Crump, 2012). In the ventral-intermediate domain we uncovered specific expression of Fgf-binding proteins (fgfbp2a and fgfbp2b), suggesting fine regulation of Fgf signaling in this domain. In the intermediate domain, we discovered two putative Bmp inhibitors (fsta and ctgfb) with tightly restricted expression near the developing hyoid joint, consistent with prior data showing that complex regulation of Bmp signaling is important for joint specification (Salazar et al., 2016; Smeeton et al., 2017). In the dorsal domain, we uncovered selective expression of genes previously implicated in earlier neural crest and ectomesenchyme development, including a Snail transcription factor (snai1a) (LaBonne and Bronner-Fraser, 2000), cadherin 11 (cdh11) (McLennan et al., 2015), potassium channel tetramerization domain-containing 15a (kctd15a), which interacts with tfap2a (Zarelli and Dawid, 2013), and endosialin (cd248a) (Das and Crump, 2012); this signature is consistent with our previous findings that dorsal arch CNCCs differentiate later than other arch CNCCs (Barske et al., 2016). Of the 31 genes tested by in situ hybridization, 26 showed expression in the predicted domain, three showed CNCC-specific expression outside the predicted domain, and two were expressed in non-CNCC arch tissues. Further, applying WGCNA to the dorsoventral gene lists correctly predicted 4/5 false positives while only excluding 1/26 true positives. The one exception was secreted frizzled-related protein 2 (sfrp2), which showed dorsal-specific expression in arch mesoderm but not CNCCs. Its mouse homolog has also been reported to be upregulated in Dlx5/6 mutants, consistent with dorsal enrichment (Jeong et al., 2008). We do not know whether the inclusion of sfrp2 in our CNCC datasets reflects expression in CNCCs below the level of detection by in situ hybridization or contamination of our FACS-sorted populations by a few non-CNCC arch cells. Nonetheless, our analysis pipeline accurately predicted domain-specific expression for a high proportion of genes, including nine not previously implicated in craniofacial development.
Region-specific roles of Edn1 and Jagged-Notch signaling in arch patterning
Previous work had suggested greater roles for Edn1 signaling in intermediate versus more ventral domains (Alexander et al., 2011; Zuniga et al., 2011), and for Jagged-Notch signaling in the dorsal arches (Zuniga et al., 2010). By analyzing how gene modules of distinct arch domains are affected by signaling perturbations, we confirm this on a genomic scale (as summarized in Fig. 4E). A more prominent role for Edn1 in controlling gene expression in intermediate mandibular and hyoid arch domains, which generate joints and the palatoquadrate and symplectic cartilages, helps explain why these skeletal elements are most sensitive to partial reduction of Edn1 function (Miller and Kimmel, 2001) and mutation of its downstream effectors Plcb3 and Mef2ca (Walker et al., 2006, 2007). Conversely, the ventralmost elements of the mandibular and hyoid arches, such as the basihyal, are spared in severe edn1 mutants (Miller et al., 2000), consistent with the expression of most ventral genes in this study (smad6a, skp2, fzd9b) and in previous reports (satb2) (Sheehan-Rooney et al., 2013a) being unaffected by Edn1 perturbations. However, some ventral genes, such as hand2 and pitx1, are lost in edn1 mutants, although regulation of hand2 may be indirect through the Edn1 targets Dlx5/6 (Miller et al., 2003; Yanagisawa et al., 2003). Edn1-independent ventral genes might instead depend on Bmp signaling. Smad6 and Satb2 were identified as direct targets of Bmp-dependent pSmads in mice (Bonilla-Claudio et al., 2012), and satb2 is a target of Bmp signaling in zebrafish (Sheehan-Rooney et al., 2013a).
In the dorsal domain, only a subset of genes are regulated by Jagged-Notch signaling (e.g. cd248a, pou3f3a and pou3f3b, but not cdh11 and kctd15a), consistent with the relatively mild dorsal phenotypes of jag1b and notch2; notch3 mutants (Zuniga et al., 2010; Barske et al., 2016) and suggesting Notch-independent regulation of some aspects of dorsal identity.
Whereas we found generally good correspondence between changes in RNAseq values and in situ validation in edn1 and jag1b mutants, in situ validation but not RNAseq revealed differences in stmn1a, sema3bl and cdh11 expression in edn1 mutants. As stmn1a and sema3bl show broad arch expression, profiling all arch CNCCs is likely to dilute the effect of selective loss of their ventral expression domains in mutants. Likewise, a shift of cdh11 expression from dorsal to ventral domains in mutants would not necessarily result in a total expression difference throughout arch CNCCs. These findings suggest that examining expression changes in CNCCs sorted from distinct arch domains in animals with signaling perturbations might be a better way to detect how signaling affects expression patterns.
A lack of obvious craniofacial phenotypes in mutants for many arch-specific genes
The ease of genetic manipulation makes zebrafish an attractive system for performing reverse genetic analysis of craniofacial development. However, homozygous loss-of-function mutants for only two of the 12 domain-specific genes tested showed clear facial cartilage and/or bone phenotypes in larvae. There are several possible explanations for the lack of observable phenotypes. First, although we selected mutations causing premature translational termination before crucial conserved domains, it remains possible that some mutations do not create true nulls. Second, some mutants might have craniofacial defects that we failed to appreciate, for example in other arch derivatives such as ligaments or long-lived progenitors. Third, maternal contribution of mRNA and/or protein could compensate for zygotic loss-of-function. In some cases (e.g. ctgfa), maternal-zygotic null mutants did not display larval craniofacial defects. The growth delay and tail fin regeneration defects of mrrf mutants could be explained by depletion of remaining maternal stores, similar to previous reports for other mutants in mitochondrial proteins (Rahn et al., 2015). It thus remains possible that mrrf expression in the ventralmost arches reflects rapid growth and/or metabolism of this domain. Fourth, there might be genetic compensation (Rossi et al., 2015). Large-scale mutational screens in zebrafish have found a surprisingly small number of genes required for larval viability (∼6%), suggesting a high degree of genetic redundancy in zebrafish (Kettleborough et al., 2013). In addition, the identification of multiple alleles for craniofacial mutants suggests that previous screens are approaching saturation for obvious larval skeletal defects (Neuhauss et al., 1996; Piotrowski et al., 1996; Schilling et al., 1996; Nissen et al., 2006). Our findings therefore indicate that many of the single gene mutants with obvious craniofacial patterning defects in zebrafish might have already been found.
Complex regulation by Edn1 coordinates intermediate arch morphogenesis
A curious feature of edn1 mutants, as well as mutants for its effector mef2ca, is the phenotypic variability of intermediate domain-derived skeletal elements, including joints and the opercle bone (Kimmel et al., 2003; DeLaurier et al., 2014). Our analysis of two newly identified Edn1 target genes, emx2 and fsta, may shed some light on this variability. For example, the gain or loss of the opercle in edn1 mutants might reflect the observed variability in emx2 regulation, given that loss of emx2 suppressed the opercle expansion seen in some edn1 mutants without rescuing ventral cartilage loss (Fig. S4). The loss of the hyoid joint and the reduction in symplectic cartilage in fsta mutants are also similar to what is seen in hypomorphic Edn1 pathway mutants (Miller et al., 2000; Walker et al., 2006, 2007; DeLaurier et al., 2014), consistent with our finding that intermediate domain fsta expression is lost in edn1 mutants. Our previous analysis of similar phenotypes in irx7; irx5a compound mutants showed that inappropriate chondrogenic differentiation at the junction between the nascent symplectic and hyomandibular cartilages prevents these cells from rearranging and thus lengthening the symplectic (Askary et al., 2015). By contrast, the hyomandibular and symplectic cartilages fail to connect in emx2 mutants. Temporal regulation by Edn1 in the intermediate domain may therefore result in Fsta blocking cartilage differentiation at early stages to allow symplectic elongation, with Emx2 promoting cartilage differentiation at later stages to fuse the symplectic and hyomandibular into a seamless cartilage (Fig. 6F). As Bmp activity inhibits irx7 expression (Askary et al., 2015), Fsta might promote irx7 at the hyoid joint by limiting Bmp signaling. The low penetrance of irx7 loss and joint fusion in fsta/b mutants might be due to functional redundancy with other putative Bmp inhibitors expressed at the joint, including ctgfb (this study), grem2b (Zuniga et al., 2011) and chordin (Miller et al., 2003). Interestingly, Emx2 mutant mice lack the incus cartilage of the middle ear (Rhodes et al., 2003), which is homologous to the palatoquadrate affected in fish emx2 mutants. Part of the arch patterning function of Emx2 might thus be conserved from fish to mammals.
Our transcriptome-driven analysis of arch regionalization has therefore provided new insights into how Edn1 signaling regulates a delicate balance of cartilage differentiation to fine-tune skeletal shape. In the future, the analysis pipeline presented here should help to reveal regulatory changes in additional mutants that disrupt facial patterning.
MATERIALS AND METHODS
The University of Southern California Institutional Animal Care and Use Committee approved all experiments on zebrafish (Danio rerio). Published lines include Tg(hand2:eGFP) (Kikuchi et al., 2011), dlx5aj1073Et (referred to here as dlx5a:GFP) (Talbot et al., 2010), Tg(fli1a:eGFP)y1 (Lawson and Weinstein, 2002), Tg(sox10:DsRed-Express)el10 (Das and Crump, 2012), Tg(sox10:GFPCAAX), irx7el538, irx5el574 (Askary et al., 2015), sucker/edn1tf216 (Miller et al., 2000) and jag1bb1105 (Zuniga et al., 2010).
FACS and RNAseq
fli1a:GFP; sox10:DsRed and hand2:GFP; sox10:DsRed fish were incrossed to generate embryos, and dlx5a:GFP; sox10:DsRed fish were outcrossed to avoid homozygosity of the dlx5aj1073Et insertional allele. Embryos were dissociated as previously described (Covassin et al., 2006), with minor modifications (Barske et al., 2016). Cells were sorted based on GFP and DsRed expression on a MoFlo Astrios instrument (Beckman-Coulter) into RLT lysis buffer (Qiagen), and total RNA was extracted using the RNeasy Micro Kit (Qiagen). RNA integrity was assessed on Bioanalyzer Pico RNA chips (Agilent), cDNA synthesized with the SMART-Seq Ultra Low Input RNA Kit (Clontech), and libraries generated with the Kapa Hyper Prep Kit (Kapa Biosystems) and NextFlex adapters (Bioo Scientific). 75 bp paired-end sequencing was performed on a NextSeq 500 machine (Illumina).
RNAseq data analysis and statistical tests
After trimming using Partek Flow default criteria, sequencing reads were aligned to zebrafish GRCz10 (Ensembl_v80) using TopHat 2 (https://ccb.jhu.edu/software/tophat/index.shtml). Aligned reads were quantified using Partek E/M and normalized to yield TPM values, controlling for sequencing depth disparities across samples (Wagner et al., 2012). Data are accessible through GEO series accession GSE95812. To test whether log2 fold-change values for each group of genes were significantly different to zero (Fig. 3), we used the Shapiro-Wilk test for normality to determine whether a one-sample t-test or a Wilcoxon signed-rank test was appropriate. The Bonferroni correction was then applied on the resulting P-values of one-tailed tests to account for multiple comparisons. The Mann–Whitney U-test for two independent samples was performed in Excel 2016 (Microsoft) using Real Statistics Resource Pack software (release 4.9; www.real-statistics.com) to compare effects of Edn1 and Notch signaling, as the data from at least one group were not distributed normally.
In situ hybridization and immunohistochemistry
Partial cDNAs were PCR amplified with Herculase II Fusion Polymerase (Agilent), cloned into pCR_Blunt_II_Topo (ThermoFisher Scientific), linearized, and synthesized with SP6 or T7 RNA polymerase (Roche Life Sciences) as specified (Table S3). In situ hybridization was performed as described (Zuniga et al., 2010), co-staining for dlx2a (Akimenko et al., 1994), sox9a, or with rabbit anti-GFP antibody (Torrey Pines Biolabs, TP401; 1:1000) to highlight arch CNCCs or early cartilages. Imaging was performed with a Zeiss LSM800 confocal microscope and presented as optical sections or maximum intensity projections as specified. Typically, six to ten controls and three to seven mutants were imaged for each probe.
Mutant generation and skeletal staining
Twelve mutant lines were created via TALEN (Sanjana et al., 2012) or CRISPR/Cas9 (Jao et al., 2013) mutagenesis as described (Barske et al., 2016). Germline founders were detected by screening their F1 progeny by restriction digestion of PCR products, followed by sequencing to identify frameshift indels (Table S2). Alcian Blue and Alizarin Red staining of cartilage and bone were performed as described (Walker and Kimmel, 2007). Symplectic cartilage length was measured with ImageJ (NIH) and compared using unpaired t-tests.
We thank Megan Matsutani and Jennifer DeKoeyer Crump for fish care, Lora Barsky and Jeffrey Boyd at the USC Stem Cell Flow Cytometry Core Facility for FACS, Charles Nicolet at the USC Norris Cancer Center Molecular Genomics Core for sequencing, and Yibu Chen for sequencing analysis.
Conceptualization: A.A., P.X., L.B., B.B., J.G.C.; Methodology: A.A., P.X., L.B., P.B., B.B.; Software: A.A., M.B.; Validation: A.A., L.B.; Formal analysis: A.A., P.X., L.B., J.G.C.; Investigation: A.A., P.X., L.B., P.B.; Resources: A.A., J.G.C.; Data curation: A.A., P.X., L.B., M.B.; Writing - original draft: A.A., P.X., L.B., M.B., J.G.C.; Writing - review & editing: A.A., P.X., L.B., M.A.B., J.G.C.; Supervision: M.A.B., J.G.C.; Project administration: J.G.C.; Funding acquisition: M.A.B., J.G.C.
Research was funded by grants from the National Institute of Dental and Craniofacial Research (NIDCR) (DE018405) to J.G.C., National Institute of Neurological Disorders and Stroke (R00NS080913) to M.A.B., A.P. Giannini Foundation and NIDCR (K99DE026239) to L.B., and National Institute on Deafness and Other Communication Disorders (training grant T32 DC009975) to A.A. Deposited in PMC for release after 12 months.
RNAseq data are available at Gene Expression Omnibus through series accession number GSE95812.
The authors declare no competing or financial interests.