ABSTRACT
Developmental evolution and diversification of morphology can arise through changes in the regulation of gene expression or protein-coding sequence. To unravel mechanisms underlying early developmental evolution in cavefish of the species Astyanax mexicanus, we compared transcriptomes of surface-dwelling and blind cave-adapted morphs at the end of gastrulation. Twenty percent of the transcriptome was differentially expressed. Allelic expression ratios in cave X surface hybrids showed that cis-regulatory changes are the quasi-exclusive contributors to inter-morph variations in gene expression. Among a list of 108 genes with change at the cis-regulatory level, we explored the control of expression of rx3, which is a master eye gene. We discovered that cellular rx3 levels are cis-regulated in a cell-autonomous manner, whereas rx3 domain size depends on non-autonomous Wnt and Bmp signalling. These results highlight how uncoupled mechanisms and regulatory modules control developmental gene expression and shape morphological changes. Finally, a transcriptome-wide search for fixed coding mutations and differential exon use suggested that variations in coding sequence have a minor contribution. Thus, during early embryogenesis, changes in gene expression regulation are the main drivers of cavefish developmental evolution.
INTRODUCTION
Changes in the spatial and temporal regulation of gene expression are often considered the main drivers of morphological evolution and diversification (Carroll, 2005; Stern and Orgogozo, 2008). As developmental gene regulatory networks deploy and progressively establish embryonic axes and germ layers, control regional and cell-type specification, and drive terminal differentiation, gene expression must be tightly regulated in time, in space and in intensity. Thus, any viable change in non-coding cis-regulatory DNA elements and/or trans-acting factors binding them can affect gene expression patterns and in turn lead to morphological or physiological developmental variations. Unlike such evolution at regulatory level, modification(s) in protein-coding sequences supposedly have a greater functional impact, especially for developmental genes with pleiotropic functions, and are thought to be negatively selected (Carroll, 2005, 2008; Stern and Orgogozo, 2008).
Teleost fish are the most diversified vertebrate group, with more than 30,000 species adapted to various environments and displaying a wide range of morphologies (Helfman et al., 2009). However, little is known on the genome-wide regulatory evolution accompanying their diversification (Krishnan et al., 2022; Verta and Jones, 2019). The Mexican tetra, Astyanax mexicanus, has emerged as a model to study microevolution and adaptation to a drastic environmental change. This species comes in two eco-morphotypes: the sighted river-dwelling surface fish (SF); and the blind dark-adapted cavefish (CF). As of today, 33 cave populations are known and inhabit the karstic caves of North-East Mexico (Elliott, 2018; Mitchell et al., 1977). They are all blind and depigmented. Most of them result from one initial colonisation event, followed by dispersion in cavities in the underground aquifer and independent evolution in different caves (Policarpo et al., 2024). Cavefishes present a variety of anatomical, behavioural and physiological phenotypes, the genetic bases of which are beginning to be understood (e.g. Aspiras et al., 2015; Elipot et al., 2014; Ma et al., 2020; Protas et al., 2006; Riddle et al., 2018). Yet accumulating evidence points towards multiple changes in gene expression profiles during early development that could explain morphological traits (e.g. Agnès et al., 2022; Ma et al., 2014, 2018; Stahl and Gross, 2017; Torres-Paz et al., 2019; van der Weele and Jeffery, 2022). As the divergence between the two morphs is recent (Fumey et al., 2018; Herman et al., 2018; Policarpo et al., 2021), cavefish and surface fish are inter-fertile and crosses can generate viable F1 hybrids. Here, we have leveraged this property to explore the mechanisms underlying the evolution of developmental gene regulation in cavefish originating from the Pachón cave, one of the most studied Astyanax cavefish population. We first analysed transcriptome-wide allelic expression ratios in F1 hybrids, and we then focused on the regulation of the rx3 candidate gene with molecular analyses and embryological manipulations, including cell transplantations between the two morphs.
RESULTS
Cis- versus trans-regulatory changes in the evolution of developmental gene expression in cavefish
To investigate the mechanisms of early embryonic gene regulation underlying cavefish evolution, we generated bulk transcriptomes of cavefish originating from the Pachón cave (CF), surface fish originating from San Salomon Springs in Texas (SF), as well as F1 hybrid embryos resulting from reciprocal crosses between the two morphs, at neural plate stage [tailbud, i.e. 10 h post-fertilization (hpf)]. This stage corresponds to the end of gastrulation, a time when extensive variations in gene expression occur between the two Astyanax morphs, presumably to set out differential neurulation and brain/eye development (Agnès et al., 2022; Hinaux et al., 2016; Pottin et al., 2011; Torres-Paz et al., 2019). In these tailbud stage embryos, 4483 transcripts were differentially expressed between SF and CF (DEG; fold change>1.5; FDR<1%), representing 19% of the 10 hpf A. mexicanus transcriptome (Fig. 1A). Moreover, the sample-to-sample distance analysis showed that the three SF samples and the three CF samples (each obtained by pooling embryos from six different parent couples) were very similar, ensuring that the observed differences are truly morph-specific and highly reproducible (Fig. 1B).
The evolution of cavefish tailbud stage (10 hpf) gene expression mainly occurs at the cis-regulatory level. (A) Volcano plot showing differentially expressed genes (DEGs) between the two Astyanax morphs at the tailbud stage. Red dots indicate upregulated in cavefish; blue dots indicate downregulated in cavefish; grey dots indicate non-DEG. (B) Sample-to-sample distance. Similar samples are close to each other. Dark blue indicates a closer relationship between samples than light blue/white. (C) Principle of the F1 hybrids analyses to uncover cis from trans gene regulation. (Top) In sighted river-dwelling surface fish (SF) considered as wild type, gene expression depends on the binding of a transcription factor (TF, triangle) on a conserved non-coding enhancer element (CNE, oval on DNA). (Middle) A gene that is downregulated in cavefish, either because of a defective trans-acting TF that cannot bind a CNE, or because of a defective CNE in cis. (Bottom) In heterozygous F1 hybrids, the presence of a fixed SNP position [indicated by S on the SF alleles and C on the blind dark-adapted cavefish (CF) alleles] allows recognition of the origin of expressed alleles/transcripts and the calculation of allelic expression ratios. (D) Principle of attribution to the different regulatory categories. Adapted from Bao et al. (2019), where it was published under a CC-BY 4.0 license. (E,F) Scatter plots of cis regulatory divergence (y-axis; from allelic expression ratios in hybrids) versus parental divergence (x-axis; from DEGs in parents), where the different categories of genes are colour coded. (E) HybSF analysis. (F) HybSF analysis (computed from reads exclusively encompassing the variant position). (G) Venn diagrams showing the consistency of results obtained from HybCF or HybSF datasets, and with the two calculation methods (at the variant position/.snp, or on the whole transcript/.deg).
The evolution of cavefish tailbud stage (10 hpf) gene expression mainly occurs at the cis-regulatory level. (A) Volcano plot showing differentially expressed genes (DEGs) between the two Astyanax morphs at the tailbud stage. Red dots indicate upregulated in cavefish; blue dots indicate downregulated in cavefish; grey dots indicate non-DEG. (B) Sample-to-sample distance. Similar samples are close to each other. Dark blue indicates a closer relationship between samples than light blue/white. (C) Principle of the F1 hybrids analyses to uncover cis from trans gene regulation. (Top) In sighted river-dwelling surface fish (SF) considered as wild type, gene expression depends on the binding of a transcription factor (TF, triangle) on a conserved non-coding enhancer element (CNE, oval on DNA). (Middle) A gene that is downregulated in cavefish, either because of a defective trans-acting TF that cannot bind a CNE, or because of a defective CNE in cis. (Bottom) In heterozygous F1 hybrids, the presence of a fixed SNP position [indicated by S on the SF alleles and C on the blind dark-adapted cavefish (CF) alleles] allows recognition of the origin of expressed alleles/transcripts and the calculation of allelic expression ratios. (D) Principle of attribution to the different regulatory categories. Adapted from Bao et al. (2019), where it was published under a CC-BY 4.0 license. (E,F) Scatter plots of cis regulatory divergence (y-axis; from allelic expression ratios in hybrids) versus parental divergence (x-axis; from DEGs in parents), where the different categories of genes are colour coded. (E) HybSF analysis. (F) HybSF analysis (computed from reads exclusively encompassing the variant position). (G) Venn diagrams showing the consistency of results obtained from HybCF or HybSF datasets, and with the two calculation methods (at the variant position/.snp, or on the whole transcript/.deg).
To disentangle cis- from trans-regulatory divergences in cavefish developmental evolution, we used the ‘F1 hybrid test’, which is based on the detection of parental allelic ratios in heterozygous F1 hybrids (Fig. 1C). For a given transcript, the proportion of parental alleles (i.e. SF and CF alleles) should be biased in F1 hybrids in case a cis-regulatory change occurred in one morph, but not if the change occurred in trans. The method therefore requires the determination of the parental origin of transcripts in the F1 hybrids transcriptomes. This was achieved by identifying transcripts with fixed polymorphism(s) in parents. Thereafter ‘surface hybrids’ (HybSF) and ‘cave hybrids’ (HybCF) will designate individuals resulting from a cross between a surface female and a cave male, and reciprocally.
Mapped reads from SF and CF were searched for variants, and only variants that were differently fixed in all parental replicates were considered (three SF versus three CF samples; each sample contained pools of embryos from different spawning and crosses to optimize the representation of allelic diversity, see Materials and Methods). We identified 26,390 fixed variants contained in 5573 genes, among which 900 were DEGs (=16%). The similar proportion of DEGs observed transcriptome wide (19%, Fig. 1A) or among the subset of genes with fixed polymorphisms (16%) suggested that the 5573 genes subset that can be analysed with the ‘F1 hybrid test’ would provide a representative glimpse to the evolution of gene expression regulation in cavefish embryos. Many genes carried several associated variants (mean=4.7; median=4). As expected for RNAseq data, the majority of fixed variants were annotated in exons and 3′-UTR (Fig. S1A,B).
Cis-regulation was then detected as a variation in allelic expression in F1 hybrids (i.e. the ratio of cave allele versus surface allele expression in hybrids, termed H) (Fig. 1D). Trans-regulation was detected by the difference between the variation in gene expression in parents (i.e. cave/surface expression ratio, termed P) and cis-regulation (therefore, trans=P-H). The regulatory category was attributed based on the statistical result of each three parameters, as described previously (Bao et al., 2019; Landry et al., 2005; McManus et al., 2010; Wang et al., 2020), and using a FDR threshold of 1%. For a given DEG in the parents, the regulatory change can be either in cis- (category 1) or in trans- (category 2), or in both. In that case, the effect can be enhancing (cis×trans, category 3) or compensating (cis+trans, category 4). No difference in parental expression with cis- and trans-regulatory differences in hybrids suggests a compensatory effect (category 5). When no differences are detected in both parents and hybrids, regulation is conserved (category 6). Any other combination was classified as ambiguous (category 7) (Fig. 1D).
The results were similar whether parental expression counts were computed from reads exclusively encompassing the variant position or from reads encompassing the entire transcripts (compare Fig. 1E-G and Fig. S1C,D). The results shown in Fig. 1E,F were computed from reads encompassing the variant position. The majority of the 5573 genes subset analysed belonged to category 6 (conserved) in both HybSF and HybCF datasets (Fig. 1E,F, black dots). Allele-specific expression in hybrids revealed that 167 genes in HybSF and 228 genes in HybCF belonged to category 1 (cis-regulatory divergence), respectively (Fig. 1E,F, red dots). Among these, 108 were shared (=2% of the 5573 genes studied; 547 variants on those 108 genes; Fig. 1G). We considered these 108 genes as strong cis-regulatory divergent candidate genes between Pachón cave and surface Astyanax embryos, and of major interest for further analyses (list in Table S1; full raw dataset in Table S2). Very few genes were found in category 2 and category 3 (0 gene) or category 4 (cis+trans; 6 and 12 genes in reciprocal F1 hybrids, respectively, including five shared), suggesting that trans-regulatory evolution has limited impact on the evolution of cavefish development at tailbud stage (Fig. 1G). We concluded that cis-regulatory changes are the main drivers of variations in early developmental gene expression in the two Astyanax morphs at tailbud stage.
Genes with evolved cis-regulatory changes in cavefish
Gene ontology (GO) analysis of the 108 genes with cis-regulatory changes between the two Astyanax morphs showed little or no over-representation of specific molecular function and biological process GO term (Fig. S2A). Likewise, network analysis using String did not show a significant number of connections or clusters among the 108 genes (Fig. S2B), suggesting that cis-regulatory changes affect a variety of seemingly unrelated developmental processes. Of note, 16 genes were located within less than 500 kb of a previously described QTL, including seven close to an eye-related QTL (Table S1).
Some genes that have evolved in their cis-regulation in cavefish are of particular interest with respect to the literature on the evolution of eye development and degeneration (Table S1). This included Otx2b (gene ID 103046890), which is upregulated in CF tailbud but downregulated at later stages and suspected to be an important regulator of cave-associated phenotypes (McGaugh et al., 2014; Stahl and Gross, 2017). The axial mesoderm expressed admp (anti-dorsalizing morphogenetic protein) and the Wnt pathway regulator gsk3aa (glycogen synthase kinase) were downregulated in cavefish, potentially contributing to signalling variations that affect head development (Hinaux et al., 2016; Pottin et al., 2011; Torres-Paz et al., 2019; Yamamoto et al., 2004). The DNA-methyl transferase dnmt3bb.3 was also underexpressed in cavefish and might contribute to early epigenetic regulation of eye development, as suggested for dnmt3bb.1 at later stages (Gore et al., 2018).
Finally, rx3, a ‘master eye gene’ that confers optic fate in the anterior neural plate (Stigloher et al., 2006) and whose loss of function leads to eyeless phenotypes in all vertebrates examined, including Astyanax (Loosli et al., 2003, 2001; Warren et al., 2021), was also strongly downregulated with evolution in cis [log2(FC)=-1.85; FDR=8.16e-19]. In F1 hybrids, rx3 expression level was intermediate between parental levels, with a biased allelic ratio (Fig. 2AB). rx3 is located ∼320 kb away from a ‘retina thickness’ QTL (O'Quin et al., 2013). As the rx3 expression pattern is altered in many ways in 10 hpf cavefish embryos (Agnès et al., 2022), we next sought to disentangle the contribution of changes in cis-regulatory sequences to its cavefish-specific regulation.
rx3 regulatory features and landscape. (A,B) Summary of rx3 transcriptomic and allelic ratio calculation data. **P<0.01, ***P<0.001, Mann–Whitney tests). (C) Schematic of the distribution of the few fixed polymorphisms between sighted river-dwelling surface fish (SF) and blind dark-adapted cavefish (CF) encountered within a 15 kb interval around the rx3 gene. See Data S1 for complete alignments. Blue arrows indicate the position of SNPs, the red arrow indicates the SNP used in the cis and/or trans analysis, and the green arrow indicates a 6 bp deletion in cavefish. (D) Vista plot showing alignments of the rx3 genomic region (30 kb) in the two Astyanax morphs and eight additional teleost species. The baseline is the SF reference genome. Colour code: exons are in purple, UTR sequences are in turquoise (Ensembl annotation) and non-coding sequences are in pink. (E) qPCR analysis for rx3 mRNA and rx3 pre-RNA abundance in SF (blue) and CF (red) tailbud embryos, using exonic and intronic primers, respectively. *P<0.05, ***P<0.001 (Mann–Whitney tests). (F) In vivo RNA stability assay. The schematic depicts the synthetic rx3 mRNAs containing tag (FLAG or HA) sequences at the beginning of the coding sequence and the G or T SNP in the 3′UTR (G>T in C, vertical arrows on the schematic) that were co-injected into one-cell stage CF embryos. qPCR primers encompassing the respective tag sequence are indicated (horizontal arrows on the schema). qPCR was performed at 10 hpf. ns, not significant (Mann–Whitney test).
rx3 regulatory features and landscape. (A,B) Summary of rx3 transcriptomic and allelic ratio calculation data. **P<0.01, ***P<0.001, Mann–Whitney tests). (C) Schematic of the distribution of the few fixed polymorphisms between sighted river-dwelling surface fish (SF) and blind dark-adapted cavefish (CF) encountered within a 15 kb interval around the rx3 gene. See Data S1 for complete alignments. Blue arrows indicate the position of SNPs, the red arrow indicates the SNP used in the cis and/or trans analysis, and the green arrow indicates a 6 bp deletion in cavefish. (D) Vista plot showing alignments of the rx3 genomic region (30 kb) in the two Astyanax morphs and eight additional teleost species. The baseline is the SF reference genome. Colour code: exons are in purple, UTR sequences are in turquoise (Ensembl annotation) and non-coding sequences are in pink. (E) qPCR analysis for rx3 mRNA and rx3 pre-RNA abundance in SF (blue) and CF (red) tailbud embryos, using exonic and intronic primers, respectively. *P<0.05, ***P<0.001 (Mann–Whitney tests). (F) In vivo RNA stability assay. The schematic depicts the synthetic rx3 mRNAs containing tag (FLAG or HA) sequences at the beginning of the coding sequence and the G or T SNP in the 3′UTR (G>T in C, vertical arrows on the schematic) that were co-injected into one-cell stage CF embryos. qPCR primers encompassing the respective tag sequence are indicated (horizontal arrows on the schema). qPCR was performed at 10 hpf. ns, not significant (Mann–Whitney test).
rx3 variants in cis and potential effects on expression
We performed genomic alignments of rx3 locus (5 kb upstream of ATG to 5 kb downstream of 3′UTR; total ∼15 kb) from wild surface fish of two different origins (Rascon river, n=7 individuals; Choy river, n=9 individuals) against Pachón cavefish (n=9 wild individuals; all publicly available from Herman et al., 2018) (Fig. 2C). The rx3-coding sequence was identical in the two morphs. Eight fixed, morph-specific variants were found in non-coding sequences (Data S1). Along the scanned genomic region, seven of these variants corresponded to SNPs (including one in the rx3 3′UTR, used in the cis/trans analysis above) and one variant corresponded to a small deletion of six nucleotides in a CT repeat region in an intron of the 3′ neighbouring gene, grp (Fig. 2C). After VISTA analysis on a larger genomic scale (30 kb), we found that the non-coding landscape was extremely conserved in the two Astyanax morphs but not in other fish species, preventing the inference of putative conserved non-coding regulatory elements (Fig. 2D). For each of the eight variants identified, we scanned ±15 bp for putative transcription factor binding site (TFBS). No gain or loss of TFBS was detected, suggesting that the cis-regulatory change(s) responsible for the deregulation of rx3 expression in cavefish may lie further away from the genomic region analysed or that they affect other levels of regulation, such as RNA post-transcriptional regulation. We next investigated this possibility in two ways.
First, we used a qPCR approach with different sets of primers: one set targeting exonic sequences and spanning one intron, probing the level of mature mRNA; and sets of primers targeting the first and second intron, respectively, probing the level of nascent pre-mRNA. As expected, the primer set targeting the exons confirmed the decreased expression of rx3 mRNA in 10 hpf cavefish (−77%, Fig. 2E). Amplification of the two introns was also reduced in CF tailbud embryos when compared with SF (Fig. 2E). The reduction of relative levels of intron 1 in cavefish (−33%) was non-significant, whereas intron 2 was significantly and strongly under-expressed in cavefish (−78%). Of note, pre-mRNA levels were very low and the relative quantity of intron 2 was lower than intron 1. The results were confirmed by sensitive digital PCR (not shown). Together, these experiments suggest that rx3 downregulation in cavefish occurs at the transcriptional level.
We then tested the potential effect of fixed SNP in the rx3 sequence that may contribute to a decrease in mRNA stability. One such SNP (GSF>TCF) is present in the rx3 gene sequence, located at the beginning of the 3′UTR (Fig. 2D). To test the ‘stability’ hypothesis, we co-injected in one-cell stage embryos two forms of in vitro-transcribed rx3 full-length mRNAs: one with a HA-tag containing the CF 3′UTR polymorphism, the other with a FLAG-tag containing the SF 3′UTR polymorphism, respectively, and we performed RT-qPCR to compare the relative levels of each tag in developed tailbud embryos (Fig. 2G). We detected no statistical difference in quantity of the two tags, suggesting similar stability for the two RNAs. Thus, it is unlikely that the single GSF>TCF candidate SNP identified in the 3′UTR of rx3 is responsible for the decrease in rx3 expression in cavefish.
A change in cis-regulation affects rx3 expression levels but not rx3 domain size in cavefish
To analyse the regulation of rx3 expression at regional and cellular level, we turned to developmental biology approaches. At 10 hpf, the cavefish eyefield, i.e. the territory of the anterior neural plate fated to become the retinas, showed low and heterogeneous rx3 expression after cell-level quantifications of fluorescent in situ hybridisation signals, and a smaller differently shaped expression domain when compared with SF (Fig. 3A,B,E,F), as previously described (Agnès et al., 2022). In both HybSF and HybCF, eyefield cells presented high to intermediate, similar to SF, fluorescence levels (Fig. 3C-E). The size of the rx3 domain was similarly large in SF, HybSF and HybCF, but markedly smaller in CF (Fig. 3A-D,F). These results suggested a dominance of SF alleles on both rx3-related phenotypes, i.e. cellular expression levels and domain size. Moreover, as neither cell expression levels nor domain sizes differed in HybSF and HybCF reciprocal F1 hybrids, we concluded that rx3 expression was not under maternal genetic control (see Torres-Paz et al., 2019). Together, these results were compatible with a cis-regulatory change in the cavefish rx3 gene affecting the cellular expression levels in the eyefield (HybSF and HybCF cells show lower expression than SF cells) but not the size of its expression domain (HybSF and HybCF domain sizes are as large as SF).
rx3 regulation at cellular and regional level. (A-D) Confocal images of rx3 fluorescent in situ hybridisation (red) showing the eyefield at 10 hpf, in dorsal views. DAPI nuclear staining is in blue. Morphs are indicated. Anterior is upwards. (E) Cell level quantification of rx3 expression signals in sighted river-dwelling surface fish (SF), blind dark-adapted cavefish (CF), HybSF and HybCF. *P<0.05, ***P<0.001 (Kruskal–Wallis tests). (F) Box plots showing the size area of the rx3 expression domain in SF, CF, HybSF and HybCF. Kruskal–Wallis tests. (G) Experimental procedure and principle for cell transplantations and quantification of the cell-level expression signals in chimeric embryos. (H,I) Confocal images of transplantation results. In situ hybridisation for rx3 is red, grafted cells are green and cell nuclei are blue. Images on the right are reconstructed sections at the level of the light-grey lines in the images on the left, showing fluorescence channels decompositions at high magnification. Arrows point to transplanted cells that have been incorporated in the rx3+ eyefield. (J) Cell level quantification of rx3 expression signals in grafted cells relative to neighbouring host cells. The nature of the donor and host, and the number of cells and embryos analysed, are indicated. ***P<0.001 (Kruskal–Wallis tests). (K) Colorimetric in situ hybridisation (purple) for rx3 in the eyefield at 10 hpf, in dorsal views, after the indicated manipulations of the Wnt signalling pathway. (L) Box plots showing the area of the rx3 expression domain after manipulations of the Wnt pathway shown in K. **P<0.01, ***P<0.001 (Kruskal–Wallis tests). The horizontal bars, boxes and whiskers of the box plots represent the mean, the three quartile dispersion, and minimum and maximum values of the datasets, respectively. Scale bars: 20 µm (D,H,K).
rx3 regulation at cellular and regional level. (A-D) Confocal images of rx3 fluorescent in situ hybridisation (red) showing the eyefield at 10 hpf, in dorsal views. DAPI nuclear staining is in blue. Morphs are indicated. Anterior is upwards. (E) Cell level quantification of rx3 expression signals in sighted river-dwelling surface fish (SF), blind dark-adapted cavefish (CF), HybSF and HybCF. *P<0.05, ***P<0.001 (Kruskal–Wallis tests). (F) Box plots showing the size area of the rx3 expression domain in SF, CF, HybSF and HybCF. Kruskal–Wallis tests. (G) Experimental procedure and principle for cell transplantations and quantification of the cell-level expression signals in chimeric embryos. (H,I) Confocal images of transplantation results. In situ hybridisation for rx3 is red, grafted cells are green and cell nuclei are blue. Images on the right are reconstructed sections at the level of the light-grey lines in the images on the left, showing fluorescence channels decompositions at high magnification. Arrows point to transplanted cells that have been incorporated in the rx3+ eyefield. (J) Cell level quantification of rx3 expression signals in grafted cells relative to neighbouring host cells. The nature of the donor and host, and the number of cells and embryos analysed, are indicated. ***P<0.001 (Kruskal–Wallis tests). (K) Colorimetric in situ hybridisation (purple) for rx3 in the eyefield at 10 hpf, in dorsal views, after the indicated manipulations of the Wnt signalling pathway. (L) Box plots showing the area of the rx3 expression domain after manipulations of the Wnt pathway shown in K. **P<0.01, ***P<0.001 (Kruskal–Wallis tests). The horizontal bars, boxes and whiskers of the box plots represent the mean, the three quartile dispersion, and minimum and maximum values of the datasets, respectively. Scale bars: 20 µm (D,H,K).
To explore rx3 regulation at the cell level, we performed transplantation experiments (Torres-Paz and Rétaux, 2021). Cells from one morph were transplanted isochronically into embryos of the other morph, either at blastula or gastrula stage. When embryos reached 10 hpf, rx3 expression levels in grafted cells incorporated in the eyefield were compared with those of the surrounding host cells (Fig. 3G-I). Irrespective of the grafting stage, CF cells transplanted into SF hosts showed lower rx3 levels than surrounding host cells, and vice versa; SF cells grafted into a CF environment expressed higher rx3 levels than neighbouring host cells (Fig. 3H-J, Fig. S3). These results showed that rx3 expression level in eyefield cells is controlled autonomously by the rx3 alleles of the transplanted cells, and do not depend on the environment. Moreover, F1 hybrid cells grafted into SF embryos showed rx3 levels similar to adjacent SF host cells (Fig. 3J, pale blue). Finally, F1 hybrid cells grafted into CF embryos behaved like transplanted SF cells (Fig. 3J, pale red). This suggested that one copy of the SF allele(s) is sufficient to confer proper rx3 levels in eyefield cells. Together these experiments support the existence of an evolved cis-regulatory element controlling the quantitative aspect, i.e. the cellular expression level, of rx3.
Yet the above experiments could not explain the other aspect of the rx3 phenotype, i.e. the reduced size of the rx3 expression domain in 10 hpf cavefish embryos. We reasoned that rx3 domain size might rather be controlled by signalling pathways that establish early embryonic patterning (Houart et al., 2002; Schier and Talbot, 2005; Sylvester et al., 2013). As we have previously described an influence of the Wnt pathway on eye shape and size in Astyanax larvae (Torres-Paz et al., 2019), we manipulated Wnt signalling and assessed rx3 expression at 10 hpf (Fig. 3K,L). Overactivation of Wnt signalling in SF by pharmacological treatment (LiCl 0.2 M) during a short window of time in late gastrulation strongly affected rx3 domain size, but rx3 expression level remained high, homogeneous and surface fish like in the small eyefield. Conversely, inhibition of Wnt signalling in CF embryos through injection of dkk1b mRNA (a Wnt signalling antagonist) at the one-cell stage resulted in an expansion of the rx3-positive domain, but the rx3 expression level remained low, heterogeneous and cavefish like in the enlarged eyefield. We obtained similar results after manipulation of the Bmp signalling pathway, using dorsomorphin treatments: incubation in the Bmp signalling inhibitor from 30% epiboly or from 70% epiboly onwards did not affect rx3 cellular expression levels but did increase the rx3 domain size in a manner dependent on the treatment time (Fig. S4). Altogether, these experiments demonstrate that the evolution of the regulation of rx3 domain size and rx3 cellular expression level are independent and uncoupled.
rx3 mRNA export is defective in cavefish
Finally, we had previously noticed a heterogeneous distribution of rx3 mRNA in cavefish eyefield cells (Agnès et al., 2022), which was also present in our current analyses (Fig. 4A and text above). Such additional rx3-related phenotypes might correspond to a defect in mRNA sub-cellular localization. To test this possibility, we analysed the nuclear-to-cytoplasmic ratio of rx3 hybridization signals (N/C, Fig. 4B). In surface fish as well as in HybSF and HybCF, rx3 fluorescent signals were localized mostly to the cytoplasm, as expected for mRNAs. By contrast, in cavefish there was almost no enrichment of rx3 signals in the cytoplasm relative to the nucleus (N/C≈1, Fig. 4B), suggesting that the export of rx3 mRNA outside the nucleus is defective, and revealing yet another phenotype for this transcript. We reasoned that the SNP in the 3′UTR investigated above might play a role in the nuclear export defect of the CF rx3 transcript. To test this possibility, we injected in vitro-transcribed rx3 full-length mRNAs from CF (HA-tag-rx3-CF 3′UTR) or from SF (FLAG-tag-rx3-SF 3′UTR) into one cell in eight-cell stage CF embryos. Embryos were grown to 30-50% epiboly, i.e. several hours before endogenous rx3 expression is turned on, and we examined the aspect of rx3 RNA distribution in gastrula cells. We observed no difference in the N/C distribution of the two synthetic transcripts (Fig. S5B). Therefore, this assay suggests the 3′UTR SNP is not sufficient to modify the rx3 N/C ratio.
rx3 mRNA subcellular localisation defect in cavefish. (A) In situ hybridisation images of the sighted river-dwelling surface fish (SF) and blind dark-adapted cavefish (CF) eyefields at 10 hpf, where rx3 signals are red and DAPI nuclear signal is blue. Graphs on the left represent fluorescence levels along the dotted line on the image on the left. Scale bar: 20 μm. (B) Quantification of rx3 nuclear/cytoplasmic fluorescence ratios in cells from sighted river-dwelling surface fish (SF, blue), HybSF (pale blue), HybCF (pale red) and blind dark-adapted cavefish (CF, red). Each dot is a cell, and the number of cells analysed is indicated. ***P<0.001 (Kruskal–Wallis tests).
rx3 mRNA subcellular localisation defect in cavefish. (A) In situ hybridisation images of the sighted river-dwelling surface fish (SF) and blind dark-adapted cavefish (CF) eyefields at 10 hpf, where rx3 signals are red and DAPI nuclear signal is blue. Graphs on the left represent fluorescence levels along the dotted line on the image on the left. Scale bar: 20 μm. (B) Quantification of rx3 nuclear/cytoplasmic fluorescence ratios in cells from sighted river-dwelling surface fish (SF, blue), HybSF (pale blue), HybCF (pale red) and blind dark-adapted cavefish (CF, red). Each dot is a cell, and the number of cells analysed is indicated. ***P<0.001 (Kruskal–Wallis tests).
A contribution of coding mutations and alternative isoforms to cavefish developmental evolution?
To complement the analyses above, we also sought to estimate the load of coding mutations during early developmental evolution in cavefish. Among fixed exonic variants identified above (11,336 total), the majority corresponded either to synonymous (n=6054) or missense mutations (n=5282). Among the latter, a minority was predicted to have a strong effect because they affected the start/stop codon or generated a frameshift (n=20 variants in 19 genes). A closer examination of these ‘severe’ variants revealed that most of them should not change greatly the protein structure or function, as they were located in the close vicinity of other start or stop codons, respectively.
As any change in amino acid might alter protein function (e.g. Elipot et al., 2014; Riddle et al., 2018), we systematically assessed putative effects of fixed missense variants. Among them, 76% (n=4003) changed either class, charge or polarity of amino acids (Fig. S6A). To predict the impact of the missense variants on protein function, we calculated a ‘pathogenic score’ using Mutpred2 (Pejaver et al., 2017), after ancestral state reconstruction to orient the mutations in the surface or the cavefish lineage (Policarpo et al., 2021). ‘Pathogenic score’ should be understood here as a proxy of the extent of the protein change, as some coding variants might actually be beneficial, be it in the SF or the CF lineage. Hence, 2539 mutations had occurred in the CF lineage and 1464 had occurred in SF lineage. In both morphs, the vast majority of variants had a low pathogenicity score (between 0 and 0.3) (Fig. 5A). Moreover, also in both morphs, about 10% of the mutations displayed a g-score greater than 0.5, which is the Mutpred2 pathogenicity threshold (n=272/9.3% in CF and n=141/8.2% in SF). Hence, there was no enrichment for pathogenic mutations in CF compared with SF (Fig. 5A), suggesting that coding sequences remained mostly unaffected and had little contribution to the evolution of cavefish development. This result is in line with the description of only three pseudogenes related to loss of eyes and pigmentation in Pachón cavefish genome (Policarpo et al., 2021). The CF and SF pathogenic gene sets were enriched in ‘metabolic process (RNA)’ and ‘mRNA processing’ GO terms, respectively, suggesting that RNA metabolism genes are globally prone to sequence evolution (Fig. S6B). Of note, predicted pathogenic mutations were also found on genes involved in circadian rhythm (LOC103027946, C649S, g-score=0.79, in CF lineage) and developmental pathways (lrp5, M698R, g-score=0.75, in SF lineage). In summary, we found only minor differences at a functional level between SF and CF protein-coding sequences expressed at tailbud stage.
Contribution of coding changes to cavefish developmental evolution. (A) Plot showing the distribution of predicted non-deleterious (g-score=0) to highly deleterious (g-score=1) effects among coding mutations identified in the sighted river-dwelling surface fish (SF) and blind dark-adapted cavefish (CF) lineages. The two curves overlap. (B) Pie chart showing the distribution of different types of DEU between SF and CF transcriptomes. The colour code is indicated and categories correspond to examples shown in Data S2. The inset shows the example of the cct8 gene, which corresponds to the most represented type of DEU, i.e. alternative exon skipping annotated in NCBI.
Contribution of coding changes to cavefish developmental evolution. (A) Plot showing the distribution of predicted non-deleterious (g-score=0) to highly deleterious (g-score=1) effects among coding mutations identified in the sighted river-dwelling surface fish (SF) and blind dark-adapted cavefish (CF) lineages. The two curves overlap. (B) Pie chart showing the distribution of different types of DEU between SF and CF transcriptomes. The colour code is indicated and categories correspond to examples shown in Data S2. The inset shows the example of the cct8 gene, which corresponds to the most represented type of DEU, i.e. alternative exon skipping annotated in NCBI.
Last, we sought to understand the potential importance of a change in mRNA isoform use during cavefish evolution. To this end, we analysed our transcriptomic dataset to detect differential exon use (hereafter DEU), a process that can also affect protein sequence. Among all genes expressed at 10 hpf (n=23,593), only 2751 genes (11.66%) were detected with at least one change in exon use between the two morphs. After manual curation, 192 genes were considered for clear DEU (0.81% of all expressed genes). We then categorized those genes in several groups, according to the type of DEU change (Fig. 5B and Data S2). Most of them corresponded to an alternative exon use (42% of the DEU genes), of which the majority were already annotated on the Astyanax genome (64%). The second larger group of DEU (14+18=32%) corresponded to variations in exon boundaries, either through change of exon border or presence of a small intron inside an exon. Among those genes with exon boundary change, half corresponded to an alternative intron in their 3′UTR (Fig. 5B and Data S2). The other categories were a biased expression for CF or SF at the 3′ or 5′ of the transcript (9% and 6%, respectively) and alternative transcription start site (8%) or end site (3%). Such biased expression observed at the extremity of genes may reflect unannotated transcription start (or end) sites. Of note, no enrichment in GO term was detected on the curated DEU gene subset, suggesting no particular bias in the evolution of exon use and/or isoform production in Astyanax mexicanus. Although we cannot exclude the possibility that a change of exon use for some genes may have a great impact on the function of the protein produced, our analysis points to a marginal influence of DEU in the evolution of Astyanax embryos, consistent with the minimal change in protein-coding sequences identified above.
DISCUSSION
Here, we have used a combination of approaches, including comparative transcriptomics, genetics, genomics, functional tests and quantitative embryology, after fluorescent in situ hybridisation, cell transplantations and manipulations of signalling pathways. We bring insights on the leading mechanisms underlying the evolution of gene regulation in early cavefish embryos, and we provide evidence that the rx3 transcription factor is a strong candidate involved in the developmental evolution of cavefish eyes.
Evolution of the cavefish tailbud transcriptome
At tailbud stage, 20% of the transcriptome is differentially expressed between the two Astyanax morphs. This result, which is comparable with previous transcriptomic analyses at various stages (Stahl and Gross, 2017; Torres-Paz et al., 2019), illustrates the extensive regulatory variations that occur between SF and CF during precocious stages of development. As our transcriptomic datasets come from pooling at least three spawns from different genetically diverse breeding groups for each condition, we can rule out the possibility that the substantial gene expression differences we report are due to experimental bias. Among all these DEGs, some must be of strong functional relevance and worth studying. However, we think it highly improbable that all the DEGs we have found at tailbud stage would have a phenotypic effect and a role in cavefish evolution. It is likely that most deregulated genes are neutral and simply reflect variations resulting from earlier global fluctuations of gene regulatory networks though domino effects, and may be canalised or buffered during development.
Deciphering changes in developmental gene regulation is crucial to understand how species evolve and adapt to their environment. The contribution of cis- versus trans-regulation to the process of morphological evolution is increasingly studied in animals and plants (Bao et al., 2019; Bell et al., 2013; Cartwright and Lott, 2020; Krishnan et al., 2022; Landry et al., 2005; McManus et al., 2010; Verta and Jones, 2019; Wang et al., 2020). To our knowledge, this work is the first to examine transcriptome wide the regulatory changes occurring during early developmental evolution in a teleost. In cavefish, similar to domesticated cotton or diversifying Drosophilidae species, we found that cis-regulation is the main contributor to the evolution of gene expression during development. Our results are consistent with the idea that changes in trans could have rather large pleiotropic effects, whereas changes in cis fine-tune gene expression. Of note, our approach was not exhaustive and could not identify all the cis-evolved genes in the cavefish genome because only a subset of 5573 genes, those presenting fixed polymorphisms, could be analysed. The 5573 gene set analysed yielded 108 that have evolved in their cis-regulation; therefore, proportionately and genome-wide we can hypothesize that about 450 genes should have evolved in their cis-regulatory sequence and be most impactful on cavefish phenotypic and developmental evolution. Likewise, over-representation of some specific GO terms might emerge with the full complement of cis-evolved genes, which was not the case among the 108 candidates identified in the present approach.
Moreover, the impact of coding mutations is probably marginal, in agreement with the idea that the modular nature of non-coding regulatory elements is more susceptible to give rise to viable changes (Carroll, 2008; Stern and Orgogozo, 2008; Wittkopp and Kalay, 2012). Interestingly, although alternative exon use resulting from a change in splicing would be a way to modulate protein function in a more dynamic fashion (Barbosa-Morais et al., 2012), it appears that only a minority of transcripts have evolved this way between cavefish and surface fish. Overall, these results are globally consistent with the idea that coding mutations in early developmental genes are too deleterious to provide fuel to viable morphological evolution.
In their influential review on the ‘cis-regulatory hypothesis’, Stern and Orgogozo reasoned in terms of population genetics and proposed that: (1) natural selection should favour mutations with fewer pleiotropic effects over those with more pleiotropic effects; and (2) mutations in cis-regulatory regions should have relatively specific, less pleiotropic, effects than mutations in coding regions. Interestingly, they noted that these arguments might not hold in the presence of other confounding factors, such as genetic drift in populations of small effective size or strong selection (Stern and Orgogozo, 2008). Cavefish population genetics precisely corresponds to this case, with small populations and a strong environmental shift that imposes strong selection (Fumey et al., 2018; Legendre et al., 2023; Rétaux and Casane, 2013). Our results suggest that the ‘cis-regulatory hypothesis’ still holds under these circumstances.
rx3, a candidate gene for the developmental evolution of cavefish eye
rx3 is one of 108 high-confidence genes that underwent cis-regulatory evolution in cavefish, making this gene a proof of principle case for the validation of our approach and transcriptomic results. Indeed, rx3 is an outstanding candidate, with potential major consequences on the cavefish ocular phenotype (Agnès et al., 2022; Devos et al., 2021; Warren et al., 2021). This transcription factor is well known for its crucial role in eye development in zebrafish, medaka or Astyanax, where loss-of-function mutations lead to eyeless phenotypes (Loosli et al., 2003, 2001; Warren et al., 2021). As regional and cell-level information on gene expression is lost in bulk transcriptomics, we have combined insights from bioinformatics and developmental biology to decipher the ‘rx3 regulatory puzzle’. We have discovered that rx3 domain size and, hence, eyefield and future eye size, depends on non-autonomous signalling mechanisms, whereas rx3 cellular expression levels and, hence, the fate and behaviour of optic cells, are cell-autonomously controlled and cis regulated. Thus, cell fate and domain size depend on two distinct rx3 regulatory modules – and both have evolved in cavefish. They are controlled in an independent and uncoupled manner, further highlighting the modularity and the complexity of the evolution of rx3 regulation. In other words, regardless of the wnt or bmp signalling-dependent size of the eyefield, rx3 expression will be low and heterogeneously distributed in CF.
In cavefish, where the low rx3 cellular expression levels are clearly due to a cis-regulatory change – as demonstrated by allelic ratios in F1 hybrids – at first sight the small rx3 domain may seem to result from a change in trans because trans signalling mechanisms regulate its size. However, the F1 hybrid analysis classified rx3 in the ‘cis only’ category, not the cis+trans category, with strong statistical support (Fig. 1). To explain this apparent discrepancy, we hypothesize that the contribution of cellular expression variations to the quantity of rx3 transcripts assessed by RNA-seq in the embryos is by far more important than the contribution of rx3 domain size variation, resulting in a ‘hidden-trans’ situation. Of note, the variation in strength of Wnt or Bmp signalling during gastrulation is itself also probably due to an (unknown) upstream cis-regulatory change (see Fig. S7), because a coding mutation in a component of such a pleiotropic signalling pathway would have strongly deleterious effects. Therefore, cascades of cis-regulatory changes affecting gene regulatory networks may be at work for cavefish developmental evolution.
Morphogens are known to regulate the relative sizes of brain regions (Sylvester et al., 2010, 2013). Because Wnt is a known regulator of eye development in fish (Heisenberg et al., 2001; van de Water et al., 2001) and flies (Ma and Moses, 1995), we chose to test this pathway as a regulator of eyefield size. However, other morphogens are known for regulating the boundaries of the eyefield. For example, Bmp is required for protecting the telencephalon from becoming eyefield (Bielen and Houart, 2012) and variations of Shh and Fgf8 expression have previously been shown to affect the evolution of the cavefish forebrain (Pottin et al., 2011). Thus, evolution of the eyefield size in Astyanax embryos most likely results from the interaction of multiple signalling pathways, of which Wnt and Bmp were tested out in this article. In line with this idea, it is intriguing to note that all major developmental signalling pathways show change in expression of several of their components in our transcriptomic dataset (String network in Fig. S7 and Table S4).
Whether the effect of Wnt signalling on rx3 expression is direct or indirect is unknown. In zebrafish (Young et al., 2019) studied a mutant for Tcf7l1, a core Wnt/β-catenin pathway transcription factor that can activate or repress genes dependent upon the status of the Wnt signalling cascade (Cadigan and Waterman, 2012). Tcf7l1a is cell-autonomously required for the expression of rx3 and, consequently, is a bona fide eyefield gene regulatory network transcription factor that functions upstream of rx3 (Young et al., 2019). Considering that it is the repressor activity of Tcf7l1a that promotes eye formation (Kim et al., 2000), the most likely role for Tcf7l1a is to repress transcription of a gene that suppresses eyefield formation and rx3 expression; hence, the effect would not be direct (Young et al., 2019). However, other Tcf transcription factors may have different effects downstream of the Wnt signalling cascade.
Besides changes in the regulation of cell-level expression and domain size, we suggest that rx3 mRNA export and processing is also defective in cavefish eyefield cells. F1 hybrids show proper subcellular mRNA distribution, indicating that the mutated allele is recessive and suggesting that a change in the RNA sequence (in cis) would not be the causative mechanism. However, because in F1 hybrid cells the level of CF allele rx3 RNA should be negligible when compared with the level of SF allele rx3 RNA, the analysis of the N/C signal ratio for the CF allele RNA might be blurred by the SF allele RNA. The SNP we identified in the 3′UTR of rx3 seemed to be an excellent candidate for the rx3 mRNA export phenotype. The in vivo experiments we designed to test it failed to demonstrate its influence on the rx3 N/C ratio, but we cannot rule out the possibility that early blastula and gastrula cells were not ‘competent’ to process the injected synthetic transcripts. The other SNP identified in the second intron of rx3 could be another, less attractive, candidate. Finally, a recessive modification of a component of the export machinery (trans effect) is an alternative but less likely hypothesis, because it would, in principle, affect the export of many or all mRNAs in a pleiotropic manner.
Finally, the results of qPCR experiments on introns may suggest yet another defect in rx3 expression regulation. The global reduction of exon and introns amplification confirms the hypothesis that the control of rx3 expression is changed at the transcriptional level in cavefish. However, the milder reduction in the expression of intron 1 (−43%) compared with intron 2 (−78%) may represent a drop in RNA polymerase activity towards the end of the rx3 gene. The molecular mechanism is unknown, but the fixed SNP we have identified in intron 2 is a candidate for playing a role in this process.
There are famous examples of cis-regulatory changes in evolution in the literature, leading to spectacular or more discrete morphological changes in flies, fish or tetrapods (e.g. Kvon et al., 2016; Prescott et al., 2015; Prud'homme et al., 2006; Ramaekers et al., 2019; Shapiro et al., 2004). Unfortunately, in the present study we could not identify the causal cis-regulatory element change in the cavefish rx3 non-coding regions. As a single nucleotide polymorphism is able to alter temporal expression of a gene (Ramaekers et al., 2019) and cis-regulatory elements can be found at extremely long distances from the gene they regulate (Kvon et al., 2016), more-comprehensive structural and functional comparisons of the cavefish and surface fish genomes regulatory landscapes are required. Nevertheless, we suggest that the yet unidentified rx3 cis-regulatory element we have theoretically evidenced corresponds to one of the 12-15 loci identified by quantitative genetics and that control eye size in Astyanax mexicanus (Protas et al., 2007). Thus, together with cystathionine β-synthase, which is involved in disrupting cavefish larval optic vasculature and which has also evolved through regulatory change in cis (Ma et al., 2020), we propose that rx3 would be the second identified gene involved in the evolution Mexican cavefish eye defect.
Conclusions
Together, our study reinforces the idea that developmental regulatory changes rather than protein sequence changes are responsible for morphological evolution, and further extends the importance of this mechanism to very early embryonic developmental evolution in a vertebrate species. Our study also highlights the complexity of the evolution of gene regulation during development, as a focus on rx3 reveals an uncoupling of regional and quantitative regulation, as well as mRNA subcellular localization of transcripts. Further functional analyses of the 108 cis-regulated genes we have identified should reveal how cavefish lost their eyes.
MATERIALS AND METHODS
In vitro fertilization and RNA extraction
Texas surface fish and Pachón cavefish were maintained in an inverted light cycle (12/12) and induced to spawn via temperature change from 21°C to 26°C. Crosses were performed by in vitro fertilization, and surface, Pachón cave and reciprocal F1 hybrid embryos were collected at 10 h post fertilization, immersed in Trizol (Invitrogen) and stored at −80°C until RNA extraction. Hybrids from the cross of a male surface fish and a female cavefish is termed cave hybrid (HybCF). Reciprocal hybrid (cavefish male and surface fish female) is termed surface hybrid (HybSF). Each condition was obtained in triplicate, with each sample being a pool of ∼60 embryos from two distinct fertilization (one female with three males each) to ensure a good representation of allele diversity among population. RNA were extracted following a phenol-chloroform procedure.
RNA-sequencing and analysis
Library preparation and RNA-sequencing was carried out at the I2BC High-throughput sequencing platform (https://www.i2bc.paris-saclay.fr/spip.php?article399) using an Illumina NextSeq 500 sequencing instrument (paired-end). Analysis was performed on the European Galaxy server (usegalaxy.eu). Reads were mapped onto the Astyanax mexicanus 2.0 genome (NCBI) with HISAT2 (version 2.1.0+galaxy4) after filtering (paired-end single match and Q>20). Reads were counted with htseq-count (version 0.9.1) and differential gene expression was determined with Deseq2 (version 2.11.40.6; Love et al., 2014). The full dataset has been deposited as a Sequence Read Archive (SRA) on NCBI under accession BioProject ID PRJNA848099. R scripts used in analyses below are available at https://github.com/LeclercqJ/Retaux-lab.
Variant calling and filtering
Each RNA-seq experimental replicate originated from two in vitro fertilizations obtained from one female and three males. Therefore, a maximum of 48 alleles were represented in each condition (four individuals, two alleles, 2 FIV and three replicates). Variant calling was performed using FreeBayes (version 1.3.1; Garrison and Marth, 2012 preprint). Genes with variant(s) were identified using SnpEff (Cingolani et al., 2012) with custom Astyanax mexicanus database from SnpEff build and the NCBI GTF file, then filtered on quality >30 with SnpSift (Galaxy, version 4.3). Reads containing fixed variants in parents were extracted using a custom R script and summed for a single gene in case of multiple variants. Of note, fixed polymorphisms used here for the F1 hybrid test refer to the alleles sampled in our transcriptomic dataset and they were not systematically checked with respect to available population genomic data. However, it is important to mention that the rx3 3′UTR variant used in our analysis is validated and indeed present in available Pachón genomes. Parental expression (P) is the result of the combination of both cis- and trans-regulation. Cis-regulation (H) is estimated via allele-specific expression in hybrid (i.e. the ratio of cave/surface alleles in hybrid). Trans-regulation is then estimated as the parental divergence that is not explained by cis-regulation (so Trans=P-H). Parental divergence and allele-specific expression in hybrid were calculated using DESeq2 R package (Love et al., 2014). Cis- and trans-regulation was then estimated using an R script derived from Bao et al. (2019). Only genes with a shared regulation category between surface hybrids and cave hybrids, and differentially expressed in parents from the DESeq2 analysis on the whole transcript, were further considered. Unsurprisingly for RNAseq data, the majority of fixed variants were annotated in exons and 3′-UTR, with few variants found in 5′-UTR and intron (likely from unspliced mRNA, alternative isoform or error in intron/exon boundary annotation) (Fig. S1B).
Gene ontology and network analysis
Genes falling under the ‘cis-only’ category were analysed for Gene Ontology enrichment using GOEnrichment (Galaxy, version 2.0.1). Gene Product Annotation File for Astyanax mexicanus was generated in a previous study (Torres-Paz et al., 2019). Putative gene interaction network between those genes was predicted using String (v11.5; Szklarczyk et al., 2018). Input gene names were manually adjusted (using Ensembl Ids or gene descriptions) to increase the number of entries retrieved. Ninety-four out of 108 genes were retrieved and subjected to the analysis.
rx3 gene alignment and TFBS prediction
The rx3 genomic region (±50 kb) and annotation from several fish species (Mexican tetra cave and surface fish, red-bellied piranha, electric eel, channel catfish, zebrafish, fugu, medaka, stickleback and tetraodon) were downloaded from Ensembl (v106) and aligned using mVista LAGAN alignment program (https://genome.lbl.gov/vista/mvista/submit.shtml; min cons width: 20 bp). The annotation of Astyanax surface fish was slightly modified to match NCBI exon boundaries. The Pachón cavefish rx3 genomic region was extracted from the new cavefish genome assembly recently generated (Imarazene et al., 2021).
The reads encompassing the rx3 region of several individuals from the Pachón cave, as well as Choy and Rascon rivers (Mexico), were retrieved from Herman et al. (2018). Pachón cave rx3 sequences were aligned with each surface population independently with ClustalOmega (www.ebi.ac.uk/Tools/msa/clustalo/) and manually searched for fixed variants shared by both surface populations. We reasoned that a putative cis-regulatory candidate element (CRE) should be shared in both surface populations and different in the Pachón cave population. To identify a potential CRE, putative transcription factor binding sites (TFBS) were scanned around each fixed SNP detected ±15 bp with both TFBSTools R package (Tan and Lenhard, 2016; min score>0.9) and FIMO (Grant et al., 2011; https://meme-suite.org/meme/tools/fimo, P-value of 10−4). Position Weight Matrix (PFM) from JASPAR2020 (Vertebrate, Core collection) was used. A predicted TFBS common to both methods and unique to one morph were examined further.
Fixed variant effect prediction
Fixed SNPs were extracted from the VCF file using VCF-Bed intersect on Galaxy, and variant effects were predicted with SnpEff (canonical transcript only, custom database). Nonsense variants and premature start variants were manually checked (see Results). Missense variants were analysed with a R custom script. Protein sequences were retrieved from NCBI with the rentrez R package and corrected for discrepancy between sequences and annotations (150 variants/75 genes corrected; two genes/five variants removed).
For each gene with a missense variant between surface and cave morphs, orthologous sequences from nine other teleost species with annotated genomes (Danio rerio, GCF_000002035.6; Pygocentrus nattereri, GCF_015220715.1; Electrophorus electricus, GCF_013358815.1; Anguilla anguilla, GCF_013347855.1; Tachysurus fulvidraco, GCF_003724035.1; Ameiurus melas, GCA_012411365.1; Triplophysa tibetana, GCA_008369825.1; Anabarilius grahami, GCA_003731715.1; Labeo rohita, GCA_004120215.1) were retrieved using BROCCOLI (https://doi.org/10.1093/molbev/msaa159), which uses kmer clustering and phylogenetic approaches to infer orthologous groups.
Protein sequences of A. mexicanus surface and cave morphs were aligned with retrieved orthologous sequences using MAFFT v7.407 (https://academic.oup.com/mbe/article/30/4/772/1073398), and PAML v4.9 h (https://doi.org/10.1093/molbev/msm088) was used for the reconstruction of ancestral sequences. Finally, for each gene, the ancestral sequence of either cave or surface morph was used as input in MutPred2 (Pejaver et al., 2017) to assign a deleterious score (g-score) to all missense mutations in these two lineages, which can vary between 0 (non-deleterious) and 1 (highly deleterious). Variants with a g-score >0.5 were analysed for Gene Ontology enrichment using GOEnrichment (Galaxy, version 2.0.1).
Differential exon usage
Variations in exon use between the two Astyanax morphs in the 10 hpf transcriptome was detected using DEXseq package (Anders et al., 2012). The analysis was performed by following the instructions on the vignette (https://bioconductor.statistik.tu-dortmund.de/packages/3.3/bioc/vignettes/DEXSeq/inst/doc/DEXSeq.pdf). GTF annotation was prepared using the DEXseq-count tool in Galaxy (version 1.28.1.1) without the ‘aggregate gene with exon’ parameter. Reads were counted using the same tool and data were analysed in R following the vignette. Genes with differential exon use were identified with a FDR of 1% and then manually curated for a biologically meaningful change. The selected genes were then categorized according to the type of changed observed (Data S2).
In situ hybridization
Colorimetric and fluorescent in situ hybridization were performed as previously described (Agnès et al., 2022). Fluorescently labelled embryos were counterstained with DAPI, and dissected and flat mounted in Vectashield antifade mounting media (Vector Laboratories) for imaging (SP8 confocal microscope, Leica). Non-fluorescent embryos were imaged in whole mount (Nikon macroscope). Image and nuclear-to-cytoplasmic ratio analyses were performed in Fiji.
Cell transplantations and quantifications
Isochronic transplantations of FITC-labelled embryonic cells, and in situ hybridization followed by FITC revelation were performed as previously described (Torres-Paz and Rétaux, 2021). Area measurements and quantification of fluorescence intensity in individual cells, semi-automatically segmented (Agnès et al., 2022), were performed in Fiji.
Manipulation of the Wnt pathway
To manipulate Wnt signalling levels, surface embryos were incubated in LiCl2 0.2 M in embryo medium from 90% of epiboly to tailbud stage, and cavefish embryos were injected at the one-cell stage with full-length A. mexicanus dkk1b mRNA (Torres-Paz et al., 2019). Statistical tests were carried out in Graph pad Prism 5.
Tag-rx3 injection
Full rx3 mRNA sequences (including 5′- and 3′-UTR) with the surface or cave polymorphism in the 3′UTR were produced by gene synthesis by Genscript company. The sequence also contained a specific tag immediately after the ATG (FLAG for surface and HA for cavefish). The sequence were inserted into a pcDNA3.1(+) vector and XhoI/XbaI restriction sites. Plasmids containing the gene were linearized using XbaI and RNA were synthesized with the T7 mMessageMachine kit (Invitrogen). A 1:1 mix of each RNA was generated (5 or 10 ng/µl of each mRNA). Around 5 nl of the mix was injected into fertilized cavefish embryos. RNA was extracted at 10 hpf for RT-qPCR analysis.
RT-qPCR
RNA from wild-type or injected 10 hpf embryos were extracted using a phenol-chloroform procedure. RNA extracts were treated with DNaseI (ThermoFisher) before reverse transcription. cDNA were obtained by reverse transcription with the iScriptTM cDNA Synthesis kit (Bio-Rad) using 500 ng of DNase-treated RNA as starting material. The qPCR reaction mix contained Evagreen (Bio-Rad) and 10 ng of gDNA. qPCR reactions were performed on the C1000 Touch thermocycler (Bio-Rad) and Cq values were extracted from the CFX ManagerTM Software (version 3.1) and processed in R. Two reference genes were used for normalization (Tbp and Polr2a). Primer sets targeting the different genes (rx3 pre-mRNA, rx3 mRNA, TBP and Polr2a) were obtained with the NCBI Primer Designing Tool (https://www.ncbi.nlm.nih.gov/tools/primer-blast/). In case of the injected tag-Rx3 RNA, the specific tag sequence was used as forward primer and a shared rx3 sequence was used as the reverse primer (horizontal arrows in Fig. 2F). The two tag primers were tested for no amplification of the other tag. All primer sequences are available in Table S3. Relative expression was obtained using the ΔΔCt method. For each reaction, the actual efficiency was determined using the LinRegPCR web tool (https://www.gear-genomics.com/rdml-tools/; Untergasser et al., 2021). For all genes tested per condition, the mean PCR efficiency was between 1.96 and 2.03, thus the relative quantity of each gene per condition was calculated as RQ= 2-ΔΔCt. In case of tag-Rx3, the Ct values of each tag after injection were normalized with the Ct values from an RNA sample before injection to take into account of a potential bias in tag ratio in the solution. P-values were computed with a Mann–Whitney test; 95% confident intervals of the mean were obtained by applying the t.test R function to each set of RQ values.
Acknowledgements
We thank Krystel Saroul, Marie Pavie and Stéphane Père for taking care of our Astyanax breeding colony. We thank Juliette Bitard and Karine Parain for help in qPCR experiments. This work has benefited from the facilities and expertise of the sequencing platforms of I2BC, Gif sur Yvette. We thank Yan Jaszczyszyn from the I2BC sequencing platform for fruitful interactions.
Footnotes
Author contributions
Conceptualization: J.L., J.T.-P., S.R.; Methodology: J.L., J.T.-P., F.A.; Software: M.P.; Validation: J.L., J.T.-P., S.R.; Formal analysis: J.L., J.T.-P., S.R.; Investigation: J.L., J.T.-P., M.P., F.A.; Resources: J.L., S.R.; Data curation: J.L., S.R.; Writing - original draft: J.L., S.R.; Writing - review & editing: J.T.-P., M.P., F.A., S.R.; Visualization: J.L., J.T.-P., S.R.; Supervision: S.R.; Project administration: S.R.; Funding acquisition: S.R.
Funding
This work was supported by grants to S.R. from Equipe Fondation pour la Recherche Médicale (EQU202003010144), Retina France and the Agence Nationale pour la Recherche (Cavemom).
Data availability
The RNA-seq data have been deposited in the Sequence Read Archive on NCBI under BioProject ID PRJNA848099.
Peer review history
The peer review history is available online at https://journals.biologists.com/dev/lookup/doi/10.1242/dev.202610.reviewer-comments.pdf.
Special Issue
This article is part of the Special Issue ‘Uncovering developmental diversity’, edited by Cassandra Extavour, Liam Dolan and Karen Sears. See related articles at https://journals.biologists.com/dev/issue/151/20.
References
Competing interests
The authors declare no competing or financial interests.