ABSTRACT
Gene expression regulation in eukaryotes is a multi-level process, including transcription, mRNA translation and protein turnover. Many studies have reported sophisticated transcriptional regulation during neural development, but the global translational dynamics are still ambiguous. Here, we differentiate human embryonic stem cells (ESCs) into neural progenitor cells (NPCs) with high efficiency and perform ribosome sequencing and RNA sequencing on both ESCs and NPCs. Data analysis reveals that translational controls engage in many crucial pathways and contribute significantly to regulation of neural fate determination. Furthermore, we show that the sequence characteristics of the untranslated region (UTR) might regulate translation efficiency. Specifically, genes with short 5′UTR and intense Kozak sequence are associated with high translation efficiency in human ESCs, whereas genes with long 3′UTR are related to high translation efficiency in NPCs. In addition, we have identified four biasedly used codons (GAC, GAT, AGA and AGG) and dozens of short open reading frames during neural progenitor differentiation. Thus, our study reveals the translational landscape during early human neural differentiation and provides insights into the regulation of cell fate determination at the translational level.
INTRODUCTION
Based on the central dogma, genetic information flows from DNA to RNA and is finally decoded as proteins, which ultimately determine cell fate and function. Among these processes, translation is the major biosynthetic and the most energy-consuming process during rapid growth (accounting for roughly half of the energy expended) (Roux and Topisirovic, 2012). Within the cell, translation is a multistep process encompassing initiation, elongation, termination, and ribosome recycling, which are regulated by the precise orchestration of ribosomes, tRNAs and numerous translation factors (Hanson and Coller, 2018; Tahmasebi et al., 2018). Dysregulation of translational control has been identified in various human disorders, including neurodegenerative diseases, immunodeficiency, cardiovascular diseases and cancer (Bhat et al., 2015; Kapur et al., 2017; Piccirillo et al., 2014). However, most studies have focused on differences at the RNA level and transcriptional regulation owing to different transcription factor binding and chromatin status (Lee and Young, 2013). In contrast, the regulation rule at translational and post-translational levels appears to be much less clear-cut. Therefore, an investigation into how and to what extent gene expression is controlled by translation is needed and of great significance.
As the most rapidly evolving organ in the human body, the brain is responsible for complex processes, including learning, memory and other higher cognitive functions. Intriguingly, translational control has been found to be associated with these processes (Costa-Mattioli et al., 2005, 2007; Sharma et al., 2020). Many mRNAs are localized in dendrites and axons and stored in a translationally repressed state until signal transduction alleviates repressor activity (Fernandopulle et al., 2021). Dysregulated RNA localization and translation lead to neurodegenerative diseases (Agrawal and Welshhans, 2021). In addition, precise spatiotemporal gene expression through translation control seems to commence at early stages of neural development. Recently, through a series of rigorous biochemical experiments, Zhang and colleagues found that high Amd1 expression was necessary for the self-renewal of mouse embryonic stem cells (ESCs), but blocked the differentiation into neural precursor cells (NPCs), and NPC-enriched miR-762 promoted the translational repression of Amd1 at the initiation of the ESC-to-NPC conversion (Zhang et al., 2012). Lin and colleagues applied a combination of methylated RNA immunoprecipitation sequencing (MeRIP-seq) and polysome profiling on Mettl1- and Wdr4 (responsible for tRNA m7G modification)-depleted mouse ESCs and showed that the translation efficiency of some neural lineage genes, rather than mesodermal or endodermal genes, were decreased, eventually leading to abnormal embryoid body differentiation (Lin et al., 2018). In addition, the RNA-binding protein CSDE1 was found to maintain pluripotency and prevent neural differentiation through translational control (Ju Lee et al., 2017, 1). Although these reports provide some clues about translational control in the neural system, the complicated translational regulation mechanisms and dynamics are still obscure. More systematic studies are required to understand the regulation of translation during neural differentiation.
Human pluripotent stem cells (PSCs), including ESCs and induced pluripotent stem cells (iPSCs), provide an excellent model for investigating early embryonic development in vitro owing to their ability to self-renew and differentiate along multiple lineages. Various in vitro differentiation protocols have been developed over the last two decades, which has improved our understanding of the complicated developmental processes in humans, particularly in early development (Pera and Rossant, 2021; Yiangou et al., 2018). Based on a PSC model, a series of signal factors have been shown to participate in pluripotency maintenance and cell differentiation (Mikryukov et al., 2021; Ohinata et al., 2022). In addition, PSCs provide a convenient in vitro model in which to investigate the function of genes of interest in embryonic development (Costa et al., 2016; Rodrigues et al., 2016; Zhang et al., 2018). These PSC-based studies have been beneficial for improving cell differentiation and promoting biomedical applications.
Here, we combined neural differentiation and ribosome profiling and dissected the translation landscape during human early neural differentiation. We first collected undifferentiated human ESCs and differentiated NPCs using an established, highly efficient neural differentiation protocol. Then, we performed ribosome sequencing (RIBO-seq) along with RNA sequencing (RNA-seq) to analyze the dynamics of translation during early neural induction. We focused on cell type- and developmental stage-specific translational regulation. With previous knowledge of the critical cis-regulatory sequences and trans-acting factors in mRNAs translation, we examined these elements and revealed a close correlation with translation efficiency. We also analyzed the abundance and bias of individual codons. Finally, we identified small open reading frames (sORFs) in noncoding regions based on the ribosome profiling data. We hope the methodology and data presented here will pave a path for additional research into proteomics and translational control during early human neural development.
RESULTS
Generation of RIBO-seq dataset about human early neural differentiation
We differentiated human ESCs into NPCs with a well-established differentiation protocol (Chambers et al., 2009; Meng et al., 2021) (Fig. S1A). To study gene translation dynamics during early neural induction, we performed RNA-seq and RIBO-seq separately on human ESCs and NPCs with three replicates (Fig. 1A). The ribosome-protected fragments (RPFs) were sequenced to a depth of 108-125 million reads, and rRNA-depleted RNA was sequenced to a depth of 66-117 million reads in a strand-specific manner (Table S1), with mapping rates and depths comparable to those in previous reports (Ingolia et al., 2009; Zhang et al., 2022a). First, we evaluated the sequence data using Spearman's rank correlation coefficient. The results were highly reproducible between biological replicates (Fig. S1B). As expected, pluripotency genes POU5F1 and NANOG were downregulated during differentiation, whereas genes representing the neuroectoderm specification, SOX1 and PAX6, were upregulated (Fig. 1B). Gene set enrichment analysis (GSEA) showed that 13 neural-related terms in the Gene Ontology Biological Process (GO-BP) database were significantly upregulated during differentiation (Fig. S1C), confirming the success of neural induction. In addition, reads from RIBO-seq were mostly aligned to protein-coding sequences (CDSs) compared with RNA-seq (Fig. 1C, Fig. S1D), which supports the reliability of the RIBO-seq data (Ingolia et al., 2009). To check the data quality, we implemented a metagene analysis based on the characteristic of codon-by-codon movement of actively translating ribosomes (Fig. 1D). The results showed that RPFs had a distinct 3-nucleotide periodicity, with the peak abruptly starting at −12 nucleotide of the start codon and ending at −15 nucleotide of the stop codon (Fig. 1D, Fig. S1E). Additionally, 5′ ends of the RPFs displayed a striking bias toward the first nucleotide of individual codons (Fig. 1D, Fig. S1E). In contrast, the mRNA reads mapped uniformly across the mRNA, as expected for randomly fragmented mRNA, with no triplet periodicity (Calviello et al., 2016). Collectively, we generated a reliable data resource for the study of translation during early neural induction.
Generation of high quality RIBO-seq data. (A) Schematic overview of the experimental design in this study. (B) mRNA expression level of representative genes shown as transcripts per million (TPM). (C) The distribution of RNA-seq reads (left) and RIBO-seq reads (right) by genomic regions. (D) Top: Schematic overview of the start of the ribosome footprints (RFs) falling near the beginning or end of the CDS. Bottom: Metagene analysis of RFs near the start codon and stop codon used to estimate the A-site position in ESCs. The position of RFs relative to the reading frame in ESCs is shown on the right.
Generation of high quality RIBO-seq data. (A) Schematic overview of the experimental design in this study. (B) mRNA expression level of representative genes shown as transcripts per million (TPM). (C) The distribution of RNA-seq reads (left) and RIBO-seq reads (right) by genomic regions. (D) Top: Schematic overview of the start of the ribosome footprints (RFs) falling near the beginning or end of the CDS. Bottom: Metagene analysis of RFs near the start codon and stop codon used to estimate the A-site position in ESCs. The position of RFs relative to the reading frame in ESCs is shown on the right.
Changes in transcription and translation during early neural development
After confirming the reliability of our data, we compared the detected genes from RIBO-seq and RNA-seq and found that both data had high sensitivity and exhibited significant overlap (Fig. 2A). In addition, compared with previous proteomics data generated by mass spectrometry (Ju Lee et al., 2017; Ooi et al., 2019), our RIBO-seq could detect more expressed genes (12,948 versus 5027). To compare the relationship between translational and transcriptional regulation, we performed enrichment analysis for differentially expressed genes (DEGs) of RIBO-seq and RNA-seq, respectively (Fig. S2A,B, Table S1). We detected 2535 genes with differential RNA levels by RNA-seq (up: 1031; down: 1504) and 2538 genes with expression differences by RIBO-seq (up: 1421; down: 1117). The top enriched biological terms in both RNA-seq and RIBO-seq included terms associated with neural development, such as ‘axonogenesis’, ‘axon guidance’ and ‘regulation of nervous system development’ (Fig. S2A,B). Interestingly, there were indeed different enriched terms between the two datasets. To assess this difference, we integrated the DEGs of transcriptional or translational data and divided them into three groups (BOTH, RNA-seq only, RIBO-seq only; Fig. 2B). There were 1337 genes with changes in both translation and transcription (BOTH), 1198 genes with changes only in transcription (RNA-seq only) and 1201 genes with changes only in translation (RIBO-seq only). By GO analysis, the genes in the ‘BOTH’ group were highly enriched with neural development-related terms (Fig. 2C, Table S2), including ‘axon development’. A similar result was observed for genes in the ‘RNA-seq only’ group. However, genes in the ‘RIBO-seq only’ group were, interestingly, enriched in different terms, such as ‘histone modification’ (Fig. 2C, Table S2). Next, we separated the up- and downregulated genes from these three groups and performed enrichment analysis (Fig. S2C,D, Table S2). Again, ‘histone modification’ was the top translationally upregulated term. The most enriched terms in the ‘RIBO-seq only down’ group included ‘organonitrogen compound biosynthetic process’ and ‘aerobic respiration’. These analyses together indicate that RIBO-seq data could mirror the underlying biological variation during NPC differentiation and illustrate additional changes specifically in translational regulation, such as metabolism and epigenetics.
Transcriptional and translational dynamics during early neural development. (A) Venn diagram showing the detected gene products by RNA-seq, RIBO-seq and proteomics (Ju Lee et al., 2017; Ooi et al., 2019) in ESCs and NPCs. (B) Scatterplot showing the relationship between transcriptional and translational change during NPC differentiation. (C) Enriched GO terms for the different groups in B, sorted by significance for each group. The enrichment was analyzed by g:Profiler. Each cell is colored based on the odds ratio value and asterisks mark statistically significant terms.
Transcriptional and translational dynamics during early neural development. (A) Venn diagram showing the detected gene products by RNA-seq, RIBO-seq and proteomics (Ju Lee et al., 2017; Ooi et al., 2019) in ESCs and NPCs. (B) Scatterplot showing the relationship between transcriptional and translational change during NPC differentiation. (C) Enriched GO terms for the different groups in B, sorted by significance for each group. The enrichment was analyzed by g:Profiler. Each cell is colored based on the odds ratio value and asterisks mark statistically significant terms.
Translational dynamics in metabolism and epigenetics
We next investigated potential translational regulation in metabolism and epigenetics. As shown in Fig. S2E, the key enzymes involved in glycolysis were maintained at high levels during differentiation, and the primary glucose transporters SLC2A1 (GLUT1) and SLC2A3 (GLUT3) even exhibited a higher translation level in NPCs. In contrast, most tricarboxylic acid cycle (TCA)-related key enzymes were translationally downregulated in NPCs (Fig. S2E), except for PDK1 and PDK2 which were both negative regulators and inhibited pyruvate entering the TCA. These results indicate a decrease in the TCA cycle flux during early neural induction. Consistent with this, electron transport chain (ETC) and mitochondrial biogenesis-associated genes were also translationally repressed (Fig. S2E). These data suggest that the NPCs may require a high glycolytic flux, but low levels of oxidative phosphorylation, together with previous Seahorse metabolic data (Cliff et al., 2017), to support a unique energy metabolic character of NPCs.
Additionally, there was a significant translational change in genes associated with polyamine metabolism, including translational upregulation of ODC1 and SAT1 and translational downregulation of AMD1, SRM and SMS, which ultimately led to the accumulation of diamine (putrescine) and the decrease of both triamines (spermidine) and tetra-amines (spermine) (Fig. S2E,F). This observation was consistent with a previous report that the addition of spermine suppresses mouse ESC differentiation (Zhang et al., 2012). We also inspected the transcription and translation of epigenetic factors and found that the transcription of epigenetic factors was essentially unchanged, whereas the translation was upregulated (Fig. S2G). In addition, we compared our data with other translatomic data during neural differentiation (Blair et al., 2017; Rodrigues et al., 2020). In these three datasets, we discovered that there are still 87 epigenetic genes that share the same dynamic. Among the 87 genes, translationally upregulated genes were notably more prevalent than translationally downregulated genes (Fig. S2H). These results suggest that neural differentiation may involve specific translational regulation of metabolic and epigenetic genes.
Translation efficiency is distinct between different cell types
Translation efficiency, calculated as the ratio of ribosome footprint (RF) abundance to RNA expression, is a sensitive indicator representing the translation landscape. In our data, we found a symmetric distribution of translation efficiency values for ESCs and NPCs (Fig. S3A,B, Table S3). Most genes had a translation efficiency value close to 1. We next picked the top and lowest 10% genes and subjected them to enrichment analysis. Unexpectedly, genes with high translation efficiency were cell type or developmental stage specific (Fig. S3C,D, Table S3). In ESCs, genes with high translation efficiency were highly enriched with energy metabolism (such as oxidative phosphorylation and aerobic respiration), suggesting that a highly active metabolism may be essential for the ESC state (Fig. S3C). In NPCs, genes relating to RNA splicing and axon development were efficiently translated and of great importance (Fig. S3D). Unlike above, genes with low translation efficiency seemed highly conserved within different cell types and related to translation and protein processing (Fig. S3C,D). Given that these genes are essential for basic life activity, we checked the transcript abundance and found these genes to be substantially increased over others (Fig. S3E,F). Thus, the low translation efficiency but high transcription level may be a unique feature of genes involved in translation and protein processing, contributing to its fidelity.
Quantitative analysis of transcriptional and translational controls during neuroectoderm differentiation
We integrated all 3736 DEGs (either transcriptional or translational) to provide a more comprehensive picture of expression dynamics during neural differentiation (Fig. 3A, Table S4). As expected, GO-BP terms representing neural identity, such as those involved in neuron development (i.e. ‘axon guidance’, ‘regulation of neuron projection development’ and ‘neuron projection guidance’) were the most enriched. KEGG pathway analysis revealed that WNT signaling, calcium signaling and Rap1 signaling significantly contribute to neural development. In addition, Reactome pathway analysis revealed a significant change in extracellular matrix organization (ECM), along with other analyses [focal adhesion in KEGG, calmodulin-binding, and integrin binding in GO-MF (Gene Ontology Molecular Function)] implying that ECM remodeling occurs during neural differentiation. Because of the redundancy of results from pathway enrichment analyses, we applied the EnrichmentMap method (Merico et al., 2010), and generated the core network shown in Fig. 3B (nodes are colored by enrichment score, and edges are sized based on the number of genes shared by the connected pathways). These results indicate that combining the DEGs from RNA-seq and RIBO-seq improved the enrichment analysis with the refined terms and enhanced significance, which may facilitate our understanding of the dynamic gene regulatory network during differentiation.
Quantitative analysis of transcriptional and translational controls. (A) Enrichment analysis using combined DEGs from RNA-seq and RIBO-seq. GO:BP, biological processes GO terms; GO:CC, cellular component GO terms; GO:MF, molecular function GO terms; KEGG, Kyoto Encyclopedia of Genes and Genomes database analysis; REAC, Reactome pathway analysis. (B) Gene enrichment map using a term redundancy removed algorithm called EnrichmentMap in Cytoscape (Merico et al., 2010). (C) The performance of two different strategies (as described in Materials and Methods) to evaluate the contribution of translation and transcription to a particular process. (D) Examples of processes regulated mainly by translation or transcription. Specifically, cholesterol biosynthesis was translationally repressed, dendrite morphogenesis was translationally promoted, and ECM–receptor interaction was transcriptionally regulated.
Quantitative analysis of transcriptional and translational controls. (A) Enrichment analysis using combined DEGs from RNA-seq and RIBO-seq. GO:BP, biological processes GO terms; GO:CC, cellular component GO terms; GO:MF, molecular function GO terms; KEGG, Kyoto Encyclopedia of Genes and Genomes database analysis; REAC, Reactome pathway analysis. (B) Gene enrichment map using a term redundancy removed algorithm called EnrichmentMap in Cytoscape (Merico et al., 2010). (C) The performance of two different strategies (as described in Materials and Methods) to evaluate the contribution of translation and transcription to a particular process. (D) Examples of processes regulated mainly by translation or transcription. Specifically, cholesterol biosynthesis was translationally repressed, dendrite morphogenesis was translationally promoted, and ECM–receptor interaction was transcriptionally regulated.
We then investigated the proportional contribution of transcriptional and translational regulation to each biological process. To do this, we first calculated two parameters for each biological process: the correlation coefficient of gene fold-changes in RNA-seq and RIBO-seq and the proportion of transcriptional regulated genes (Materials and Methods; Fig. 3C, Table S4). Then, combing the directionality of both layers of regulation (the mean log2 fold-change values of transcription and translation), we performed principal component analysis (PCA; details in Materials and Methods; Fig. S4A). This revealed specific translational upregulation of dendrite morphogenesis genes, which manifests neural differentiation (Fig. 3C,D, Fig. S4A). In addition, cholesterol biosynthesis is largely suppressed translationally during neural differentiation (Fig. 3C,D, Fig. S4A), reflecting the rigorous regulation of cholesterol metabolism in the neural system. ECM–receptor interaction was mostly transcriptionally controlled (Fig. 3C,D, Fig. S4A), with 79% of genes having no significant change in translation efficiency between ESCs and NPCs (Fig. 3D). Other pathways, such as WNT and MAPK signaling, were subject to composite transcriptional and translational control (Fig. S4B-D), indicating that complex response mechanisms are involved in these processes. Because cholesterol is necessary for neuronal functioning and defects in cholesterol metabolism are associated with many central nervous system diseases (Zhang and Liu, 2015), we looked into 24 genes involved in cholesterol metabolism further. The heatmap and representative screenshot of DHCR24 demonstrate that their translational level did indeed decrease during differentiation (Fig. S4E,F). MEME motif analysis revealed that there are five RBP motifs enriched (Fig. S4G), which may regulate the translation of cholesterol metabolism genes. Altogether, our analyses demonstrate that translation is involved in many aspects of neural differentiation to different degrees.
The cis- and trans-elements that may affect translation efficiency
To investigate the underlying mechanism of the translation dynamics during neural differentiation, we evaluated the molecular features of mRNAs showing distinct translation states. We first classified all genes into three groups according to the translation efficiency (TE) value: TE_high: the top 10% genes; TE_low: the lowest 10% genes; TE_middle: the rest of the genes (Fig. S3A,B). Next, we compared the relationship between the length or GC content (GC%) of 5'UTR and 3′UTR to gene translation efficiency, respectively (Table S5). First, we surveyed the GC% of UTR, but there was not a remarkable correlation with translation efficiency (Fig. S5A). However, genes belonging to the ESC_TE_high group tended to have a short 5′UTR, whereas genes belonging to the ESC_TE_low group tended to have a long 5′UTR in ESCs. As a control, the group ESC_TE_middle was at the intermediate level (Fig. 4A). This finding was further corroborated by our experiment, which revealed that as the 5′UTR lengthens, translation efficiency decreased (Fig. S5B-D). Recent studies reported that mRNAs with stringent Kozak sequences and short 5′UTR could be efficiently translated by RPS26-containing ribosomes (Ferretti et al., 2017; Li et al., 2022). Therefore, we extracted the Kozak sequences of all protein-coding genes and summarized them by the position weight matrix (PWM), and then calculated PWM scores for each gene to evaluate the sequence conservation (Fig. 4B, Table S5). The result showed that the scores of the ESC_TE_high group were significantly higher than those of the ESC_TE_low group (P=0.00019) (Fig. 4C). However, this was not conserved in NPCs (P=0.36). These results suggest a specific cis-regulation in human ESCs, which is consistent with recent data in mouse ESCs (Li et al., 2022). Additionally, using published icSHAPE data from human ESCs (Sun et al., 2021), we found that genes of ESC_TE_high had higher reads signals than ESC_TE_low genes (Fig. 4D), suggesting that secondary structures within the 5'UTR represses gene translation in human ESCs. In addition, we found that 3'UTR positively correlates with the TE value in NPCs (Fig. 4A). Next, we compared the sequence features of genes in the NPC_TE_high group with those of the NPC_TE_low. The result showed that genes encoding 45 RNA-binding proteins were enriched in both groups (Fig. 4E, Fig. S5E), including poly(A) binding protein (PABP) and heterogeneous nuclear ribonucleoprotein (hnRNP) and serine/arginine-rich (SR) splicing factor protein (SRSF). However, we found that FMR1 and ESRP2 were explicitly enriched in NPC_TE_high group genes (Fig. 4E). Because FMR1 is a well-known translational regulator in the neural system, we performed GSEA and found that most NPC_TE_high genes were translationally downregulated in the FMR1 knockout (FMR1_KO) neural cortex (Das Sharma et al., 2019) (Fig. 4F). We compared our potential FMR1 targets with a previous study that used proximity-based techniques to crosslink mRNAs in brain tissue before FMR1 immunoprecipitation (Darnell et al., 2011). We found that 178/470 (37.9%) genes could be directly bound by FMR1. Among these 178 genes, 69 (38.8%) were autism-related genes (SFARI database), such as SHANK3, NRXN1 and CHD8 (Fig. 4G).
Factors affecting translation efficiency. (A) Density plots showing the association between UTR length (5′UTR and 3′UTR) and gene TE for ESCs and NPCs. P-values were calculated using Kruskal–Wallis test. (B) The ranking of Kozak PWM-based scores of protein-coding genes (PCGs) reflecting the probability of Kozak sequence match. (C) Box plot showing the Kozak PWM-based scores for distinct TE groups in ESCs and NPCs. Horizontal lines within the box plots are the median values, and box limits represent upper and lower quartiles of PWM scores of each group. The dashed line represents the mean PWM score of all genes. (D) The RNA structures of 5′UTR were derived from icSHAPE data. High signals indicate that the nucleotides in this region are not paired, whereas low signals indicate the formation of RNA secondary structure. (E) Venn diagram showing the number of unique and overlapping RBPs enriched in 3′UTR. (F) GSEA showing a positive enrichment profile of genes translationally downregulated in the FMR1 knockout (FMR1_KO) cortex. Genes were ordered according to their TE values in NPCs. NES, normalized enrichment score. (G) Venn diagram showing the overlap with FMR1 target genes (Darnell et al., 2011) and SFARI autism genes. (H) Schematic overview to analyze the codon bias using ribosome profiling data. The position of the A-site was determined by the metagene analysis shown in Fig. 1. (I) Four codons were retained (GAC and GAT for Asp, AGA and AGG for Arg) with significant changes. P-values were calculated using Kruskal–Wallis test.
Factors affecting translation efficiency. (A) Density plots showing the association between UTR length (5′UTR and 3′UTR) and gene TE for ESCs and NPCs. P-values were calculated using Kruskal–Wallis test. (B) The ranking of Kozak PWM-based scores of protein-coding genes (PCGs) reflecting the probability of Kozak sequence match. (C) Box plot showing the Kozak PWM-based scores for distinct TE groups in ESCs and NPCs. Horizontal lines within the box plots are the median values, and box limits represent upper and lower quartiles of PWM scores of each group. The dashed line represents the mean PWM score of all genes. (D) The RNA structures of 5′UTR were derived from icSHAPE data. High signals indicate that the nucleotides in this region are not paired, whereas low signals indicate the formation of RNA secondary structure. (E) Venn diagram showing the number of unique and overlapping RBPs enriched in 3′UTR. (F) GSEA showing a positive enrichment profile of genes translationally downregulated in the FMR1 knockout (FMR1_KO) cortex. Genes were ordered according to their TE values in NPCs. NES, normalized enrichment score. (G) Venn diagram showing the overlap with FMR1 target genes (Darnell et al., 2011) and SFARI autism genes. (H) Schematic overview to analyze the codon bias using ribosome profiling data. The position of the A-site was determined by the metagene analysis shown in Fig. 1. (I) Four codons were retained (GAC and GAT for Asp, AGA and AGG for Arg) with significant changes. P-values were calculated using Kruskal–Wallis test.
Owing to the balance between mutational drift and natural selection, synonymous codons are not utilized arbitrarily (Bulmer, 1991). During mammalian ontogenesis, codon usage is associated with developmental stage-related patterns of gene expression (Ren et al., 2007). Here, we collected codons from the ribosomal A-site and compared the codon usages between ESCs and NPCs (Fig. 4H, Table S5). We found there were four codons enriched (fold-change>1.2) or deprived (fold-change<0.8) with a significant P-value (P<0.05) that encoded two amino acids (Asp and Arg) (Fig. 4I). In order to explore whether specific biological processes were affected, genes enriched with these four codons were ranked and GO analysis was performed (Fig. S5F,G). It is somewhat unexpected that genes from three of these four different codons (except GAC) enriched to a common theme: RNA splicing. In addition, we noticed that a recent paper reported that depletion of the arginine tRNA gene n-Tr20 and GTP binding protein 2 (Gtpbp2) leads to the death of brain cells soon after birth in mice (Ishimura et al., 2014). Of note, whereas Gtpbp2 is ubiquitously expressed, n-Tr20 is capable of decoding the AGA codon and is expressed predominantly in the central nervous system. In our results, the codon AGA (encoding Arg) was the most enriched codon in NPCs. This raises an intriguing question of whether AGA codon anomalies can influence nervous system development or lead to neurological illnesses.
Identifying sORFs from noncoding regions
Translated sequences in noncoding regions have attracted more attention in recent years (Mackowiak et al., 2015; van Heesch et al., 2019). Therefore, we conducted a de novo recognition for actively translated ORFs using RiboTaper (Calviello et al., 2016) (Fig. 5A). Interestingly, we identified 216 upstream open reading frames (uORFs) in 178 genes and eight downstream open reading frames (dORFs) in eight genes (Table S6). One gene of great interest is SLC2A3, which encodes a significant neuronal glucose transporter (Maher et al., 1992). There was a uORF in the 5′UTR of SLC2A3 mRNA, and the uORF of SLC2A3 repressed the translation of the downstream main ORF in NPCs (Fig. 5B). Then, we put this uORF before the CDS of EGFP and then transfected the vector into HEK293T cells to determine the impact of uORF on the translation of EGFP. Interestingly, this uORF did indeed suppress the translation of the following CDS sequence (EGFP) (Fig. 5C). Like previous reports (Hinnebusch, 2005; Vattem and Wek, 2004), most uORFs identified in this study repressed the translation of the host gene (Fig. 5D).
Small ORFs from noncoding regions. (A) Summary of uORFs, CDS ORFs and dORFs detected by RiboTaper. (B) bedGraph analysis of RNA-seq and RIBO-seq at the SLC2A3 locus. The shaded area indicates the uORF of SLC2A3. (C) uORF represses the translation of downstream ORFs. Three vectors [EGFP, lncORF-EGFP (the lncORF is from AC016044.1), uORF-EGFP (the uORF is from SLC2A3)] were transfected into HEK293T cells. Left: Representative flow cytometry plots showing the percentage of GFP-positive cells. Right: Bar plot showing the mean percentages of GFP-positive cells (normalized to EGFP group, n=4). (D) Violin plot showing the decreased translation efficiency of genes with uORF in comparison to genes without uORF. (E) A two-step method to classify lncRNAs. Transcribed lncRNA (RNA-seq TPM>1) without detected sORF were classified as ‘conventional lncRNA’, lncRNA with detected sORF but without 3-nt periodicity were classified as ‘ribosome-associated lncRNA’, and lncRNA with both detected sORF and 3-nt periodicity in the sORF were classified as ‘translated lncRNA’. (F) The list of translated lncRNAs identified by the pipeline showed in E. Red indicates previous evidence at the protein level; blue indicates previously identified by RIBO-seq; black indicates newly identified in this study.
Small ORFs from noncoding regions. (A) Summary of uORFs, CDS ORFs and dORFs detected by RiboTaper. (B) bedGraph analysis of RNA-seq and RIBO-seq at the SLC2A3 locus. The shaded area indicates the uORF of SLC2A3. (C) uORF represses the translation of downstream ORFs. Three vectors [EGFP, lncORF-EGFP (the lncORF is from AC016044.1), uORF-EGFP (the uORF is from SLC2A3)] were transfected into HEK293T cells. Left: Representative flow cytometry plots showing the percentage of GFP-positive cells. Right: Bar plot showing the mean percentages of GFP-positive cells (normalized to EGFP group, n=4). (D) Violin plot showing the decreased translation efficiency of genes with uORF in comparison to genes without uORF. (E) A two-step method to classify lncRNAs. Transcribed lncRNA (RNA-seq TPM>1) without detected sORF were classified as ‘conventional lncRNA’, lncRNA with detected sORF but without 3-nt periodicity were classified as ‘ribosome-associated lncRNA’, and lncRNA with both detected sORF and 3-nt periodicity in the sORF were classified as ‘translated lncRNA’. (F) The list of translated lncRNAs identified by the pipeline showed in E. Red indicates previous evidence at the protein level; blue indicates previously identified by RIBO-seq; black indicates newly identified in this study.
Because long noncoding RNAs (lncRNAs) can also encode small peptides, we used an integrated strategy (Fig. 5E, Table S6) and divided lncRNAs into three groups: Group 1, the expressed lncRNAs without ribosome binding; Group 2, the ribosome-associated lncRNAs without coding potential; and Group 3, the bona fide translated lncRNAs. Consequently, we identified 581, 114 and 31 lncRNAs belonging to Groups 1, 2 and 3, respectively, in ESCs, and 757, 98 and 13, respectively, in NPCs. Supporting this analysis, a well-studied peptide-coding lncRNA, TUG1 (Lewandowski et al., 2020; van Heesch et al., 2019), and several other peptide-coding lncRNAs reported previously (Guo et al., 2020a; Hansji et al., 2016; van Heesch et al., 2019) appeared in the candidate list with translation potential (colored in red, Fig. 5F). To explore whether these lncRNAs are human specific, we used the phastCons100 score (Fig. S6A,B), which is calculated from the results of multiple sequence alignments of 100 vertebrate species in the UCSC genome browser database. We found that some lncORFs may be conserved across species (i.e. with high phastCons100 score; Fig. S6A,B). For example, the sORF of TUG1 has been reported to be conserved between human and mouse and can produce a stable protein that impacts mitochondrial membrane potential (Lewandowski et al., 2020). We also noticed that there are indeed some lncORFs with low phastCons100 score, suggesting that some lncRNAs/lncORFs may only exist in human (Fig. S6A,B). Finally, we compared the RNA expression of these lncRNAs in ESCs and NPCs. Some lncRNAs are differentially translated/expressed between ESCs and NPCs, such as LNCPRESS1, an ESC-specific lncRNA associated with pluripotency (Jain et al., 2016). Another interesting lncRNA is JAKMIP2-AS1, which we identified as containing a sORF in both ESCs and NPCs, having a high expression level at NPC (Fig. S6C,D). More studies are required to understand the function of JAKMIP2-AS1 and its associated sORF in neural differentiation.
DISCUSSION
Protein synthesis alters as cells differentiate or transition from quiescence to an active state (Baser et al., 2019). To investigate how translation contributes to cell differentiation, we set out to quantify the transcriptome (RNA-seq) and translatome (RIBO-seq) of human ESCs and NSCs in parallel (Fig. 1). In detail, we developed a method to highlight the discrepancies between transcriptome and translatome data. According to the results, a subset of genes is specifically regulated at the translational level. Following this finding, we investigated the sequence features of translationally regulated genes. In ESCs, we found that long 5′UTR length and 5′UTR secondary structure were correlated with low translation efficiency (Fig. 4). In NPCs, the 3′UTR appeared to be more significant (Fig. 4), which is consistent with previous reports (Miura et al., 2013; Tushev et al., 2018). We also developed an algorithm to determine how much transcriptional or translational control affects each biological function during early neural development (Fig. 3). The results showed that dendrite morphogenesis-related genes are mainly translationally upregulated, whereas cholesterol biosynthesis was mostly translationally repressed. Several crucial pathways, such as WNT and MAPK signaling, were also discovered to be under composite transcriptional and translational regulation. Our work collected transcriptome and translatome data in early neural development and established a strategy to explore stage-specific translational regulation.
Post-transcriptional regulation is essential in the nervous system. During neuronal differentiation, mRNAs undergo rapid coordinated local translation in response to external signals according to their length and complexity (Agrawal and Welshhans, 2021; Fernandopulle et al., 2021). Here, we found that translational regulation is also crucial in early neural development. We uncovered that translation specifically influences specific metabolic genes involved in glucose and polyamine metabolism (Fig. S2E). Previous studies have revealed straightforward metabolic switching during neural development. Unlike definitive endoderm and mesoderm, which heavily utilize oxidative phosphorylation for their energy requirements, high glycolytic flux is required for the ectoderm germ layer (Cliff et al., 2017; Lees et al., 2018; Lu et al., 2019; Lv et al., 2022). Translation regulation may thus play a role in the establishment of unique energy metabolism patterns in the neural lineage, which requires further investigation. Compared with glucose metabolism, the dynamic of polyamine metabolism during this process is not very clear. To our best knowledge, there is only one report that miRNA-inhibited translation of Amd1 plays a crucial role during neural differentiation (Zhang et al., 2012). Notably, supplementation with the metabolite spermine blocked neural differentiation, suggesting a potential application of metabolic regulation of neural induction. In our data, we observed not only translational downregulation of AMD1 but also translational upregulation of another rate-limiting enzyme, ODC1. Along with the downregulation of spermidine synthase (SRM) and spermine synthase (SMS) and the upregulation of spermidine/spermine N1-acetyltransferase 1 (SAT1), our data indicate that more tri- and tetra-amines were converted to diamine during neural differentiation (Fig. S2E,F). These different kinds of polyamines have distinct roles in development (Zhang et al., 2012; Zhao et al., 2012). Notably, the substrate of AMD1 is S-adenosylmethionine (SAM), which provides a methyl group for DNA/histone methylation and thus is involved in epigenetic modifications. Downregulation of AMD1 may increase the abundance of SAM in cells and contribute to epigenetic reprogramming during neural differentiation. Therefore, more studies are needed to dissect fully the complex crosslinks between translation regulation, epigenetics, metabolism, and cell fate determination.
Here, we revealed that translationally boosted genes are stage specific. In ESCs, energy metabolism genes have high translation efficiency, whereas in NPCs translation was accelerated for genes involved in RNA splicing and axon formation (Fig. S3C,D). However, genes with low translation efficiency were remarkably similar in both stages. These genes were enriched in terms related to mRNA translation and protein processing (Fig. S3C,D). We note that a recent study used a machine-learning method integrating RNA, translation and protein data to study gene expression control in lymphoblastoid cell lines (Cenik et al., 2015). They also found that genes involved in protein synthesis exhibited low translation efficiency. It is possible that the protein synthesis process generally has low translation efficiency in different cell types, but this hypothesis requires more data.
Furthermore, we found that genes with high translation efficiency in ESCs have stringent Kozak sequence, short 5′UTR and fewer RNA secondary structures in the 5′UTR (Fig. 4A-D), which may stimulate 40S ribosomes to scan for the start codon (Ferretti et al., 2017; Li et al., 2022). In NPCs, we did not observe a similar signature in 5′UTR. In addition, we found that the RNA-binding motif of FMR1 was enriched in the 3′UTR of a significant portion of genes with high translation efficiency in NPCs (Fig. 4E). More importantly, these genes were translationally downregulated in the FMR1-knockout cortex (Das Sharma et al., 2019) (Fig. 4F). Besides FMR1, we identified ESRP2, PCBP1, ZC3H10 and RBM8A as potential RNA-binding proteins involved in such translational control in NPCs. Audano and colleagues recently found that mouse Zc3h10 inhibits protein synthesis during the early stages of adipogenesis, and overexpression of Zc3h10 remarkably increased lipid droplet size (Audano et al., 2021). In addition, RBM8A has also been reported to play a role in many processes, including Alzheimer's disease (Zou et al., 2019). Thus, it would be of great interest to determine whether these RNA-binding proteins affect translation during neural development. Besides the cis-element and RNA-binding proteins, codon usage bias may also contribute to translation regulation. Non-optimal codons can reduce ribosome translocation rates, which can feed back to the synthesis of protein (Hanson and Coller, 2018; Nagayoshi et al., 2021; Ren et al., 2007). Using RIBO-seq data, we identified four codons (AGA, AGG, GAC, GAT) biasedly used in ESCs and NPCs (Fig. 4H,I). The AGA codon is of great significance because n-Tr20 (Arg-UCU-4-1), a tRNA gene specifically expressed in the brain, has been linked to neurodegeneration and early death in mice (Ishimura et al., 2014). At present, the effect of codons on gene expression has been widely explored in lower organisms (dos Reis et al., 2004; Ikemura, 1981), but for multicellular eukaryotes there are limited experimental data available and whether it is as similar as in unicellular organisms remains unclear.
lncRNAs are involved in the process of cell differentiation. Most studies have explored their functions at the RNA level, but recent studies found that lncRNAs can exert effects by encoding small peptides (Lewandowski et al., 2020; Wang et al., 2020; Zhang et al., 2022b). In 2019, van Heesch and colleagues successfully validated 44 microproteins produced by lncRNAs in the human heart (75% of 58 candidates identified by RIBO-seq data) (van Heesch et al., 2019), indicating that RIBO-seq is an excellent approach to identify translated lncRNAs. Here, we developed a discovery pipeline to identify translated lncRNAs in human ESCs and NPCs (Fig. 5E,F), and multiple lncRNAs identified have been well-characterized by their noncoding roles (Hansji et al., 2016; Young et al., 2005). In addition, we also identified 212 ribosome-associated lncRNAs, having a RF but without 3-nt periodicity (Calviello et al., 2016). These ribosome-localized lncRNAs have been poorly studied thus far (Hansji et al., 2016), although lncRNAs have been demonstrated to perform different functions based on the distinct subcellular localization (Guo et al., 2020b; Jiang et al., 2015; Yang et al., 2020; Zheng and Xiang, 2022).
MATERIALS AND METHODS
ESC culture and neural differentiation
The human ESC line HUES8 was grown in Matrigel-coated plates (Matrigel hESC-Qualified Matrix, BD Biosciences, 354277) with mTeSR1 media (STEMCELL Technologies, 1000023391) and passaged with Accutase (STEMCELL Technologies, A1110501) until the density reached roughly 80-90%. The medium was replaced every day. Dual SMAD inhibition was used for neural differentiation according to a previous protocol (Meng et al., 2021). Briefly, human ESCs were re-plated at a density of 5×104 cells per well in 24-well plates. After 2 days of cultivation in mTeSR1, neuroectoderm differentiation was triggered by neural induction media for 7 days [KnockOut™ DMEM/F12 (Gibco, 12660012), 0.5× N2 (Shanghai BasalMedia Technologies, M430721), 0.5× B27 without vitamin A (Shanghai BasalMedia Technologies, B430805), 1% penicillin/streptomycin (Thermo Fisher Scientific, 15140163), 1% 2-mercaptoethanol (Gibco, 2121115), 1% GlutaMAX™ Supplement (Thermo Fisher Scientific, 35050061), 1% NEAA (Gibco, 11140-050), plus 2.5 μM SB431542 (Selleck Chemicals, S1067) and 0.05 μM LDN193189 (Selleck Chemicals, S7507)]. The medium was changed daily, and the TGFβ inhibitor SB431542 was withdrawn from the neural induction media for the last 2 days.
RIBO-seq sample pretreatment
Harringtonine (LKT Laboratories, H0169) was added to the cell culture medium at 2 μg/ml concentration to immobilize starting ribosomes. Next, 100 μg/ml cycloheximide (Sigma-Aldrich, C48591ML) was added to the cell growth medium to stop translational elongation (100 μg/ml cycloheximide was added to all downstream buffers). Cells were mixed well and proceeded immediately to cell lysis. The resuspended extracts in lysis buffer were transferred to new microtubes, pipetted several times, and then incubated on ice. Then cells were triturated ten times through a 26-G needle. The lysate was centrifuged at 4°C (20,000 g for 10 min), and the supernatant was collected.
To prepare RFs, RNase I and DNase I were added to the lysate with gentle mixing on a Nutator mixer. First, nuclease digestion was stopped by adding SUPERase·In RNase inhibitor. Next, size exclusion columns (illustra MicroSpin S-400 HR Columns, GE Healthcare, 27-5140-01) were equilibrated with polysome buffer by gravity flow, and digested RFs were added to the column and centrifuged (600 g for 2 min at room temperature). Next, 10% (wt/vol) SDS was added to the elution, and RFs greater than 17 nt were isolated according to the RNA Clean and Concentrator-25 kit (Zymo Research, R1017). rRNA was removed using DNA probes complementary to rRNA sequences, and then RNase H and DNase I were used to digest the probes. Finally, RFs were purified using magnet beads (Vazyme). After obtaining the RFs above, RIBO-seq libraries were constructed using NEBNext® Multiple Small RNA Library Prep Set for Illumina® (E7300S, E7300L). Briefly, adapters were added to both ends of RFs, followed by reverse transcription and PCR amplification. PCR products of 140-160 bp were enriched to generate a cDNA library and sequenced using Illumina HiSeq™ 2500 by LC Biotechnology (Hangzhou, China).
RNA-seq sample pretreatment
Total RNA was extracted using Trizol reagent (Invitrogen) according to the manufacturer's protocol. RNA quality was assessed on an Agilent 2100 Bioanalyzer (Agilent Technologies) and checked using RNase-free agarose gel electrophoresis. After total RNA was extracted, rRNAs were removed. The enriched mRNAs and noncoding RNAs were broken into short fragments using fragmentation buffer and reverse transcribed into cDNA with random primers. Next, second-strand cDNA was synthesized with DNA polymerase I, RNase H, dNTP (dUTP instead of dTTP), and buffer. Next, the cDNA fragments were purified with a QiaQuick PCR extraction kit (QIAGEN), end-repaired, poly(A) added, and ligated to Illumina sequencing adapters. Then UNG (uracil-N-glycosylase) was used to digest the second-strand cDNA. The digested products were size-selected by agarose gel electrophoresis, PCR amplified and sequenced using Illumina HiSeqTM 4000 by LC Biotechnology (Hangzhou, China).
Processing of RNA-seq and RIBO-seq
The gene annotation used in this study was GRCh38 (v31) from the GENCODE database. Before read alignment, RIBO-seq reads were trimmed for remaining adaptor sequences and reads aligning to mitochondrial, ribosomal RNA, and tRNA sequences were filtered using Bowtie (Langmead et al., 2009). Next, mRNA-seq and RIBO-seq reads were aligned to the human genome (hg38) using STAR (Dobin et al., 2013) with the default parameters (except --outFilterMultimapNmax 1). Secondary alignments were filtered out, ensuring that each aligned read had just one genomic location. Then, reads were counted using featureCount (Liao et al., 2014). Custom scripts were used to calculate transcripts per million (TPM) and counts per million (CPM). deepTools (Ramírez et al., 2016) and bedtools (Quinlan and Hall, 2010) were used to create gene tracks.
RiboTaper (Calviello et al., 2016) was used to detect ORFs based on RIBO-seq data. For each read length, we performed metagene analysis of RFs near the start codon and stop codon to infer the P-site locations. Based on the results, read lengths ranging from 25 nt to 31 nt was kept and the −12 nt was used to define P-site positions.
To compare and integrate mRNA-seq and RIBO-seq data, raw read counts (RNA-seq and RIBO-seq) were subjected to read normalization, size factor estimation, and differential expression analysis between NPCs and ESCs using DESeq2 (Love et al., 2014). All genes that we deemed to be expressed are included in this analysis (average RNA-seq TPM>1 and RIBO-seq CPM>1; n=13,362). We considered a gene differentially transcribed or translated when it met genome-wide significance thresholds of false discovery rate (FDR)<0.05 and log2fold change >1.8 for translation, and >1.1 for transcription. The cutoffs were selected according to data distribution, which provides a proper comparison between RNA-seq and RIBO-seq data and then helps determine the precise contribution of transcriptional regulation to enriched processes in downstream investigations. Translation efficiency estimations were calculated by taking the ratio of RIBO-seq over RNA-seq counts. Then, differential translation evaluations were performed by deltaTE (Chothani et al., 2019).
Enrichment analysis
GO, Reactome and KEGG enrichment analyses were performed with g:Profiler (Reimand et al., 2019) (https://biit.cs.ut.ee/gprofiler/gost). Because pathway enrichment analysis frequently reveals many versions of the same pathway, such redundancy was processed with EnrichmentMap (Merico et al., 2010). To define the main contributing layer for each enriched biological process (transcription or translation), we defined two parameters: (1) the correlation coefficients between RNA-seq fold-change and RIBO-seq fold-change of genes from a particular pathway, and (2) the fractions of genes that were transcriptionally regulated (−1<log2 fold-change of translation efficiency<1). The two parameters had a good linear correlation (Fig. 3C). Then, for each enriched term, we added the mean transcription, translation, and log2 fold-change of translation efficiency and performed PCA. Each term's position in the PCA plot is thus determined by the balance of transcriptionally and translationally controlled genes and the directionality (up or down).
Translation efficiency correlations with mRNA features
The transcription FASTA sequence was obtained from the GENCODE database (https://www.gencodegenes.org). For each gene, the longest isoform was used in downstream analysis. The length, GC% of 5′UTR, and 3′UTR for each gene were calculated by Python code developed in-house. The region surrounding the start codon (from −6 to +6 nt) was extracted to test the Kozak sequence's global effect on translation efficiency. Then, the PWM was obtained using the R library ggseqlogo (Wagih, 2017). mRNA secondary structure was evaluated using previously published icSHAPE sequencing data in human ESCs (Sun et al., 2021) (GSE145805). RBP-binding motif analysis was performed with Analysis of Motif Enrichment (AME) (McLeay and Bailey, 2010) from MEME suite (http://meme-suite.org/tools/ame).
Codon bias analysis
First, ribosome A-site locations were predicted by RiboTaper (Calviello et al., 2016), and then the A-site codon was extracted using a Python script developed in-house. Next, we counted each codon frequency and for each gene the raw count number for a particular codon was divided by all codon counts mapped to this gene. We considered a codon to be differentially used when it met the following thresholds: P<0.05 and fold-change <0.8 or >1.2.
Vector construction
To generate pTY-EF1a-utr-mCherry-IRES-GFP, mCherry was cloned from PCW-Tet-puro-mCherry (modified from Addgene plasmid #50661) by PCR. In order to verify that the long 5′UTR represses translation in ESCs, we selected the 5′UTR of ALKBH5 (which had a low TE in ESCs). We cloned the full-length 5′UTR (1-416 nt) and three truncated versions (1-50 nt, 1-130 nt, 1-250 nt; Table S5) from the genome. Then, we inserted 5'UTR and mCherry in place of the lacZ in the lentiviral expression vector pTY-EF1a-lacZ-IRES-GFP (modified from Addgene plasmid #61738; lacZ was between SpeI and EcoRI restriction sites; Fig. S5B) using Gibson assembly cloning. Combined with psPAX2 (Addgene plasmid #12260) and pMD2.G (Addgene plasmid #12259) plasmids, a lentiviral vector system was used to express these vectors in ESCs. Finally, mCherry and GFP levels were quantified by measuring fluorescence intensities by flow cytometry. The translational rates were computed by mCherry levels/GFP levels as previously described (Ferreira et al., 2013).
Acknowledgements
We sincerely thank the core facility of the Medical Research Institute at Wuhan University for the technical support. We thank Mao Li, Pei Lu, Ran Liu and other laboratory members for their technical help and constructive discussion.
Footnotes
Author contributions
Conceptualization: C.Y., J.C., W.J.; Methodology: C.Y., Y.M.; Validation: C.Y.; Investigation: C.Y.; Writing - original draft: C.Y., J.Y., W.J.; Writing - review & editing: J.C., W.J.; Visualization: C.Y.; Supervision: J.C., W.J.; Project administration: W.J.; Funding acquisition: J.C., W.J.
Funding
This work was supported by the National Key Research and Development Program of China (2016YFA0503100), the Science and Technology Program of Hubei Province (2020CFA017), the Fundamental Research Funds for the Central Universities in China (2042021kf0207), and the Seed Fund Program for Sino-Foreign Joint Scientific Research Platform of Wuhan University (WHUDKUZZJJ202205).
Data availability
All data generated in this study have been deposited in Gene Expression Omnibus under accession number GSE208082. All major codes used in our analysis have been uploaded to GitHub (https://github.com/iamycc/development_translatome).
References
Competing interests
The authors declare no competing or financial interests.