ABSTRACT
Identification of cell type-specific cis-regulatory elements (CREs) is crucial for understanding development and disease, although identification of functional regulatory elements remains challenging. We hypothesized that context-specific CREs could be identified by context-specific non-coding RNA (ncRNA) profiling, based on the observation that active CREs produce ncRNAs. We applied ncRNA profiling to identify rod and cone photoreceptor CREs from wild-type and mutant mouse retinas, defined by presence or absence, respectively, of the rod-specific transcription factor (TF) Nrl. Nrl-dependent ncRNA expression strongly correlated with epigenetic profiles of rod and cone photoreceptors, identified thousands of candidate rod- and cone-specific CREs, and identified motifs for rod- and cone-specific TFs. Colocalization of NRL and the retinal TF CRX correlated with rod-specific ncRNA expression, whereas CRX alone favored cone-specific ncRNA expression, providing quantitative evidence that heterotypic TF interactions distinguish cell type-specific CRE activity. We validated the activity of novel Nrl-dependent ncRNA-defined CREs in developing cones. This work supports differential ncRNA profiling as a platform for the identification of cell type-specific CREs and the discovery of molecular mechanisms underlying TF-dependent CRE activity.
INTRODUCTION
Identification of tissue- and context-specific cis-regulatory elements (CREs) is essential for the definition of the transcriptional networks that govern physiology and disease. Although high-throughput sequencing methods to monitor histone modifications, chromatin status, and transcription factor (TF) occupancy have led to the identification of many CREs, it has been shown that many putative regulatory elements do not appear to affect gene expression, making them unlikely to represent functional CREs (Bonn et al., 2012; ENCODE Project Consortium, 2012; Creyghton et al., 2010; Dogan et al., 2015; Visel et al., 2009; Werner and Ruthenburg, 2015). The development of massively parallel reporter assays (MPRAs) has allowed for high-throughput analysis of potential CREs for enhancer activity (Hartl et al., 2017; White et al., 2013, 2016). However, most of these approaches depend on a priori knowledge of candidate CREs and often test CREs in non-native chromatin configurations. To interrogate context-specific CRE function accurately, complementary strategies for the identification of functional enhancers are necessary. One such approach is the interrogation of differentially expressed non-coding RNAs (ncRNAs) to identify active CREs in context-specific states (Mikhaylichenko et al., 2018; Xiong et al., 2018; Yang et al., 2017).
Non-coding RNAs are often transcribed from active regulatory elements and serve as a hallmark of functional enhancers (Davidovich and Cech, 2015; Engreitz et al., 2016; Natoli and Andrau, 2012; Wang et al., 2011; Wu et al., 2014; Yang et al., 2017). Although ncRNAs can be produced from active regulatory elements, the genome is pervasively transcribed, leading to an abundance of non-coding transcripts, making identification of CRE-specific transcription potentially difficult. For example, several classes of functional ncRNAs have been described in the retina, including microRNAs (miRNAs) and long ncRNAs (lncRNAs) (Blackshaw et al., 2004; La Torre et al., 2013; Zelinger et al., 2017). To isolate CRE-based ncRNA transcripts, TF-dependent ncRNA profiling has been utilized in other systems to identify highly active regulatory elements at loci that are important for cardiac rhythm (Yang et al., 2017). We hypothesized that TF-dependent ncRNA profiling may facilitate the identification of TF-dependent retinal CREs.
Over the past two decades, the gene regulatory networks (GRNs) that regulate photoreceptor development have been investigated, leading to the identification of many crucial TFs (Cepko, 2015). Retinal progenitor cells (RPCs) that express Otx2 develop into both rod and cone photoreceptors, which are subsequently distinguished by the expression of additional TFs (Koike et al., 2007; Nishida et al., 2003). Cones are specifically produced from RPCs that express Olig2 and Oc1 (Onecut1), in addition to Otx2 (Altshuler and Lillien, 1992; Emerson et al., 2013; Gonzalez-Cordero et al., 2017; Hafler et al., 2012). Rods are specified by the expression of Nrl, a basic leucine zipper TF. Nrl is required for the formation of rods as it activates a rod-specific GRN and represses a cone-specific GRN (Chen, 2005; Mears et al., 2001; Oh et al., 2007; Swain et al., 2001). The pan-retinal photoreceptor TF Crx is also required for the differentiation and maintenance of both rods and cones (Chen et al., 1997; Furukawa et al., 1997).
Defining cone-specific CREs has been challenging due to the small number of cones in the mouse retina, where rods comprise the vast majority of photoreceptors (Jeon et al., 1998; Young, 1985). However, in Nrl−/− mutant mice, cells normally fated to become rods are transformed into cone-like cells (Mears et al., 2001). The abundance of cones in the Nrl−/− mouse has been used to investigate cone-specific gene regulation. For example, previous work has profiled rod- and cone-enriched coding and non-coding transcripts (Hsiau et al., 2007; Hughes et al., 2017; Swaroop et al., 2010; Zelinger et al., 2017), and distinguished the regulatory landscape in the rod-predominant wild-type (WT) retina from the cone-predominant Nrl−/− mutant retina, by assay for transposase-accessible chromatin (ATAC)-seq and CRE-seq (Hughes et al., 2017, 2018; Mo et al., 2016).
Models for combinatorial TF activity in rod and cone GRNs have been generated by examining TF binding in WT (rod-dominant) and Nrl−/− (cone-dominant) retinas. CRX plays a key role in driving photoreceptor GRNs by promoting chromatin remodeling at specific target sites and recruiting additional TFs to direct photoreceptor differentiation and gene expression (Andzelm et al., 2015; Chen et al., 1997; Furukawa et al., 1997; Hennig et al., 2008; Hsiau et al., 2007; Hughes et al., 2017, 2018; Livesey et al., 2000; Peng and Chen, 2005; Ruzycki et al., 2018). The effect of CRX binding changes dramatically when co-bound with NRL. In the presence of NRL, CRX and NRL act cooperatively to induce gene expression robustly, whereas in the absence of NRL, CRX appears to drive gene expression to a lesser extent (Corbo et al., 2007, 2010; Hao et al., 2012; Hsiau et al., 2007; Hughes et al., 2017; Mitton et al., 2000; Mo et al., 2016; Swaroop et al., 2010; White et al., 2016). These findings are based on biochemical studies that interrogate physical interactions between NRL and CRX, their genomic localization, the impact of Nrl and Crx mutant alleles on retinal gene expression, and examination of the activity of NRL and CRX on photoreceptor CREs (Corbo et al., 2007; Hughes et al., 2017; Mitton et al., 2000; Oh et al., 2015; Swaroop et al., 2010; White et al., 2016). These data suggest a heterotypic model for photoreceptor-specific GRN selection in which combinatorial binding of NRL and CRX drive expression of a rod-specific GRN and CRX binding alone drives the expression of a cone-specific GRN.
Here, we applied ncRNA profiling to identify rod- and cone-specific regulatory elements active in the WT and Nrl−/− mutant retina, respectively. We identified thousands of Nrl-dependent ncRNAs, an order of magnitude more than Nrl-dependent ncRNAs identified in previous analyses (Zelinger et al., 2017). Differential ncRNA expression strongly correlated with local coding gene expression, consistent with the observation that altered ncRNA abundance identifies active CREs (Yang et al., 2017). Comparing Nrl-dependent ncRNAs with retinal genomic signatures nominated these loci as rod- and cone-specific ncRNA-defined CREs. We examined the effect of specific TF localization patterns on genome-wide ncRNA abundance, and found that ncRNA abundance provides quantitative support for heterotypic interactions between NRL and CRX as drivers of cell type-specific CRE activity. Direct assessment of Nrl-repressed ncRNA-defined candidate CREs showed that they are active regulatory elements in developing cones. This work provides a resource for the identification of retinal GRNs and highlights ncRNA profiling as a robust method for the detection of cell type-specific CREs.
RESULTS
Nrl-dependent coding and non-coding transcripts nominate photoreceptor CREs
We interrogated Nrl-dependent transcription in the mouse retina, using polyA+ RNA as an approximation of coding RNA, and polyA− RNA as an approximation of non-coding RNA. Nrl-dependent polyA+ RNA was identified (Fig. 1A) by sequencing cDNA libraries made from polyA+-selected RNA from retinas of litter-matched WT and Nrl−/− adult mice at postnatal day (P) 21 (n=5 and n=6, respectively). By P21, mouse photoreceptor differentiation is largely complete. Differential expression analysis revealed 4315 misregulated genes (Fig. 1A). Nrl expression was absent from Nrl−/− samples, along with numerous known Nrl targets and rod-specific genes, including Rho, Gnat1 and Nr2e3. Conversely, genes that are not normally expressed in rods, including many involved in cone phototransduction, were upregulated in Nrl−/− samples, such as Gnat2, Gngt2 and Opn1sw (Hsiau et al., 2007). These expression changes are consistent with previously identified Nrl-dependent gene expression and rod- versus cone-specific gene expression patterns in the adult retina (Brooks et al., 2011; Corbo et al., 2007; Hsiau et al., 2007; Hughes et al., 2017; Kim et al., 2016a; Mo et al., 2016).
Transcriptional profiling identifies Nrl-dependent coding and non-coding RNAs. (A) Volcano-plot [log2 (fold change) versus −log10 (PAdj)] of Nrl+/+ versus Nrl−/− coding transcripts. Significantly misregulated genes (PAdj<0.05 |log2 fold change|>1) in blue, and non-significant in gray. Selected significant coding genes known to play a role in photoreceptor differentiation are labeled (black). (B) Volcano-plot [log2 (fold-change) versus −log10 (PAdj)] of Nrl+/+ versus Nrl−/− ncRNAs]. Significantly misregulated ncRNAs (PAdj<0.05 |log2 fold change|>1) are depicted in blue and non-significant in gray. Selected transcripts are labeled by the nearest Nrl-dependent coding genes (black). (C) Pearson correlation of polyA-depleted alignment files. Highly concordant samples (>0.7) in darker purple, and weakly correlated samples (0.6>) in lighter purple. (D) Scatterplot for the Nrl-dependent ncRNAs. Differential expression fold change (log2) of nearest Nrl-dependent gene (y-axis) versus differential expression fold change (log2) of ncRNAs (x-axis). Correlation of ncRNAs with genes within TAD is shown in gray, and of genes with closest ncRNAs upstream in purple. (E) GO analysis of the nearest Nrl-dependent genes to defined Nrl-dependent ncRNAs. Odds ratio is shown by the blue scale, and −log(PAdj) on the x-axis. (F) GIGGLE (genome inquiry) plot of Nrl-dependent ncRNAs across several retinal epigenomic assays. GIGGLE score was defined and the Fisher odds multiplied by the −log(P-value).
Transcriptional profiling identifies Nrl-dependent coding and non-coding RNAs. (A) Volcano-plot [log2 (fold change) versus −log10 (PAdj)] of Nrl+/+ versus Nrl−/− coding transcripts. Significantly misregulated genes (PAdj<0.05 |log2 fold change|>1) in blue, and non-significant in gray. Selected significant coding genes known to play a role in photoreceptor differentiation are labeled (black). (B) Volcano-plot [log2 (fold-change) versus −log10 (PAdj)] of Nrl+/+ versus Nrl−/− ncRNAs]. Significantly misregulated ncRNAs (PAdj<0.05 |log2 fold change|>1) are depicted in blue and non-significant in gray. Selected transcripts are labeled by the nearest Nrl-dependent coding genes (black). (C) Pearson correlation of polyA-depleted alignment files. Highly concordant samples (>0.7) in darker purple, and weakly correlated samples (0.6>) in lighter purple. (D) Scatterplot for the Nrl-dependent ncRNAs. Differential expression fold change (log2) of nearest Nrl-dependent gene (y-axis) versus differential expression fold change (log2) of ncRNAs (x-axis). Correlation of ncRNAs with genes within TAD is shown in gray, and of genes with closest ncRNAs upstream in purple. (E) GO analysis of the nearest Nrl-dependent genes to defined Nrl-dependent ncRNAs. Odds ratio is shown by the blue scale, and −log(PAdj) on the x-axis. (F) GIGGLE (genome inquiry) plot of Nrl-dependent ncRNAs across several retinal epigenomic assays. GIGGLE score was defined and the Fisher odds multiplied by the −log(P-value).
We also performed transcriptional profiling to identify Nrl-dependent polyA− transcripts, to identify primarily ncRNAs, by deep sequencing polyA− RNA from the same control and Nrl−/− retinal samples (Fig. 1B). This approach identified approximately 30,000 retinal non-coding transcripts by de novo transcript assembly (Fig. S1). Of these transcripts, 5900 were Nrl-dependent intergenic ncRNAs (FDR<0.05, FC>2). 2553 ncRNAs were significantly downregulated in the Nrl−/− retina, due to the loss of direct activation by NRL or loss of activation indirectly dependent upon NRL. We term these ‘Nrl-activated ncRNAs’. 3489 ncRNAs were significantly upregulated in Nrl−/− retina, due to the loss of direct repression by NRL or by loss of Nrl-dependent repressors. We term these ‘Nrl-repressed ncRNAs’. The remainder of ncRNAs were not significantly misregulated (‘shared’) (Fig. 1B). We found a strong linear relationship between samples by genotype, confirming reproducibility (Fig. 1C).
We and others have hypothesized that quantitative changes in CRE transcription mirror quantitative changes in CRE activity (Li et al., 2016; Mikhaylichenko et al., 2018; Yang et al., 2017). Accordingly, we examined the quantitative correlation between the change in Nrl-dependent ncRNA transcription and the change in the most proximal Nrl-dependent gene. The average change in expression of Nrl-dependent ncRNAs and the nearest mRNAs within a shared topologically associating domain (TAD) was strongly correlated (Fig. 1D, Cor=0.76, P=2.2e−16) (Shen et al., 2012). Gene ontology (GO) analysis of the nearest genes showed enrichment for ontology terms associated with phototransduction, sensory perception, and response to light stimulus (Fig. 1E).
We examined the correlation between the epigenomic landscape of mouse photoreceptors and retinal ncRNAs, applying GIGGLE to calculate and rank the significance of the overlap between genomic locations with Nrl-dependent ncRNAs and features of retinal enhancers including accessible chromatin, decreased methylation, and H3K27Ac (Hughes et al., 2017; Layer et al., 2018; Mo et al., 2016; Mouse ENCODE Concortium et al., 2012). Nrl-activated ncRNAs originated within regions enriched for WT and rod enhancer features whereas Nrl-repressed ncRNAs originated within regions enriched for cone and Nrl−/− enhancer features (Fig. 1F). Specifically, Nrl-activated ncRNAs originated from 1340 genomic regions that overlapped with whole retina accessible chromatin as defined by ATAC-seq (P=9.642e−201) (Mo et al., 2016), 1088 locations of sorted rod ATAC-seq (P=1.10 e−200) (Hughes et al., 2017) and 820 locations of low methylation in rods (P=5.32e−108) (Mo et al., 2016). Nrl-repressed ncRNAs originated from 1580 genomic regions that overlapped with whole retina ATAC-seq (P=1.01e−200) (Mo et al., 2016), 1517 locations of sorted cone ATAC-seq (P=3.51e−201) (Hughes et al., 2017) and 1300 locations of low methylation in cones (P=1.25 e−200) (Mo et al., 2016). Shared ncRNAs originated within regions that displayed a lower enrichment score (see Materials and Methods) for retinal enhancer features relative to Nrl-activated or Nrl-repressed regions. Together, these observations indicate that Nrl-dependent ncRNA profiling correlates with the retinal epigenomic landscape and local Nrl-dependent gene expression. The association of chromatin accessibility with ncRNA abundance defined over 1000 candidate rod- and cone-specific CREs. These findings further suggest that Nrl-activated and Nrl-repressed ncRNA-defined regions specifically correlate with rod- and cone-specific features of retinal enhancers, respectively.
Nrl-dependent transcriptional profiling identifies distinct gene regulatory signatures specific to rod and cone photoreceptors
Changes in photoreceptor chromatin accessibility and changes in corresponding rod and cone gene expression are correlated (Hao et al., 2012; Mo et al., 2016; White et al., 2016). We hypothesized that the Nrl-dependent change in ncRNA abundance may correlate with cell type-specific changes in chromatin accessibility. We therefore compared the Nrl-dependent changes in ncRNA transcription with the changes in chromatin accessibility in WT versus Nrl−/− retina and in chromatin accessibility in sorted rods versus cones (Fig. 2A,B) (Hughes et al., 2017; Mo et al., 2016). We defined differential open chromatin regions as those with a fold change >1 and FDR <0.05, identifying 1040 rod-specific regions at Nrl-activated ncRNAs and 990 cone-specific regions at Nrl-repressed ncRNAs. We observed a strong positive correlation between ncRNA fold change and open chromatin fold change for whole retina, and for rods versus cones (Fig. 2A, Cor=0.82, 0.77 resp.). This result indicated that the direction and quantitative degree of Nrl-dependence of ncRNA expression correlated with the relative cell-type specificity of ATAC-seq signal, providing further evidence for the hypothesis that ncRNA expression is a useful predictor of CRE activity in specific cell types. We next assessed Nrl-activated and Nrl-repressed ncRNA association with ATAC-seq read density in whole retina and sorted rods and cones (Hughes et al., 2017; Mo et al., 2016). We found that Nrl-activated ncRNAs emanated from regions associated with higher ATAC-seq signal in WT retina and in rods than in cones or Nrl−/− retina, and the inverse was true for Nrl-repressed ncRNAs. (Fig. 2B). These observations are consistent with the hypothesis that Nrl-activated and Nrl-repressed ncRNA-defined regions represent rod- and cone-specific CREs, separately highlighting thousands of genomic locations as strong candidates for Nrl-dependent cell type-specific CREs.
Nrl-activated and -repressed ncRNAs identify cell type-specific regulatory programs. (A) Scatterplot depicting correlation of ncRNA fold change with ATAC-seq fold change from whole-retina samples [WT (dark red) versus Nrl−/− (dark blue)] and rod (pink) versus cone (light blue) whole-retina ATAC-seq (Mo et al., 2016). Non-significantly changed transcripts are shown in gray. (B) Boxplots depicting the enrichment of mean reads per million (RPM) of ATAC-seq from WT P21 mouse retinas and in Nrl−/− ATAC-seq P21 mouse retinas (top) (Mo et al., 2016) and of rod-specific ATAC-seq and cone-specific ATAC-seq (bottom) (Hughes et al., 2017) in Nrl-activated (red) and Nrl-repressed (green) compared with shared (gray). P-values calculated by Tukey test. (C) Fold change line plot of ncRNA sites shared with whole-retina differential ATAC-seq (Mo et al., 2016). Footprints top 10 families by fold change are labeled. (D) Fold change line plot of ncRNA sites shared with rod versus cone differential ATAC-seq (Hughes et al., 2017). Footprints top 10 families by fold change are labeled.
Nrl-activated and -repressed ncRNAs identify cell type-specific regulatory programs. (A) Scatterplot depicting correlation of ncRNA fold change with ATAC-seq fold change from whole-retina samples [WT (dark red) versus Nrl−/− (dark blue)] and rod (pink) versus cone (light blue) whole-retina ATAC-seq (Mo et al., 2016). Non-significantly changed transcripts are shown in gray. (B) Boxplots depicting the enrichment of mean reads per million (RPM) of ATAC-seq from WT P21 mouse retinas and in Nrl−/− ATAC-seq P21 mouse retinas (top) (Mo et al., 2016) and of rod-specific ATAC-seq and cone-specific ATAC-seq (bottom) (Hughes et al., 2017) in Nrl-activated (red) and Nrl-repressed (green) compared with shared (gray). P-values calculated by Tukey test. (C) Fold change line plot of ncRNA sites shared with whole-retina differential ATAC-seq (Mo et al., 2016). Footprints top 10 families by fold change are labeled. (D) Fold change line plot of ncRNA sites shared with rod versus cone differential ATAC-seq (Hughes et al., 2017). Footprints top 10 families by fold change are labeled.
We next searched for TF motif sequences that were enriched in ncRNA-associated ATAC-seq peaks. In Nrl-activated regions, we observed nuclear receptor motifs potentially bound by NR2E3, ESRRB and RORB, and previously identified as enriched in rod versus cone open chromatin regions (White et al., 2016) (Fig. 2C,D). In Nrl-repressed regions, we observed enrichment of the Neurod1 (a basic helix-loop-helix TF) and RXRG (a nuclear receptor) motifs, previously identified as enriched in cone- versus rod-specific open chromatin regions (White et al., 2016) (Fig. 2C,D). This recapitulation of previously identified motifs indicates that Nrl-dependent ncRNAs accurately identify genomic locations likely to be active cell type-specific CREs. We also observed motifs for the CUT, EN and POU domains in the Nrl-repressed regions, suggesting that these families of retinal TFs may play a specific role in cone gene regulation (Brunet et al., 2005; Emerson et al., 2013; Li et al., 2002; Sapkota et al., 2014; Trieu et al., 2003). Genes in these families also play a role in retinal development, suggesting that ncRNA profiling of mature tissue may also detect enhancers active in the developing retina. Finally, we interrogated putative promoter regions of Nrl-dependent ncRNAs that did not overlap open chromatin or TF-bound peaks. At these elements, we found binding sequences for both rod and cone elements, such as bZip, CUT, Homeobox (K50 HD, Q50 HD), Hox (HD) and ESSRB (NR). This observation suggests that ncRNA-profiling identifies candidate regulatory regions undetected by other assays (Fig. S2). We also found that differential motif enrichment analysis between open chromatin with differentially expressed ncRNAs and open chromatin alone identified novel enriched motifs in the ncRNA-defined locations, such as Six6 (TATCA, P=1−e37), a TF recently implicated in regulation of cone gene expression in zebrafish (Mouse ENCODE Concortium et al., 2012; Ogawa et al., 2019) (Table S1). We conclude that Nrl-dependent ncRNA profiling identifies distinct gene regulatory signatures, including cell type-specific TF motifs, at CREs that may participate in cell type-specific gene regulation.
Nrl-dependent ncRNA expression identifies functional output of combinatorial TF binding
CRX and NRL cooperatively control photoreceptor gene expression (Corbo et al., 2007; Hao et al., 2012; Hsiau et al., 2007; Hughes et al., 2017, 2018; Swaroop et al., 2010). Because the combinatorial action of CRX and NRL at least partially determines the specificity of rod versus cone CREs, the binding of these TFs also may be reflected in cell type-specific ncRNA expression. We utilized data from previous CRX ChIP-seq experiments (Corbo et al., 2010) in WT and Nrl−/− retina, and tested whether CRX binding events were associated with global changes in ncRNA expression. First, we evaluated the overlap of CRX binding with ncRNA regions in the WT and Nrl−/− retina (Fig. 3A, top). We next compared ncRNA fold change in each category of CRX binding (Fig. 3A, bottom), including absent CRX binding, WT-specific CRX binding and Nrl−/−-specific CRX binding (CRXNrl−/−) (Fig. 3B). On average, ncRNAs from regions with WT-specific CRX binding were more highly expressed in the WT retina, whereas ncRNAs from regions with Nrl−/−-specific CRX binding were more highly expressed in the Nrl−/− retina (Fig. 3B, red compared with blue). This relationship became even more apparent when we specifically examined Nrl-dependent ncRNAs; Nrl-dependent ncRNAs within regions bound by CRX in WT were more highly expressed in WT (Cohen's d=0.51, P<2.2e−16) and Nrl-dependent ncRNAs within regions bound by CRX in Nrl−/− were more highly expressed in Nrl−/− retina (Cohen's d=0.78, P<2.2e−16). ncRNAs in regions bound by CRX in both WT and Nrl−/− retina displayed negligible differences in expression between the mutant and WT (Cohen's d=0.13, P=0.005) (Fig. 3B).
Unique CRX binding events correlate with context-specific ncRNA expression. (A) Top: Venn diagram of the overlap of CRX WT ChIP-seq (red) and CRXNrl−/− ChIP-seq (blue) peaks. Bottom: Scatterplot in hexagonal binning depicting ncRNA fold change (x-axis). CHIP-seq binding events that intersect at the 5′ end of the RNA are highlighted: red for WT CRX ChIP-seq peak, blue for CRXNrl−/− ChIP-seq peak, purple for shared CRX peaks, and gray for no CRX ChIP-seq peaks (Corbo et al., 2010). (B) Boxplots depicting mean expression differences between global Nrl-dependent ncRNAs categorized by CRX ChIP-seq binding. P-values calculated by Student's t-test and effect size by Cohen's d test. (C) Box plots depicting mean expression differences between global Nrl-dependent ncRNAs categorized by CRX ChIP-seq binding. P-values calculated by Student's t-test and effect size by Cohen's d test. Whiskers represent maximum and minimum values and the width of the notches in the coloured boxes indicate the confidence interval for the mean (horizontal line) ±1.57 interquartile range divided by the square root of the number of observations.
Unique CRX binding events correlate with context-specific ncRNA expression. (A) Top: Venn diagram of the overlap of CRX WT ChIP-seq (red) and CRXNrl−/− ChIP-seq (blue) peaks. Bottom: Scatterplot in hexagonal binning depicting ncRNA fold change (x-axis). CHIP-seq binding events that intersect at the 5′ end of the RNA are highlighted: red for WT CRX ChIP-seq peak, blue for CRXNrl−/− ChIP-seq peak, purple for shared CRX peaks, and gray for no CRX ChIP-seq peaks (Corbo et al., 2010). (B) Boxplots depicting mean expression differences between global Nrl-dependent ncRNAs categorized by CRX ChIP-seq binding. P-values calculated by Student's t-test and effect size by Cohen's d test. (C) Box plots depicting mean expression differences between global Nrl-dependent ncRNAs categorized by CRX ChIP-seq binding. P-values calculated by Student's t-test and effect size by Cohen's d test. Whiskers represent maximum and minimum values and the width of the notches in the coloured boxes indicate the confidence interval for the mean (horizontal line) ±1.57 interquartile range divided by the square root of the number of observations.
We next tested whether coordinated binding by CRX and NRL correlated with a greater effect on ncRNA expression in either the WT or Nrl−/− state than CRX binding alone. We observed that ncRNAs from regions with NRL and CRX binding in the WT retina had a greater decrease in ncRNA expression in the Nrl−/− retina compared with those from regions with either NRL or CRX binding alone (Cohen's d: 0.99; P<2.2e−16) (Fig. 3C, column 4). ncRNA expression from regions with CRX binding in the absence of NRL demonstrated the inverse relationship, with ncRNA expression displaying a large increase in the Nrl−/− retina near CRX-bound regions (Cohen's d: 0.83; P<2.2e−16) (Fig. 3C, column 6). These relationships suggest that, at the genome scale, co-binding of CRX and NRL drive Nrl-activated ncRNA-defined CRE expression in a rod-predominant context, and that in the absence of NRL, unique CRX binding at CREs drive NRL-dependent ncRNA activity in a cone-predominant (Nrl−/−) context (Hughes et al., 2017; Mo et al., 2016). Together, these findings support the model that combinatorial NRL and CRX action defines cell type-specific expression patterns by providing quantitative support of the output of TF binding events on cell type-specific CREs.
Nrl-dependent ncRNAs define functional photoreceptor regulatory elements
Our data suggest that Nrl-dependent ncRNAs pinpoint thousands of functional CREs with cell type-specific activity. Specifically, Nrl-activated ncRNA-defined regions intersected with rod-specific ATAC nominated 1040 candidate rod enhancers and Nrl-repressed ncRNA-defined regions intersected with cone-specific ATAC nominated 990 candidate cone enhancers (Fig. 2B, Table S2). We tested the regulatory capacity of Nrl-dependent ncRNA-defined regions in developing mouse retinas. As Otx2 is required for the genesis of both rods and cones, understanding its regulation is of interest (Emerson et al., 2013; Koike et al., 2007; Nishida et al., 2003). We identified 24 DNaseI hypersensitivity sites (HSs) at the Otx2 locus using P0 mouse retina data from ENCODE (Fig. 4A) (Mouse ENCODE Concortium et al., 2012). Of these sites, four overlapped Nrl-repressed ncRNAs and cone-specific ATAC-seq peaks (FC>2; Figs 2A and 4B). We examined the activity of all 24 DNaseI HSs to determine if the CRE activity of these sites would correlate with areas defined by Nrl-repressed ncRNAs. We cloned sequences corresponding to each DNaseI HS into an eGFP-IRES-AP reporter (Billings et al., 2010; Emerson and Cepko, 2011) and co-electroporated the constructs with a control plasmid containing a ubiquitously expressed CAG-Cherry reporter. Electroporations were performed into explants from embryonic day (E) 14.5, a time when cones are being generated (Young, 1985). Alkaline phosphatase (AP) activity was assayed on whole mounts, as it provides a quick and sensitive assay of CRE activity. Nine out of the 24 DNaseI HSs showed AP activity, with five of these sites displaying strong AP signal (Fig. 4C). Three of the five sites that drove strong AP signal overlapped with Nrl-repressed ncRNAs, suggesting that ncRNA-defined regions can identify active CREs more efficiently than regions defined by open chromatin alone (Wilken et al., 2015; Yang et al., 2017). Cone development is also regulated by Oc1 (Emerson et al., 2013; Sapkota et al., 2014), leading us to test several DNaseI HSs near Oc1 (Fig. S3E). Similarly to Otx2-associated CREs, we found that the Oc1 region with strong CRE activity was defined by ncRNA expression (Fig. 4D, OC1 C compared with OC1 A and B).
Nrl-repressed ncRNA-defined regions display cis-regulatory activity. (A) Annotation track showing ENCODE DNaseI HSs for mouse retinal cells at various stages of development for the Otx2 locus. Twenty-four peaks were selected for further testing (O1-O23) within a ∼300 kb region centered around Otx2. (B) Nrl-dependent ncRNAs at the Otx2 locus and the nearest ATAC-seq regions from cone P21 cells. (C) DNA sequences corresponding to the DNAse I HS peaks annotated in A were cloned into the reporter plasmid, Stagia3, encoding eGFP and alkaline phosphatase (AP) and tested for AP activity (PLAP; dark purple) in E14.5 mouse explants. Constructs were electroporated, with a broadly expressed co-electroporation control plasmid, CAG-Cherry. Retinas were then cultured as explants on filters for 2 days. Positive regions are green in A, and negative regions are red. (D) As in C, but using Stagia3 constructs containing DNA sequences defined by retina DNAse I HS from ENCODE at the Oc1 locus. (E) As in C, but using Stagia3 constructs containing DNA sequences defined by cone ATAC-seq peaks associated with ncRNAs found at the Otx2 locus (ncRNA regions shown in B) and other retinal loci, as indicated in each panel.
Nrl-repressed ncRNA-defined regions display cis-regulatory activity. (A) Annotation track showing ENCODE DNaseI HSs for mouse retinal cells at various stages of development for the Otx2 locus. Twenty-four peaks were selected for further testing (O1-O23) within a ∼300 kb region centered around Otx2. (B) Nrl-dependent ncRNAs at the Otx2 locus and the nearest ATAC-seq regions from cone P21 cells. (C) DNA sequences corresponding to the DNAse I HS peaks annotated in A were cloned into the reporter plasmid, Stagia3, encoding eGFP and alkaline phosphatase (AP) and tested for AP activity (PLAP; dark purple) in E14.5 mouse explants. Constructs were electroporated, with a broadly expressed co-electroporation control plasmid, CAG-Cherry. Retinas were then cultured as explants on filters for 2 days. Positive regions are green in A, and negative regions are red. (D) As in C, but using Stagia3 constructs containing DNA sequences defined by retina DNAse I HS from ENCODE at the Oc1 locus. (E) As in C, but using Stagia3 constructs containing DNA sequences defined by cone ATAC-seq peaks associated with ncRNAs found at the Otx2 locus (ncRNA regions shown in B) and other retinal loci, as indicated in each panel.
We predicted that Nrl-repressed ncRNA-defined regions would represent CREs active in cones, but not rods. Such regions were found near other genes important for cone development and function, including Opn1sw, Rxrg, Gngt2, Gnat2 and Sall3 (Table S2). ncRNA-defined regions from near these loci were thus interrogated for cone-specific CRE activity using electroporation of embryonic explants. To determine if these CREs were active in cones, we co-electroporated the ThrbCRM1-tdTomato plasmid, a reporter with activity predominantly in cones, with some activity also in horizontal cells (HCs) and the cone/HC progenitor cells. Importantly, this reporter has no activity in rods (Emerson et al., 2013). Ten of the 12 ncRNA-defined regions were able to drive reporter expression, including those with CRE activity from the Otx2 locus (Fig. 4E).
To determine if the Nrl-repressed ncRNA regions with CRE activity were active in cones, but not rods, we examined tissue sections for expression of eGFP driven by the eGFP-IRES-AP reporters and tdTomato expression driven by the ThrbCRM1 reporter. The majority of tdTomato-positive cells were located in the apical region of the electroporated E15.5 retinas (Fig. 5A), the location of developing cones (Fei, 2003). tdTomato-positive cells also expressed the Rxrg protein, a known cone marker (Fig. 5B) (Roberts et al., 2005). Cells that expressed GFP from seven of the Nrl-repressed ncRNA constructs were located in the apical region, and they also expressed tdTomato (Fig. 5C). The only exception was the Socs3 CRE (Fig. 5C), which marked cells with a morphology and position matching those of RPCs. We further tested the expression from the Otx2 ncRNA-defined enhancer regions 1 and 2 relative to that of the Otx2 protein (Fig. 5D). Both Otx2-ncRNA-defined CREs showed strong colocalization with the Otx2 protein. Together, these findings indicate that the genomic regions nominated by Nrl-repressed ncRNAs have CRE activity in developing cones. These findings validate the hypothesis that Nrl-dependent ncRNA profiling identifies functional cell type-specific regulatory elements.
Nrl-repressed ncRNAs identify functional cone CREs. (A-D) Images of sections of murine retinas from explants electroporated at E15.5 (A) or E14.5 (B-D) and cultured for 2 days before harvest. The ThrbCRM1-dtTomato enhancer was electroporated into mouse retina with a ubiquitous CAG-GFP co-electroporation control. (A) ThrbCRM1-positive cells were located in the apical region of the developing outer nuclear layer (ONL), where photoreceptors are found (left). The co-electroporation GFP reporter was expressed in cells that were proliferating at the time of the electroporation. (B) The ThrbCRM1 enhancer (left) was expressed in cells positive for Rxrg protein (center), a marker for cones. (C) Positive regions from the AP detection screen (Fig. 4) were co-electroporated with the ThrbCRM1-dtTomato enhancer (left). The Stagia3 reporter plasmid used for AP also contains an eGFP readout (center column). All but one of the cells positive for the putative enhancers were located in the developing ONL, and showed a strong overlap with the tdTomato from the ThrbCRM1 reporter. (D) Two Nrl-dependent ncRNAs from the Otx2 locus were tested (left) for their colocalization with the Otx2 protein (center). Expression from both CREs showed a strong overlap with the protein (composite image, right).
Nrl-repressed ncRNAs identify functional cone CREs. (A-D) Images of sections of murine retinas from explants electroporated at E15.5 (A) or E14.5 (B-D) and cultured for 2 days before harvest. The ThrbCRM1-dtTomato enhancer was electroporated into mouse retina with a ubiquitous CAG-GFP co-electroporation control. (A) ThrbCRM1-positive cells were located in the apical region of the developing outer nuclear layer (ONL), where photoreceptors are found (left). The co-electroporation GFP reporter was expressed in cells that were proliferating at the time of the electroporation. (B) The ThrbCRM1 enhancer (left) was expressed in cells positive for Rxrg protein (center), a marker for cones. (C) Positive regions from the AP detection screen (Fig. 4) were co-electroporated with the ThrbCRM1-dtTomato enhancer (left). The Stagia3 reporter plasmid used for AP also contains an eGFP readout (center column). All but one of the cells positive for the putative enhancers were located in the developing ONL, and showed a strong overlap with the tdTomato from the ThrbCRM1 reporter. (D) Two Nrl-dependent ncRNAs from the Otx2 locus were tested (left) for their colocalization with the Otx2 protein (center). Expression from both CREs showed a strong overlap with the protein (composite image, right).
DISCUSSION
Enhancer transcription defines context-specific CREs
Non-coding transcriptional profiling from specific biological contexts was evaluated as an approach for nominating candidate regulatory elements in the mouse retina. A strong positive quantitative relationship between TF-dependent ncRNA expression and transcriptional activation in enhancer assays, as well as with local mRNA expression, and retinal epigenomic markers showed that differential ncRNA-seq provides a robust genome-wide assay to identify functional CREs. This approach specifically nominates candidate regulatory elements that are most likely to be functional, as shown by electroporation-based assays for CRE activity. Furthermore, adding quantitative ncRNA data to the existing genomic metrics at retinal CREs informed TF-dependent models of retinal gene expression. When incorporated with existing genomic information, ncRNA expression provided contextual and quantitative evidence for functional output of CREs and the GRNs they comprise, as demonstrated by our analysis of the transcriptional effect of combinatorial TF binding (Fig. 3). TF-dependent ncRNA profiling provides an alternative approach to large-scale reporter assays to identify putative regulatory regions, which can be difficult to apply in vivo, or when the cells of interest are not readily accessible, as is the case for cones. The example herein using the retina, and the previous study using cardiomyocytes (Yang et al., 2017), show that TF-dependent ncRNA profiling is successful as a general approach for the definition of regulatory elements.
Regulatory regions active in photoreceptors and their progenitor cells
An understanding of the GRNs that govern photoreceptor production is particularly important for therapeutic applications (Oswald and Baranov, 2018). In particular, cone photoreceptors, the cell type that mediates color and high-acuity vision, degenerate in a variety of blinding disorders, such as retinitis pigmentosa or age-related macular degeneration (Campochiaro and Mir, 2018; Shelley et al., 2009). Therapeutic approaches focus on cone replacement using cells derived from stem cells, or utilize gene therapy to target cones specifically (Oswald and Baranov, 2018). Both approaches would benefit from a greater understanding of cone versus rod GRNs (Aguirre, 2017; Jüttner et al., 2019; Sinha et al., 2016). The conversion of rods to cone-like cells in the Nrl−/− mouse has provided access to an enriched source of cone-like cells, allowing for the elucidation of molecular aspects of rods versus cones, including the identification of rod- and cone-enriched coding and non-coding transcripts and open chromatin regions (Hughes et al., 2017; Li et al., 2016; Mo et al., 2016; Swaroop et al., 2010; Zelinger et al., 2017). These datasets have provided an excellent background for our assessment of the rod- and cone-specific transcriptome and quantitative analysis of CRE activity changes in a rod-predominant state (WT retina) and a cone-predominant state (Nrl−/− retina).
The abundance of ncRNAs at regulatory regions is associated with the CRE enhancer activity (Henriques et al., 2018; Mikhaylichenko et al., 2018), suggesting that the CREs identified in this study at P21, after retinal development is complete, should be active in mature cones. Interestingly, functional assessment of enhancer activity of a subset of ncRNA-defined CREs, potentially regulating relevant cone genes, showed cell type-specific activity in developing cones, and often identified regions that were highly active by qualitative assessment (Fig. 4). These observations suggest that the presence of ncRNAs is a signature for strong enhancers and/or enhancers that are active throughout development to adult stages. Accordingly, some of the tested ncRNA-defined regions overlap with previously published ncRNAs from earlier developmental time points (P2/4/6/10) in the Nrl−/− retina, suggesting sustained activity of those regions (Zelinger et al., 2017) (Fig. S3). Furthermore, an ongoing study on the regulation of Otx2 in the postnatal retina has shown that some of our identified ncRNA-defined CREs have activity in late RPCs and other late-generated cell types, and are in the vicinity of Otx2 regulatory regions previously identified in the postnatal retina (Chan et al., 2019; Wilken et al., 2015).
A small number of tested elements had activity in other cell types or had no specific activity in retinal explants. This may be due to the fact that the CREs identified by ncRNA profiling and ATAC-seq peaks may harbor TF binding sites active in other cell types, or may rely on higher-order chromatin structure for proper regulation. Alternatively, elements that did not show activity in retinal explants may be solely active in mature cones, and not in the cone progenitor cells or immature cones that were assayed here. Interestingly, Oc1, which is upstream of Nrl (Emerson et al., 2013), is not expressed in mature photoreceptors (Wu et al., 2012); however, we identified a candidate Nrl-dependent ncRNA-defined CRE at this locus that displayed strong activity in the early retina (Fig. 4D). It is possible that ncRNA transcription may reflect an epigenetic memory of embryonic activity, similar to previous findings that vestigial enhancers may remain undermethylated in adult tissues (Hon et al., 2013; Jadhav et al., 2019; Mo et al., 2016).
Although many of the genes associated with Nrl-repressed ncRNAs were related to cone development or function (Otx2, Rxrg, Gngt2, Gnat2, Opn1sw, Sall3) (Chang et al., 2006; de Melo et al., 2011; Nishida et al., 2003; Roberts et al., 2005; Rodgers et al., 2016), we also identified CREs for genes that are generally involved in retinal development but that, until recently, had no known cone-specific role, except for their enrichment in adult cones and Nrl−/− retinas (Hughes et al., 2017; Siegert et al., 2012; Yoshida et al., 2004). For example, the ncRNA-defined CRE associated with Six6, an eye-field TF (Li et al., 2002), and the element near En2, a gene important for retinal ganglion cell differentiation (Brunet et al., 2005), both displayed cone-specific reporter activity. Recently, Six6 was shown to be more highly expressed in mouse cones than in rods (Hughes et al., 2017), and has been found to regulate middle-wavelength opsins in zebrafish (Ogawa et al., 2019). Binding sites for Six6 were specifically identified in differential motif enrichment analysis in ncRNA-defined locations (Table S1). These results suggest that non-coding transcriptional profiling of the WT and Nrl−/− retina uncovered not only rod- and cone-specific regulatory programs, but potential shared regulatory programs that warrant further investigation. Overall, the regulatory elements nominated by this analysis represent potent candidates that may be relevant for cone-replacement strategies and gene therapy, may shed light on developmental and mature CRE utilization, and may add novel elements for our understanding of developmental cone GRNs.
Heterotypic NRL/CRX localization defines ncRNA abundance at rod versus cone CREs
In retinal development, CRX has been shown to interact with NRL and recruit additional TFs, including NR2E3, to regulate photoreceptor genes (Andzelm et al., 2015; Peng and Chen, 2005; Ruzycki et al., 2018). The combinatorial action of CRX and NRL has been hypothesized to target expression specifically to rods (Hughes et al., 2017; Ruzycki et al., 2018), suggesting a model of heterotypic TF action, where CRX alone generally targets photoreceptor enhancers and the addition of NRL provides rod specificity. Using ncRNA abundance and differential expression in the WT and Nrl−/− retina, we provide quantitative transcriptional evidence for the combinatorial action of CRX and NRL. The binding of CRX and NRL together correlated with rod-specific ncRNA transcription, and in the absence of NRL, CRX was preferentially located at CREs with strong cone-specific ncRNA transcription (Fig. 3). These rod- and cone-specific binding events are distinguished from shared CRX binding in both rods and cones, which did not display cell type-specific non-coding transcription. This heterotypic model, supported by relative ncRNA abundance as a measure of altered CRE activity, may drive rod versus cone CRE choice and thereby regulate gene expression that determines cell type during retinal development. Overall, this finding highlights the utility of differential ncRNA profiling for nominating potent CREs and deepening our understanding of context-specific GRNs.
MATERIALS AND METHODS
Animals
Nrl−/− mice were generated as previously described (Mears et al., 2001). Mouse husbandry and all procedures (including euthanasia by CO2 inhalation and cervical dislocation) were conducted in accordance with the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health, and were approved by the Washington University in St. Louis Institutional Animal Care and Use Committee. For ex vivo enhancer testing, WT embryos were obtained from timed pregnant CD1 mice (Charles River Laboratories). All animal studies were approved by the Institutional Animal Care and Use Committee at Harvard University.
Coding RNA-seq library preparation and data analysis
Libraries were prepared from RNA starting with 1 μg per sample and using the mRNA-seq Sample Prep Kit (Illumina) as per recommended instructions. After Ribozero purification and removing only ribosomal RNA, barcoded libraries were prepared according to Illumina's instructions (2013) accompanying the TruSeq RNA Sample prep kit v2 (Part# RS-122-2001). Libraries were quantified using the Agilent Bio-analyzer (model 2100) and pooled in equimolar amounts. The pooled libraries were sequenced with stranded 50-bp single-end reads on the HiSeq2500 in Rapid Run Mode following the manufacturer's protocols (2013).
RNA library preparation was performed as previously discussed (Yang et al., 2017). Briefly, 22 M to 30 M reads were mapped to mouse genome with STAR (v 2.5.3a). Reads mapped to the mitochondrial genome, and with phred score <30 were excluded. Counts were retrieved with HTseq (v.0.6.0) (Anders et al., 2015) in union mode. Lastly, counts were analyzed for differential expression with R (3.4) package DEseq2 (Love et al., 2014).
Non-coding RNA-seq library preparation
Total RNA was extracted by TRIzol Reagent (Invitrogen), followed by ribosomal and polyA depletion. After RiboZero purification and oligo-dT depletion, RNA Barcoded Libraries were prepared according to Illumina's instructions (2013) accompanying the TruSeQ RNA Sample prep kit v2 (Part# RS-122-2001). Libraries were quantified using the Agilent Bio-analyzer (model 2100) and pooled in equimolar amounts. The pooled libraries were sequenced with 50-bp stranded single-end reads on the HiSEQ4000 in Rapid Run Mode following the manufacturer's protocols (2013).
Non-coding RNA-seq data analysis
About 170-186 million high-quality reads (quality score >30) for each sample were obtained. Fastq files were aligned to UCSC genome build mm10 using STAR and between 168 million and 174 million reads were successfully mapped. Transcript assembly was performed by Stringtie (version 1.3.3) and merged with the Stringtie ‘merge’ function. Counts were retrieved from the merged transcriptome for differential expression testing, performed as above. False discovery rate (FDR) was calculated after removing the coding-gene transcripts, ncRNA that overlapped coding genes in the same strand, and RNAs shorter than 100 base pairs. Significance was considered to have been reached when FDR was <0.05 and fold change was >2. Deeptools multiBamCompare and plotHeatmap were used to make and plot sample correlations.
GIGGLE index and search
We downloaded retina methylation, ChIP-seq and ATAC-seq regions from GSE72550 (Mo et al., 2016) and DNA-seq GSE37074 (Mouse ENCODE Concortium et al., 2012). TF ChIP-seq [CRX: GSE20012 (Corbo et al., 2010); NRL: https://datashare.nei.nih.gov/nnrlMain.jsp (Hao et al., 2012); OTX2: GSE54084 (Samuel et al., 2014)] and ATAC-seq (GSE83312; Hughes et al., 2017) were processed as below. Regions were placed in a directory, sorted and gunzipped with ‘giggle sort_bed’, and then indexed with ‘giggle index’ which uses tabix (samtools 1.5) (Layer et al., 2018). This was deemed the ‘retinal index’. ncRNA regions were also GIGGLE processed (sort index), and used to compare against the retinal index with command ‘giggle search -I<index> -q <ncRNA.bed> -g 272552137 –s’. Results were ranked by score, which was calculated as odds ratio multiplied by −log10(P-value). The heatmap was generated with R pheatmap (v 1.012) CRAN package (Kolde, 2019).
Nearest gene assignment
Mouse embryonic stem cell TADs were downloaded from ENCODE [chromosome.sdsc.edu, GSE29184 (Shen et al., 2012)]. Empty regions were filled in with R and assigned as ‘inter-TAD’ as described by Szabo et al. (2018). The 5′ regions of all RNAs were intersected with TADs using Bedtools intersect.
GO enrichment analysis
Enrichment of GO Biological Process terms from closest genes that share a TAD with ncRNAs was performed with Bioconductor package ClusterProfiler version 2.4670 (Yu et al., 2012).
ATAC-seq and ChIP-seq data processing
Fastq files from previously generated ChIP-seq (GSE20012; Corbo et al., 2010) and ATAC-seq (GSE83312; Hughes et al., 2017) datasets were downloaded from Gene Expression Omnibus and processed identically as previously described (Hughes et al., 2017). Briefly, adapter sequences were clipped from reads using Cutadapt (Martin, 2011), then aligned to UCSC mouse genome mm10 with bowtie version 2.3.4 (Langmead and Salzberg, 2012) in end-to-end mode. Mismatched reads, PCR duplicates, ENCODE blacklisted regions, and reads with quality <30 were removed with Samtools version 1.5 (Li et al., 2009). For ATAC-seq, fragments with width >147 base pairs were removed to enrich for nucleosome-free reads using a custom script. Peaks for both assays were called with Macs 2.11 (Zhang et al., 2008).
Associating ncRNAs and regulatory elements
Open chromatin peaks and TF-binding peaks were intersected with ncRNAs using Bioconductor package GenomicRanges (Lawrence et al., 2013) allowing for a 500 bp gap upstream the 5′ end of the RNA and 50 bp into the RNA.
Metagene analysis
To compare the coverage of ATAC-seq regions in ncRNAs, we used Bioconductor Package Metagene (2.12.1). Coverage was normalized to reads per million. We binned the position of each region to 100 bp. We modified the current Metagene source code to output boxplots as opposed to ribbons and we then tested the difference of means with R base function Tukey HSD.
Identification of differentially accessible peaks
ATAC-seq peak regions were combined and sorted with bash commands (cat sort). Counts were retrieved from each alignment file using Bedtools multicov (2.26.0) and tested for differential expression with DESeq2. Peaks were considered differentially accessible when the log2 fold change was >1 and P-adjusted value <0.05 and these regions were considered cone specific and rod specific. To assess the relationship between differentially expressed ncRNAs and differentially accessible open chromatin, we performed a global overlap of all ncRNA regions and combined ATAC-seq regions. Then differential regions from both sets were highlighted.
TF footprinting and motif enrichment analysis
We used the RGT suite (Li et al., 2019) (v 0.12.1) for footprint analysis, and sites were matched with JASPAR CORE 2016 database to assign a motif. From the differential ATAC versus differential ncRNA overlap in rods and cones (Fig. 2B), we asked what TF footprint was associated with each ATAC peak from whole retina and in sorted rods versus cones. We then plotted the ncRNA fold change and labeled the top 10 families.
Known and de novo motif scanning was performed on the 5′ end of the ncRNAs extended by 500 bp and 50 bp into the ncRNA using HOMER (4.3). We used regions that failed to overlap with any ATAC-seq or ChIP-seq peak. Differential motif enrichment was performed with HOMER between the ncRNAs and the ATAC-seq and ChIP-seq peaks combined as background (parameter -bg).
CRX ChIP-seq comparison and heterotypic binding
WT CRX, Nrl−/− CRX, and NRL ChIP-seq regions refs were combined into a master list and each region was assigned a binding category (NRL alone, NRL and CRX, etc.). All regions were overlapped with the 5′ region of ncRNA background, the fold change distribution was plotted with ggplot2, and color-coded by binding category. For each category, the average expression by genotype of the ncRNA background was compared with R by Student's t-test and Cohen's d.
Comparison with previous ncRNA analysis
De novo transcriptome and Fastq files were downloaded from GSE74660 (Kim et al., 2016b). Fastq files were aligned with STAR, and alignment files were compared with our ncRNA background using bedtools multicov. Counts were normalized with R and plotted with pheatmap. GSE74660 transcriptome was filtered by fpkm>1 and 2>exons, and overlapped with ncRNA background using R genomicRanges.
Retina electroporation and AP staining
Ex vivo retina electroporation was carried out as described previously (Cherry et al., 2011; Matsuda and Cepko, 2007), with at least three biological replicates for AP staining, or at least duplicates for immunohistochemical analyses. The chamber used for electroporation was modified as previously described (Montana et al., 2011). Stages of embryos used for the experiments are described in the main text or in the figure legends. Electroporation settings were 5×30 V pulses, 50 ms each and 950 ms apart. DNA concentration was 400-600 ng/μl for control plasmids and 1 μg/μl for constructs testing CRE activity. Retinas were harvested after 2 days in culture.
Plasmid and DNA sequences
In vivo enhancer testing was performed with the Stagia3 reporter vector (Addgene #28177) (Billings et al., 2010). Enhancer testing with the CAG-EGFP, CAG-mCherry and ThrbCRM1-tdTomato vectors were modified from our previous work (Emerson et al., 2013; Matsuda and Cepko, 2007). Coordinates of regions cloned are shown in mm10 assembly: O10>chr14:48616314-48617497; O11>chr14: 48624486-48626389; O5>chr14: 48579937-48581029; O6>chr14: 48584564-48585012; O7>chr14: 48593170-48594188; O8>chr14: 48606973-48608016; O9>chr14: 48608144-48609697; O15>chr14:48662841-48663211; O20>chr14:48740203-48742409; Oc1 A>chr9:74384085-74384740; Oc1 B>chr9 74530189-74532399; Oc1 C>chr9:74810971-74812406; Otx2 ncRNA1>chr14+48580418 48580655; Otx2 ncRNA2>chr14+48593310 48594038; Otx2 ncRNA3>chr14+48608844 48609310; Otx2 ncRNA4>chr14+48617045 48617321; Rxrg ncRNA>chr1+167438156 167438645; Pde6b ncRNA>chr5+108366779 108367075; En2 ncRNA>chr5+28145373 28145644; Socs3 ncRNA>chr11+117963224 117963787; Nab1 ncRNA>chr1+52435750 52436194; Six6 ncRNA1>chr12+72854405 72854809; Six6 ncRNA2>chr12+72831804 72832279; Opn1sw ncRNA>chr6+29394311 29394823.
Immunohistochemistry
Retinal sections (20-30 μm thick) were prepared and stained as described previously (Cherry et al., 2011). Blocking solution was 0.3% Triton X-100 in 1× PBS. Primary antibodies used in this study were: chicken anti-GFP (1:1000, Abcam, AB13970), rabbit anti-mCherry (1:1000, Abcam, 167453), rabbit anti-Otx2 (1:500, Millipore, AB9566), mouse anti-Rxrg (1:300, Santa Cruz Biotechnology, sc-514134). Secondary antibodies were from Jackson ImmunoResearch (703-545-155, 711-585-152, 715-545-150, 715-585-150).
Imaging
Retina explants were imaged on a Leica M165FC microscope. Retinal section images were acquired using a Zeiss LSM780 inverted confocal microscope from the Microscopy Resources on the North Quad (MicRoN) core at Harvard Medical School.
Footnotes
Author contributions
Conceptualization: C.P.-C., L.A.S., R.D.N., N.L., I.P.M., C.C.; Methodology: R.D.N.; Software: C.P.-C.; Validation: N.L.; Formal analysis: C.P.-C., L.A.S., N.L.; Investigation: C.P.-C., L.A.S., N.L.; Resources: C.P.-C., A.E.O.H., S.W., J.C.C., N.L.; Data curation: C.P.-C.; Writing - original draft: C.P.-C., L.A.S., R.D.N., C.C., N.L., I.P.M.; Writing - review & editing: C.P.-C., L.A.S., J.C.C., C.C., N.L., I.P.M.; Visualization: C.P.-C., L.A.S., N.L.; Supervision: C.P.-C., L.A.S., R.D.N., J.C.C., C.C., I.P.M.; Project administration: J.C.C., C.C., I.P.M.; Funding acquisition: C.C., I.P.M.
Funding
This research was supported in part by the National Institutes of Health (NIH) (R01HL126509, R01HL148719, R01HL147571 and R33 HL123857 to I.P.M.; R01EY024958, R01EY025196 and R01EY026672 to J.C.C.; F30 HL131298 to R.D.N.; R01EY029771 to C.C.), an American Heart Association Collaborative Sciences Award (AHA17CSA33610126 to I.P.M.), the Howard Hughes Medical Institute (C.C.), the Fondation Leducq (I.P.M.) and in part by the NIH through resources provided by the Computation Institute and the Biological Sciences Division of the University of Chicago and Argonne National Laboratory (grant 1S10OD018495-01). N.L. was supported by post-doctoral fellowships from the Swiss National Science Foundation (Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung) and the Human Frontier Science Program. S.W. was supported by the American Diabetes Association Pathway Program (1-16-INI-16). L.A.S. was supported by NIH Training Grants (5T32GM007197-43 and T32HL007381) from the National Heart, Lung, and Blood Institute. Deposited in PMC for release after 12 months.
Data availability
All data presented in this manuscript have been deposited in Gene Expression Omnibus under accession number GSE136565.
Peer review history
The peer review history is available online at https://dev.biologists.org/lookup/doi/10.1242/dev.184432.reviewer-comments.pdf
References
Competing interests
The authors declare no competing or financial interests.