ABSTRACT
Ocular lens development entails epithelial to fiber cell differentiation, defects in which cause congenital cataracts. We report the first single-cell multiomic atlas of lens development, leveraging snRNA-seq, snATAC-seq and CUT&RUN-seq to discover previously unreported mechanisms of cell fate determination and cataract-linked regulatory networks. A comprehensive profile of cis- and trans-regulatory interactions, including for the cataract-linked transcription factor MAF, is established across a temporal trajectory of fiber cell differentiation. Furthermore, we identify an epigenetic paradigm of cellular differentiation, defined by progressive loss of the H3K27 methylation writer Polycomb repressive complex 2 (PRC2). PRC2 localizes to heterochromatin domains across master-regulator transcription factor gene bodies, suggesting it safeguards epithelial cell fate. Moreover, we demonstrate that FGF hyper-stimulation in vivo leads to MAF network activation and the emergence of novel lens cell states. Collectively, these data depict a comprehensive portrait of lens fiber cell differentiation, while defining regulatory effectors of cell identity and cataract formation.
INTRODUCTION
Expression profiling of specific tissues and cell types has been applied for gene discovery in organogenesis (Anand and Lachke, 2017). Genome-level approaches have examined the ocular lens from human, mouse, chicken, amphibian and fish, by expression microarrays and high-throughput RNA-sequencing (Agrawal et al., 2015; Anand et al., 2015, 2018b; Audette et al., 2016; Barnum et al., 2020; Budak et al., 2018; Cavalheiro et al., 2017; Chauhan et al., 2002a,b; De Maria and Bassnett, 2015; Greiling et al., 2009; Hawse et al., 2003, 2004, 2005; Hoang et al., 2014; Ivanov et al., 2005; Kakrana et al., 2018; Khan et al., 2015, 2016, 2018; Lachke et al., 2011, 2012; Manthey et al., 2014a,b; Padula et al., 2019; Patel et al., 2022; Siddam et al., 2018, 2023; Sousounis and Tsonis, 2012; Sousounis et al., 2013, 2015; Srivastava et al., 2017; Upreti et al., 2023; Wang et al., 2017; Wolf et al., 2013a,b; Xiao et al., 2006; Zhao et al., 2018, 2019a). However, the majority of these studies were performed on whole lens tissue, with the exception of a few that generated transcriptome data on isolated lens epithelium or fiber cells (Greiling et al., 2009; Hoang et al., 2014; Nakahara et al., 2007; Zhao et al., 2018, 2019a), or specific regions within the lens (Chauss et al., 2014; De Maria and Bassnett, 2015; Disatham et al., 2019; Ivanov et al., 2005). Additionally, targeted expression profiling has been performed on individual lens fiber cells in mouse (Bhat et al., 2019; Gangalum et al., 2018). The advent of single-cell and single-nucleus RNA-sequencing (sc/snRNA-seq) offers new opportunities to examine gene expression of cells at an unprecedented throughput and resolution (Dimitriu et al., 2022; Grindberg et al., 2013; Ogbeide et al., 2022; Stuart and Satija, 2019; Vandereyken et al., 2023).
Even so, the application of these approaches to the lens have been limited (Disatham et al., 2023). Except for recent studies in zebrafish (Farnsworth et al., 2021) and mouse (Yamada et al., 2021), there are no reports of scRNA-seq or snRNA-seq on the developing lens. The zebrafish study, although broadly informative, performed scRNA-seq on the whole embryo, limiting its depth in relation to the lens (Farnsworth et al., 2021). In the mouse study, scRNA-seq was performed on the lens placode at the induction stage (Yamada et al., 2021). Thus, although informative for examining gene expression associated with early development, the mouse study does not provide any information about expression dynamics across lens epithelial populations or on differentiation into fiber cells. Although the chicken has been a model for lens development for several decades (Bhat et al., 1980; Cui et al., 2004; Ogino and Yasuda, 1998), studies defining the transcriptome of chick lenses have been limited (Chauss et al., 2014; Disatham et al., 2019). The two transcriptomics studies on the embryonic chick lens were performed on distinct regions within the tissue (Chauss et al., 2014; Disatham et al., 2019), but not at single-cell resolution. Additionally, the study of Disatham et al. also described chromatin accessibility profiles across these lens regions, allowing correlation between the state of chromatin and RNA abundance in a coarse spatial manner (Disatham et al., 2019). Other studies performed in the mouse focused on establishing correlation between chromatin state and gene expression in whole lenses (Sun et al., 2015) or in isolated epithelial and fiber cells (Chang et al., 2023; Zhao et al., 2019b).
To advance knowledge on the intersection of chromatin state dynamics and gene expression in the lens at the single-cell level, we present an integrated, multiomic approach that encompasses gene expression and chromatin accessibility associated with the chicken epithelial to fiber cell differentiation program (Fig. 1A). We generate and analyze multiple modalities and layer functional approaches to define molecular cascades underlying lens development and their significance to lens pathology. Specifically, we capture epithelial cell state dynamics and establish an inferred temporal map of fiber cell differentiation trajectory. This confirms established signaling effectors that are active in lens development, such as FGF, WNT, GLI and NOTCH, while providing further information about their precise spatiotemporal dynamics. The multimodal viewpoint allows us to construct a comprehensive atlas of the cis- and trans-regulatory networks hallmarking epithelial cell dynamics and fiber cell differentiation. To achieve this, we identify established lens transcription factors (TFs), including PAX6, FOXE3, MAF and SOX-family proteins, and predict new candidates for future study. We leverage these observations to deduce global binding events and downstream regulatory targets for the cataract-linked transcription factor MAF, a known effector of fiber cell differentiation. Additionally, this integrated analysis uncovered a previously unreported epigenetic paradigm of fiber cell differentiation that is defined by loss of the H3K27 methylation writer Polycomb Repressive Complex 2 (PRC2) and the simultaneous acquisition of the PRC2-modifying gene JARID2 (Chammas et al., 2020; Liu and Liu, 2022; van Mierlo et al., 2019). We profile the global DNA-binding activity of PRC2 and the distribution of histone modifications to uncover a potential new epigenetic mechanism for safeguarding early lens cell identity. Finally, we examine the functional effects of FGF hyper-stimulation on lens cell identity in vivo at single-cell resolution. Together, these rich datasets confirm established pathways and known cataract-linked genes in the lens, and through independent validation and functional assays, we identify a cohort of new proteins that play a key role in lens development, homeostasis and pathology.
RESULTS
Single-nuclei profiling identifies distinct cell populations in the embryonic chicken lens
We sought to examine the global regulatory profile of the embryonic chicken lens at a multimodal, single-cell level (Fig. 1A). We established a protocol to process and identify lens nuclei by performing single-nucleus RNA-seq and ATAC-seq on the whole chicken eye to investigate the transcriptome and chromatin landscape (Fig. 1B). Whole eyes were isolated from chickens at embryonic (E) days 4 and 5, processed using an in-house nuclei isolation protocol and subjected either to snRNA-seq or to combined snRNA-seq and snATAC-seq. From the snRNA-seq profiles, which comprised 20,066 nuclei, 643 nuclei were identified by a UMAP feature plot to exhibit characteristic lens gene expression, characterized by progressive ASL1 abundance, which encodes delta-crystallin, a marker of the chicken lens (Fig. 2A). Re-clustering of this isolated lens population revealed three major cell states (Fig. 2B) associated with epithelial to fiber transition. For example, epithelial markers TFAP2A, DMRTA2 and FOXE3 were progressively reduced across the three populations (Fig. 2C). Conversely, fiber cell genes MIP, BFSP1 and CRYBA4 showed an inverse pattern (Fig. 2C).
We then applied an approach termed ‘cell cycle scoring’ that assigns a specific cell cycle stage to each nucleus (Fig. 2B). These analyses led to the recognition of the subclusters of epithelial, intermediate and fiber cell origin (Fig. 2B). In total, 1417 unique genes were differentially expressed in any of the three subclusters (Table S1). Interestingly, we identified 101 gene markers for the intermediate cluster, including LHFPL1, RELN and JAG1, suggesting the presence of a discrete, transient cell state during fiber cell differentiation (Table S1). Each of these subclusters comprised nuclei from two distinct library preparation technologies, as well as from distinct biological replicates and both sexes (Fig. S1). Functional pathways associated with genes enriched in the epithelial subcluster included ‘mitotic cell cycle’ and ‘morphogenesis of an epithelium’ (Fig. 2D; Table S2). Terms associated with the intermediate subcluster-enriched genes included ‘establishment or maintenance of cell polarity’, whereas those of the fiber subcluster included ‘lens fiber cells’ (Fig. 2D). Together, these analyses demonstrate that snRNA-seq can reliably identify lens cells and classify them into populations reflecting epithelial, intermediate and fiber cell biology.
Derivation of the epithelial to fiber cell differentiation trajectory by snRNA-seq
Although clustering analysis was able to coarsely define lens cell populations, we postulated that these cells may be more aptly considered as a continuum of expression states, rather than discrete populations. To this end, we employed pseudotime analysis to establish a continuous trajectory of epithelial to fiber cell differentiation (Fig. 3A). Pseudotime represents an analytic framework that assigns a measure of how far a cell has moved across a biological process. The established trajectory concurred with previously accepted expression patterns reflecting fiber cell differentiation, allowing us to infer temporal aspects of gene regulation. For example, EFNA5 is abundant early and becomes progressively reduced in this inferred epithelial to fiber trajectory (Fig. 3B), consistent with its established expression in the lens (Cheng et al., 2017). Other genes, such as CRIM1, TGFB2 and TRPM1, which were previously shown to follow this pattern (Cheng et al., 2017; Zhang et al., 2016), were identified (Fig. 3B). Furthermore, BIN1, BNIP3L, CAPRIN2, EPHA2, GJA3, GJA8, and RBM24 exhibited progressive elevation across the epithelial to fiber continuum (Fig. 3B), in agreement with their established expression patterns (Chauss et al., 2014; Cheng et al., 2017; Dash et al., 2015, 2020; Zhao et al., 2018).
Interestingly, a group of genes exhibited a transient spike in abundance across the differentiation trajectory, substantiating the presence of an intermediate cell state. Such genes included EFNB2, the extracellular matrix-remodeling regulator MXRA5, the fibroblast growth factor receptor-like protein FGFRL1 and the kinesin motor protein-encoding gene KIF13A, among others (Fig. 3B). Consistent with literature (Rowan et al., 2008), JAG1 exhibited transient expression across the trajectory. This group of transiently expressed genes can be reconciled with a previous report documenting extensive chromatin changes concurrent with the transition from differentiating epithelial cell to nascent fiber cell (Disatham et al., 2019), suggesting a dedicated regulatory program that initiates fiber cell differentiation. The total numbers of genes associated with epithelial or fiber states were significantly larger than the number of genes correlating with the intermediate state (Table S1). Taken together, the captured trajectory encapsulates previously reported features of fiber cell differentiation, while broadly providing information about the spatiotemporal dynamics of RNA regulation.
snATAC-seq captures chromatin configuration during epithelial to fiber cell differentiation
To determine whether chromatin changes are associated with the epithelial through fiber differentiation program. To achieve this, we took a multiomics approach that facilitates simultaneous capture of RNA and accessible chromatin from the same nucleus. Using this approach, we captured ATAC profiles of 386 nuclei across the lens to identify a total of 17,591 total accessible regions. These combined RNA and ATAC profiles are summarized by UMAP representation (Fig. 4A). Notably, the combined modality UMAP preserves the spatial proximity of cells within the subclusters previously derived from the RNA-only profiles. In addition, all clusters displayed expected enrichment patterns proximal to transcription start sites (TSSs) and nucleosomal periodicity across the distribution of aligned fragment lengths (Fig. S2). Differential accessibility testing of peak regions identified 963 regions that exhibit a significant change in accessibility during fiber cell differentiation, which broadly correlated with epithelial and fiber states (Fig. 4B). Interestingly, the genomic distribution of all 17,591 accessible regions was predominantly found within promoters (∼55.9%; Fig. 4C), presumably owing to active transcription. In contrast, differentially accessible peaks were largely found within intronic or distal intergenic peaks (∼60.6%; Fig. 4D), suggesting the majority of changes are attributable to alterations at cis-regulatory elements (CREs), including enhancers.
We then used RNA abundance and chromatin accessibility to predict interactions between peak regions and nearby TSSs. This analysis identified CREs that correlate with the expression pattern of individual genes. For example, three CREs and a peak spanning the coding region were identified for FOXE3, and the accessibility of these regulatory regions correlated with FOXE3 expression (i.e. high expression and accessibility in the epithelial state that is progressively reduced in the fiber state) (Fig. 4E). Similarly, an expression-linked CRE and promoter region peak were identified for the fiber cell-expressed gene BFSP1 (Fig. 4F). Interestingly, we identified fiber-expressed genes as promising candidates for further study in the lens (e.g. the glycophorin-C encoding gene GYPC) (Fig. 4G). Importantly, this approach was sensitive enough to identify genes whose dynamic expression spiked ephemerally in the intermediate state, such as RHOJ, which encodes a Ras homolog family member GTP-binding protein, as well as a transiently accessible CRE predicted to drive RHOJ transcription (Fig. 4H). The cis-regulatory profiles of additional known and previously unreported lens regulators are displayed, including ASL1, CRYAB, CRYBA4, PAX6, TKTL1 and YBX3 (Fig. S3). In a similar manner, we report the expression profile and chromatin landscape of genes that serve as effectors of pathways with roles in mediating lens fate, including FGF, BMP, SHH and NOTCH signaling, as well as numerous mediators of intracellular cascades (Fig. S4).
Additionally, we postulated that this analysis could help to reveal the dynamic expression of genes encoding TFs in lens development. We used established repositories (Hammelman et al., 2022; Lambert et al., 2018) and manual curation to create a complete catalog of chicken TFs, based on mouse and human orthologs. This approach identified 80 TF-encoding genes with dynamic expression patterns across lens states (Fig. 5A). Importantly, we corroborated established expression patterns for key lens TFs (e.g. FOXE3, HSF4, MAF, MAB21L1, MEIS2, PAX6, PROX1, SOX1 and SOX2) (Fig. 5A) (Cvekl and Zhang, 2017). For example, PAX6 levels are progressively reduced across the inferred epithelial to fiber trajectory (Fig. 5A), consistent with its known expression pattern (Walther and Gruss, 1991). Next, we sought to relate gene expression dynamics of TFs to the chromatin signature at their cognate DNA-binding motifs. To achieve this, we used the JASPAR database (Castro-Mondragon et al., 2022) to procure vertebrate TF motifs and search for their over-representation within accessible regions across lens populations. This approach identified global accessibility reductions at genomic regions containing a FOX-family motif during fiber differentiation (Fig. 5B), correlating with FOXE3 expression. This analysis also identified shifts in motif accessibility for known TFs (e.g. E2F3) (Wenzel et al., 2011), as well as contextually novel TFs (e.g. the E2F-binding protein TFDP1) (Wu et al., 1995). Similarly, this analysis identified a global elevation in accessibility within genomic regions containing MAF and SOX13 motifs as lens cells differentiate, agreeing with observed gains in RNA abundance for these factors (Fig. 5C). SOX2 also exhibits elevated RNA abundance and motif accessibility in fiber cells (Fig. 5C). Notably, SOX2 has previously identified as a core gene regulator of lens placode formation (Donner et al., 2007; Inoue et al., 2007; Kamachi et al., 2001; Smith et al., 2009), as well as cooperating with MAF in the activation of delta-crystallin (Shimada et al., 2003).
Derivation of regulatory networks for transcription factors associated with human ocular defects
Next, we sought to apply chromatin accessibility information to predict regulatory relationships associated with transcription factors in the lens. We focused on the bZIP family protein MAF (c-Maf), mutations or deficiency of which cause cataract and ocular defects in humans (Anand et al., 2018a; Jamieson et al., 2002; Niceta et al., 2015) and animal models (Kawauchi et al., 1999; Kim et al., 1999; Ring et al., 2000). We took the following approach toward this goal: (1) identify all peak regions correlated with a nearby gene expression change, (2) identify the subset that contain Maf-binding motif(s), (3) identify those that exhibit a predicted activating relationship (i.e. upregulation of nearby gene expression) and (4) retain only peaks that have a positive correlation between accessibility and MAF RNA abundance. Cells across epithelial through fiber states exhibit a progressive increase in MAF expression, which correlates with increased accessibility of specific genomic regions containing MAF-binding motifs (Fig. 6A). We searched for linked genes within 1 million base-pairs of MAF motif-containing genomic regions; of these, 114 genes met all of the selection criteria, suggestive of an activating relationship with MAF (Fig. 6B). These candidates included key cataract/lens defect-linked genes, such as BFSP2, CRYBB1, GATA3 and TDRD7 (Fig. 6C), among others (Fig. S5). In addition, other genes were identified with important roles in lens biology (e.g. ASL1, CAP2, CAPRIN2, CRYBA1, CRYBA4, CRYBB2, JAG1, MAFA, PROX1, RBM24 and SOX13) (Fig. 6B). Using this approach, MAF itself was identified among the target genes, indicative of auto-regulatory circuitry. MAF has been demonstrated to bind proximal to its own locus and to potentiate its own expression in the mouse lens, as well as to drive expression of Crybb1 (Cvekl and Zhang, 2017; Xie and Cvekl, 2011; Yang et al., 2004). Thus, our predictive model identifies bona fide MAF targets, while expanding the repertoire of candidate MAF targets in the lens. Altogether, this represents a multiomics-based approach for predicting MAF regulatory outputs in lens development and pathology, and a similar method could be extended to other TFs.
Identification of a previously unreported epigenetic program associated with epithelial to fiber cell differentiation
Our focus then shifted to genes that exhibit variable expression across the lens pseudotime trajectory, which is marked by progressive abundance of ASL1 (Fig. 7A). Several components of the PRC2 complex, i.e. JARID2, EZH2, EED and SUZ12, were identified among the candidates with significant expression variability during fiber cell differentiation (Fig. 7A; Table S3). The PRC2 complex is a writer of the repressive histone modification H3K27me3 and is recognized as key regulator of gene silencing during development (Chammas et al., 2020; Liu and Liu, 2022). Three core members of the PRC2 complex, EZH2, EED and SUZ12, exhibited a progressive reduction in abundance during fiber differentiation. Of these, EZH2 showed a marked reduction in expression, while EED and SUZ12 represented relatively low-expressed genes with a less pronounced regulatory pattern (Fig. 7A). Conversely, JARID2 – which encodes an accessory protein that can modify PRC2 function – is expressed in all the lens populations but exhibits elevated abundance associated with differentiation. This pattern of EZH2 and JARID2 mRNA expression in the chicken lens was independently validated by hybridization chain reaction fluorescent in situ hybridization (Fig. 7B). Furthermore, this pattern was conserved between chicken and mouse, as independently validated by examination of previously described transcriptomic data from isolated mouse lens epithelial and fiber cells (Zhao et al., 2018).
Protein abundance of EZH2 in E12.5 mouse lens epithelial cells was assessed using immunofluorescence, which revealed a clear reduction in fiber cells (Fig. 7C). Notably, a reduction in SUZ12 protein was also observed, although variability was observed among biological replicates of the E12.5 mouse lens (Fig. 7C; Fig. S6A). Immunostaining of E14 mouse lenses showed marked reductions in SUZ12 (Fig. S6B), suggesting that this phenotype becomes salient during this developmental window. Interestingly, we did not find evidence of EZH1 regulation in our chick snRNA-seq dataset (Fig. S6C). EZH1 localized to extra-nuclear deposits along the apical surface of the developing mouse lens epithelium, which redistributed toward the anterior lens as fiber differentiation proceeded (Fig. S6D,E). As expected, we observed H3K27me3 immunofluorescence signal in mouse lens epithelial cells, which was mostly absent from fiber cells (Fig. 7D; Movie 1). The overlap of EZH2, SUZ12 and H3K27me3 signals suggests that the PRC2 complex functions in rendering spatiotemporal epigenetic control over chromatin structure in lens epithelial cells, an activity that is abrogated from fiber cells. These data suggest a key function for PRC2 complex proteins in cell fate control in lens development (Fig. 7E).
Genome-wide profiling of PRC2 and histone modifications in the developing lens via CUT&RUN
We then performed CUT&RUN analysis on the E4.5 chicken lens to identify the genomic regions where EZH2 and JARID2 are localized in the lens and correlate these with H3K27me3 modification. CUT&RUN uses an antibody-guided nuclease to identify global DNA-binding regions for proteins of interest (Skene and Henikoff, 2017). CUT&RUN analysis of embryonic chicken lenses showed extensive colocalization of signals for EZH2, JARID2 and H3K27me3 across peak regions, which were largely devoid of H3K4me3 (which marks active transcription) and the IgG negative control (Fig. 8A, Fig. S7A-C). Interestingly, EZH2/JARID2 colocalized peaks were found predominantly proximal to genic regions (94.7%) compared with distal intergenic regions (5.3%), implicating their direct function in mediating transcription (Fig. 8B; Table S4). Within genic regions, these peaks were largely identified near the promoter regions (∼75%) (Fig. 8B). Comparison of the top 25 EZH2- or JARID2-bound genes exhibited a 60% overlap (Fig. 8C). Almost all of these bound regions encoded master-effector transcription factors or DNA-binding factors involved in fate determination. Indeed, GO analysis of the top 200 PRC2-bound targets demonstrated that the majority of these genes are involved in transcription regulation, indicated by the enrichment of the term ‘DNA-binding TF activity’, as well as cell fate control, indicated by the terms ‘cell fate commitment. and ‘neuron fate commitment’ (Fig. 8D; Table S2).
Interestingly, loci encoding lens TFs known to function in epithelial cells, i.e. PAX6, SIX3 and FOXE3, did not exhibit enrichment of PRC2 or H3K27me3 signal, but did exhibit H3K4me3 signal, consistent with their active transcription in the epithelial cell population (Fig. 8E). We also observed a small set of genes associated with our snRNA-seq dataset that were bound by PRC2 (Fig. S7D). Conversely, a cohort TFs, generally involved in cell fate determination of non-lens ocular cell types, exhibited robust localization of EZH2, JARID2 and H3K27me3, but not H3K4me3, demonstrating the PRC2-mediated maintenance of heterochromatin domains that facilitate their repression in the lens (Fig. 8F). For example, eye field transcription factors RAX and SIX6, as well as neural retina TFs, such as ASCL1, VSX2 and PRDM14, exhibited extensive PRC2 and H3K27me3 domains across their transcript start sites (Fig. 8F). Together, these data suggest that PRC2 function is actively involved in the control of differentiation and suppression of alternate cell fates in the lens.
In this regard, localization patterns between PRC2 and the histone landscape pointed toward sophisticated control over the activation of TFs in the lens gene regulatory network. For example, GATA3 exhibits PRC2 peaks, as well as features of open chromatin and active transcription (Fig. 8E). Within the GATA3 locus, we previously observed a differentially accessible region containing a candidate MAF-binding site, for which accessibility patterns correlated with GATA3 expression (Fig. 6C). Taken together with our CUT&RUN data, we observed overlap between this site and the active transcription modification H3K4me3 (Fig. 8E). All our observations [e.g. (1) PRC2 is bound to the GATA3 locus (Fig. 8E), (2) fiber cells exhibit loss of PRC2 and H3K27me3 signals (Fig. 7), (3) there are hallmarks of active GATA3 transcription during lens development (H3K4me3; Fig. 8E), (4) progressive abundance of GATA3 mRNA is observed in the epithelial to fiber snRNA trajectory (Figs 5A and 6B), (5) there are correlative changes to chromatin architecture, as demonstrated by snATAC-seq (Fig. 6C), and (6) there are putative MAF-binding sites that we identified within the GATA3 regulatory locus (Fig. 6C)] together define a complex sequence of events reconstructed by this multiomics approach (Fig. S8). Of note, PRC2 and H3K27me3 were largely localized to the 5′ region of the GATA3 locus, which leaves open the possibility that PRC2 could also direct the use of a specific TSS during lens development (Fig. 8E).
snRNA-seq analysis of the impact of FGF on lens cell fate
We next sought to gain in vivo mechanistic insights into signaling pathways at single-cell resolution. FGF signaling plays an important role in regulating fiber cell differentiation in the lens (Lovicu and McAvoy, 2005). In this regard, our analysis indicated progressive reduction of FGFR3 throughout fiber cell differentiation and transient upregulation of FGFRL1 during fiber cell differentiation (Table S1). To assess the impact of FGF signaling on lens development, we co-opted a retina regeneration paradigm in which the retina is surgically removed (retinectomy) and replaced with beads that deliver FGF2 to the overlying lens (Fig. S9). (Coulombre and Coulombre, 1965; Tangeman et al., 2021, 2022). The lens was then subjected to retinectomy-only (control) or retinectomy and FGF2 hyperstimulation for 6 h before performing snRNA-seq, which revealed dramatic changes in cell populations (Fig. 9A). We applied automated cell type annotation using a label transfer method to categorize cells into epithelial, intermediate and fiber states based on their transcriptional similarity to our developmental reference dataset (Fig. 2B). The control and FGF2-treated samples showed a spectrum of lens cell states (Fig. 9B; Fig. 3A), whereas FGF2 treatment resulted in a clear separation of intermediate and fiber cell populations, corresponding to two previously unreported clusters (Fig. 9C, Fig. S10A). Supporting our confidence in these cell states, TFAP2A exhibited high expression in epithelial cells and reduced expression in fiber cells, whereas ASL1 exhibited the opposite pattern (Fig. 9D,E; Fig. S10B). We then assessed for differences in the odds of a cell being assigned to a particular cell state across treatments using multinomial logistic regression. This analysis revealed strong evidence of increased odds that a cell is in the intermediate state (odds factor 2.655; FDR adj. P<0.0005) or fiber state (odds factor 2.238; FDR adj. P<0.0005), as opposed to the epithelial state, when contrasting the FGF2-treated condition to retinectomy-only control (Fig. 9F). Similarly, there was strong evidence of increased odds of a cell being in an intermediate state (odds factor 2.148; FDR adj. P<0.005) or fiber state (odds factor 1.933; FDR adj. P<0.005) when contrasting the FGF2-treated condition with E4 development (Fig. 9F), supporting a role for FGF2 hyperstimulation in driving fiber cell differentiation. It should be noted that our observation derived from multinomial logistic regression are accompanied by notable caveats, including a low number of replicates and independence assumptions, among others (see Materials and Methods), and future studies may be designed to more rigorously examine these findings.
After broadly defining FGF-responsive cell populations, we focused on differential expression after FGF treatment (Table S5). Genes with a functional relation to FGF signaling (e.g. FGFR3, MAF and SPRY2) were responsive, consistent with FGF hyperstimulation (Fig. 9G). Interestingly, BNIP3L, which is involved in fiber cell organelle degradation (Brennan et al., 2018) and in a separate study is downregulated by FGF2 in H9C2 cells (Chen et al., 2016), was elevated. To further support these claims, we performed pseudotime analysis individually on our control and FGF2-stimulated lens populations (Fig. S11A,B). For both treatment groups, a clear epithelial-to-fiber trajectory was established, facilitating a side-by-side comparison of gene expression for markers of epithelial, intermediate and fiber identity. The activation of fiber cell markers (RASGEF1A, MAF and BNIP3L) was accelerated in the presence of FGF2, and that of intermediate markers (LHFPL1, JAG1, MAST4 and FGFRL1) was robustly activated, concordant with the observed increase in intermediate cell frequency (Fig. 9G, Fig. S11C). A clear reduction in TGFB2, an epithelial marker, was distinguishable in the differentiation trajectory of FGF2-stimulated lens cells (Fig. S11C). Paradoxically, MIP and VIT were reduced in abundance relative to control (Fig. 9G); these are genes associated with cataracts or predicted to be important in the lens (Berry et al., 2000; Lachke et al., 2012). This could suggest that stimulation of fiber cell differentiation with exogenous FGF2 in vivo does not faithfully recapitulate the endogenous fiber differentiation program.
We then compared the expression of these genes with the endogenous expression patterns observed in the developing E4 lens (Fig. S12). Consistent with expectations, we observed augmented expression of MAF in the FGF-stimulated condition, whereas SPRY2 expression was extended into the fiber cell population after FGF2 stimulation. Relative to the developing lens, FGF2 stimulation resulted in suppression of TGFB2 and VIT in intermediate or fiber cell populations. In addition, BMPER, BNIP3L, CAMK2D, HMP19, JAG1 and RASGEF1A, which are all expressed in the developing lens, were reduced in abundance after retinectomy and partially recovered or increased in expression after FGF2 stimulation (Fig. S12). Thus, FGF2 is able to rescue the expression of lens genes that are not sustained in the absence of influences from the neural retina.
We next performed pseudobulk analysis, which summates total RNA across lens cells in each condition, and used this approach to uncover a cohort of genes responsive to FGF2 treatment (Fig. 9H). Aggregate differential expression analysis captured a clear treatment response (Fig. S13a), identifying 471 genes significantly altered in lens cells (Table S5). This analysis captured trends in gene expression consistent with a role for FGF2 in driving fiber cell differentiation, including elevated levels of GATA3, CAPRIN2, PROX1, EPHA2, CDKN1B and CDKN1C, as well as reductions in SIX3 and PAX6, although not all of these genes reached statistical significance (adjusted P value>0.05) (Fig. S13B). Given the upregulation of MAF in response to FGF2, we examined the overlap between FGF-responsive genes and those identified as putative MAF targets (Fig. 6), which included genes linked to lens biology and defects (e.g. JAG1, TDRD7, etc.) (Fig. 9I). Together, these findings provide functional insights into the impact of FGF2-mediated gene activation in lens cells (Fig. S14).
Cataract-associated loci implicated in the present study
Next, we examined whether genes identified in these various approaches could point toward the regulatory underpinnings of lens development and pathology. Therefore, we compared candidate genes identified in these various approaches with those identified in Cat-Map (Shiels et al., 2010) (an online database of cataract-linked genes) and iSyTE (Anand et al., 2018b; Kakrana et al., 2018; Lachke et al., 2012) (an approach to identify genes with lens-enriched expression that correlate with lens biology and pathology). The comparative analysis with Cat-Map identified 72 genes that are linked to cataracts and serve as markers of cell populations in the developing lens (Fig. 10A; Table S6). We identified 15 genes in Cat-Map that are MAF targets, the majority of which (80%) are expressed in fiber cells (Fig. 10B). Analysis of PRC2 targets found in Cat-Map suggested that few are associated with lens cell populations; the majority of these genes are not significantly regulated or not appreciably expressed in lens cells at the examined developmental stage (Fig. 10C). This is consistent with the regulatory logic of PRC2 as a repressor and may explain the contribution of these cataract-linked genes to lens pathology, which could result from spurious expression secondary to defective transcriptional silencing. Similarly, numerous FGF2-responsive genes were identified to have an overlap with Cat-Map (Fig. 10D). A number of FGF2 targets were not robustly expressed or associated with specific cell populations in our development-only dataset; this could indicate dysfunctional FGF-signaling as a mechanism for aberrant gene activation and cataractogenesis. Analysis of the top 528 genes with lens-enriched expression in the iSyTE database followed a similar trend (Fig. S15; Table S6). Together, these analyses identify new genes associated with lens development, homeostasis and pathology.
DISCUSSION
Although recent studies have reported scRNA-seq of the lens in zebrafish (Farnsworth et al., 2021), mouse (Yamada et al., 2021; Giannone et al., 2023) and human (Liu et al., 2023; van Zyl et al., 2022; Zhu et al., 2023), there remains a need to apply multimodal outlooks to examine lens development. The present work encompasses a comprehensive multiomic analysis of the embryonic chicken lens. Briefly, we performed snRNA-seq, snATAC-seq and CUT&RUN-seq, as well functional assays (FGF2 hyper-stimulation) during a crucial window of lens development. To contextualize these findings and gain new biological insights, we carried out an integrative analysis in the context of lens biology, including motif analysis for key TFs (e.g. FOXE3 and MAF), and related the findings to cataract-linked genes in Cat-Map (Shiels et al., 2010) and to high-priority lens genes in the iSyTE database (Anand et al., 2018b; Kakrana et al., 2018; Lachke et al., 2012). Specifically, this report: (1) defined single-cell regulatory states in the lens, providing a spatiotemporal roadmap of fiber cell differentiation; (2) uncovered regulatory events underpinning lens development, including pathways and TFs, as well as intrinsic and extrinsic cues; (3) captured dynamic lens chromatin reconfiguration, including cis- and trans-regulatory events controlling fiber cell differentiation; (4) identified PRC2-directed epigenetic surveillance as a previously unreported mechanism for lens cell fate control, (5) profiled previously unreported lens cell states and activation of the MAF network in response to FGF hyper-stimulation; and (6) provided a developmental or mechanistic reference frame for cataract-associated genes, prioritizing previously unreported effectors of lens development in relation to pathology.
Cumulatively, this study provides a model for applying multiomic approaches and integrative analysis to tissue development. We have demonstrated how these efforts lead to the identification of conserved expression patterns during lens development, and we set the stage for capturing fundamental processes of cellular differentiation and pathology. A recent review proposed the implementation of various bulk assays, such as RNA-seq, ATAC-seq and CUT&RUN-seq, that would advance our understanding of lens biology (Disatham et al., 2023). Our study builds on these approaches by capturing multiomic readouts, and represents a seminal step toward applying these principles at single-cell resolution. Our results, importantly, add to the growing number of studies to leverage CUT&RUN-seq for the profiling of complex features of DNA-binding proteins and transcription regulators in vertebrates in vivo.
In addition to identifying PRC2 as novel regulatory complex in the lens, we demonstrate how PRC2 controls the activation of TFs, such as GATA3, which has an important role in fiber differentiation. Although PRC2 is involved in the formation and maintenance of heterochromatin (Chammas et al., 2020; Liu and Liu, 2022; van Mierlo et al., 2019), we now demonstrate the participation of this regulatory complex in lens development in vivo. Based on the distribution of PRC2 components across the lens developmental genome, we postulate that PRC2 is involved in the suppression of alternate cell fates, including non-lens ocular lineages. Indeed, studies have suggested that lens cells have a predisposition for acquiring aberrant retinal transcript profiles, and have documented perturbed H3K27 trimethylation with this phenotype (Maddala et al., 2021). In an analogous context, chromatin proteomics was used to associate PRC2 with lineage control in cultured human embryonic stem cells (Zijlmans et al., 2022). Our study uses CUT&RUN to demonstrate that this mechanism is co-opted later in development, in a tissue-specific manner to control cell fate. It should be noted that our bulk CUT&RUN-seq approach represents a mixture of signals from all populations of the developing lens, although we leverage our multimodal observations to deduce interplay between PRC2 and the dynamic chromatin landscape. Interestingly, we observe the loss of PRC2 subunits and H3K27me3 as fiber cells differentiate. Although the present study does not resolve the functional significance of these dynamics, it can be speculated that PRC2 loss may reflect a hallmark of terminal differentiation. Future studies may assess whether lens phenotypic plasticity can be augmented through PRC2 perturbation.
Our study leverages high-content, multimodal information to reconstruct complex features of gene regulation, exemplified by the prediction of regulatory outputs for the human cataract-linked TF MAF. We identify previously known MAF targets, as well as expand the connectivity of this TF by predicting new regulatory targets in the lens. It should be acknowledged that in silico deduction of gene regulatory networks is an emerging field experiencing active technical refinement (Baysoy et al., 2023; Bonneau et al., 2006; Kamimoto et al., 2023; Kartha et al., 2022). Accordingly, regulatory relationships could benefit from independent validation. As the field of in silico gene regulatory network prediction expands, this approach can be applied to other factors, e.g. FOXE3, HSF4, MAB21L1, PITX3, TFAP2A, etc., for which loss-of-function gene expression datasets exist and can be used as validation (Blixt et al., 2000; Brownell et al., 2000; Fujimoto et al., 2004; Ho et al., 2009; Min et al., 2004; West-Mays et al., 1999; Yamada et al., 2003). We intend that this study should set a foundation for future in-depth interrogations of lens development at single-cell resolution. In this regard, a potential limitation of our study is our sample size (n=643 nuclear transcriptomes for the developing lens), although it was sufficient to capture known and previously unreported features of gene regulation. Current technologies enable RNA profiling of thousands to millions of cells in a single batch. Our current analysis does not indicate rare populations that would warrant profiling on such a scale at this stage of development.
FGF signaling is a known driver of lens fiber differentiation (Lovicu and McAvoy, 2005; Thein et al., 2016; Zhao et al., 2008), and our present study reinforces its role from the perspective of single-cell dynamics. After FGF2 stimulation, we observe a reduction in the proportion of epithelial cells, as well as altered intermediate and fiber cell states. Furthermore, we document MAF activation in an FGF-dependent manner in vivo, reinforcing MAF as a FGF target (Xie et al., 2016). Our model places MAF as a key player in driving fiber cell differentiation, as we demonstrate that FGF hyperstimulation results in accelerated MAF activation and upregulation of MAF targets. Given that numerous genes are reduced in expression after retinectomy and recovered after FGF2 supplementation, our observations reinforce the notion that the retina serves as an essential source of growth factors (Thein et al., 2016) that can, in part, be rescued by exogenously supplied FGF2. In summary, our study leverages numerous nascent omics approaches to intertwine a global perspective of gene regulation with granular insights, serving to reinforce known regulatory mechanisms in the developing lens and extend our understanding into previously unappreciated facets of cellular differentiation and organogenesis.
MATERIALS AND METHODS
Chicken embryos and nuclei isolation
Fertile Specific Pathogen Free eggs (Charles River Laboratories, 10100329) were incubated in a rotating, humidified incubator at 37°C, according to institutional and responsible conduct of research standards. Whole chicken eyes were enucleated at E4 or E5 in PBS containing 0.2 U/µl RNasin Plus RNase Inhibitor (Promega, N2615). Eyes from three embryos were pooled for each condition. Nuclei were immediately extracted and fixed for snRNA-seq. For combined snRNA-seq and scATAC-seq, samples were first cryopreserved via snap freezing in an isopentane bath on liquid nitrogen and briefly stored at −80°C before nuclei extraction. In brief, nuclei extraction was carried out via an optimized variation of the demonstrated protocols described in 10X Genomics Protocol CG000124 Rev E and 10X Genomics Protocol CG000366 Rev D. For snRNA-seq, cell lysis was performed by triturating samples and incubating in an ice-cold buffer comprising 10 mM Tris-HCl, 10 mM NaCl, 3 mM MgCl2, 0.01% Surfact-Amps NP-40 (Thermo Fisher Scientific, 28324) and 0.2 U/µl RNase Inhibitor. Nuclei were washed three times by resuspending in ice-cold PBS containing 2% Bovine Serum Albumin Fraction V and 0.2 U/µl RNase inhibitor (Promega, N2615). For multiome assay, lysis was performed via trituration and 30 s incubation in a buffer containing 10 mM Tris-HCl, 10 mM NaCl, 3 mM MgCl2, 0.005% Tween-20, 0.005% Surfact-Amps NP-40, 0.0005% digitonin (Thermo Fisher Scientific, BN2006), 1% Bovine Serum Albumin Fraction V, 1 mM DTT and 1 U/µl Sigma Protector RNase inhibitor (MilliporeSigma, 3335402001). Nuclei were washed twice by resuspending in ice-cold buffer containing 10 mM Tris-HCl, 10 mM NaCl, 3 mM MgCl2, 1% Bovine Serum Albumin Fraction V, 0.1% Tween-20, 1 mM DTT and 1 U/µl Sigma Protector RNase inhibitor. Nuclei were passed twice through 20 µm cell strainers (pluriSelect, 43-10020-40) during isolation to filter debris and cell aggregates.
Single nucleus multiome profiling library preparation and data generation
Nuclei were subjected to the Chromium Next GEM Single Cell Multiome ATAC+Gene Expression library preparation workflow (10X Genomics, 1000285 and 1000230) following 10X Genomics protocol CG000338 Rev F, loading ∼16,000 nuclei per reaction. The 10X Genomics Chromium Controller was used for GEM generation. ATAC libraries were amplified with seven total PCR cycles, and RNA libraries were amplified with 15 PCR cycles. Indexes were incorporated from Single Index Kit N, Set A (10X Genomics, PN-1000212) and Dual Index Plate TT Set A (10X Genomics, PN-1000215). Libraries were sequenced at the Novogene Sequencing Core on the Illumina NovaSeq 6000. ATAC libraries were sequenced to generate 49,000-54,000 reads per nucleus and RNA libraries were sequenced to generate 29,000-56,000 reads per nucleus.
Split-pool barcoding library preparation and data generation
snRNA-seq libraries were generated via the split-pool barcoding method (Rosenberg et al., 2018) using the Parse Biosciences Single Cell Whole Transcriptome Kit Chemistry Version 1 (Parse Biosciences, SB2001). Nuclei were fixed according to the Nuclei Fixation Kit Protocol (Parse Biosciences, SB1003) and stored at −80°C. During library preparation, cDNA was amplified with 13 total PCR cycles. Final libraries were assessed with the Agilent Bioanalyzer and Qubit 4. Libraries were sequenced across two lanes of HiSeq X Ten and a lane of NovaSeq 6000 at the Novogene Sequencing Core to generate an average of 48,101 reads per nucleus across 58,906 total recovered nuclei.
Single-nuclei sequence alignment, gene annotation and data pre-processing
Downstream analysis of single-nuclei data was performed using command line interface, R version 4.2.2 and RStudio Version 2023.06.0+421. Reads were aligned to chicken genome GRCg7b and analyzed using the chicken annotation from Ensembl Release 109 (Cunningham et al., 2022). Throughout the analysis, processing of the chicken annotation was facilitated using the R package ensembldb (v2.22.0) (Rainer et al., 2019). For comparative studies, orthology was determined using Ensembl Biomart. Ensembl gene ENSGALG00010026936 is designated as FOXE3 throughout the manuscript. Reference genome indexes were built for the alignment of single-cell data using the Parse Biosciences alignment suite (v1.0.3) and 10X Genomics Cell Ranger ARC (v2.0.2). Filtered gene matrices were read into the R environment using Seurat (v4.3.0) (Hao et al., 2021). Nuclei with over 200 genes, fewer than 6000 genes, less than 10% mitochondrial reads and fewer than 15,000 RNA molecules were retained in the Parse Biosciences dataset. Nuclei with over 200 genes, fewer than 5000 genes, less than 15% mitochondrial reads and fewer than 15,000 RNA molecules were retained in the 10X Genomics dataset. Ambient RNA was subtracted from raw reads using the SoupX package (v1.6.2) (Young and Behjati, 2020), with the estimated contamination fraction set to the value calculated via Souporcell, as described below (Heaton et al., 2020).
In silico genotyping and doublet removal
In silico genotyping was employed to achieve biological replication by assigning chicken nuclei, derived from three pooled embryos per condition, back to an embryo of origin. Genotyping was facilitated by the Souporcell software (v2.0) (Heaton et al., 2020), using either the provided singularity or on a step-by-step basis. The detailed workflow for Souporcell was followed as written (https://github.com/wheaton5/souporcell). This workflow relies on additional software, including Samtools (Danecek et al., 2021), Minimap2 (Li, 2018) and Freebayes (Garrison and Marth, 2012 preprint). Single-nucleotide variants were cataloged using chicken genome GRCg6a for Parse Biosciences data and genome GRCg7b for 10X Genomics data. Before genotyping, cell barcodes in bam alignment files were given unique suffixes for each library and merged. Barcode assignments were read into the R environment and integrated into Seurat objects. Nuclei labeled as doublets, as well as a small number of nuclei with invalid genotype assignments (nuclei assigned to a condition incompatible with their sample of origin), were discarded from the analysis.
snRNA clustering, marker identification, differential expression testing and pathway enrichment analysis
Filtered nuclei datasets were processed with Seurat (Hao et al., 2021), relying on sctransform for normalization and variance stabilization (Hafemeister and Satija, 2019) before clustering nuclei and UMAP visualization. Cell cycle scoring was performed using the Seurat CellCycleScoring function, and the difference between the S-phase score and the G2M score was saved and used as a regression variable during normalization. In addition, the percentages of counts assigned to the mitochondrial and W chromosomes were calculated and regressed during normalization. Analysis of each dataset resulted in clear nuclei populations expressing ASL1 that were subset and re-processed for downstream analysis. These lens populations were integrated by applying the PrepSCTIntegration function, followed by anchor-based integration. The integrated lens dataset was subjected to PCA, UMAP and nearest-neighbor analysis with 25 dimensions of reduction and clusters, determined using Seurat FindClusters and Louvain algorithm, with the resolution parameter set to 0.2. Raw counts were log normalized and were used for downstream differential expression testing, marker identification and visualization, unless otherwise specified. Differential expression testing between nuclei clusters was performed using the Wilcoxon Rank Sum test (Table S1). Pathway enrichment analysis was performed using gprofiler2 (Kolberg et al., 2020), using the R package (v0.2.1) or online interface (Table S2). Gene lists were converted to human orthologs and run as multi-query assays for pathway enrichment analysis.
Pseudotime and trajectory analysis
Pseudotime trajectories were inferred using the Monocle 3 software (v1.3.1) (Cao et al., 2019; Qiu et al., 2017; Trapnell et al., 2014). Seurat objects were converted to Monocle cell data sets using the SeuratWrappers package (v0.3.1) function as.cell_data_set. Pseudotime trajectories were preprocessed using the Monocle workflow, including normalization, clustering, graph learning and cell ordering (Levine et al., 2015; Traag et al., 2019). Twenty-five dimensions were applied during processing and the minimal branch length parameter was adjusted to 5 during graph learning. Moran's I test was applied for differential expression testing across the trajectory using the principal graph and an adjusted P-value cut-off of 0.05 was applied (Table S3). Expression across the trajectory was visualized using the plot genes in pseudotime function with parameters min expr=1 and trend formula=“∼splines::ns(pseudotime, d.f.=3)”. Expression heatmaps were generated using the pheatmap package (v1.0.12) with columns ordered based on the pseudotime ranking assigned from the pseudotime analysis. For heatmaps, the rolling mean of log normalized counts was calculated using the rollmean function of the R package zoo (v1.8-11) (Zeileis and Grothendieck, 2005), using the neighboring 5% of cells as a rolling window.
ATAC-seq data processing, peak calling, and differentially accessible region testing
ATAC data output from Cell Ranger ARC was read into R as a merged hdf5 file using Seurat. Fragments were added to the existing RNA-only file and Multiome nuclei were subset for further analysis. Nucleosome signal and TSS signal were calculated using Signac (Stuart et al., 2021), and nuclei were retained that contained fewer than 40,000 ATAC counts, greater than 1000 ATAC counts, nucleosomal signal less than 2 and TSS enrichment greater than 1. Peak regions were defined using HMMRATAC (Tarbell and Liu, 2019) (v1.2.10), using the Cell Ranger ARC bam alignment file filtered to contain only lens nuclei. Peaks with a score <10 were removed and mitochondrial peaks discarded. ATAC counts were normalized with the term frequency inverse document frequency method (Stuart et al., 2019) using the RunTFIDF function. Partial singular value decomposition was performed with the Signac (v1.9.0) RunSVD function and UMAP was generated using the latent semantic indexing reduction. The combined modality UMAP was generated using the Seurat function FindMultiModalNeighbors to generate a weighted nearest-neighbors graph derived from the principal component analysis reduction generated for RNA and the latent semantic indexing reduction generated for ATAC. Differential accessibility testing between clusters was performed using the logistic regression and likelihood ratio ‘LR’ test in Seurat with the min.pct parameter set to 0.05. Coverage plots were generated using the Signac CoveragePlot function to display normalized Tn5 insertion frequency across loci of interest.
Linkage of accessible regulatory elements to genes, motif profiling and chromVAR analysis
The correlation of peak accessibility with nearby gene expression was carried out using the Signac LinkPeaks function (Ma et al., 2020), which takes into account GC content and peak length. Normalized RNA and ATAC values were used as input, Peak-TSS relationships were confined to 1,000,000 base pairs and a P-value cut-off of 0.05 was applied. Peaks were annotated with vertebrate motifs drawn from the 2022 JASPAR CORE database (Castro-Mondragon et al., 2022). Motifs were added to the Seurat object using the Seurat AddMotifs function and visualized. chromVAR analysis (Schep et al., 2017) was carried out using the Signac function RunChromVAR and adjusted P values were determined by applying a Wilcoxon Rank Sum test to motif activity scores. Motif logo displays were made using the Signac MotifPlot function. For the prediction of MAF binding events, the prediction of transcription factor binding events was drawn from the annotated Seurat objects. Predicted peak-TSS linkages were retained that both (1) contain a MAF motif in the peak region of and (2) exhibit a positive correlation coefficient between peak accessibility and gene expression, which was reasoned based on the accepted activity of MAF as a transcriptional activator (Takahashi, 2021). The rolling mean of normalized RNA abundance and accessibility were then calculated across the pseudotime trajectory, as described above. Peaks with a normalized accessibility that exhibited a positive correlation with MAF RNA abundance (correlation coefficient>0.6) were retained as candidate MAF-binding loci.
Hybridization chain reaction fluorescent in-situ hybridization
Fertile chicken eggs were obtained from either Michigan State University or Charles River Laboratory (10100329). Embryos were rinsed in PBS, fixed overnight in 4% PFA at 4°C, washed in PBS and stored in 100% methanol at −20°C. HCR-fluorescent in-situ hybridization was carried out using custom probes and reagents designed by Molecular Instruments, and a modified version of the Molecular Instruments protocol MI-Protocol-RNAFISH-Mouse Rev. 8 2022-07-22 (https://www.molecularinstruments.com/hcr-rnafish-protocols) was used. In brief, samples were rehydrated in graded methanol/PBST washes and permeabilized with 10 µg/ml proteinase K for 20 min before 20 min of post-fixation with room temperature 4% PFA. Sets of 20 custom probes each against ENSGALT00010019926.1 (EZH2) and ENSGALT00010023105.1 (JARID2) were introduced at 2 pmol and incubated overnight at 37°C. Hairpin incubation was performed overnight at room temperature using 60 nM of each hairpin. Samples were briefly equilibrated in 30% sucrose at 4°C before snap-freezing in optimal cutting temperature (OCT) medium on dry ice. Samples were sectioned on a cryotome, stained with 1 µg/ml DAPI (MilliporeSigma, MBD0015), washed and a coverslip applied with Fluoromount before imaging. Imaging was performed using the Zeiss LSM 710 Laser Scanning Confocal System using the Zen Black image software, and post-processing of images was carried out using the Fiji ImageJ platform (Schindelin et al., 2012). Images shown are representative of at least three embryos.
Immunohistochemistry
All protocols involving mice conformed to the protocol approved by the Miami University Institutional Animal Care and Use Committee (protocol 997_2024_Jun). FVB/N mouse embryos were collected at embryonic day 12.5 or 14, as specified, and fixed overnight in 10% formalin at 4°C before embedding in paraffin wax and sectioning. Immunostaining was performed as described previously (Luz-Madrigal et al., 2020). In brief, sections were deparaffinized and antigen retrieval was performed at 100°C for 15 min with 0.01 M sodium citrate (pH 6.0). Sections were permeabilized in 1% saponin for 5 min. Blocking was performed with 10% goat serum. Samples were incubated with primary antibodies at the specified dilutions for 4 h at room temperature; secondary antibodies were incubated for 2 h at room temperature. Images shown are representative of at least two embryos. Antibodies and dilutions employed in this study are summarized (Table S8). Imaging was performed using the Zeiss LSM 710 Laser Scanning Confocal System using the Zen Black image software, and post-processing of images was carried out using the Fiji ImageJ platform (Schindelin et al., 2012).
CUT&RUN-sequencing and analysis
Cleavage Under Targets and Release Using Nuclease sequencing (CUT&RUN-seq) (Meers et al., 2019; Skene and Henikoff, 2017; Skene et al., 2018) was performed using the CUTANA ChIC/CUT&RUN Kit Version 3 (EpiCypher, 14-1048) and CUTANA CUT&RUN Library Prep Kit (EpiCypher, 14-1002). The protocol was carried out according to the manufacturer's protocol, with slight modification. Chicken embryos were obtained from Michigan State University and lenses were dissected from embryonic day 4.5 chicken embryos. Dissected lenses were collected on ice and pooled into three biological replicates consisting of ∼44 lenses each. Nuclei were extracted from samples using the protocol described above for single-nuclei applications. Buffers were modified to contain 1× proteinase inhibitor (MilliporeSigma, 11873580001), 1.5 mM spermidine, no RNase inhibitors and no DTT, and lysis buffer was modified to contain 0.001% Surfact-Amps NP-40. Nuclei across all samples were diluted to 1000 nuclei/µl and loaded at 100,000 nuclei per reaction. All antibody reactions were performed in triplicate across independent biological replicates, and the negative control IgG antibody reaction was performed in duplicate. Antibodies employed in this study are summarized in Tables S7 and S8). Final libraries were assessed on the Agilent Bioanalyzer and Qubit 4 before pooling and sequencing at the Novogene Sequencing Core on a lane of Illumina HiSeq X Ten. Raw reads were quality and adapter trimmed using Trim Galore! (v0.6.4_dev) (Krueger et al., 2023) with parameters –phred33 –paired –length 36 -e 0.1 -q 5 –stringency 1. Trimmed reads were aligned against chicken genome GRCg7b with Bowtie 2 (v2.4.1) using parameters -X 2000 --local --no-mixed --no-discordant --no-dovetail. Peak calling and normalization of CUT&RUN-seq data was performed as described previously (Lonfat et al., 2021) with the following modifications. Peaks were called using MACS2 (v2.2.7.1) (Zhang et al., 2008) and parameters -f BAMPE -g 1070887886 --keep-dup all -q 0.05, providing IgG alignment files as background. A minimum integer score of 400 was set for defining EZH2 peaks and 125 for defining JARID2 peaks. Each sample was normalized before visualization using the deepTools (Ramírez et al., 2016) bamCoverage (v3.5.1) function with parameters --binSize 1 --normalizeUsing RPGC --effectiveGenomeSize 1070887886. RPGC normalization (reads per genomic content) corresponds to 1× normalization. Normalized replicates were merged into a single file for visualization. CUT&RUN heatmaps were produced using the deepTools computeMatrix (v3.5.1) and plotHeatmap (v3.5.1) functions using merged bigwig files. EZH2 and JARID2 colocalization was determined with the GenomicRanges R package (v1.50.2) (Lawrence et al., 2013). The annotation of peak regions to genomic elements was performed with the ChIPseeker (v1.34.1) R package (Wang et al., 2022).
Chicken surgeries and FGF2 delivery
Fertile specific pathogen-free chicken eggs were obtained from Charles River Laboratories (10100329). Retinectomies and FGF2 delivery were performed at embryonic day 4 as previously described (Spence et al., 2004; Tangeman et al., 2022). In brief, an incision was made around the anterior eye chamber with surgical scissors and the neural retina dislodged with a wire probe. For FGF2 delivery, ∼20 acrylic beads with immobilized surface heparin (MilliporeSigma, H5263) were incubated overnight in 4 µl of 250 ng/µl of bovine basic FGF2 (R&D Biosystems, 133-FB-025). At the time of surgery, FGF2 beads were deposited into the posterior chamber of the eye cup and the eye was shut. Embryos were incubated for 6 h before enucleating the eyes for use with the Parse Biosciences snRNA-seq workflow, as described above. One eye from three chickens was pooled per condition.
Analysis of FGF2-treated samples
Pre-processing of snRNA-seq data for FGF2-treated samples was performed as described above for development samples. Using this approach, a multiplet lens population was identified that exhibited abnormally high total RNA counts and total genes detected, as well as co-expression of numerous retinal and mesenchyme-specific genes (Fig. S16), and this population was discarded from downstream analysis. FGF2-treated lens cells were assigned cluster labels using the Seurat TransferData function, using the Parse Biosciences-derived nuclei from the development dataset as a reference. Pseudobulk analysis was performed by summation of the ambient RNA-adjusted raw counts associated with each embryo, and sums were rounded to the nearest whole integer. Genes with less than 10 total counts, as well as genes derived from the W chromosome, were discarded from the count matrix before analysis. Normalization and differential expression testing were performed using DESeq2 (v1.38.3) (Love et al., 2014). The assessment of differences in cell state abundance was performed using multinomial logistic regression (Hilbe, 2017) with three cell states (epithelial, intermediate and fiber) as response categories and the factor with three conditions (E4 development, retinectomy control and retinectomy+FGF2) as a predictor; the estimates of odds and FDR-adjusted P-values were reported. Multinomial logistic regression in this scenario is accompanied by several caveats, detailed as follows.
- 1.
In the process of attempting to model these data, a number of count-based models were explored, and there was evidence of overdispersion.
- 2.
ANOVA was applied as an approximate baseline analysis for this dataset using the log(proportion) of cell states in each group. ANOVA did not provide clear evidence of differences, although ANOVA is not ideal as the data are based on counts, rather than continuous response variables.
- 3.
Although there is a large total count of cells, there are only three replicates (embryos) per condition.
- 4.
Multinomial analysis assumes that each cell is independent of the others, the basis of which cannot be thoroughly explored, as cells within an embryo are constituents of the same biological system.
Acknowledgements
The authors thank Carlos M Charris Dominguez and Anil Upreti for input toward this study. Dr Byran Smucker (Miami University) provided consultation related to statistics employed in the manuscript. We acknowledge and thank the staff (Dr Andor Kiss and Ms Xiaoyun Deng) of the Center for Bioinformatics and Functional Genomics (CBFG) at Miami University for instrumentation and computational support. The authors acknowledge the use of Miami University's Redhawk HPC cluster and thank Dr Jens Mueller for computational support. We thank the Miami University Center for Advanced Microscopy and Imaging (CAMI), including Dr Zach Oestreicher and Mr Matt Duley, for imaging support. Dr Andy Fischer and Dr Heithem El-Hodiri of The Ohio State University assisted with multiome library preparation. Jackson Theile assisted with the design of schematics used throughout the manuscript.
Footnotes
Author contributions
Conceptualization: J.A.T., M.L.R., S.A.L., K.D.R.-T.; Methodology: J.A.T., S.M.R., E.G.-E., J.M.W.; Software: J.A.T., S.B.-S.; Formal analysis: J.A.T., S.B.-S.; Investigation: J.A.T., S.A.L.; Data curation: J.A.T.; Writing - original draft: J.A.T., S.A.L.; Writing - review & editing: M.L.R., K.D.R.-T.; Visualization: J.A.T., S.M.R., E.G.-E., J.M.W.; Supervision: M.L.R., S.A.L., K.D.R.-T.; Project administration: M.L.R., S.A.L., K.D.R.-T.; Funding acquisition: M.L.R., S.A.L., K.D.R.-T.
Funding
J.A.T. was supported by the National Institute of Neurological Disorders and Stroke [F99 NS129167]. K.D.R.-T. was supported by the National Eye Institute [R01 EY026816 and EY034980], the Rapid Grant Program at Miami University and the John W. Steube Endowed Professorship. S.A.L. was supported by the National Institutes of Health/National Eye Institute [R01 EY021505 and R01 EY029770]. M.L.R. was supported by the National Institutes of Health/National Eye Institute [R21 EY033471 and R21 EY031092]. S.M.R. was supported by the ‘Choose Development!’ Program from the Society for Developmental Biology and by a Miami University Chapter of the Louis Stokes Alliance for Minority Participation Research Grant.
Data availability
The data discussed in this publication have been deposited GEO under accession number GSE236905. All code and resultant output files are available upon reasonable request to the authors.
Peer review history
The peer review history is available online at https://journals.biologists.com/dev/lookup/doi/10.1242/dev.202249.reviewer-comments.pdf
References
Competing interests
The authors declare no competing or financial interests.