ABSTRACT
Pro-spermatogonia (SG) serve as the gateway to spermatogenesis. Using single-cell RNA sequencing (RNAseq), we studied the development of ProSG, their SG descendants and testicular somatic cells during the perinatal period in mice. We identified both gene and protein markers for three temporally distinct ProSG cell subsets, including a migratory cell population with a transcriptome distinct from the previously defined T1- and T2-ProSG stages. This intermediate (I)-ProSG subset translocates from the center of seminiferous tubules to the spermatogonial stem cell (SSC) ‘niche’ in its periphery soon after birth. We identified three undifferentiated SG subsets at postnatal day 7, each of which expresses distinct genes, including transcription factor and signaling genes. Two of these subsets have the characteristics of newly emergent SSCs. We also molecularly defined the development of Sertoli, Leydig and peritubular myoid cells during the perinatal period, allowing us to identify candidate signaling pathways acting between somatic and germ cells in a stage-specific manner during the perinatal period. Our study provides a rich resource for those investigating testicular germ and somatic cell developmental during the perinatal period.
INTRODUCTION
Pro-spermatogonia (ProSG) – also called gonocytes – are the precursor cells that give rise to spermatogonial stem cells (SSCs). Primary transitional (T1)-ProSG are mitotically quiescent cells that undergo genome-wide shifts in chromatin accessibility, 3D chromatin organization and epigenetic marks (Yamanaka et al., 2019). At postnatal day (P) 1, T1-ProSG begin converting into secondary transitional (T2)-ProSG, which initiate proliferation and are considered the immediate precursor cells that give rise to SSCs (McCarrey, 2013). During the time interval in which this T1-to-T2-ProSG conversion event occurs, these transitioning germ cells migrate from the center of the seminiferous tubule to its periphery (Kluin and de Rooij, 1981; McGuinness and Orth, 1992). This site is where SSCs are thought to first form from T2-ProSG, and it is also where SSCs reside throughout their lifetime (McLean et al., 2003). Thus, the periphery of seminiferous tubules is regarded as the SSC niche (Oatley and Brinster, 2012).
T2-ProSG not only give rise to SSCs but are the transient cell population that generates the ‘first wave of spermatogenesis’, which allows mice to generate sperm as early as ∼5 weeks of age (Bellve et al., 1977). Thus, T2-ProSG are unique cells that have two fates. One fate leads to the rapid production of male gametes and the other fate leads to the production of SSCs and consequent long-term spermatogenesis.
Progress on clarifying the developmental and molecular events that accompany and underlie ProSG development has been hindered, in part, by a dearth of ProSG subset-specific markers. There have been several challenges in identifying ProSG subset-specific markers, including the fact that ProSG subsets exhibit only subtle differences morphologically, they overlap with each other temporally, and they undergo asynchronous changes as they mature in vivo (Culty, 2013; McCarrey, 2013; Vergouwen et al., 1991).
Perinatal testes not only harbor germ cells but also several somatic cell types, including Sertoli cells (SCs), Leydig cells (LCs) and peritubular myoid cells (PTMs). SCs are the only somatic cells in direct contact with germ cells, allowing them to provide many ‘nurse cell functions’ (Griswold, 1998). LCs are located in the connective tissue surrounding the seminiferous tubules, where they produce testosterone, which is crucial for a variety of functions during the fetal, postnatal and adult stages (Griswold and Behringer, 2009). PTMs surround seminiferous tubules, where they provide contractile functions and secrete factors important for spermatogenesis, including GDNF, which is essential for undifferentiated SG to develop in both postnatal and adult mice (Chen et al., 2016). Although the functions of these somatic cell types in spermatogenesis is relatively well established, little is known about their development and function prior to spermatogenesis, particularly at the molecular level (Griswold and Behringer, 2009; Meroni et al., 2019; Nurmio et al., 2012).
Here, we elucidated the nature of the germ and somatic cell types that populate the mouse testes during the perinatal period using single-cell RNA sequencing (scRNAseq) analysis. Recently, several groups, including ours, used scRNAseq analysis to investigate adult spermatogenesis in humans and mice (Guo et al., 2018; Hermann et al., 2018; Sohni et al., 2019; Velte et al., 2019; Wang et al., 2018). Two recent studies used scRNAseq analysis to investigate the steps leading up to spermatogenesis (Law et al., 2019; Liao et al., 2019). Liao et al. identified early postnatal SG subsets, including a CD87+ cell population, from scRNAseq analysis of purified germ cells from P5.5 testes (Liao et al., 2019). Law et al. studied the transcriptome and cellular dynamics that accompany SSC specification through analysis of purified germ cells from multi-lineage reporter mice (Law et al., 2019). Using multiple approaches, including scRNAseq analysis, they defined heterogeneous ProSG and SG populations, along with the associated transcriptomes and putative regulatory networks. Germ-cell transplantation analysis revealed that Id4-eGFP+, but not Id4-eGFP−, E16.5 germ cells were able to rescue spermatogenesis in germ cell-deficient mice, suggesting that Id4-eGFP+ ProSG are fated to become SSCs.
To obtain an unbiased view of the developmental and molecular events occurring during perinatal testicular development, we performed scRNAseq analysis on unfractionated testicular cells from the perinatal period. This analysis, described herein, provides detailed information on both germ and somatic cells during the perinatal period.
RESULTS
Identification of the testicular cell types present during the perinatal period
We performed scRNAseq analysis on dissociated cells from freshly isolated whole testes obtained from embryonic day (E) 18.5, P2 and P7 mice. After filtering out poor-quality cells, 50,859 cells remained for subsequent analysis. Using a nonlinear dimensionality-reduction technique – uniform manifold approximation and projection (UMAP) (Becht et al., 2018) – we identified cell clusters corresponding to eight major cell types (Fig. 1A). Some of the gene markers used to define these cell clusters are shown in Fig. 1B. Biological replicates exhibited similar cell distributions (Fig. S1A). Fig. S1B shows standard quality control parameters after filtering. Genes exhibiting enriched expression in each cell subset are shown in Fig. 1C and Table S1. Gene ontology (GO) and ingenuity pathway analyses (IPAs) identified functions enriched for the different cell types (Fig. 1D; Fig. S1C). Many cell clusters exhibited dramatic shifts in gene expression between E18.5 and P7 (Fig. S1A), indicative of dynamic developmental changes during the perinatal period.
Germ cell stages traversing the ProSG-to-SG transition period
We next performed re-clustering analysis on only the germ cells, which revealed the existence of six distinct germ cell clusters (Fig. 2A). Only one cell cluster is present at E18.5 (Fig. 2A). This cell population has a cell-cycle gene expression pattern indicative of non-proliferating cells (Fig. S2A), which is consistent with the evidence that only the proliferative quiescent cell population – called ‘T1-ProSG’ – are present during late embryogenesis (McCarrey, 2013). This T1-ProSG cell cluster expresses genes encoding proteins relevant to DNA methylation (Fig. S2B), consistent with the genome-wide re-establishment of DNA methylation marks at this stage (Tseng et al., 2015). At P2, we identified two cell clusters. One P2 cluster broadly expresses cell-cycle genes (Fig. 2A), corresponding to proliferative T2-ProSG (McCarrey, 2013). The other P2 cluster, which we named ‘intermediate (I)-ProSG’, largely lacks cell-cycle gene expression and has migratory properties (Fig. 2A), as detailed below. At P7, we identified three cell clusters that we named ‘SSC-1’, ‘SSC-2’ and ‘differentiating (Diff)-SG’, based on their expression of known SSC, progenitor and Diff-SG gene markers (Fig. S2C). Genes selectively expressed by these six germ cell clusters are shown in Fig. 2B and Table S1. GO categories significantly enriched in these germ cell clusters are shown in Fig. 2B.
Inspection of the germ cell-enriched genes (Fig. 1C) suggested that Cdkn2a, which encodes a cyclin-dependent kinase inhibitor, is a pan-ProSG marker (Fig. 2C). Immunofluorescence (IF) analysis showed that its protein product, CDKN2A, is also a pan-ProSG marker, based on its co-expression with the postnatal/adult germ cell marker TRA98 (Tanaka et al., 1998) (Fig. 2C). CDKN2A is expressed on the cell surface (Zhang et al., 1998), raising the possibility that it could be used to purify ProSG. In support, fluorescence-activated cell sorting (FACS)-purified CDKN2A+ cells (∼3.7% of total P2 testicular cells) were enriched for expression of ProSG genes, and largely lacked expression of testicular somatic cells (Fig. S2D).
We found that the Dnmt3l gene is preferentially expressed in T1-ProSG (Fig. 2D), leading to the possibility that its encoded protein specifically labels this ProSG stage. In support, it was previously shown that DNMTL is expressed in ProSG (Sakai et al., 2004), but its specificity for this stage was not determined. Using an antiserum against DNMT3L, we found it strongly stained germ cells at P0 and had dramatically reduced staining at P2 (Fig. 2D), consistent with the expression pattern of T1-ProSG (Culty, 2013; McCarrey, 2013). Co-staining with an antibody against the proliferation marker MKI67 (KI-67) (Gerdes et al., 1983) showed that the vast majority of DNMT3L+ cells are non-proliferative (Fig. 2D), another key characteristic of T1-ProSG (McCarrey, 2013).
Identification of distinct ProSG subsets engaging in proliferation and migration
As described above, we identified a previously unrecognized ProSG subset – I-ProSG – present in the perinatal testis (Fig. 2A). Monocle trajectory analysis showed that I-ProSG are developmentally less advanced than T2-ProSG, placing them at an intermediate position between T1-ProSG and T2-ProSG (Fig. S2E). Many genes involved in cell migration and actin organization exhibit enriched expression in I-ProSG (Fig. S2A,F).
Two genes preferentially expressed in I-ProSG are Elmo1 and Palld (Fig. 2E), which both encode proteins involved in cell migration and cytoskeleton organization (Pogue-Geile et al., 2006; Raftopoulou and Hall, 2004). IF analysis showed that the corresponding proteins, ELMO1 and PALLD, mainly mark germ cells located at the center of seminiferous tubules and are predominantly KI-67 negative (Fig. 2F), consistent with our evidence that I-ProSG are largely non-proliferative (Fig. S2A). The few ELMO1+ and PALLD+ cells at the seminiferous tubule periphery are mostly KI-67+ (Fig. 2F), indicating that I-ProSG undergo proliferative expansion once these cells migrate to the periphery. The ELMO1 and PALLD immunosignal is much dimmer at P3 than at P2 (Fig. 2F), suggesting that these proteins transiently mark migrating ProSG and shut off soon after I-ProSG migrate to the seminiferous tubule periphery.
We found that Etv4 preferentially labels T2-ProSG (Fig. 2E). Its corresponding protein, the transcription factor ETV4, marks T2-ProSG, based on its expression in proliferative (KI-67+) cells at P2 (Fig. 2F). In contrast with I-ProSG (ELMO1+ and PALLD+ cells), which became rare by P3, ETV4+ cells remained abundant at P3 (Fig. 2F), consistent with pseudotime analysis that indicated that T2-ProSG follow I-ProSG developmentally (Fig. S2E). Furthermore, most ETV4+ cells are in the periphery of seminiferous tubules at P2, indicating that most ETV4+ cells have already undergone migration, whereas most I-ProSG (ELMO1+ and PALLD+ cells) are in the center of seminiferous tubules at P2 (Fig. 2F).
Together, these data support a model in which T1-ProSG convert into migratory I-ProSG that translocate from the center of seminiferous tubules to their periphery, whereupon the I-ProSG become proliferative T2-ProSG, which have two fates: (1) conversion into Diff-SG, which undergo the first round of spermatogenesis, and (2) conversion into SSCs for long-term spermatogenesis (Fig. 2G). We note that because ProSG migration to the seminiferous tubule periphery is asynchronous (Kluin and de Rooij, 1981; McGuinness and Orth, 1992), it is not possible, using current methods, to test whether I-ProSG are exclusively migratory.
To validate our cell cluster analysis, we took advantage of two recently published studies that performed scRNAseq analysis on perinatal germ cells (Law et al., 2019; Liao et al., 2019).
Liao et al. examined FACS-purified Oct4-GFP+/KIT− male germ cells from P5.5 mice (Liao et al., 2019). We bioinformatically combined their scRNAseq dataset with ours and found that their P5.5 germ cell cluster overlapped with our P7 cell cluster (Fig. S3A), thus validating our clustering analysis of the P7 time point.
Law et al. examined Tomato fluorescent protein-positive cells from tdTomatoflox_STOP_flox; Blimp-Cre mice (which label primordial germ cells and their descendants) purified by FACS from E16.5 to P6 mice (Law et al., 2019). We combined their scRNAseq datasets with ours (both had similar read depths), but we did not observe a cluster pattern with a coherent temporal sequence, even when using the same sequence alignment tool (data not shown). We then re-analyzed the Law et al. dataset using the same alignment, filtering, normalization and clustering parameters used for our dataset, which revealed nine clusters (Fig. S3B). Analysis of the genes preferentially expressed from each of these clusters (Table S1) revealed that C1 and C3 correspond best to the T1-ProSG cluster we defined; C3 and C4 correspond to I-ProSG; and C4 and C5 correspond to T2-ProSG (Fig. S3C). Consistent with C1 and C3 being the least advanced, most cells in these clusters have a cell-cycle gene expression pattern indicative of their being mitotically quiescent (Fig. S3D), a characteristic of T1-ProSG (McCarrey, 2013). The T1-, I- and T2-ProSG markers we identified preferentially labeled the corresponding Law et al. cell clusters (Fig. S3E). Conversely, the well-established stage-specific germ cell markers used by Law et al. exhibited the expected expression pattern in the cell subsets identified by our study: SSC-associated markers are mainly expressed in the ProSG and SSC-1 clusters, whereas progenitor and differentiation markers are mainly expressed in the SSC-2 and Diff-SG clusters, respectively (Fig. S3F).
Characterization of perinatal undifferentiated SG
As described above, we identified three cell clusters at P7 that we named ‘SSC-1’, ‘SSC-2’ and ‘Diff-SG’ (Fig. 2A). SSC-1 and SSC-2 mainly comprised undifferentiated SG, based on expression of known undifferentiated SG markers (Fig. 2B; Fig. S2C). SSC-1 expressed higher levels of many SSC marker genes (e.g. Gfra1 and Id4) than SSC-2 (Fig. S2C), suggesting that SSC-1 are more primitive than SSC-2. In contrast, SSC-2 express higher levels of progenitor marker genes (e.g. Neurog3, Nanos3 and Rarg) than SSC-1 (Fig. S2C). Although we did not perform a functional assay to ascertain the function of cells in the SSC-1 and -2 clusters, their expression pattern suggests that SSC-1 and SSC-2 are enriched for SSCs and progenitors, respectively.
Re-clustering of the P7 germ cells separated the SSC-1 subset into two cell clusters that we named SSC-s1 and SSC-s2 (Fig. 3A), which differentially express several genes (Fig. 3B; Table S1). SSC-s1 expresses the recently identified SSC markers, Eomes and Cd87 (also known as Plaur) (Fayomi and Orwig, 2018; Liao et al., 2019) (Fig. S4A; Table S1), raising the possibility that SSC-s1 is the more primitive of these two cell clusters, a possibility confirmed by Monocle pseudotime analysis (Fig. 3C,D). Genes involving ‘integrin-mediated signaling’ and ‘vessel development’ are enriched in SSC-s1 (Fig. S4B,C), suggesting these processes may play roles in SSC development.
To define the molecular events involved in perinatal SG development, we identified genes statistically enriched in cells along the trajectory pseudotime axis. This analysis identified four distinct patterns of gene expression dynamics (Fig. 3E). Group-1 genes are expressed in the most primitive SG and thus are candidates to be involved in SSC formation. Significantly enriched GO categories for group-1 genes include ‘system development’ and ‘signaling transduction’ (Fig. 3E). Group-2 and -3 genes are mainly expressed in progenitors; enriched GO categories corresponding to these two gene expression pattern groups include ‘metabolism’, ‘biosynthesis’ and ‘cell surface receptor signaling’. Group-4 genes are mainly expressed in Diff-SG; enriched GO categories include ‘metabolism’ and ‘cell cycle process’, which is consistent with the evidence that differentiating SG undergo proliferative expansion (Suzuki et al., 2012).
To identify novel SSC markers, we screened the genes in group 1 for those with known functions in stem cells. This screen identified Lhx1 and Cd82 as fulfilling this criterion (Fig. 3B,F). To check their specificity further, we leveraged a published scRNAseq dataset from P6 germ cells purified based on the Id4-eGfp signal (Hermann et al., 2018). This revealed that both Lhx1 and Cd82 are mainly expressed in Id4-eGFPbright cells (Fig. S4D,E), which are widely regarded as the most specific mouse SSC marker currently known (Helsel et al., 2017). To determine the specificity of the proteins encoded by these two genes, we examined LHX1 and CD82 protein expression relative to Id4-eGFP. IF analysis showed that virtually all LHX1+ cells were also Id4-eGFP+ (Fig. 3G), implying that LHX is highly selective for SSCs. Interestingly, only ∼69% of Id4-eGFP+ cells were LHX1+ (Fig. 3G), raising the possibility that LHX1 marks a subset of SSCs. Using two different antisera against CD82, we found that CD82 also labels Id4-eGFP+ cells, but because these antisera also stained the SC cytoplasm (Fig. S4F), we were not able to quantify the percentage of positive germ cells using IF analysis. As an alternative approach, we performed FACS analysis with one of these antisera and found that CD82 marked a highly select group of testicular cells (∼0.48% of the total), mainly Id4-eGFPbright cells (Fig. 3H). Together, these results suggest that both LHX1 and CD82 are SSC marker proteins.
Identification of stage-specific signaling pathways and transcription factors (TFs) during perinatal germ cell development
To identify signaling pathways that are potentially involved in perinatal germ cell development, we screened for signaling component genes exhibiting stage-specific expression in the seven germ clusters we identified. This revealed many signaling pathways predicted to be selectively active in specific stages (Fig. S5A; Table S1).
To define TFs that may be involved in regulating genes in these cell subsets, we took advantage of IPA, which uses known consensus DNA-binding sites to define TF targets (Krämer et al., 2014). Fig. S5B provides a list of the TF genes predicted to target genes preferentially expressed in each subset. Their putative target genes in each cell subset are listed in Table S1. Most of these TFs are expressed in the cell subsets in which they are predicted to act (Fig. S5C). Some of these TFs are predicted to act in several cell subsets [e.g. MYC and TP53 (TRP53)], but most display selectivity, including the pluripotency factor, POU5F1, which is predicted to act in I-ProSG; and the transcriptional repressor, REST, which is predicted to act in SSC-2. Some of the TFs are directly involved in signaling pathways (e.g. SMAD4 is the core factor in the TGFβ signaling pathway).
As another approach to define key TFs important during perinatal germ cell development, we identified TFs exhibiting enriched expression in each cell subset we defined (Table S1). Some of these TF genes, including Sall4, Bcl6b, Dmrtc2, Lhx1, T and Dmrt1, were previously shown to have roles in SSCs (Chan et al., 2017; Gegenschatz-Schmid et al., 2017; Oatley et al., 2006; Zhang et al., 2016).
Comparison of SSC-dependent and -independent waves of spermatogenesis
To determine whether there are differences in SG between the first (SSC-independent) and subsequent (SSC-dependent) waves of spermatogenesis, we combined our P7 testes scRNAseq dataset with published testes scRNAseq datasets from older (postnatal and adult) mice (Grive et al., 2019). The SG subsets were annotated as ‘SSC-1’, ‘SSC-2’ and ‘Diff-SG’, based on the gene markers we used to define cell clusters in our P7 scRNAseq datasets. Pearson correlation analysis showed that the SSC-1 subset from these different ages (P6 to adult) are highly related (Fig. S5D). The same was the case for the SSC-2 subset (Fig. S5D). This suggests that SSCs and progenitors exhibit few differences in gene expression as they transition from the neonatal period to adult spermatogenesis. In contrast, the Diff-SG subset exhibited detectable differences in gene expression between the first wave (before P30) and later stages (P30 and adult; Fig. S5D), indicative of marked changes in the molecular events that occur during the first and subsequent waves of SG differentiation. Table S1 lists the unique gene signatures for each SG subset at each stage.
We also analyzed spermatocytes (SPCs) for age-dependent differences in gene expression. Pearson correlation analysis showed that the expression pattern of SPCs when they first emerge (P14) (Bellve et al., 1977) is significantly different from SPCs derived from older mice, including P18 mice (Fig. S5E). This suggests that early SPCs generated during the first round of spermatogenesis differ from those generated later. Consistent with first-round SPCs exhibiting a unique pattern of expression, GO analysis revealed that ‘translation’ and ‘RNA splicing’ functions are enriched in P14 SPCs, whereas adult SPCs are enriched for ‘gametogenesis’-related functions (Fig. S5F).
Somatic cell heterogeneity and development during the perinatal period
To analyze the development of somatic cells during the perinatal period, we independently re-clustered the three known major somatic cell types from our scRNAseq datasets.
SCs
Re-clustering of the SCs from the E18.5, P2 and P7 testicular datasets generated six SC clusters (Fig. 4A), suggesting that SCs undergo discrete step-wise shifts in gene expression as they develop during the perinatal period. E18.5 and P2 SCs are in common clusters, whereas P7 SCs are in distinct clusters (Fig. 4A), suggesting that SCs undergo a dramatic change between P2 and P7. Consistent with this, we identified many more uniquely expressed genes at P7 (257) than at E18.5 and P2 (81 and 17, respectively) (q<0.01; Table S2). Consistent with the known gradual decline in SC proliferation during early postnatal development (Vergouwen et al., 1991), we observed a progressive decline in the percentage of proliferating SCs between E18.5 and P7 (as defined by the expression of cell-cycle genes; Fig. 4B).
Pseudotime trajectory analysis revealed a linear trajectory of SCs from E18.5 to P2 to P7 (Fig. 4C). Three patterns of gene expression dynamics were identified along the pseudotime trajectory, each with statistically enriched GO functions (Fig. 4D) and signaling pathways (Table S2). These three gene groups tended to correspond to the developmental age of the SCs, with group-1, -2 and -3 genes being enriched in most primitive, intermediate and later stage of SCs, respectively. We identified 148, 98 and 281 TF genes expressed in groups 1, 2 and 3, respectively (Table S2). Several group-3 TF genes, including Gata1, Dmrt1 and Nr3c1, are predicted to target other TF genes in group 3 (Fig. 4E), suggesting the existence of TF cascades in developing SCs. Indeed, IPA predicted that group-3 TFs form complex regulatory networks (Fig. 4F).
LCs
Re-clustering of LCs generated three distinct cell clusters, with each cluster largely representing a different time point (Fig. S6A). Genes involved in ‘apoptosis’ are enriched in the P2 LC cluster (Fig. S6B), consistent with the timing of fetal LC loss (Griswold and Behringer, 2009). Pseudotime trajectory analysis showed a linear trajectory of LCs from E18.5 to P2 to P7 (Fig. S6C). Three patterns of gene expression dynamics were identified along the pseudotime trajectory, each with statistically enriched GO functions (Fig. S6D) and signaling pathways (Table S2). We identified 52, 67 and 35 TF genes expressed in groups 1, 2 and 3, respectively (Table S2). Fig. S6E shows a gene network inferred by Ingenuity analysis to be formed by group-2 TFs in LCs. Key nodes in this inferred network are AKT (AKT1) and FOXO1, raising the possibility that AKT signaling and the FOXO1 TF have key roles in the fetal-to-adult LC transition (Griswold and Behringer, 2009).
PTMs
Re-clustering of PTMs generated three distinct cell clusters, with each cluster representing a different time point (Fig. 5A). Enriched genes from E18.5 and P2 partially overlapped, indicative of only a modest shift in gene expression between these time points, whereas enriched genes in the P7 cluster were largely unique, indicative of a relatively dramatic shift in PTM gene expression between P2 and P7 (Table S2). Pseudotime analysis showed a linear trajectory of myoid cells from E18.5 to P7 (Fig. 5B). Three patterns of gene expression dynamics were identified along the pseudotime trajectory, each with statistically enriched GO functions (Fig. 5C) and signaling pathways (Fig. 5D). We identified 131, 53 and 123 TF genes exhibiting enriched expression in groups 1, 2 and 3, respectively (Table S2). Several group-3 TF genes, including Clock, Mef2a and Srf, are predicted to target other TF genes in group 3 (Fig. 5E), suggesting the existence of TF cascades in developing PTMs. Indeed, group-3 TFs are predicted to form complex gene networks comprising many factors, including thyroid hormone receptor and estrogen receptor (Fig. 5F). This raises the possibility that hormone signaling is involved in PTM development during the perinatal period.
Evidence for a dramatic burst in Hippo signaling during the ProSG-to-SG transition
Our analysis, described above, indicates that both germ cells and somatic cells undergo dynamic gene expression changes during the perinatal period. To define signaling pathways temporally regulated during this period, we used IPA to identify signaling pathways predicted to be active in germ cells, SC, LCs and PTMs at E18.5, P2 and P7 (Fig. 6A). The most statistically significant signaling pathway identified in germ cells was the Hippo pathway. In support of the possibility that the Hippo pathway is active in germ cells during the perinatal period, we found that several Hippo signaling component genes are expressed in germ cells during the perinatal period, with a peak at P2 (Fig. 6B). To test directly whether there is active Hippo signaling in perinatal germ cells, we performed IF analysis of YAP (YAP1), which is regarded as the best marker of Hippo signaling, as this transcriptional co-activator translocates into the nucleus of cells to transcriptionally activate Hippo pathway target genes (Meng et al., 2016). IF analysis showed that YAP is nuclear in perinatal germ cells, indicative of active Hippo signaling. Intriguingly, its nuclear localization is temporally regulated during perinatal development (Fig. 6C; Fig. S7). Between P0 and P2, there is a dramatic switch in the pattern of YAP localization from virtually no nuclear YAP+ germ cells at P0 to nearly 100% nuclear YAP+ germ cells at P2. The percentage of cells with a bright YAP signal increased between P2 and P3, followed by a decrease by P5. By P7, only ∼10% of germ cells expressed nuclear YAP. Together, these data suggest that there is a transient burst in Hippo signaling in germ cells between ∼P2 and ∼P5.
Ligand-receptor signaling pairs expressed during the ProSG-to-SG transition
One possible role of somatic cells during the perinatal period is to participate in signaling events, just as they do in the adult testis (Takase and Nusse, 2016; Yang et al., 2013b). To investigate this possibility, we screened for ligand/receptor gene pairs expressed in germ and somatic cell populations during the perinatal period. This revealed the expression of genes involved in a number of signaling pathways, including GDNF, Hedgehog, CXC chemokine, Notch, FGF, TGFβ, WNT, PDGF, LIF and IGF (Fig. 6D; Fig. S8). Perinatal germ cells express genes encoding receptors for the GDNF, TGFβ and WNT signaling pathways (Gfra1/Ret, Bmpr1a and Fzd3, respectively), indicating that they likely respond to signals from other cells through these receptors. Perinatal germ cells also appear to secrete signaling factors, as they express genes encoding ligands for the Notch, TGFβ and PDGF signaling pathways (Dlk1, Tgfb1 and Pdgfb, respectively). Likewise, SCs express genes encoding both ligands and receptors, including ligands for the GDNF, Hedgehog, TGFβ, PDGF and LIF signaling pathways (Gdnf, Dhh, Tgfb1, Pdgfb and Ctf1, respectively), and receptors for Notch, TGFβ, WNT, and IGF signaling pathways (Notch2, Bmpr1a/Tgfbr1/Acvr1, Fzd3 and Sh2b2, respectively). LCs and PTMs also express genes encoding ligands and receptors for signaling pathways, as detailed in Fig. 6D and Fig. S8.
To ascertain when these intercellular pathways are likely to function during perinatal development, we examined the expression pattern of the ligand/receptor pairs. This revealed temporal specificity for several of these intercellular cell-cell signaling pathways. For example, Gdnf, which encodes the GDNF signaling ligand, is expressed in E18.5 SCs, whereas Gfra1 and Ret, which encode the GDNF signaling receptors, do not initiate expression until after birth (Fig. 6E), suggesting that GDNF signaling is not active in germ cells until after birth (GDNF may have non-germ-cell roles before birth). As another example, Bmp2 and Bmpr1a, which encode a BMP signaling ligand and a BMP receptor, respectively, both initiate expression between E18.5 and P2 (Fig. 6F). Because the co-expression of the ligand and receptor is induced in different cell types, this suggests coordinated gene regulation to mediate BMP signaling between LCs and ProSG in a temporally specific manner.
DISCUSSION
In this study, we used scRNAseq to molecularly characterize testicular germ and somatic cells during the perinatal period. We charted the progress of SSC precursor cells – ProSG – as they differentiate, migrate, undergo proliferative expansion, and convert into SSCs (Fig. 7). We also defined SG subsets present during the perinatal period, and defined stage-specific markers for both ProSG and SG subsets. We also characterized the development of somatic cells during the perinatal stage. Finally, we identified TFs and signaling pathways that are likely to function in and between germ and somatic cell subsets during the perinatal period. Together, our study provides a useful resource for understanding perinatal testicular development.
Identification of ProSG subsets and specific markers
Our scRNAseq analysis identified three distinct stages of ProSG development during the perinatal period. Two of these stages – T1-ProSG and T2-ProSG – were previously known (McCarrey, 2013), but no markers that specifically label these stages had been previously defined. We filled this gap by identifying both gene and protein markers for T1- and T2-ProSG, as well as a pan-ProSG marker. The T1-ProSG protein marker we identified is DNMT3L, a DNA methylation-promoting protein that restores DNA methylation marks erased during the primordial germ cell stage of germ-cell development (Hajkova, 2011) and is crucial for spermatogenesis (Bourc'his and Bestor, 2004; La Salle et al., 2007; Vlachogiannis et al., 2015; Webster et al., 2005). We identified the transcription factor ETV4 as a marker of T2-ProSG. T2-ProSG also express the Etv4 gene paralog Etv5 (Table S1), which is known to promote the proliferation of germline stem cells in vitro (Wu et al., 2011) and early postnatal SG in vivo (Tyagi et al., 2009), suggesting that this transcription factor stimulates SSC self-renewal. By analogy, Etv4 and Etv5 may participate in the proliferative expansion of T2-ProSG and SSC genesis.
Identification of a migrating ProSG subset
We identified a largely non-proliferative ProSG cell cluster – I-ProSG – that our evidence indicates undergoes migration from the center to the periphery of seminiferous tubules (Fig. 2G). As further support for this notion, I-ProSG express a large cadre of cell migration and actin organization genes (Fig. S2F). The discovery of the I-ProSG cell subset leads us to propose a refined model for SSC establishment in which the conversion of T1-ProSG to I-ProSG initiates a migratory event that places these germ cells in a niche that triggers their subsequent conversion into SSCs (Fig. 2G). Given that we found that most I-ProSG are in the center of seminiferous tubules, we cannot rule out the possibility that they are predominantly a ‘priming stage’ to prepare ProSG for migration, rather than migratory cells per se. However, we provide evidence that at least some I-ProSG are migratory, as we found some cells with I-ProSG markers that are in the periphery.
In both mice and rats, ProSG undergo migration to the seminiferous tubule periphery at approximately the same time after birth as they re-initiate mitosis (Kluin and de Rooij, 1981; McGuinness and Orth, 1992). It is not clear whether these two events are mechanistically connected. One possibility is that migration-promoting signals are received by T1-ProSG in the center of the seminiferous tubule, which transforms these cells into migratory I-ProSG. Once they migrate to the basement membrane in the periphery of the seminiferous tubule, I-ProSG receive growth signals present in this microenvironment, which drives these cells to convert into proliferative T2-ProSG. The identification of I-ProSG-specific markers will be invaluable to address this and other models. Interestingly, the two protein markers we identified as specific for I-ProSG – ELMO1 and PALLD – are both known to drive cell migration in mammalian cell lines (Goicoechea et al., 2008; Grimsley et al., 2004), which raises the intriguing possibility that ELMO1 and PALLD not only mark I-ProSG, but also promote the migration of these cells in vivo.
I-ProSG not only transiently express genes involved in migration, but other gene categories as well (Table S1). Perhaps most striking is their expression of many genes involved in intracellular signal transduction, including Hspb1, Mast4, Map4k4, Plcg2, Prkci, Pkn2, Srpk2, Stk39, Sgk1, Socs2 and Tulp4. This raises the possibility that I-ProSG activate specific signaling pathways in response to extracellular stimuli, perhaps to guide them in migration. I-ProSG also downregulate several epigenetic modification-related genes, such as Dnmt3l, Dnmt3a, Kdm1b and Piwil4. Perhaps this down-regulatory response results from a checkpoint mechanism that senses when epigenetic marks are fully restored. According to this model, only developing germ cells that pass the checkpoint are allowed to migrate to the SSC niche and form SSCs.
Undifferentiated SG heterogeneity
Several recent scRNAseq studies – focusing on the adult testis – have obtained evidence that undifferentiated SG are not a homogeneous population, but instead a collection of subsets in subtly different states, each with different patterns of gene expression (Green et al., 2018; Grive et al., 2019; Guo et al., 2018; Hermann et al., 2018; Sohni et al., 2019; Tan and Wilkinson, 2019; Wang et al., 2018). How this transcriptome complexity is initially established is not known. Using scRNAseq to analyze germ cells during the SSC establishment period, we identified two cell clusters of undifferentiated SG – SSC-s1 and SSC-s2 – that express distinct patterns of genes, raising the possibility that SSCs are heterogeneous even when first formed.
We found that the SSC-s1 cell cluster is the most naïve, based on Monocle psuedotime trajectory analysis and the expression pattern of known SSC markers (i.e. Eomes and Plaur) and SSC markers identified in this study (e.g. Lhx1 and Cd82). Although the functional significance of this difference in gene expression is not known, one possible explanation is that SSC-s1 and SSC-s2 differ in their endurance as stem cells. Given that SSC-s1 is more primitive than SSC-s2, it may contain cells that tend to retain SSC activity for longer than SSC-s2 cells. Another possibility is that SSC-s1 and SSC-s2 are interconvertible states, in alignment with studies suggesting that mouse undifferentiated SG stages are in equilibrium (Hara et al., 2014).
Among the genes enriched in the SSC-s1 cell cluster are genes involved in integrin-mediated signaling (Fig. S4B), which has roles in virtually all aspects of stem cell niche interactions (Marthiens et al., 2010). Indeed, homing of mouse SSCs to their niche in the seminiferous tubule periphery is known to depend on β1-integrin (Kanatsu-Shinohara et al., 2008). Thus, integrins may function in the initial generation of these naïve cells, their maintenance, and/or their conversion into other SSC states. Genes related to vessel development are also enriched in SSC-s1 cells (Fig. S4C), which raises the possibility that these cells are in close physical proximity to lymphatic endothelial cells and respond to extracellular factors from these cells. In support of this hypothesis, it was recently reported that SSC density is tightly regulated by the supply of FGFs from lymphatic endothelial cells (Kitadate et al., 2019).
Testicular somatic cell development
Our scRNAseq analysis of testicular somatic cells during the perinatal period provides a reference for understanding the development of these cells. In the case of SCs, GO analysis revealed that many genes involved in ‘male gonad development’ and ‘SC differentiation’ are enriched at P7 (Table S2), consistent with the evidence that SCs are differentiating during the early postnatal period (Walker, 2003). Many of the genes exhibiting transiently elevated expression mid-way through the perinatal period (group 2, as defined by pseudotime trajectory analysis) are involved in cell proliferation (Fig. 4D; Table S2), and thus are candidates to drive the proliferation of SCs during early postnatal development (Baker and O'Shaughnessy, 2001). Genes exhibiting elevated expression at a later development stage (group 3) encode proteins involved in ubiquitylation, PI3K/AKT signaling and integrin signaling (Table S2), raising the possibility that these pathways are important for SC maturation. Indeed, several of the TF genes in group 3 (Table S2) have known roles in SC development, e.g. Ar, Dmrt1, Sox8 and Sox9 (Barrionuevo et al., 2009; Raymond et al., 2000; Skinner et al., 1988), or lead to male infertility in gene-knockout experiments, e.g. Nr0b1, Rara, Rb1, Rxrb and Sox3 (Costa et al., 1997; Stickels et al., 2015; Vernet et al., 2008; Weiss et al., 2003; Yang et al., 2013a). Based on the timing of their expression, group-3 TF genes are likely to be expressed in SCs in contact with newly emergent SSCs and differentiating SG, raising the possibility that some group-3 TFs have roles in SSC establishment and the first wave of spermatogenesis, respectively.
LCs are unique in that they exist as two distinct populations – fetal and adult – both of which produce testosterone, but otherwise have distinct characteristics (Baker and O'Shaughnessy, 2001). We found that genes involved in ‘apoptosis’ are enriched in the P2 LC cluster (Fig. S6B), consistent with the timing of fetal LC loss (Griswold and Behringer, 2009). Indeed, TF genes expressed transiently in LCs during the mid-phase of the perinatal period (group-2 TF genes, as defined by pseudotime trajectory analysis; Fig. S6D; Table S2) have known roles in apoptosis, e.g. Mid1, Tnfaip3 and Erf (Collison et al., 2013; Honma et al., 2009; Oikawa and Yamada, 2003). Other group-2 TFs are involved in fetal LC differentiation, e.g. Arx (Miyabayashi et al., 2013), or lead to male infertility in gene-knockout experiments, e.g. Hoxd11, Arntl and Atm (Alvarez et al., 2008; Boulet and Capecchi, 2002; Xu et al., 1996).
Little is known about the development of PTMs (Nurmio et al., 2012). We found that P7 PTMs are enriched for genes involved in actin cytoskeleton organization (Table S2), raising the possibility that one or more of these genes drive the rearrangement of actin filaments that occurs in PTMs postnatally (Maekawa et al., 1996). Among the TF genes transiently upregulated late in PTMs (group 3, as defined by pseudotime analysis; Fig. 5C) is Ar, which is known to be important for PTM development (Welsh et al., 2012). Many cell-cycle, nuclear division, and chromatid segregation genes exhibit enriched expression in E18.5 and P2 PTMs, and thus are candidates to drive the proliferation of PTMs, which occurs around birth in mice (Vergouwen et al., 1991).
Cell signaling during perinatal testicular development
Our scRNAseq analysis allowed us to identify many signaling pathways that are candidates to be active during ProSG development and SSC genesis (Fig. S5A). Some signaling pathways enriched in I-ProSG are known to be involved in germ cell migration, including JAK/Stat (Brown et al., 2006), PDGF (Smith et al., 2005) and CXCR4 (Molyneaux et al., 2003), raising the possibility that these signaling pathways have a role in the migration of I-ProSG from the center to the periphery of the seminiferous tubule. Using a well-established Hippo signaling assay, we found that Hippo signaling is specifically activated in germ cells during the time window when ProSG undergo conversion into differentiating SG and SSCs (between ∼P2 and P5). This raises the possibility that Hippo signaling has roles in one or both of these conversion events. As evidence against this, a recent study showed that conditional knockout of Yap in primordial germ cells did not result in any obvious defects in mouse spermatogenesis (Abou Nader et al., 2019), but this may be because of rescue by the Yap paralog Taz (Meng et al., 2016).
We identified a large number of ligands and their cognate receptors expressed in perinatal somatic and germ cell subsets (Fig. 6D; Fig. S8). Several of these ligand-receptor pairs have been shown to function in spermatogenesis (Mäkelä and Toppari, 2017; Takase and Nusse, 2016; Yang et al., 2013b), but, to our knowledge, none has yet been shown to operate in events that occur in germ or somatic cells during the perinatal period. For example, we found that the gene encoding the growth factor GDNF is expressed in perinatal SCs, and its two receptors, GFRA1 and RET, are expressed in T2-ProSG and SSC-1, raising the possibility that GDNF-mediated signaling is important for the generation and initial expansion of SSCs, thereby expanding its role beyond adult SSC self-renewal (Oatley et al., 2007). We found that the CXC chemokine ligand, Cxcl12, is highly expressed in both stromal cells and PTMs, and its receptor, Cxcr4, is expressed in ProSG, LCs and innate lymphatic cells. This raises the possibility that CXC chemokine signaling may be involved in the establishment of SSC niche, as this pathway is known to promote SSC homing to their niche in the seminiferous tubule periphery in adult mice (Niu et al., 2016). A role for FGF signaling is suggested by our finding that that the FGF receptor gene Fgfr2 is expressed in germ cells, LCs, PTMs and stromal cells, and the FGF ligand genes Fgf2 and Fgf7 are expressed in both stromal cells and PTMs during the perinatal period. This is consistent with the possibility that emerging SSCs are regulated by stromal cells and PTMs, as the supply of FGFs is known to tightly regulate SSC density (Kitadate et al., 2019). Other signaling pathways implicated as being active during perinatal development include the Hedgehog, Notch, TGFβ, WNT, PDGF, LIF and IGF pathways (Fig. S8).
MATERIALS AND METHODS
Animals
This study was carried out in strict accordance with the Guidelines of the Institutional Animal Care and Use Committee (IACUC) at the University of California, San Diego. The protocol was approved by the IACUC at the University of California, San Diego (permit number: S09160). All C57BL/6 mice (female: 7-10 weeks; male: 8-16 weeks) were housed under a 12 h light:12 h dark cycle and provided with food and water ad libitum.
Testis sample preparation
Each sample analyzed by scRNAseq consisted of testicular cells pooled from two pups. Single testicular cells were isolated using a two-step enzymatic digestion protocol described previously (Sohni et al., 2019; Valli et al., 2014), with minor modifications. In brief, the tunica albugineas of fresh mouse testes were removed and the testes mechanically disrupted and enzymatically digested with 1 mg ml−1 collagenase type IV (Worthington Biochemical Corporation) in Hank's Balanced Salt Solution (HBSS; Gibco) at 37°C. The tubules were gravity sedimented and washed with HBSS and digested in 0.25% Trypsin-EDTA (Thermo Fisher) and Deoxyribonuclease I (Worthington Biochemical Corporation). The suspension was triturated vigorously ten times, incubated at 37°C for 5 min, followed by repeat trituration and incubation. The digestion was stopped by adding the same volume of αMEM+10% fetal bovine serum (FBS) medium and the cells were sequentially size-filtered through 70 μm and 40 μm strainers (Thermo Fisher) and pelleted by centrifugation at 600 g for 3 min. The dead cells were removed using the ClioCell Dead Cell Removal kit (Amsbio) following the manufacturer's instructions.
10x Genomics library preparation
Single testicular cells were washed once in PBS and re-suspended in 0.3% FBS in PBS for loading on the 10x Chromium chip. Cell capturing, and library preparation was carried as per kit instructions [Chromium Single Cell Kit (v2 chemistry)]. In brief, 10,000 cells were targeted for capture per sample; after cDNA synthesis, 12-14 cycles were used for library amplification. The resultant libraries were size selected, pooled and sequenced using 2×100 paired-end sequencing protocol on an Illumina HiSeq 4000 instrument. The libraries initially underwent shallow sequencing to access quality and to adjust subsequent sequencing depth based on the capture rate and unique molecular indices (UMI) detected. All sequencing was performed at the Institute of Genomic Medicine at UCSD.
De-multiplexed raw sequencing reads were processed and mapped to the mouse genome (mm10) using Cell Ranger software (v3.0.2) with default parameters. Quality control metrics from these data are listed in Table S3. We filtered raw count matrices by excluding cells expressing fewer than 200 detectably expressed genes and genes expressed in fewer than three cells and each library was tagged with a library batch ID and combined across independent experiments using the Seurat package (Butler et al., 2018) in R. To check the quality of the single-cell data and to remove multiplets, we performed Seurat-based filtering of cells based on three criteria: number of detected features (nFeature_RNA) per cell, number of UMIs expressed per cell (nCount_RNA) and mitochondrial content, using the following threshold parameters: nFeature_RNA (200 to 6000), and percentage of mitochondrial genes expressed (<0.2%). For the detailed analysis of each cell type, we applied more stringent filtering parameters (i.e. mitochondrial content percent cut-off was set to 0.15). In addition, we used a known lineage marker profile to ensure that the same barcode was not assigned to two cells of different lineages (multiplets), as well as to rule out the possibility of captured cell-free droplets. Gene expression values were log normalized and regressed by the mitochondrial expression (‘percent.mt’) and cell cycle (‘S.Score’ and ‘G2M.Score’) using the ‘SCTransform’ function. Batch correction was performed using the ‘JackStraw’ functions in the Seurat package.
To identify cell clusters, we employed UMAP (Becht et al., 2018). The number of the cells in each cluster are summarized in Table S4. The ‘FindMarkers’ function (a Wilcoxon rank sum test) was used to determine differential gene expression between clusters (set at minimum expression in 25% of cells). The ‘CellCycleScoring’ function was used to infer cell cycle phase, as this program determines relative expression of a large set of G2/M- and S-phase genes (Kowalczyk et al., 2015). The ‘DoHeatmap’ function was used to generate an expression heatmap for given cells and features. Pearson correlation analysis of cell clusters was performed using the ‘CellScatter’ function. GO analysis (DAVID v6.8) was done using top differentially (positively) expressed genes with a P-adjusted cut-off of 0.01. Lists of differentially expressed genes were analyzed by IPA to identify biological pathways that are significantly over-represented among the genes in each list.
Cell trajectory analysis
Single-cell pseudotime trajectories were constructed with the Monocle 2 package (v2.10.1) (Qiu et al., 2017) according to the provided documentation (http://cole-trapnell-lab.github.io/monocle-release/). UMI counts were modeled as a negative binomial distribution. Genes defining the trajectory are those with high dispersion across cells (mean_expression≥0.01; dispersion_empirical≥1). The discriminative dimensionality reduction with trees (DDRTree) method was used to reduce data to two dimensions. Differentially expressed genes were identified and used for dynamic trajectory analysis [false discovery rate (FDR)<0.05], which ordered cells in pseudotime. The ‘plot_pseudotime_heatmap’ function was used to generate heatmaps.
Immunofluorescence analysis
Sections were deparaffinized twice in xylene, followed by serial dilutions of ethanol. Unmasking was performed with IHC-TekTM epitope retrieval solution using a steamer (IHC World) for 40 min. Blocking was performed by incubating with 5% serum (from the species in which the secondary antibody was raised) for 1 h at room temperature. The sections were then incubated overnight with the primary antibody (Table S5) at 4°C and incubated with secondary antisera (see Table S5 for a list) for 1 h at room temperature. The nuclei were counterstained with DAPI, a coverslip was placed over the sections with mounting medium, and the images were viewed using a Leica DMI4000 B fluorescence microscope.
FACS and qRT-PCR analysis
After dissecting single testicular cells, cells were re-suspended in staining buffer (PBS+3% FBS) and stained with the CDKN2A antibody for 20 min on ice, washed with staining buffer, incubated with secondary antibodies for 20 min on ice, re-suspended in staining buffer and sorted by FACS. For FACS, gating was set based on size (forward scatter, FSC) to remove small debris and doublets. We then used unstained and secondary antibody only (primary omitted)-stained cells as negative controls for gating unstained and false positive-stained cells, respectively.
cDNAs were generated using the Iscript reverse transcriptase (RT) kit, according to the manufacturer's protocol (Bio-Rad). qRT-PCR was performed using an iCycler real-time PCR machine according to the manufacturer's protocol (Bio-Rad). The production of the amplicon was measured by SYBR green fluorescence and the threshold cycle (Ct) values were calculated. Ct values obtained were normalized to Ct values for the ribosomal protein Rpl19 gene. Three independent biological replicates were performed.
Statistical methods
The details of the statistical method used for identifying the differential gene expression and pseudotime trajectory analysis are provided in the detailed methods above. Quantification of the immunostainings was performed by counting the positively stained cells in different fields of view. The number of cells counted is indicated on the respective figure or its figure legend.
Acknowledgements
We thank the UCSD Institute for Genomic Medicine for technical support and the San Diego Supercomputer Center for providing data analysis resources. We also thank the Developmental Studies Hybridoma Bank (created by the NICHD and maintained at The University of Iowa, Department of Biology) for the LHX1 antibody generated by Dr David R. Soll.
Footnotes
Author contributions
Conceptualization: K.T.; Methodology: K.T., H.-W.S.; Software: K.T.; Validation: K.T.; Formal analysis: K.T.; Investigation: K.T.; Resources: K.T., H.-W.S.; Data curation: K.T.; Writing - original draft: K.T.; Writing - review & editing: K.T., M.F.W.; Supervision: M.F.W.; Project administration: M.F.W.; Funding acquisition: M.F.W.
Funding
This work was supported by the National Institutes of Health (R01 GM119128 to M.F.W.) and the Lalor Foundation (K.T.). Deposited in PMC for release after 12 months.
Data availability
The scRNAseq data generated in this study has been deposited at NCBI's Gene Expression Omnibus database under the accession number GSE130593.
Peer review history
The peer review history is available online at https://dev.biologists.org/lookup/doi/10.1242/dev.183251.reviewer-comments.pdf
References
Competing interests
The authors declare no competing or financial interests.