Recent advances in the generation of kidney organoids and the culture of primary nephron progenitors from mouse and human have been based on knowledge of the molecular basis of kidney development in mice. Although gene expression during kidney development has been intensely investigated, single cell profiling provides new opportunities to further subsect component cell types and the signalling networks at play. Here, we describe the generation and analysis of 6732 single cell transcriptomes from the fetal mouse kidney [embryonic day (E)18.5] and 7853 sorted nephron progenitor cells (E14.5). These datasets provide improved resolution of cell types and specific markers, including subdivision of the renal stroma and heterogeneity within the nephron progenitor population. Ligand-receptor interaction and pathway analysis reveals novel crosstalk between cellular compartments and associates new pathways with differentiation of nephron and ureteric epithelium cell types. We identify transcriptional congruence between the distal nephron and ureteric epithelium, showing that most markers previously used to identify ureteric epithelium are not specific. Together, this work improves our understanding of metanephric kidney development and provides a template to guide the regeneration of renal tissue.
Mammalian kidney development has been studied using the mouse for over 70 years. The developing mammalian kidney consists of three main cell lineages, all of which derive from multipotent progenitors. Foxd1-expressing progenitors give rise to most cell types in the interstitial compartments (Kobayashi et al., 2014) aside from the stroma surrounding the ureter, which derives from Tbx18-expressing cells (Bohnenpoll et al., 2013). Ret-expressing ureteric tip (UT) cells give rise to the collecting duct and ureter (Chi et al., 2009). Finally, the filtration units of the kidney, the epithelial nephrons, arise from Six2-expressing nephron progenitor cells (Kobayashi et al., 2008). During kidney development, these progenitor populations signal to each other to ensure the ongoing expansion of the organ and accumulation of nephrons, with the resulting kidney containing ∼1,000,000 nephrons in human and 16,000 in mouse (Bertram et al., 2011; Merlet-Benichou et al., 1999).
Our understanding of the molecular identity of cellular components within the mouse kidney is arguably richer than in almost any other organ system. Initial microarray analyses (Challen et al., 2005; Schmidt-Ott et al., 2005; Schwab et al., 2003; Stuart et al., 2003) were followed by some of the earliest profiling of laser-captured and sorted cell populations (Brunskill et al., 2008, 2011) with extensive section in situ hybridisation studies both validating compartment-enriched gene expression as well as further subsecting cellular domains (Georgas et al., 2009; Harding et al., 2011; Mugford et al., 2009; Thiagarajan et al., 2011). Anatomical and molecular comparisons of kidney development between human and mouse have now identified species-specific cell type markers within a developmental programme that is largely conserved (Lindström et al., 2018b,c,d). With the advent of single cell RNA sequencing (scRNA-seq), such analyses of the developing mouse (Adam et al., 2017; Brunskill et al., 2014; Magella et al., 2017) and human (Lindström et al., 2018a,b; Menon et al., 2018; Wang et al., 2018; Young et al., 2018) kidney have given further insight into the cellular composition and molecular profiles of cells in both species. The analyses performed on existing datasets have not comprehensively identified the signalling pathway components known to be operating within the mouse fetal kidney. Understanding the signals involved in specifying renal progenitors in mouse has formed the basis of current human kidney organoid protocols (Little et al., 2016) and has underpinned advances in culturing primary progenitor cells from both species (Brown et al., 2015; Li et al., 2016; Tanigawa et al., 2016; Yuri et al., 2017). A deeper understanding of unique cell type-specific marker genes and the local signalling environment for all component cell types during mouse kidney development will provide avenues to optimise the maintenance and differentiation of nephron, stromal and ureteric epithelium cell types from mouse and human cells for drug screening and disease modelling applications.
In this study, we used single cell profiling to interrogate cell types and gene expression within 6732 cells from three distinct E18.5 developing mouse kidney pairs, and 7853 sorted nephron progenitor cells from E14.5. By combining biological replication with robust clustering algorithms to define cell types, and rigorous statistical testing to determine differentially expressed cluster markers, we have generated an in-depth single cell view of the developing mouse kidney. Global analysis of receptor and ligand interactions within this dataset provides information with which to improve specification, maintenance and maturation of renal cell types in vitro. Importantly, this dataset more deeply subdivides stromal subcompartments as well as better addressing the need for unique markers of specific nephron segments.
Single cell profiling of the developing kidney identifies all major lineages and cell types including a stromal-nephron progenitor cluster
We sought to explore cell types and developmental programmes in the late fetal mouse embryonic kidney (E18.5), a time at which all progenitor populations and most mature and maturing cell types co-exist. Using three independent kidney pairs captured in parallel using the 10x Chromium system, our aggregated dataset consists of 6752 cells, 5639 of which passed quality control (see Materials and Methods), with a median of 2896 unique genes detected per cell. We used Seurat (Butler et al., 2018; Satija et al., 2015) to perform normalisation, variable gene selection and subsequent unsupervised clustering of cells, yielding 16 distinct whole kidney clusters (K0-K15, Fig. 1A). Following initial exploration of the data, we normalised for the effects of the three biological replicates as well as for cell cycle stage. After normalisation, an overlay of the three independent kidney datasets showed an even distribution of cells from each replicate among clusters, and visualisation of cell cycle state across the t-SNE projection illustrates no association between cell cycle state and any specific cluster after cell cycle normalisation (Fig. S1). TREAT tests from the edgeR package (McCarthy and Smyth, 2009; Robinson et al., 2010) were used to find genes that were differentially expressed between cells in each cluster and all other cells (log fold change>1, FDR<0.05). Genes that were enriched or specifically expressed in each cluster were cross-referenced to validated anchor genes and established markers to identify cell types (Fig. 1B, Table 1) (Georgas et al., 2009, 2008; Thiagarajan et al., 2011). Entire gene lists were also compared with available kidney cell type-specific profiling using ToppGene (toppgene.cchmc.org) (Chen et al., 2009). This provided a provisional identification for all clusters (Fig. 1A-C). The number of clusters and key markers from our dataset are generally consistent with previous single cell analyses of the developing mouse kidney (Adam et al., 2017; Magella et al., 2017), though our analysis identifies more established marker genes per cluster (Fig. S1). Lists of differentially expressed (DE) genes for each cluster are provided in Table S1, and tSNE plots of key marker genes are displayed in Fig. S2. One or more clusters representing each of the major renal lineages – stroma, nephron and ureteric epithelium – were identified in the data. Vascular endothelial and tissue-resident immune cell populations were also identified (Fig. 1A-C). We note that resident immune cells expressed Bmp2 and Tgfb1, whereas the endothelial cells expressed Igf1, Igf2, Tgfb1, Notch1 and Notch 2, which may influence cell-cell signalling within the developing kidney. Clusters corresponding to nephron progenitor cells, all major nephron segments, and a nephron progenitor-like cluster that co-expresses stromal markers including Penk and Col3a1 (Fig. 1A,B) were identified. Four additional populations with a stromal signature were identified, all expressing Meis1, Col3a1 and Pdgfra. These populations correspond to a cortical/nephrogenic zone stroma (cluster K2, Meis1+Foxd1+Wnt4−), medullary stroma (K4, Meis1+Foxd1−Wnt4−), collecting duct-associated stroma (K1, Meis1+Wnt4+Wnt11+) and a population marked by genes known to be expressed in several locations such as the cortical stroma, renal capsule, mesangium, smooth muscle cells and ureteric stroma (K13, Meis1+Foxd1+Tbx18+) indicating further heterogeneity within these clusters (Fig. S1). We next sought to examine potential signalling interactions between cell types.
Global analysis of putative ligand-receptor interactions
Known expression domains for key ligands and receptors from GDNF-RET, TGFB, Wnt and FGF pathways were observed in expected cell types in our differential expression analysis (Fig. 2A). To investigate cell communication in the entire dataset we screened all cell types for a curated list of 2422 known and inferred receptor-ligand interactions (Ramilowski et al., 2015) adapted for use in single cell data (Farbehi et al., 2019). This identified >12,000 potential interactions within and between the whole kidney clusters (Fig. 2B, Table S2). Although interactions between some cell populations are implausible due to lack of proximity, this provides an unbiased analysis of autocrine signalling and interactions between adjacent cell types. As an illustration of these results, we focus on potential paracrine interactions between the cortical stroma (CS, K2), nephron progenitor (NP, K0), and ureteric epithelium (UE, K9) cell clusters (Fig. 2C). This identifies known interactions between NP and UE through GDNF-RET, FGF, BMP and Wnt ligands and receptors and identifies additional putative interactions through NP-produced Rspo1 and Rspo3, and UE-produced Nrtn. We note that NP-produced Fgf1 was identified to signal back to the ureteric epithelium. Fgf1 is the most differentially expressed FGF ligand in the NP cluster within this data, though Fgf20, and lower levels of Fgf8, Fgf9 and Fgf10 were also detected. We previously identified Fgf1 as a candidate driver of increased NP proliferation in a heterozygous knockout of Six2 (Combes et al., 2018) and exogenous FGF1 promotes NP maintenance in culture (Brown et al., 2011). This analysis relies on a curated list of interacting factors (Ramilowski et al., 2015), which has some notable exceptions including Wnt9b; however, expression of specific genes can be interrogated in Table S1. Signalling between the CS and NP populations is important for kidney development (Das et al., 2013; Fetting et al., 2014) but our understanding of the pathways underlying these interactions is incomplete. This analysis identifies potential interactions involving Wnt5a and Bmp7. These genes have kidney phenotypes on knockout and are expressed in the CS (Wnt5a) (Nishita et al., 2014) or influence organisation of the CS (Bmp7) (Oxburgh et al., 2004). Putative NP-CS interactions involving Ntn1, Sfrp1, Fgf1, Fgf2, Pdgfc and Ntf3 were identified (Fig. 2C). Although the significance of putative interactions requires testing, these provide candidate pathways to improve nephron progenitor specification and maintenance in vitro.
Subclustering of ureteric epithelium cells identifies known subpopulations and established developmental trajectories
The ureteric epithelium in the developing mouse kidney has distinct zones of gene expression defining the tips, cortical and medullary domains of this epithelium (Rutledge et al., 2017; Thiagarajan et al., 2011). Whole kidney cluster K9 expressed genes that are characteristic of the ureteric epithelium, including Wnt11, Ret, Gata3 and Wnt9b. Cells belonging to K9 were re-clustered, resulting in the identification of three ureteric epithelium subpopulations (U), with differential expression defining marker genes corresponding to tips (U0), cortical (U1) and medullary (U2) segments of the ureteric epithelium (Fig. 3A-C, Table S3) (Thiagarajan et al., 2011). The cluster enriched for medullary collecting duct marker genes also contained genes expressed in the urothelium of the renal pelvis (Fig. 3C) (Thiagarajan et al., 2011). Testing of Gene Ontology (GO) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) annotations identified major signalling pathways active in these subpopulations. This indicated the activity of several pathways known to be involved in ureteric tip development such as Wnt, retinoic acid, TGFβ, FGF and Hippo signalling (Fig. 3D) (Reginensi et al., 2015; Yuri et al., 2017). This analysis also identified TGFB and PI3K-AKT pathways as active in the cortical collecting duct and phosphatidylinositol, PPAR and Notch pathways as active in the medullary collecting duct and or urothelium (Fig. 2D). These represent candidate pathways for attempts to direct differentiation of mature collecting duct from progenitors of the ureteric epithelium. Whereas clustering attempts to group single cell transcriptomes into distinct cell types, pseudotime analysis involves ordering cells along a continuous trajectory that represents progress through differentiation. This is done by maximising the transcriptional similarity between successive pairs of cells, using dimensionality reduction and minimal spanning trees. Branches can occur along the trajectory when precursor cells make cell fate decisions that result in multiple subsequent lineages (Trapnell et al., 2014). Pseudotime analysis of cells from kidney cluster K9 using Monocle 2 (Qiu et al., 2017) replicated the established developmental trajectory from tip progenitor to cortical then medullary collecting duct, and identified cohorts of genes that change during this progression (Fig. 3E,F).
Nephron lineage and relationships
Some nephron clusters within the whole kidney analysis represented multiple nephron populations such as cluster K3 (Fig. 1A), which co-expressed markers of the connecting segment (Calb1) and distal tubule (Slc12a1), which do not overlap in the embryo. We reclustered cells from the nephron lineage to gain further insight into nephron segments and subpopulations. This identified eight nephron (N) clusters representing established early nephron states and mature nephron segments (Fig. 4A), and markers including Six2, Cited1, and Meox1 defined a further five nephron progenitor clusters. All cells of the nephron arise from nephron progenitors via a mesenchyme-to-epithelial transition in response to WNT9B, produced at highest levels in the tip-stalk junction of the ureteric epithelium (Carroll et al., 2005; Kobayashi et al., 2008). The first morphological sign of this transition is clustering of progenitors into a pretubular aggregate (PTA), marked by expression of Wnt4 and Tmem100 (Rumballe et al., 2011), which then forms a polarised epithelial renal vesicle (RV), marked by elevated levels of Ccnd1, Jag1 and Fgf8 (Fig. 4B) (Georgas et al., 2009). Distinct proximal and distal gene expression is seen at RV with distal, medial and proximal segments evident in the S-shaped body (SSB) (Fig. 4B) (Georgas et al., 2008). Podocytes are located in the proximal segment of the SSB (marked by Mafb) (Fig. 4B). The SSB matures into a capillary loop nephron, which contains precursors for all major nephron segments, including a connecting segment (Calb1), distal tubule (Slc12a1), loop of Henle (Umod), proximal tubule (Lrp2, Fbp1) and podocyte-enriched glomerulus (Mafb, Podxl) (Fig. 4B).
Clusters representing all major nephron segments were present in the single cell data (Fig. 4A-C), with RV and SSB markers overlapping in cluster N3, and cluster N8 expressing markers of the distal SSB. Clusters representing the PTA, connecting segment, distal tubule and loop of Henle, early proximal tubule and proximal tubule, and podocytes were also identified (Fig. 4C). We refer to the ‘early proximal tubule’ (K5) cluster as such because it appears to represent a less mature version of the proximal tubule cluster (K8); however, K5 may also represent a distinct proximal tubule segment identity rather than a state of maturation. DE genes enriched were identified for each cluster (Table S4).
Pseudotime analysis was used to further interrogate nephron formation. This identified three main nephron states (Fig. 4D). An initial state combined nephron progenitors with early nephron up to SSB and podocytes. The trajectory subsequently forked into two arms representing the connecting segment and distal tubule on one arm, and the proximal tubule on the other (Fig. 4D). The split between proximal and distal tubule, and the association between distal tubule and connecting segment was anticipated (Georgas et al., 2009, 2008). This positioning of podocytes between RV/SSB and the branch point between proximal and distal fates is different for the separate trajectory reported in human fetal kidney (Lindström et al., 2018a), or that more closely associated with proximal nephron as reported by Hochane et al. (2019).
Mechanisms regulating nephron formation and maturation
Transcriptional regulation is a crucial mechanism for determining and maintaining cell fate during development. Segment-specific transcriptional regulators may facilitate direct reprogramming, as previously reported for nephron progenitors and proximal tubule (Hendry et al., 2013; Kaminski et al., 2016). The top differentially expressed transcription factors (TF) within each mouse nephron lineage cluster, including nephron progenitor, were identified (Fig. S3, Table S4), highlighting cell type-specific TFs such as Six2 (nephron progenitor/early nephron), Mafb (podocytes) and Hnf4a (proximal tubule) (Kaminski et al., 2016; Thiagarajan et al., 2011). Signalling pathways identified as active within the nephron progenitor cluster include several pathways shown to regulate nephron progenitor fate in vivo such as PI3K-AKT, Wnt, Hippo and MAPK signalling (Brown et al., 2015; Das et al., 2013; Karner et al., 2011; Lindström et al., 2015; McNeill and Reginensi, 2017). Likewise, signalling pathways capable of triggering nephron formation, including Notch and TGF-β signalling (Brown et al., 2015; Chung et al., 2017), were identified in early nephron cell types (Fig. 4E, Fig. S3). Novel developmentally significant signalling pathways, including Hedgehog and JAK-STAT, were implicated by this analysis and may improve methods for maintaining isolated nephron progenitors (Brown et al., 2015; Li et al., 2016; Tanigawa et al., 2016). Although nephron progenitor regulation and early nephron segmentation has been intensely studied, very little is known about the signals that are active in maturing nephron segments. Table S4 provides candidate signalling pathways that could be used to produce specific states of nephron maturation from primary nephron progenitor cells and in human kidney organoids. cAMP, cGMP-PKG and insulin signalling are associated with the distal and early proximal tubule, whereas PPAR, AMPK and glucagon signalling are associated with proximal tubule.
Identification of Notch2+Spry2+ state and nephron progenitor stromal cluster
Nephrons derive from a self-renewing mesenchymal population (Boyle et al., 2008; Kobayashi et al., 2008). The nephron progenitor population is thought to be divided into a Six2+Cited1+ uncommitted and Six2+Cited1− committing state (Brown et al., 2015; Mugford et al., 2008). However, previous time-lapse imaging of kidney morphogenesis has revealed substantial cell movement (Combes et al., 2016; Lawlor et al., 2019; O'Brien et al., 2018) and variation in cell cycle length (Short et al., 2014) within the nephron progenitor population, suggesting it may be more heterogeneous than was previously thought. As noted above (Fig. 4A), five nephron progenitor populations were identified expressing Six2, Cited1 and Meox1. Two of these clusters (N7 and N9) appeared to be driven by cell cycle genes (e.g. Cenpa, Cenpf, Pclaf, Top2a), with top DE genes also relating to cell cycle. These clusters are likely driven by cycle profile not accounted for in the cell cycle normalisation. Cell cycle clusters N7 and N9 displayed a partially committed phenotype with low expression of Wnt4 and Tmem100. This could associate cell division with priming for commitment or reflect the increase in cell proliferation that is seen in committing nephron progenitor cells (Short et al., 2014). The three remaining nephron progenitor clusters expressed cluster-specific DE markers (Fig. 5A). We define these as: (1) ‘uncommitted’ (cluster N0), with the highest levels of Cited1 and Meox1 and little to no expression of Wnt4 and Tmem100; (2) ‘primed’ (N6), with lower levels of Six2 and Cited1, expression of Notch2 and Spry2 and low levels of renal vesicle marker Jag1; and (3) a nephron progenitor-stromal population (N10) with modest expression of commitment markers and stromal characteristics, including expression of Pdgfra and Col3a1 (Fig. 5A). Cluster N6 expressed Spry2, a negative regulator of FGF signalling, with FGF signalling associated with nephron progenitor maintenance (Walker et al., 2016). Notch signalling has recently been shown to regulate early commitment to nephron formation (Chung et al., 2016, 2017). Hence, cluster N6 may represent a transitional state between nephron progenitor and PTA (N4) (Fig. 5A).
Pseudotime analysis of all cells within early nephron clusters (N0 NP to N4 PTA, Fig. 5B) reproduced the expected developmental trajectory from nephron progenitor to PTA. Cell cycle-associated cluster NP7 grouped with PTA cluster N4, representing a more committed state (Fig. 5B). Cell cycle-associated cluster N9 and a putative ‘primed’ cluster N6 were distributed along the entire trajectory. As such, N6 may represent a transient state that NP cells cycle through or reflect cells that are positioned adjacent to the stroma or ureteric tip at any point in time. The nephron progenitor/stromal state N10 diverged from this main trajectory after the undifferentiated nephron progenitor state. Of note, cells from the N6 NP cluster were also present with the N10 NP-STR cluster (Fig. 5B).
Sorted nephron progenitors recapitulate nephron progenitor cell states identified in whole kidney
To gain a deeper insight into nephron progenitor subpopulations within the developing mouse kidney, 7853 Six2GFP+ nephron progenitor cells from three pooled replicates of E14.5 kidney were isolated using fluorescence activated cell sorting (FACS) from the Six2GCE mouse line (Kobayashi et al., 2008) and profiled using scRNA-seq. Sorted cells were combined with all nephron progenitor cells from the whole kidney analysis and clustered using Seurat's dataset integration approach (Butler et al., 2018) (Fig. S4). We refer to the resulting clusters as nephron progenitor (NP) clusters 0-9 (NP0-NP9, Fig. 5C). Clusters that did not relate to cell cycle were marked by the same DE genes that were observed in the whole kidney nephron progenitor subclusters. Cluster NP0 displayed increased expression of uncommitted progenitor genes such as Cited1, NP3 had increased levels of Notch2 and Spry2, cluster NP4 represented a committing state with increased levels of Wnt4 and Tmem100, and a nephron progenitor-stromal cluster (NP7) remained. Cells from both whole kidney and sorted nephron progenitors were present in all clusters (Fig. 5D, Fig. S4). Although nephron progenitor cells across all clusters expressed stromal markers (Meis1, Lgals1 and Meg3), Pdgfra was enriched in NP7 (Fig. 5D). Trajectory analysis of nephron progenitors from clusters NP0, NP3, NP4 and NP7 (Fig. 5E) reproduced the trajectory analysis of nephron progenitors from the whole kidney data (Fig. 5B). See Table S5 for cluster markers and DE genes for NP0, NP3, NP4 and NP7. This larger dataset reinforced the nephron progenitor subpopulations identified in the whole mouse kidney.
The nephron progenitor-stromal cluster may be the result of a technical artefact
Nephron progenitors and stromal progenitors arise from the same lineage before the onset of nephron formation (Brunskill et al., 2014; Mugford et al., 2008) but are not thought to cross lineage after this time (Kobayashi et al., 2014; Naiman et al., 2017). Stochastic expression of stromal markers in nephron progenitor cells has been reported at the single cell level (Magella et al., 2017), and expression of stromal markers Foxd1 and Meis1 within our analysis suggests this may be more than random expression (Fig. 5D). However, NP7 represented a small but distinct cell cluster that expressed both nephron progenitor and a broad range of stromal markers. This combined profile could represent a technical artefact in which stromal and nephron progenitor cells are labelled by a single barcode creating a ‘doublet’, the existence of a genuine in vivo cell population transitioning between nephron progenitor and stromal lineages, or an artefactual change in progenitor identity upon dissociation. No other populations were observed with mixed signatures, but doublet-finding algorithms (doubletCells and doubletCluster functions in the scran package) did identify this cluster within the whole kidney data (K14) as having an increased probability of containing doublets (Fig. S4).
Lineage tracing was performed to investigate the possibility of these cells representing a transitional state. Using a constitutively active Six2-Cre (Six2TGC), Six2-derived cells were observed in the cortical and medullary stroma in all samples (Fig. 5F). Evidence of lineage transition was also observed deeper in the kidney. However, as this Six2-Cre is active from E11.5 or earlier, labelled stromal cells may reflect the early plasticity between stromal and nephron lineages rather than continued transdifferentiation. Using an inducible Six2-Cre (Six2GCE, induced from E12.5) to assess nephron progenitor contributions to stroma after the establishment of the proposed lineage boundary did result in rare Six2-derived cells in the nephrogenic zone that did not express SIX2 protein, but labelled cells were observed at a frequency lower than expected based on NP7 cluster size, and most labelled cells were unusually small, suggesting they may be undergoing apoptosis (Fig. 5G). Lineage tracing from an inducible Pdgfra stromal Cre activated at E13.5 and assessed at E18.5 did not label cells within the nephron progenitor population or nephron lineage (Fig. 5H). Hence, stromal cells do not appear to transition to nephron progenitor fate. SIX2 antibody staining did not overlap with a transgenic mouse line expressing nuclear PdgfraGFP (Fig. 5I) despite transcripts for Six2 and Pdgfra being co-expressed in the scRNA-seq data. A genuine discrepancy between mRNA and protein expression is improbable, as the reporters used drive inducible Cre expression from the native Pdgfra and Six2 promoters and therefore should evade mechanisms targeted at preventing production of proteins from the other lineage. Cumulatively, these data affirm the current model of boundaries between nephron and stromal lineages after early kidney development. Although this mixed signature could represent transcriptional confusion induced by dissociation, increased library size and a merged signature supports selective doublets between nephron progenitor and stromal cells. It remains unclear why this doublet was enriched; however, this may suggest differential cell-cell adhesion between these states.
Defining stromal subpopulations within developing kidney
Interrogating the role of stromal subpopulations in kidney development has been hampered by a lack of understanding of specific markers of these populations. Although ontological terms were defined for distinct anatomical regions of the kidney stroma (Little et al., 2007), definitive markers for such regions have been less well defined. Adam et al. (2017) and Magella et al. (2017) identified three stromal clusters and regionally assigned them (cortical, medullary, mesangial) with respect to in situ hybridisation data from the Allen Brain Atlas (http://portal.brain-map.org/). Stromal cell types, and signalling from them, are crucial to normal kidney development (Li et al., 2014). Reclustering of stromal clusters identified seven stromal lineage (S) clusters (Fig. 6A-C). In situ hybridisation data from the Allen Developing Mouse Brain Atlas (https://developingmouse.brain-map.org/) was mined to map the expression domains of cluster markers (Fig. S5). Stromal clusters S0 and S4 are marked by several genes expressed in the cortical and nephrogenic zone stroma (Foxd1, Ntn1, Igfbp5, Aldh1a2, Gdnf). Cluster S4 revealed a cell cycle signature, likely representing cells within the same region as S0 that are proliferating. Clusters S2 and S3 (Alx1, Wnt4, Nkd1, Wnt11) represent the Alx1+ collecting duct-associated stroma. S2 may reflect proliferating cells within S3, as the majority of genes that differ between these clusters relate to cell cycle, with the notable exception of Ren1, which may identify this cluster as perivascular. Markers within cluster S1 have a heterogenous expression with overlap in the medullary region. The S5 population expresses markers of vascular-associated smooth muscle cells and pericytes (Angpt1, Angpt2, Mef2c, Pdgfrb, Cspg4, Ren1, Gata3). DE genes for cluster S6 included genes such as Tbx18, with established profiles in the stroma surrounding the ureter (Airik et al., 2006), and Dlk1, Igf1 and Cd34, suggesting vascular-associated cells (Fig. 6B-C, Fig. S5). The DE genes from this analysis will aid in characterising stromal populations in the developing kidney (Table S6). Further integration of scRNA-seq datasets such as this one with emerging spatial transcriptomics methods (Ståhl et al., 2016) will also aid in defining more precise regions and cell types within the stroma.
Congruence between markers of the ureteric epithelium and distal nephron
In the process of defining cluster identities, a strong congruence between markers of the ureteric epithelium and distal nephron was observed. Most established markers of the ureteric epithelium, such as Hoxb7, Gata3, Calb1, Krt8, Krt18, Krt19 and Aqp2, were also expressed in the distal nephron, albeit at lower levels. Likewise, nephron markers such as Cdh16, Mal, Spp1 and Spint2 were expressed in the ureteric epithelium (Fig. 7A). Indeed, over half of the top 30 DE genes in either distal nephron or ureteric epithelium were expressed in both clusters. This has significant implications in the directed differentiation of pluripotent cells to kidney organoids, which has relied upon many of these markers to identify collecting duct versus distal nephron.
To check that these results were not due to inappropriate clustering of ureteric epithelium cells, we re-examined the presence of GATA3 protein within the distal nephron segments (connecting segment/distal tubule) in vivo using antibody staining and lineage tracing driven by a nephron progenitor-specific Six2-Cre mouse line (Six2TGC) (Kobayashi et al., 2008). As expected, all connecting segments and distal regions of the nephron tubules were derived from the nephron lineage, but these structures clearly express GATA3 protein (Fig. 7B). Indeed Hoxb7, an established marker of the ureteric epithelium, was most highly expressed in ureteric epithelium but also in distal nephron (Fig. 7C) and some endothelial cells (not shown). Expression of GATA3 and Hoxb7GFP in the distal nephron has likely been previously overlooked as in situ hybridisation and immunofluorescence focus on sites of highest expression.
Comparing genes that are upregulated in the connecting segment (N12) and ureteric epithelium (K9) clusters, and checking these against relevant clusters in the nephron and ureteric epithelium lineage clustering, identified 36 genes that represent markers specific to the connecting segment and/or expressed more broadly in the nephron lineage that could be used in combination with Gata3 expression to distinguish connecting segment from ureteric epithelium. Likewise, 29 ureteric epithelium genes not expressed in the distal nephron were identified (Table S7).
The developing mouse kidney represents an invaluable tool with which to understand the formation and maturation of each renal cell type. The single cell data presented here offers a unique opportunity to understand the mechanisms of progenitor maintenance and differentiation in the stroma, the ureteric epithelium and the nephron lineages. Dynamic changes in gene expression and signalling pathway activity from progenitor to mature cell type provide a roadmap of the signals that regulate progenitor maintenance and differentiation in vivo. Likewise, global analysis of receptor-ligand interactions between all cell clusters in the whole kidney identified potential novel interactions and interactions known to play a significant role in kidney development.
Previous scRNA-seq analyses of developing mouse kidney have been performed at E11.5-E14.5 and postnatal day (P) 1 (Adam et al., 2017; Brunskill et al., 2014; Magella et al., 2017) (Table S8). This study examined E18.5, a developmental stage that contains a broader complement of cell types compared with E11-E14.5 but precedes the cessation of nephrogenesis initiating at P1 (Hartman et al., 2007; Rumballe et al., 2011). Although the DE genes identified here correlate with these previous studies, this dataset provides a deeper insight into cluster-specific gene expression, identifying both anticipated receptors/ligand expression patterns and revealing novel relationships. Although >20,000 cells were profiled at P1 (Adam et al., 2017), several known signalling molecules with functionally validated roles in the nephrogenic niche were not highlighted in that analysis. The cross-platform study conducted at E14.5 (Magella et al., 2017) provided insight into some novel signalling interactions, including Gdnf expression in the nephrogenic zone stroma, but expression of genes encoding key ligands such as Gdnf, Fgf9 or Bmp7 did not feature in the nephron progenitor population, perhaps favouring detection of ligands with more restricted expression patterns such as Fgf20. The improved analysis of signalling interactions provided in this study may be due to sequencing depth (∼3000 genes detected per cell), biological replication and differential expression analysis with the edgeR method, which has recently been shown to be a top performer in a comparison of 36 differential expression analysis methods for scRNA-seq data (Soneson and Robinson, 2018). Crucially, we have used in vivo gene expression and lineage tracing studies to validate or dismiss novel compartments.
This analysis identifies heterogeneity within the nephron progenitor population with a Six2/Cited1 high undifferentiated state, a moderate Six2/Cited1 expression cluster co-expressing Notch2 and Spry2, and a Six2 moderate Cited1 low/off cluster with upregulated expression of early commitment markers (Wnt4, Tmem100) potentially representing PTA. These clusters reflect previously described undifferentiated, primed and PTA clusters based on regionally restricted expression of markers such as Six2, Cited1, Dpf3 and Meox1 (Brown et al., 2015; Georgas et al., 2009; Mugford et al., 2009). In contrast to previous work on nephron progenitor subpopulations, Cited1 was not absent before the upregulation of pretubular aggregate genes, though Cited1 levels were reduced between the ‘undifferentiated’ and ‘primed’ populations. Additional cell cycle-associated nephron progenitor clusters were also identified but pseudotime analysis suggests they are dividing cells within other NP populations. A nephron progenitor-stromal cluster was identified by clustering and pseudotime analyses but not supported by subsequent lineage tracing or protein colocalisation experiments. Again, this may reflect a difference in the range of expression levels detected by this analysis versus those evident by in situ hybridisation or immunofluorescence. Changes in expression patterns between nephron progenitor subpopulations were graded rather than sharp, perhaps reflecting smooth transitions between states. Further work will be required to determine whether these subpopulations correlate to distinct anatomical regions within the nephrogenic niche.
Human kidney organoids contain epithelial, stromal and endothelial cell types with transcriptional congruence to equivalent populations in the human fetal kidney (Combes et al., 2019). However, our ability to interpret the cellular composition and authenticity of engineered renal tissue depends on our understanding of the markers that define a particular cell type or state of maturation in vivo. Likewise, our capacity to generate a cell type depends on knowledge of the programmes that specify and maintain cellular identity. Here, we identify a strong transcriptional congruence between the ureteric epithelium and the distal nephron, validating the expression of GATA3 and HOXB7 (Hoxb7GFP) in the murine distal nephron, two markers that were previously thought to be specific to the ureteric epithelium. Although expression of ureteric epithelium markers such as Calb1 have been documented in the distal nephron before (Georgas et al., 2008), the extent of the similarities between these cell types has not been fully appreciated. Emerging scRNA-seq studies of human fetal kidney identify GATA3, KRT8, KRT18, KRT19, WFDC2 and CDH16 as expressed in human distal nephron and ureteric epithelium clusters (Wang et al., 2018). More definitive ureteric epithelium markers, such as RET and WNT11 (when co-expressed with GATA3), were not detected despite the fact that these genes are known to be expressed in human kidneys (Rutledge et al., 2017). We have previously described the formation of ureteric epithelium within kidney organoids based upon co-staining for PAX2+ ECAD+ GATA3+ KRT8+ and DBA+ (Takasato et al., 2015). Indeed, recent lineage tracing experiments within such kidney organoids confirmed nephron epithelium as arising from SIX2-expressing cells, but not this presumptive GATA3+ ureteric epithelium (Howden et al., 2019). In contrast, Taguchi et al. propose that the ureteric epithelium is derived from anterior intermediate mesoderm and should not arise simultaneously with the metanephric mesenchyme (Taguchi et al., 2014). We now show that the markers that were previously used to define ureteric epithelium in our kidney organoids are not specific to ureteric epithelium. Although this leaves the identity of this epithelium undefined, it provides the field with specific ureteric epithelium markers with which to improve protocols.
In summary, this study provides the most comprehensive reference of cell type-specific expression within the developing kidney to date, associating known and new signalling molecules and pathways with specific cell types. As such, this data represents a roadmap with which to improve in vitro models of the developing kidney.
MATERIALS AND METHODS
Mouse strains and embryo staging
In mouse experiments, noon of the day on which the mating plug was observed was designated E0.5. C57Bl/6 mice were used for the E18.5 embryonic kidney analysis. E14.5 Six2GCE mice were used for the sorted nephron progenitor cell analysis. Sample gender was not determined before analysis. Mouse lines used were: Six2TGC [Tg(Six2-EGFP/cre)1Amc, Jackson Laboratory, 009606]; Six2GCE [Six2tm3(EGFP/cre/ERT2), Jackson Laboratory, 009600] (Kobayashi et al., 2008); PdgfraMerCreMer (CDB0674K, RIKEN Center for Life Science Technologies) (Ding et al., 2013); LSLTdTomato [Gt(ROSA)26Sortm9(CAG-tdTomato), Jackson Laboratory, 007909] (Madisen et al., 2010); Hoxb7GFP [Tg(Hoxb7-EGFP)33Cos, Jackson Laboratory, 016251] (Srinivas et al., 1999); and PdgfraGFP [Pdgfratm11(EGFP)Sor, Jackson Laboratory, 007669] (Hamilton et al., 2003). All animal experiments were approved by the Murdoch Children's Research Institute Animal Ethics Committees and conducted under Australian guidelines for the care and use of animals for scientific purposes.
Immunofluorescence and microscopy
E18.5 embryonic kidneys were fixed in 4% paraformaldehyde (PFA) for 20 min, washed in PBS and cleared using the PACT method (Yang et al., 2014) to preserve tdTomato or GFP fluorescence. Cleared samples were stained using rabbit anti-SIX2 (1:600, Proteintech, 11562-1-AP), goat anti-GATA3 (1:600, R&D Systems, AF2605) or mouse anti-cytokeratin (1:300, Abcam, ab115959) and Alexa Fluor 488- and/or 647-labelled secondary antibodies (1:600, Thermo Fisher Scientific) (antibodies previously used in Combes et al., 2018 and Combes et al., 2019). Samples were blocked in PBST (PBS+0.1% Triton-X) with 10% normal donkey serum and incubated at room temperature with each antibody solution for at least 48 h, followed by washing for 24 h in PBST. Nuclei were stained using Draq5 (Abcam). Samples were mounted in RIMS (88% Histodenz) and imaged using an Andor Dragonfly spinning disk system with a 40 µm pinhole disk and Nikon 1.15NA 40× water-immersion objective. Images were processed using Fiji (Schindelin et al., 2012).
Single cell sample prep and sequencing
Mouse kidneys were dissected into ice-cold PBS then digested over 15 min at 37°C in Accutase (A1110501, Life Technologies), with manual dissociation via pipetting through a P1000 tip every 5 min. Following dissociation, cells were passed through a 30 µm filter and stored on ice in 50% PBS, 50% DMEM with 5% fetal calf serum (FCS). Three pairs of E18.5 mouse kidneys were run in parallel on a 10x Chromium Single Cell Chip (10x Genomics). Kidneys from multiple litters of Six2GFP+ (Six2GCE, Kobayashi et al., 2008) mouse embryos at E14.5 were pooled into three replicate tubes and dissociated in parallel using the same protocol. Six2GFP+ cells were isolated using gates for Six2GFP fluorescence, propidium iodide to exclude dead cells, and size to exclude cell debris and doublets. Isolated Six2GFP+ cells were collected and stored on ice in 50% PBS, 50% DMEM with 5% FCS, and then run in parallel on a 10x chip. Libraries were prepared using Chromium Single Cell Library kit V2 (10x Genomics) and sequenced on an Illumina HiSeq using 100 bp paired-end sequencing.
Single cell data analysis
For the whole mouse kidney samples, raw sequencing data was processed using Cell Ranger (v1.3.1, 10x Genomics) to produce gene-level counts for each cell in each sample, which were aggregated to form a single matrix of raw counts for 6752 cells. All subsequent analysis was performed in the R statistical programming language. Cells with >95% of genes with zero assigned reads were removed, leaving 5639 cells for further analysis. Genes with zero counts in more than 5589 cells (assuming a minimum cluster size of 50 cells), mitochondrial and ribosomal genes, and genes without annotation were also filtered out. The final dataset used for analysis consisted of 5639 cells and 13,116 genes. The Seurat package (v2.0.1) (Macosko et al., 2015; Satija et al., 2015) was used to normalise data, regressing out factors related to biological replicate and cell cycle. For clustering, 1962 highly variable genes were selected and the first 30 principal components based on those genes used to build a graph, which was segmented with a resolution of 0.8. This identified 16 clusters across the 5639 cells. We obtained lists of DE genes for each cluster by testing for genes that had an absolute log fold change >1 between cells in each cluster compared with the remaining cells using the glmTreat method in the edgeR package (Robinson et al., 2010). To identify corresponding cell types we focussed on genes that were significantly upregulated in each cluster. In addition, we used pathway analysis to aid our interpretation, including GO and KEGG analysis, which was performed using limma (Ritchie et al., 2015), as well as pathway analysis using the ToppGene suite (Chen et al., 2009). Trajectory analysis of the various lineages was performed using Monocle (v2.4.0) (Qiu et al., 2017; Trapnell et al., 2014).
For the sorted cap mesenchyme data, raw sequencing data was processed using Cell Ranger as above. The 7853 cells all had <95% of genes with zero assigned reads. Cells with low diversity were removed, leaving 7844 cells for further analysis. Gene filtering proceeded as described above, leaving 12,344 genes for further analysis. To identify clusters within the nephron progenitor population, we performed an integrated analysis of the cells from the sorted cap mesenchyme and the nephron progenitor populations identified in the whole mouse kidney dataset, represented by clusters 0, 4, 6 and 10 in the nephron lineage. This was carried out using the alignment technique in the Seurat package. For both datasets, biological replicate, cell cycle and the total UMI counts were regressed out using the ScaleData function in Seurat. The two datasets were merged using canonical correlation analysis on 2187 highly variable genes and 20 canonical correlation vectors. Ten clusters were identified using 20 canonical correlation vectors and the resolution parameter set to 0.6. Marker genes for the 10 clusters were defined using Wilcoxon rank sum tests in the Seurat package. Five of the 10 clusters showed strong cell cycle-related expression patterns (clusters 1, 2, 5, 6 and 9), whereas cluster 8 had high immune cell markers. Clusters 0, 3, 4 and 7 corresponded to clusters 0, 6, 4 and 10, respectively, in the nephron lineage reclustering of the whole mouse kidney dataset, hence validating these clusters in a much larger dataset. Focusing on these four clusters, differential expression analysis with edgeR and glmTreat (fold-change threshold of 20%) was performed, further refining the marker gene lists for these populations. Trajectory analysis of the cells in these four clusters was performed using Monocle (Qiu et al., 2017), which identified three states.
Ligand-receptor interaction analysis was performed according to the approach described previously (Farbehi et al., 2019). Briefly, a weighted directed graph was built linking ‘source’ cell types, defined by expression of a ligand, to ‘target’ cell types expressing a corresponding receptor, after reference to a curated map of human ligand-receptor pairs (Ramilowski et al., 2015). Source-ligand and receptor-target edges were weighted according to expression fold change in ligands and receptors, respectively. Ligand-receptor edges were weighted according to mouse-specific protein-protein association scores from STRING (Szklarczyk et al., 2017). Significant cell-cell connections were determined by network permutation testing (100,000 permutations, Padj<0.01).
We ran two doublet detection algorithms available in the scran Bioconductor package (Lun et al., 2016) on the whole mouse kidney dataset as the nephron progenitor-stromal cluster (cluster 14) proved difficult to validate with subsequent experiments. First we ran the doubletCluster function in scran, which aims to identify clusters that have intermediate expression profiles of two other clusters (Bach et al., 2017). Every possible trio of clusters (the query cluster and its two ‘parents’) were examined, and a number of statistics computed providing support for the cluster arising from doublet cells. This analysis ranked cluster 14 as the most likely to contain doublets, with the parent clusters identified as clusters 4 (medullary stroma) and 12 (pretubular aggregate). Cluster 14 had very few unique marker genes (n=11), had cells with much larger library sizes compared with cells in clusters 12 and 4, and the proportion of cells belonging to cluster 14 was low (1.4%), providing further evidence for doublets. In addition we also ran the doubletCells function, which simulates doublets from the single cell expression profiles (Dahlin et al., 2018). Thousands of doublet cells are simulated by adding together two randomly chosen single cell profiles, ignoring clustering information. Each original cell is then compared with the simulated doublets, as well as the observed cells, and a doublet score is computed for each cell. High scores indicate a greater likelihood that the cells are doublets. Once more, cluster 14 was flagged as comprising of doublet cells, as the majority of cells had high doublet scores.
The authors thank Dr Hiroshi Kataoka for access to the PdgfraCreER line. Single cell sequencing was performed at the Australian Genome Research Facility Genomics Innovation Hub with the assistance of J. Jabbari. Microscopy was performed at the Murdoch Children's Research Institute.
Conceptualization: A.N.C.; Methodology: B.P., K.T.L., R.P., A.N.C.; Software: B.P., R.P., L.Z.; Validation: A.N.C., K.T.L.; Formal analysis: B.P.; Investigation: K.T.L., A.N.C., A.D., B.P.; Resources: A.D., R.P.H., R.P.; Data curation: B.P., L.Z.; Writing - original draft: A.N.C., B.P.; Writing - review & editing: A.N.C., B.P., K.T.L., A.D., R.P., R.P.H, A.O., M.H.L.; Supervision: A.N.C., R.P.H., A.O., M.H.L.; Funding acquisition: A.N.C., M.H.L.
This work was supported by seed funding from the Murdoch Children's Research Institute (which is supported by the State Government of Victoria’s Operational Infrastructure Support Program) and the University of Melbourne, and grants from the Australian Research Council (DE150100652 to A.N.C.) and the National Health and Medical Research Council (NHMRC) (GNT1156567 to A.N.C., A.O., M.H.L.). M.H.L. is a Senior Principal Research Fellow of the NHMRC (GNT1136085). A.O. is a Career Development Fellow of the NHMRC (APP1126157). L.Z. is supported by a Department of Education and Training, Australian Government, Research Training Program (RTP) Scholarship.
The authors declare no competing or financial interests.