Much attention has focussed on the conversion of human pluripotent stem cells (PSCs) to a more naïve developmental status. Here we provide a method for resetting via transient histone deacetylase inhibition. The protocol is effective across multiple PSC lines and can proceed without karyotype change. Reset cells can be expanded without feeders with a doubling time of around 24 h. WNT inhibition stabilises the resetting process. The transcriptome of reset cells diverges markedly from that of primed PSCs and shares features with human inner cell mass (ICM). Reset cells activate expression of primate-specific transposable elements. DNA methylation is globally reduced to a level equivalent to that in the ICM and is non-random, with gain of methylation at specific loci. Methylation imprints are mostly lost, however. Reset cells can be re-primed to undergo tri-lineage differentiation and germline specification. In female reset cells, appearance of biallelic X-linked gene transcription indicates reactivation of the silenced X chromosome. On reconversion to primed status, XIST-induced silencing restores monoallelic gene expression. The facile and robust conversion routine with accompanying data resources will enable widespread utilisation, interrogation, and refinement of candidate naïve cells.
Studies of the early mouse embryo and of derivative stem cell cultures have led to the proposition that pluripotency proceeds through at least two phases: naïve and primed (Hackett and Surani, 2014; Kalkan and Smith, 2014; Nichols and Smith, 2009, 2012; Rossant and Tam, 2017). Recent reports provide evidence that the naïve phase of pluripotency characterised in rodent embryos may be present in a similar form in the early epiblast of primate embryos, albeit with some species-specific features (Boroviak et al., 2015; Nakamura et al., 2016; Reik and Kelsey, 2014; Roode et al., 2012; Takashima et al., 2014). However, mouse embryonic stem cells (ESCs) correspond to naïve pre-implantation epiblast (Boroviak et al., 2014, 2015), whereas human pluripotent stem cell (hPSC) cultures (Takahashi et al., 2007; Thomson et al., 1998; Yu et al., 2007) seem to approximate primitive streak stage epiblast (Davidson et al., 2015; Irie et al., 2015; Wu et al., 2015; Nakamura et al., 2016). In general, hPSCs more closely resemble mouse post-implantation epiblast-derived stem cells (EpiSCs) (Brons et al., 2007; Tesar et al., 2007) than ESCs. Consequently, they are considered to occupy the primed phase of pluripotency.
Mouse ESCs can be propagated as highly uniform populations that exhibit consistent and unbiased multi-lineage differentiation in vitro and in chimaeras (Martello and Smith, 2014; Wray et al., 2010; Ying et al., 2008). These attributes contrast favourably with the heterogeneity and variable differentiation propensities of primed hPSCs (Butcher et al., 2016; Nishizawa et al., 2016) and have provoked efforts to determine conditions that will support a human naïve condition (De Los Angeles et al., 2012). Early studies lacked stringent criteria for demonstrating a pluripotent identity with comprehensive resemblance to both rodent ESCs and naïve cells in the human embryo (Davidson et al., 2015; Huang et al., 2014). However, two culture conditions have now been described for sustaining reset hPSC phenotypes that exhibit a wide range of both global and specific properties expected for naïve pluripotency (Takashima et al., 2014; Theunissen et al., 2016, 2014). Furthermore, candidate naïve hPSCs can be derived directly from dissociated human inner cell mass (ICM) cells (Guo et al., 2016). These developments support the contention that the core principle of naïve pluripotency may be conserved between rodents and primates (Nakamura et al., 2016; Nichols and Smith, 2012; Smith, 2017). Nonetheless, current techniques for resetting conventional primed hPSCs to a more naïve state raise issues concerning employment of transgenes, universality, genetic integrity, and ease of use. Here, we address these challenges and provide a simple protocol for consistent resetting to a stable and well-characterised candidate naïve phenotype.
Transient histone deacetylase inhibition resets human pluripotency
To monitor pluripotent status we exploited the piggyBac (PB) EOS-C(3+)-GFP/puroR reporter (EOS) as previously described (Takashima et al., 2014). Expression of this reporter is directed by mouse regulatory elements that are active in undifferentiated ESCs: a trimer of the CR4 element from the Oct4 (Pou5f1) distal enhancer coupled with the early transposon (Etn) long terminal repeat promoter (Hotta et al., 2009). We observed that conventional human ESCs (hESCs) stably transfected with the piggyBac construct and maintained in KSR/FGF on feeders quickly lost visible EOS-GFP, although expression remained detectable by flow cytometry (Fig. S1A,B). Expression was further diminished when cells were transferred into 2iLIF (two inhibitors – the MEK inhibitor PD and the GSK3 inhibitor CH – with the cytokine leukaemia inhibitory factor LIF; see Materials and Methods) or MEK inhibitor plus LIF (PDLIF) culture (Fig. S1C). By contrast, the PB-EOS reporter is upregulated during transgene-induced resetting and visible expression is maintained in naïve-like cells (Takashima et al., 2014). These observations suggested that PB-EOS might be subject to reversible epigenetic silencing in primed hPSCs.
Histone deacetylase (HDAC) inhibitors are global epigenetic destabilisers that have been used to facilitate nuclear transfer (Ogura et al., 2013), somatic cell reprogramming (Huangfu et al., 2008) and mouse EpiSC resetting (Ware et al., 2009). We investigated whether exposure to HDAC inhibitors would promote conversion of human primed cells to a naïve state. We applied valproic acid (VPA) or sodium butyrate to Shef6 hESCs carrying the PB-EOS reporter (S6EOS cells). When cells were treated for 3 days in E6 medium supplemented with PDLIF, then exchanged to t2iLGö naïve cell maintenance medium, the EOS reporter was upregulated (Fig. 1A,B). Bright GFP-positive colonies with dome-shaped morphology emerged over several days. We varied the culture parameters and empirically determined conditions that consistently yielded EOS expression in compact spheroid colonies (Fig. 1A-C). We tested the method on H9EOS reporter cells and found that they similarly acquired bright GFP expression and formed dome-shaped colonies (Fig. S1D).
We monitored the expression of OCT4, NANOG and the primate naïve marker KLF17 (Guo et al., 2016) during resetting of S6EOS cells. RT-qPCR analysis (Fig. 1D) shows that both OCT4 and NANOG expression decrease without HDAC inhibitor treatment, consistent with differentiation in PDLIF. By contrast, in HDAC inhibitor-treated cells, OCT4 mRNA levels show a transient increase on day 3 then remain at a similar level to that in primed cells, whereas NANOG transcripts increase ∼2-fold over the first 9 days. KLF17 transcripts are not detected in conventional hESCs, but become appreciable from day 7 onwards during resetting. KLF17 protein became apparent in some cells by immunofluorescence staining from as early as day 3 of resetting (Fig. 1E).
Cultures were dissociated with TrypLE after 9 days of resetting and replated in naïve culture medium, t2iLGö. Some differentiation and cell death were evident, and a few passages were required before the EOS-positive population became stable and predominant (Fig. 1F, Fig. S1E,F). From passage 5 onwards the reset phenotype was robust and could thereafter be expanded reliably.
The ability to enrich the naïve phenotype after resetting by bulk passaging in t2iLGö suggested that a reporter should be dispensable, facilitating general applicability. We therefore tested resetting without the EOS transgene on a panel of primed human ESCs and induced pluripotent stem cells (iPSCs). Stable cultures of compact colonies displaying naïve marker gene expression were established consistently (Table 1, Fig. 1G). These cell lines are denoted by the designation cR (chemically reset). Resetting efficiency varied between lines and according to initial culture status. In general, however, a single well of a 6-well plate of primed PSCs was sufficient for initial generation of multiple colonies and subsequent establishment of stable naïve cultures by passage 5. Rho-associated kinase (ROCK) inhibitor was used during resetting and initial expansion in most experiments, but was usually omitted during subsequent propagation. Together with NANOG, reset cells expressed the naïve transcription factor proteins KLF4 and TFCP2L1, which are present in the human ICM (Takashima et al., 2014) but undetectable in primed PSCs (Fig. 1H).
Feeder-free expansion of reset cells
As noted previously (Takashima et al., 2014), reset cells can be cultured on pre-coated plates without feeders. However, morphology was heterogeneous, with more differentiation and cell death than on feeders. We varied conditions and found that provision of growth factor-reduced Geltrex with the culture medium at the time of plating was more effective than pre-coating (Fig. 2A). Geltrex or laminin applied in this manner supported continuous propagation in t2iLGö of both embryo-derived HNES and chemically reset cells, with robust expression of naïve pluripotency factors (Fig. 2B-D). Moreover, aberrant expression of some mesoendodermal genes was reduced in feeder-free conditions (Fig. 2E).
In the absence of feeders we found that some reset cell lines expanded more robustly in very low (0.3 µM) or even no CH (Fig. 2F). This is in line with observations that GSK3 inhibition is optional in the alternative 5i/L/A naïve culture system (Theunissen et al., 2016). We subsequently adopted 0.3 µM CH for standard culture. Naïve cell maintenance medium with 0.3 µM CH is termed tt2iLGö. Reset cultures in Geltrex and tt2iLGö displayed homogeneous morphology and expanded continuously with a doubling rate of ∼24 h (Fig. 2G,H).
We also observed that omitting CH entirely for the first 10 days of resetting increased the yield of EOS-positive cells. We therefore implemented a revised resetting routine, omitting CH initially then exchanging into tt2iLGö on feeders before transfer to Geltrex culture. PSCs reset in these conditions showed consistent feeder-free expansion, with typical naïve morphology, growth and marker profiles that were indistinguishable from cells reset in the presence of CH (Fig. 2I).
WNT inhibition stabilises resetting
As noted above, EOS-GFP-positive and KLF17-immunopositive colonies emerged within 10 days of VPA treatment (Fig. 1E). However, differentiation and cell death are ongoing for several passages and during this period we observed that the reset phenotype could not be sustained without feeders. Thus, the resetting process appears incomplete and vulnerable at early stages. We also noted a requirement for a stabilisation period following doxycycline (DOX) withdrawal during transgene-mediated resetting (Takashima et al., 2014). We used H9-NK2 cells, with DOX-dependent expression of NANOG and KLF2, to explore conditions that might stabilise resetting. We tested two candidates: the amino acid L-proline and the tankyrase inhibitor XAV939 (XAV). L-proline is reported to be produced by feeders and to alleviate nutrient stress in mouse ESCs (D'Aniello et al., 2015). XAV inhibits canonical Wnt signalling (Huang et al., 2009) and has previously been reported to facilitate the propagation of pluripotent cells in alternative states (Kim et al., 2013; Zimmerlin et al., 2016). We withdrew DOX from H9-NK2 cells and applied either L-proline (1 mM) or XAV (2 µM) in combination with t2iLGö. We assessed colony formation on feeders after the first and second passages. We saw no pronounced effect of L-proline. By contrast, addition of XAV resulted in more robust production of uniform domed colonies (Fig. 3A). RT-qPCR analysis substantiated the presence of naïve pluripotency markers in XAV-supplemented cultures and also highlighted reduced levels of lineage-affiliated markers such as brachyury (T) and GATA factors (Fig. 3B).
We investigated whether WNT inhibition would stabilise emergent cR cells. In addition to the tankyrase inhibitor XAV, we tested an orthogonal WNT pathway inhibitor, IWP2, which acts to prevent the production of functional WNT protein (Chen et al., 2009). XAV or IWP2 were added following VPA treatment on day 3 of resetting H9EOS and S6EOS cells (Fig. 3C). For both inhibitors we observed reduced numbers of differentiating or dying cells and a substantial increase in the frequency of EOS-GFP-positive cells by day 9, which increased further on passaging into tt2iLGö on MEFs (Fig. 3D, Fig. S2A). After the second passage the majority of colonies displayed domed morphology and readily visible GFP (Fig. 3E). WNT inhibitor-treated H9EOS cultures at passage 2 expressed higher levels of naïve markers and lower GATA6 and GATA3 than parallel cultures reset without WNT inhibition (Fig. 3F). Similarly, S6EOS cells reset using XAV or IWP2 progressed to stable reset cultures expressing naïve markers and minimal levels of brachyury, CDX2 and GATA6 (Fig. S2B). From passage 3, we transferred XAV-treated cells to feeder-free culture in tt2iLGö and Geltrex without XAV. Marker analysis by RT-qPCR confirmed maintained expression of signature naïve pluripotency factors after four passages at similar levels to those in reset cells generated without the use of WNT inhibitors (Fig. 3G).
We also assessed whether vitamin C was required for resetting. For the 3 day period of exposure to VPA we replaced E6 medium, which contains vitamin C, with N2B27 medium with or without addition of vitamin C. Resetting was continued in the presence of XAV as above. After two passages we observed comparable upregulation of EOS-GFP and similar expression of naïve markers with or without exposure to vitamin C (Fig. S2C,D).
Collectively, these findings establish that, following VPA treatment, WNT inhibition can improve the rate and efficiency of conversion to a stable naïve phenotype that can subsequently be propagated robustly in tt2iLGö with or without feeders or ongoing WNT inhibition. The results also indicate that vitamin C supplementation is not required for resetting. Full details of the protocol and cell lines reset are provided in the supplemental Materials and Methods and Table S1.
Global transcriptome profiling
We obtained transcriptome data by RNA sequencing (RNA-seq) of replicate samples of reset cells generated by VPA treatment. We also sequenced the embryo-derived naïve stem cell line HNES1 (Guo et al., 2016) and a parallel culture of HNES1 cells that had been ʻprimed' by transfer into KSR/FGF for more than ten passages. We added to the analysis published data (see Materials and Methods) from cells reset with inducible transgenes (Takashima et al., 2014), HNES cells cultured in the presence of vitamin C and ROCK inhibitor (Guo et al., 2016), naïve-like cells in 5i/L/A (Ji et al., 2016) and a variety of conventional PSCs from publicly available resources and our own studies. We applied two complementary dimensionality reduction techniques: principal component analysis (PCA) identifies and ranks contributions of maximum variation in the underlying dataset, whereas t-distributed stochastic neighbour embedding (t-SNE) is a probabilistic method that minimises the divergence between pairwise similarities in the constituent data points. Both analyses of global transcriptomes unambiguously discriminate naïve/reset samples from primed PSCs (Fig. 4A,B). In each analysis, cR cells cluster closely together with HNES1 cells that were cultured in parallel. Sample replicates are intermingled despite being from cell lines of disparate provenance and culture history. Feeder-free cultures form a slightly distinct cluster within the naïve grouping. Consistent with previous analyses (Huang et al., 2014; Irie et al., 2015; Nakamura et al., 2016; Takashima et al., 2014; Theunissen et al., 2016), two independent RNA-seq datasets for purported naïve cells cultured in 4i (NHSM) conditions (Gafni et al., 2013; Irie et al., 2015; Sperber et al., 2015) cluster with conventional primed PSCs by both PCA and t-SNE, as do cultures in ʻextended pluripotency' media (Yang et al., 2017). For both naïve and primed cells, PCA component 2 appears sensitive to differences in growth conditions and/or batch effects and to capture variation between laboratories and cell lines.
Gene Ontology (GO) analysis of genes contributing to PCA component 1 shows significant enrichment of functional categories primarily associated with extracellular matrix, development and differentiation (Table S2), reflecting distinct identities associated with naïve and primed cells. We also noted upregulation of multiple genes associated with mitochondria and oxidative phosphorylation in reset cells cultured on laminin and on feeders (Fig. S3A-C), consistent with metabolic reprogramming between primed and naïve pluripotency (Takashima et al., 2014; Zhou et al., 2012). Overall, cR cells share global gene expression features with ICM-derived HNES cells and transgene-reset PSCs and are distinct from various primed PSCs. Genes highly upregulated in naïve conditions relative to conventional PSCs are highlighted in Fig. 4C.
We inspected the expression of transposable elements (TEs) – the transposcriptome (Friedli and Trono, 2015). A number of TEs are known to be transcriptionally active in early embryos and PSCs, potentially with functional significance. PCA of TE expression separated cR and HNES cells from primed PSCs (Fig. S3D,E). Notably, HERVK, SINE-VNTR-Alu (SVA) and LTR5_Hs elements were upregulated in naïve cultures (Fig. 4D). Inspection of KRAB-ZNFs, potential regulators of TE expression, revealed that many are significantly upregulated in reset cells (Fig. S3F). These include ZNF229 and ZNF534, which represses HERVH elements (Theunissen et al., 2016), ZNF98 and ZNF99, which are also upregulated during epigenetic resetting of germ cells (Tang et al., 2015), and ZFP57, which protects imprints in the mouse (Quenneville et al., 2011).
We compared relative transcript levels for a panel of pluripotency markers between cR cells and human pre-implantation embryos. For the embryo data we used published single-cell RNA-seq (Blakeley et al., 2015; Yan et al., 2013). Normalised expression was consistent between reset cells and the epiblast, more so than with earlier stage embryonic cells (Fig. 4E). Primed PSCs exhibited no or low expression of several of these key markers. A set of genes upregulated in reset cells were also expressed in the human ICM and epiblast, and their expression was low or absent in various conventional and alternative primed PSC cultures (Fig. 4E, Fig. S4). These genes encode transcription factors, epigenetic regulators, metabolic components and surface proteins, and provide several candidate markers of human naïve pluripotency. In addition, we inspected recently published transcriptome data from cynomolgus monkey embryos (Nakamura et al., 2016). Analysis of the most differentially expressed genes between reset and primed PSCs separated the cynomolgus samples into two clusters (Fig. S5). Notably, reset cells share features with the pre-implantation epiblast, whereas primed PSCs are more similar to pre-streak and gastrulating epiblast.
Global DNA hypomethylation is a distinctive characteristic of mouse and human ICM cells (Guo et al., 2014; Lee et al., 2014; Smith et al., 2012) that is manifest in candidate naïve hPSCs (Takashima et al., 2014; Theunissen et al., 2016). We performed whole-genome bisulfite sequencing (BS-seq) on primed S6EOS and on reset S6EOS and H9 EOS cultures derived from independent experiments with or without addition of XAV. Methylation profiles were compared with previous datasets for primed PSCs, human ICM cells (Guo et al., 2014), transgene reset PSCs (H9-NK2; Takashima et al., 2014) and HNES1 cells (Guo et al., 2016). Primed PSCs show uniformly high levels of DNA methylation (85-95%), whereas reset cells display globally reduced CpG methylation, comparable to ICM and with a similar relatively broad distribution (Fig. 5A). Hypomethylation extended over all genomic elements (Fig. S6B) and was lower in cells that had been through more than ten passages in t2iLGö. Loss of methylation from primed to reset conditions was not uniform across the whole genome, however. Highly methylated (80-100% methyl-CpG) regions in primed cells showed divergent demethylation to between 15% and 65% methyl-CpG (Fig. 5A,B, Fig. S6C). The majority of promoters were methylated at low levels in both primed and reset S6EOS cells (Fig. 5C), including most CpG island (CGI)-containing promoters. Among methylated promoters in primed PSCs, many showed decreased methylation in reset cells in line with the global trend. However, we also identified a number of CGI and non-CGI promoters that gained methylation upon resetting (highlighted in red in Fig. 5C; >40% CpG methylation difference between primed and averaged reset cells). GO analysis of the genes associated with this group of promoters indicated enrichment for terms related to differentiation, development and morphogenesis (Fig. S6D). Transgene reset and HNES1 cells also showed significantly higher promoter methylation levels at these loci than their primed counterparts (Fig. 5D), suggesting that selective promoter methylation is a feature of naïve-like cells in t2iLGö. By contrast, we observed that many, although not all, imprinted differentially methylated regions (DMRs) are demethylated in reset conditions (Fig. 5E), in line with previous findings (Pastor et al., 2016).
The correlation between gene expression and promoter methylation (Fig. 5F, Fig. S6E) is very weak overall, as previously noted in mouse ESCs (Ficz et al., 2013; Habibi et al., 2013). Nonetheless, some genes that are highly upregulated in reset cells and potentially functionally significant, such as KLF17, DNMT3L and ZNF534, show striking reductions in promoter methylation. Conversely, although TEs in general obeyed the genome-wide trend of hypomethylation in reset cells, substantial subsets of the HERVH and LTR7 TE families gained methylation and most of these showed reduced expression or were silenced (Fig. 5G). Finally, we noted demethylation of the piggyBac repeat sequences in cR-S6EOS cells (Fig. S6F), consistent with the proposition that the transgene is subject to epigenetic repression in primed cells that is relieved by resetting.
A major concern with manipulation of PSC culture conditions is the potential for selection of genetic variants (Amps et al., 2011). Indeed, it has previously been noted that naïve-like cells cultured in the 5i/L/A formulation are prone to aneuploidy (Pastor et al., 2016; Sahakyan et al., 2017; Theunissen et al., 2014). We therefore carried out metaphase chromosome analyses by G-banding on a selection of cR cells (Fig. S2E). The results presented in Table 1 show retention of a diploid karyotype in most cases, although in some cultures minor subpopulations of aneuploid cells are present. These data indicate that the epigenetic resetting process does not induce major chromosomal instability nor select for pre-existing variants, in line with previous observations that cultures in t2iLGö can maintain a diploid karyotype (Guo et al., 2016; Takashima et al., 2014). However, we noticed a variable incidence of tetraploid cells during expansion and one line showed a ubiquitous gain of chr19q13 after extended culture (40 passages). cR and HNES1 cells could also maintain a diploid karyotype over multiple passages in Geltrex or laminin, although abnormalities emerged in some cultures (Table 1). We also examined the transcriptome data by variant analysis for mutations in TP53 that have been detected recurrently in primed PSCs (Merkle et al., 2017). None of the loss-of-function TP53 mutations identified was found in cR cells.
To assess the multi-lineage potential of cR cells we first used embryoid body differentiation. After 3 days of floating culture in t2iL without Gö, aggregates were transferred to Geltrex-coated dishes and differentiated as outgrowths in serum. Alternatively, reset cells were transferred into E8 medium for 6 days then aggregated in serum for 3 days before outgrowth. RT-qPCR on 8 day outgrowths showed upregulation in both conditions of markers of early neuroectoderm, mesoderm and endoderm specification (Fig. S7A). Induction of these markers was lower for reset cells taken directly from t2iLGö than for cells conditioned in E8 (Fig. S7A), whereas downregulation of pluripotency markers was similar in both cell types. Immunostaining evidenced expression of protein markers of mesoderm and endoderm differentiation (Fig. S7B) and, at lower frequency, of neuron-specific β-tubulin.
We then evaluated directed lineage commitment in adherent culture. Unsurprisingly, cR cells taken directly from t2iLGö did not respond directly to definitive endoderm or neuroectoderm induction protocols (Chambers et al., 2009; Loh et al., 2014) developed for primed PSCs (Fig. S7C). After prior transfer into N2B27 for 3 days, a CXCR4/SOX17-positive, PDGFRα-negative population, indicative of definitive endoderm, could be obtained (Fig. S7D) but neural marker induction in response to dual SMAD inhibition remained low. We therefore converted cR cells into a conventional primed PSC state by culture in E8 medium on Geltrex for several passages (Fig. S7E). We then applied the protocols for germ layer specification from primed cells to three different ʻre-primed' cultures. We observed robust expression of lineage markers for endoderm, lateral plate mesoderm and neuroectoderm by RT-qPCR (Fig. 6A). Immunostaining for SOX17 and FOXA2, and for SOX1 and PAX6, validated the widespread generation of endoderm or neuroectoderm, respectively (Fig. 6B). Flow cytometric analysis quantified efficient induction of all three lineages (Fig. 6C, Fig. S7F). We examined further neuronal differentiation. After 29 days we detected expression of neuronal markers by RT-qPCR (Fig. 6D). Many cells with neurite-like processes were immunopositive for MAP2 and NEUN (RBFOX3) (Fig. 6E). By 40 days, markers of maturing neurons were apparent: vesicular glutamate transporter (vGlut2; SLC17A6), the post-synaptic protein SNAP25 and the presynaptic protein bassoon (Fig. 6F).
We also subjected cR-S6EOS cells to a protocol for inducing primordial germ cell-like cells (PGCLCs). Cells were transferred from t2iLGö into TGFβ and FGF for 5 days, followed by exposure to germ cell-inductive cytokines (Irie et al., 2015; von Meyenn et al., 2016). Cells co-expressing tissue non-specific alkaline phosphatase and EOS-GFP, suggestive of germ cell identity, were isolated by flow cytometry on day 9. Analysis of this double-positive population by RT-qPCR showed upregulated expression of a panel of primordial germ cell markers (Fig. S7G). These data indicate that germ cell specification may be induced from chemically reset cells, as also shown for reset cells generated by transgene expression (von Meyenn et al., 2016).
Female naïve cells are expected to have two active X chromosomes in human, as in mouse. Unlike in mouse, however, XIST is expressed from one or both active X chromosomes in human ICM cells (Okamoto et al., 2011; Petropoulos et al., 2016; Vallot et al., 2017) as well as from the inactive X in differentiated cells. Primed female hPSCs usually feature an inactive X, although this has frequently lost XIST expression, a process referred to as erosion (Mekhoubad et al., 2012; Silva et al., 2008). X chromosomes in female cR-S6EOS cells show more marked loss of methylation than autosomes (Fig. S6C), suggestive of reactivation (Takashima et al., 2014). We employed RNA FISH to assess nascent transcription from X chromosomes at the single-cell level. In parental S6EOS and H9EOS cells the presence of two X chromosomes was confirmed by RNA FISH for XACT (Fig. S8A), which is transcribed from both active and eroded X chromosomes (Patel et al., 2017; Vallot et al., 2017). No XIST signal was evident in either cell line but we detected monoallelic transcription of HUWE1, an X-linked gene typically subject to X-chromosome inactivation (Patel et al., 2017) (Fig. 7A,B). By contrast, reset cells displayed biallelic transcription of HUWE1 in the majority (90%) of diploid cells for both lines. Similar results were obtained for two other X-linked genes: ATRX and THOC2 (Fig. S8A,B). XIST was detected monoallelically in a subset of reset cells (Fig. 7A,B). This unusual feature is in line with recent reports that human naïve-like cells have two active X chromosomes, but predominantly express XIST from neither, or only one, allele (Sahakyan et al., 2017; Vallot et al., 2017).
We also examined X-chromosome status after reset cells had been reverted to a primed-like PSC state by culture in E8 medium for 30 days as above. We found that HUWE1 became transcribed monoallelically in ∼90% of ʻre-primed' cR-S6EOS cells and that almost all of those cells expressed XIST from the other X chromosome (Fig. 7A,B). For cR-H9EOS, 40% of re-primed cells showed monoallelic expression of HUWE1, and those cells also upregulated XIST from the other, inactive X chromosome. Similar patterns were observed when we co-stained the cells for XIST and another X-linked gene, THOC2 (Fig. S8A). These data are consistent with induction of X-chromosome silencing by XIST during pluripotency progression.
The availability of candidate naïve hPSCs offers an experimental system for investigation of human pluripotency progression and a potentially valuable source material for biomedical applications. Our findings demonstrate that cell populations exhibiting a range of properties consistent with naïve pluripotency can readily be generated from primed PSCs by transient HDAC inhibition followed by culture in t2iLGö or tt2iLGö. WNT inhibition stabilises initial acquisition of the reset phenotype. Chemically reset cells are phenotypically stable and in many cases cytogenetically normal. They can be propagated robustly without feeders and readily be re-primed to undergo multi-lineage differentiation in vitro. We provide detailed protocols along with global transcriptome, transposcriptome and methylome datasets as resources for the community.
The mechanism by which HDAC inhibition promotes resetting is unresolved but seems likely to involve the generation of a more open chromatin environment that relieves silencing of naïve pluripotency factors. The reset phenotype is initially rather precarious but can be stabilised by inhibitors of tankyrase or porcupine that suppress the canonical WNT pathway. cR cells differ dramatically in global expression profile from primed PSCs and resemble previously described human naïve-like cells generated by inducible or transient transgene expression (Takashima et al., 2014) or by adaptation to culture in 5i/L/A/(F) (Theunissen et al., 2014). In particular, transcriptome analysis shows that cR cells share a high degree of genome-wide and marker-specific correspondence with HNES cell lines derived directly from dissociated human ICM (Guo et al., 2016). Reset cells express transcription regulators and other genes that are found in human pre-implantation epiblast but are low or absent in primed PSCs. These include the characterised naïve pluripotency factors KLF4 and TFCP2L1, along with potential new regulators and markers.
Reset and HNES cells express SVA, LTR5, HERVK and SST1 TEs. These are among the most recent entrants to the human genome and are transcribed in pre-implantation embryos (Grow et al., 2015; Theunissen et al., 2016). By contrast, HERVH families and their flanking LTR7 repeats are mostly downregulated in reset cells and exhibit increased methylation. These findings confirm and extend the recent report that specific TE expression discriminates between primed and naïve-like hPSCs (Theunissen et al., 2016). HERVH and LTR7 are reported to generate alternative and chimaeric transcripts in primed PSCs, where they display heterogeneous expression (Wang et al., 2014). Therefore, silencing in naïve cells and derepression upon progression to primed pluripotency might have functional significance. Notably, ZNF534, the postulated negative regulator of HERVH (Theunissen et al., 2016), is highly upregulated in reset cells, while increased expression of DNMT3L in human naïve-like cells, a feature not apparent in mouse ESCs, may facilitate de novo methylation at specific TE loci.
During resetting, DNA methylation is globally reduced to a level similar to that reported for human ICM (Guo et al., 2014). This is regarded as a key process for erasure of epigenetic memory in the naïve phase of pluripotency (Lee et al., 2014). Reduced methylation extends to all classes of genomic element but is non-uniform. At promoters, both loss and gain of methylation are detected. As in other cell types, there is poor overall correlation with gene expression but it is noteworthy that extensively demethylated promoters in reset cells include several associated with highly upregulated genes that are likely to be functional in naïve cells, including KLF17, as well as numerous primate- and hominid-specific TEs. Demethylation also extends to imprinted loci, however, as noted previously for other human naïve-like stem cells (Pastor et al., 2016; Theunissen et al., 2016). Loss of imprints is observed in conventional hPSCs (Nazor et al., 2012) and in mouse ESCs (Dean et al., 1998; Greenberg and Bourc'his, 2015; Walter et al., 2016), but not typically to the extent detected for human naïve-like cells. Whether failure to sustain imprints is an intrinsic feature of human naïve pluripotency during extended propagation or may be rectified by modification of the culture environment remains to be determined.
Efficient multi-lineage differentiation may be initiated from reset cells either via embryoid body formation or by ʻre-priming' in adherent culture. It is noteworthy, however, that human cells in the t2iLGö naïve condition are not immediately responsive to lineage induction. Ground-state mouse ESCs also appear not to respond directly to lineage cues but to require prior transition through a formative stage (Kalkan et al., 2017; Mulas et al., 2017; Semrau et al., 2016 preprint). This capacitation period might be more protracted in primates given the longer window between implantation and gastrulation (Nakamura et al., 2016; Smith, 2017).
A hallmark of the transient phase of naïve pluripotency in both rodent and human ICM cells is the presence of two active X chromosomes in females (Okamoto et al., 2011; Petropoulos et al., 2016; Sahakyan et al., 2017; Vallot et al., 2017). In female cR cells, the gain of biallelic expression of X-linked genes indicates reactivation of the silent X chromosome. Moreover, expression of XIST from an active X chromosome in a subset of reset cells resembles the pattern of the human pre-implantation embryo. Upon re-priming, monoallelic expression of X-linked genes is restored in many cells. Significantly, although no XIST was observed in the original primed cells, an XIST signal is detected in re-primed cells on a silenced X chromosome. Resetting and subsequent differentiation thus offer a system to characterise X-chromosome regulation in human, which appears to diverge substantially from the mouse paradigm (Okamoto et al., 2011).
In summary, this study provides the requisite technical protocols and resources to facilitate routine generation and study of candidate human naïve PSCs. Moreover, feeder-free culture simplifies the propagation of reset cells. Nonetheless, further refinements are desirable to enhance the quality and robustness of naïve hPSCs, including preserving imprints and maximising long-term karyotype stability. Optimising the capacitation process prior to differentiation by recapitulating the progression of pluripotency in the primate embryo is an important future goal and opportunity.
MATERIALS AND METHODS
Conventional hPSC culture
Primed hPSCs were routinely maintained on irradiated mouse embryonic fibroblast (MEF) feeder cells in KSR/FGF medium: DMEM/F-12 (Sigma-Aldrich, D6421) supplemented with 10 ng/ml FGF2 (prepared in-house), 20% KnockOut Serum Replacement (KSR) (Thermo Fisher Scientific), 100 mM 2-mercaptoethanol (2ME) (Sigma-Aldrich, M7522), 1×MEM non-essential amino acids (NEAA) (Thermo Fisher Scientific, 11140050) and 2 mM L-glutamine (Thermo Fisher Scientific, 25030024). Cells were passaged as clusters by detachment with dispase (Sigma-Aldrich, 11097113001). To establish PB-EOS stable transfectants, 1 μg/ml puromycin was applied for two passages (10 days) to transfected cells on Matrigel (Roche). Some PSC lines were propagated without feeders on Geltrex (growth factor-reduced, Thermo Fisher, A1413302) in E8 medium [made in-house according to Chen et al. (2011)].
Naïve cell culture
Chemically reset and embryo-derived (HNES) naïve stem cells were propagated in N2B27 (see the supplementary Materials and Methods) supplemented with t2iLGö [1 µM CHIR99021 (CH), 1 µM PDO325901 (PD), 10 ng/ml human LIF and 2 µM Gö6983] with or without ROCK inhibitor (Y-27632) on irradiated MEF feeders. Where indicated as tt2iLGö, CH was used at 0.3 µM. For feeder-free culture, Geltrex or laminin (Merck, CC095) was added to the medium at the time of plating. Cells were cultured in 5% O2, 7% CO2 in a humidified incubator at 37°C and passaged by dissociation with Accutase (Thermo Fisher Scientific, A1110501) or TrypLE (Thermo Fisher Scientific, 12605028) every 3-5 days. Cells were cryopreserved in CryoStem (Biological Industries, K1-0640). Cell lines were tested free of mycoplasma contamination in-house by PCR. No other contamination test has been performed.
Reverse transcription and real-time PCR
Total RNA was extracted using an RNeasy Kit (Qiagen) and cDNA synthesized with SuperScript III reverse transcriptase (Thermo Fisher Scientific, 18080085) and oligo(dT) adapter primers. TaqMan assays and Universal ProbeLibrary (UPL) probes (Roche Molecular Systems) are listed in Table S3A,B. Embryoid bodies were lysed in TRIzol (Thermo Fisher Scientific, 15596018) and total RNA was isolated with PureLink RNA Mini Kit (Thermo Fisher Scientific, 12183025) with On-Column PureLink DNase (Thermo Fisher Scientific, 12185010). For analyses of adherent differentiation, total RNA was extracted with ReliaPrep RNA Miniprep Kit and RT-qPCR performed using oligo(dT) primer, the GoScript Reverse Transcription System and GoTaq qPCR Master Mix (all from Promega).
Cells were fixed with 4% buffered paraformaldehyde for 15 min at room temperature, permeabilised with 0.5% Triton X-100 in PBS for 10 min and blocked with 3% BSA and 0.1% Tween 20 in PBS for 30 min at room temperature. Incubation with primary antibodies (Table S3C) diluted in PBS with 0.1% Triton X-100 and 3% donkey serum was overnight at 4°C and secondary antibodies were added for 1 h at room temperature. Slides were mounted with Prolong Diamond Antifade Mountant (Life Technologies).
G-banded karyotype analysis was performed following standard cytogenetics protocols, typically scoring 30 metaphases.
Total RNA was extracted using the TRIzol/chloroform method (Invitrogen) and RNA integrity assessed using a Qubit 2.0 fluorometer (Thermo Fisher Scientific) and RNA Nano Chip Bioanalyzer (Agilent Genomics). Ribosomal RNA was depleted from 1 µg total RNA using Ribo-Zero (Illumina). Sequencing libraries were prepared using the NEXTflex Rapid Directional RNA-Seq Kit (Bioo Scientific, 5138-08). Sequencing was performed on an Illumina HiSeq4000 in either single-end 50 bp or paired-end 125 bp format.
RNA-seq data analysis
External datasets used for comparative analyses were obtained from the European Nucleotide Archive (ENA) under accessions ERP006823 (Takashima et al., 2014), SRP059279 (Ji et al., 2016), SRP045911 (Sperber et al., 2015), SRP045294 (Irie et al., 2015), SRP011546 (Yan et al., 2013), SRP055810 (Blakeley et al., 2015), SRP074076 (Yang et al., 2017) and ERP007180 (Wellcome Trust Sanger Institute). To minimise technical variability, reads of disparate lengths and sequencing modes were truncated to 50 bp single-end format. Alignments to human genome build hg38/GRCh38 were performed with STAR (Dobin et al., 2013). Transcript quantification was performed with htseq-count, part of the HTSeq package (Anders et al., 2014), using gene annotation from Ensembl release 86 (Aken et al., 2016). Libraries were corrected for total read count using the size factors computed by the Bioconductor package DESeq2 (Love et al., 2014). Principal components were computed by singular value decomposition with the prcomp function in the R statistics package from variance-stabilised count data. Differential expression was computed with DESeq2 and genes ranked by log2 fold-change. t-distributed stochastic neighbour embedding (t-SNE) (van der Maaten and Hinton, 2008) was performed using the Barnes-Hut algorithm (Van Der Maaten, 2014) implemented in the Bioconductor package Rtsne with perplexity 12 for 1600 iterations. For display of expression values, single-end count data were normalised for gene length to yield RPKMs and scaled relative to the mean expression of each gene across all samples. Heatmaps include genes for which a difference in expression was observed (i.e. scaled expression >1 or <−1 in at least one sample). For functional testing, enrichment for GO terms was determined using the GOStats package (Falcon and Gentleman, 2007) based on the 1000 most upregulated and downregulated genes distinguishing naïve and primed cells, and most significant genes contributing to principal component 1 (Fig. 3A). RNA-seq libraries were screened for mutations in the P53 locus by processing alignments with Picard tools (http://broadinstitute.github.io/picard) and the Genome Analysis Toolkit (GATK) (DePristo et al., 2011; McKenna et al., 2010) to filter duplicate reads, perform base quality score recalibration, identify indels for realignment, and call variants against dbSNP build 150 (Sherry et al., 2001).
Bisulfite sequencing, mapping and analysis
Post-bisulfite adaptor tagging (PBAT) libraries for whole-genome DNA methylation analysis were prepared from purified genomic DNA (Miura et al., 2012; Smallwood et al., 2014; von Meyenn et al., 2016). Paired-end sequencing was carried out on HiSeq2000 or NextSeq500 instruments (Illumina). Raw sequence reads were trimmed to remove poor quality reads and adapter contamination using Trim Galore (v0.4.1) (Babraham Bioinformatics). The remaining sequences were mapped using Bismark (v0.14.4) (Krueger and Andrews, 2011) to the human reference genome GRCh37 in paired-end mode as described (von Meyenn et al., 2016). CpG methylation calls were analysed using SeqMonk software (Babraham Bioinformatics) and custom R scripts. Global CpG methylation levels of pooled replicates were illustrated using bean plots. The genome was divided into consecutive 20 kb tiles and percentage methylation was calculated using the bisulfite feature methylation pipeline in SeqMonk. Pseudocolour scatter plots of methylation levels over 20 kb tiles were generated using R.
Specific genome features were defined using the following Ensembl gene sets annotations: Gene bodies (probes overlapping genes), Promoters (probes overlapping 900 bp upstream to 100 bp downstream of genes), CGI promoters (promoters containing a CGI), non-CGI promoters (all other promoters), Intergenic (probes not overlapping with gene bodies), non-promoter CGI (CGI not overlapping with promoters). Annotations of human germline imprint control regions were obtained from Court et al. (2014). Pseudocolour heatmaps representing average methylation levels were generated using the R heatmap.2 function without further clustering, scaling or normalisation. Correlation between promoter methylation and gene expression was computed from average CpG methylation across promoters or TEs and correlating these values with the respective gene expression values.
Fluorescent in situ hybridisation (FISH)
Nascent transcription foci of X-linked genes and the lncRNAs XIST and XACT were visualised at single-cell resolution by RNA FISH as described (Sahakyan et al., 2017). Fluorescently labelled probes were generated from BACs RP11-13M9 (XIST), RP11-35D3 (XACT), RP11-121P4 (THOC2), RP11-1145J4 (ATRX) and RP11-975N19 (HUWE1). Coverslips were imaged using an Imager M1 microscope (Zeiss) and AxioVision software. ImageJ was used for collapsing z-stacks, merging different channels, and adjusting brightness and contrast to remove background. A minimum of 100 nuclei were scored for each sample. Cells that appeared to have more than two X chromosomes were excluded.
RepeatMasker annotations for the human reference genome were obtained from the UCSC Table Browser. To calculate repeat expression, adapter-trimmed RNA-seq reads were mapped to the reference genome using bowtie (Langmead and Salzberg, 2012) with parameters ‘−M1 –v2 –best –strata’, i.e. two mismatches were allowed, and one alignment location was randomly selected for reads that multiply align to the reference genome. Read counts for repeat regions and Ensembl transcripts were calculated by featureCounts, normalised by the total number of RNA-seq reads that mapped to protein-coding gene regions. Differential expression of repeat copies across samples was evaluated by the R Bioconductor DESeq package (Anders and Huber, 2010).
Embryoid body differentiation
Embryoid body formation and outgrowth were performed in DMEM/F12 supplemented with 15% fetal calf serum (FCS), 2 mM L-glutamine. 1 mM sodium pyruvate, 1× non-essential amino acids and 0.1 mM 2ME as described (Guo et al., 2016). Alternatively, reset cells were aggregated in t2iLIF medium with ROCK inhibitor in PrimeSurface 96V cell plates (Sumitomo Bakelite MS-9096V) then plated after 3 days on Geltrex (Thermo Fisher Scientific, 12063569) for outgrowth in serum-containing medium. Outgrowths were fixed with 4% paraformaldehyde for 10 min at room temperature for immunostaining.
Except where specified, reset cells were ʻre-primed' before initiating differentiation. Cells were plated on Geltrex in t2iLGö and after 48 h the medium was changed to E8. Cultures were maintained in E8, passaging at confluence. Lineage-specific differentiation was initiated between 25 and 44 days.
Definitive endoderm was induced according to Loh et al. (2014). Cells were cultured in CDM2 medium (in-house according to Loh et al., 2014) supplemented with 100 ng/ml activin A (produced in-house), 100 nM PI-103 (Bio-Techne, 2930), 3 µM CHIR99021, 10 ng/ml FGF2, 3 ng/ml BMP4 (Peprotech) for 1 day. For the next 2 days the following supplements were applied: 100 ng/ml activin A, 100 nM PI-103, 20 ng/ml FGF2, 250 nM LDN193189.
For lateral mesoderm induction (Loh et al., 2016), cells were treated with CDM2 supplemented with 30 ng/ml activin A, 40 ng/ml BMP4 (Miltenyi Biotech, 130-098-788), 6 µM CHIR99021, 20 ng/ml FGF2, 100 nM PI-103 for 1 day, then with 1 µM A8301, 30 ng/ml BMP4 and 10 µM XAV939 (Sigma-Aldrich).
For neural differentiation via dual SMAD inhibition (Chambers et al., 2009), cells were treated with N2B27 medium supplemented with 500 nM LDN193189 (Axon, 1509) and 1 μM A 83-01 (Bio-Techne, 2939) for 10 days, then passaged to plates coated with poly-L-ornithine and laminin and further cultured in N2B27 without supplements.
Flow analysis was carried out on a Fortessa instrument (BD Biosciences). Cell sorting was performed using a MoFlo high-speed instrument (Beckman Coulter).
Rosalind Drummond provided excellent technical support. We thank Nicholas Bredenkamp for sharing data. We are grateful to Peter Andrews for advice and support on karyotyping and to Valeria Orlova and Balazs Varga for advice on differentiation protocols. Andy Riddell and Peter Humphreys supported flow cytometry and imaging studies. Maike Paramor prepared RNA-seq libraries. Sequencing was conducted at the CRUK Cambridge Institute Genomic Core. We thank Felix Krueger for bioinformatics support.
Conceptualization: G.G., F.v.M., A.Sm.; Methodology: G.G., F.v.M., M.R., J.C., D.B., A.Sa., S.M., P.B.; Formal analysis: F.v.M., S.D., D.B., A.Sa., P.B.; Investigation: G.G., F.v.M., M.R., J.C., S.M.; Writing - original draft: G.G., A.Sm.; Writing - review & editing: G.G., F.v.M., A.Sm.; Visualization: S.D., P.B.; Supervision: K.P., W.R., A.Sm.; Funding acquisition: K.P., W.R., A.Sm.
This research is funded by the Medical Research Council of the United Kingdom (G1001028 and MR/P00072X/1) and European Commission Framework 7 (HEALTH-F4-2013-602423, PluriMes), and in part by the UK Regenerative Medicine Platform (MR/L012537/1). W.R. is supported by the Biotechnology and Biological Sciences Research Council (BB/K010867/1), Wellcome Trust (095645/Z/11/Z), European Commission BLUEPRINT and EpiGeneSys. The Cambridge Stem Cell Institute receives core funding from the Wellcome Trust and the Medical Research Council. F.v.M. was funded by a Postdoctoral Fellowship from the Swiss National Science Foundation (SNF; Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung)/Novartis. S.M. is funded by a Wellcome Trust PhD Studentship. A.Sm. is a Medical Research Council Professor. Deposited in PMC for immediate release.
RNA-seq data are deposited in ArrayExpress under accession number E-MTAB-5674; and whole-genome bisulfite sequencing data in Gene Expression Omnibus under accession number GSE90168.
G.G. and A.Sm. are inventors on a patent filing by the University of Cambridge relating to human naïve pluripotent stem cells. W.R. is a consultant to, and shareholder in, Cambridge Epigenetix.