Pluripotent stem cells have broad utility in biomedical research and their molecular regulation has thus garnered substantial interest. While the principles that establish and regulate pluripotency have been well defined in the mouse, it has been difficult to extrapolate these insights to the human system due to species-specific differences and the distinct developmental identities of mouse versus human embryonic stem cells. In this Review, we examine genome-wide approaches to elucidate the regulatory principles of pluripotency in human embryos and stem cells, and highlight where differences exist in the regulation of pluripotency in mice and humans. We review recent insights into the nature of human pluripotent cells in vivo, obtained by the deep sequencing of pre-implantation embryos. We also present an integrated overview of the principal layers of global gene regulation in human pluripotent stem cells. Finally, we discuss the transcriptional and epigenomic remodeling events associated with cell fate transitions into and out of human pluripotency.
Introduction
Pluripotent cells reside naturally in the blastocyst of early mouse and human embryos and constitute the founder tissue of the embryo proper. Pluripotent stem cells (PSCs) that are capable of long-term self-renewal can be derived from the blastocyst (Evans and Kaufman, 1981; Martin, 1981; Thomson et al., 1998) or can be induced from somatic cells by direct reprogramming (Takahashi and Yamanaka, 2006; Takahashi et al., 2007; Yu et al., 2007). Given appropriate cues, embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs) can differentiate into virtually any cell type found in the adult body. Elucidating the molecular nature of pluripotency is thereby essential for designing enhanced strategies to reprogram somatic cells and for understanding how PSCs can be directed into lineages of interest in vitro.
Owing to their high proliferation rate and amenability to genetic manipulation, mouse ESCs (mESCs) provide a robust in vitro system with which to dissect the molecular regulation of pluripotency. The genome-wide DNA targets and protein-protein interaction partners of core pluripotency regulators have been mapped in mESCs (Young, 2011; Huang and Wang, 2014), and an essential transcription factor network that can explain mESC behavior has been defined using computational methods (Dunn et al., 2014). Furthermore, transcriptomic studies and functional experiments have shown that, under chemically defined conditions, mESCs closely resemble the epiblast compartment of the embryonic day (E) 4.5 mouse blastocyst (Boroviak et al., 2014). Thus, mESCs recapitulate key features of in vivo pluripotency.
It is becoming increasingly clear, however, that the regulatory principles of pluripotency cannot simply be extrapolated from mouse to human, but must be interrogated in human cells directly. Evidence obtained in recent years has revealed that extensive differences exist between mouse and human early embryogenesis, including the timing of zygotic genome activation (ZGA; see Glossary, Box 1) (Blakeley et al., 2015), divergent responses of mouse and human embryos to signal inhibitors (Kuijk et al., 2012; Roode et al., 2012), differences in the expression of key developmental regulators (Blakeley et al., 2015; Petropoulos et al., 2016), and different mechanisms to accomplish X-chromosome dosage compensation (Okamoto et al., 2011; Petropoulos et al., 2016; Vallot et al., 2017). Adding further complexity, human ESCs (hESCs) are considered to be developmentally more mature than mESCs, and to more closely resemble mouse epiblast stem cells (mEpiSCs) that are derived from the post-implantation epiblast (Brons et al., 2007; Tesar et al., 2007). Therefore, the molecular mechanisms that regulate human pluripotency are not easily inferred from studies in mice.
ChIP-Chip. A method to identify the genome-wide DNA targets of a protein of interest by chromatin immunoprecipitation followed by DNA microarray analysis.
ChIP-Seq. A method to identify the genome-wide DNA targets of a protein of interest by chromatin immunoprecipitation followed by massively parallel DNA sequencing.
CpG methylation. The addition of a methyl group to the fifth carbon of a cytosine base in a cytosine-phosphate-guanine (CpG) dinucleotide. The methylation of CpG-dense promoter regions is associated with gene repression.
Epiblast (EPI). The lineage of the blastocyst that gives rise to all somatic lineages and the germ line.
Expression quantitative trait loci (eQTL). Regions of the genome that contain variations in DNA sequence that correlate with the expression of one or more genes.
Extended pluripotent stem (EPS) cells. Pluripotent stem cells that can contribute to embryonic as well as to extraembryonic tissues upon injection into early mouse embryos.
Fluorescence ubiquitin cell cycle indicator (FUCCI). A system to track cell cycle progression in live cells based on cell cycle-dependent proteolysis of fluorescent ubiquitylation oscillators.
Inner cell mass (ICM). A cellular mass on the inside of the blastocyst containing the epiblast and primitive endoderm (hypoblast) lineages.
Insulated neighborhoods. Chromosomal loop structures that are formed by CTCF homodimers and co-occupied by the Cohesin complex. Such neighborhoods function to insulate genes and their regulatory elements within the loop.
Mesendoderm. A bipotential embryonic tissue layer that arises during gastrulation and gives rise to both mesoderm and endoderm.
Naive pluripotency. A state of pluripotency associated with the pre-implantation epiblast, which is characterized by an unbiased developmental potential and depletion of repressive chromatin features. Naive pluripotency is recapitulated in vitro in the form of mESCs. Recently, a number of studies have attempted to derive hESCs in a naive state.
Pioneer factors. Factors that can engage target sequences on nucleosomes or in compacted chromatin and facilitate the binding of other transcription factors.
Polycomb repressive complex 2 (PRC2). A complex of Polycomb group proteins that di- and tri-methylates lysine 27 of histone H3 (H3K27me2/3). The PRC2 complex consists of four subunits: EED, SUZ12, EZH1/2 and RBAP46/48 (RBBP7/4).
Primed pluripotency. A state of pluripotency associated with the late post-implantation epiblast, which is characterized by lineage priming and enrichment in repressive chromatin features. Primed pluripotency is recapitulated in vitro in the form of mEpiSCs. Conventional hESCs also display defining features of primed pluripotency.
Primitive endoderm (PE, or hypoblast). The lineage of the blastocyst that gives rise to the extraembryonic endoderm of the yolk sac.
Primordial germ cells (PGCs). Precursors to sperm and egg, which are specified from pluripotent epiblast and subsequently migrate to the gonads.
RNA fluorescence in situ hybridization (RNA FISH). A method to detect the presence of RNA molecules in fixed cells using complementary fluorescent probes.
Trophectoderm (TE). Outer layer of cells in the blastocyst, which gives rise to a large part of the placenta.
Zygotic genome activation (ZGA). Time point during cleavage when the embryonic genome becomes transcriptionally active and takes over control of cellular functions from maternal products.
2i/L. A serum-free medium that includes inhibitors of MEK and GSK3 together with the cytokine leukemia inhibitory factor (LIF), which maintains mESCs in a naive pluripotent state.
t2i/L+Gö. A medium for maintaining hESCs in a naive pluripotent state that includes a MEK inhibitor, a reduced dose of GSK3 inhibitor, human LIF and a protein kinase C inhibitor.
5i/L/A. A medium for maintaining hESCs in a naive pluripotent state that comprises 2i/L together with inhibitors of BRAF, SRC and ROCK, and recombinant activin A. FGF2 is added in some formulations to improve the efficiency of naive conversion.
In this Review, we provide a broad overview of the global regulatory mechanisms that operate specifically in human pluripotent cells. We first review our current understanding of the major transcriptional and epigenomic events that lead to human blastocyst formation. This provides essential background for the rest of the article, in which we discuss the key determinants of global gene expression in hESCs and the molecular changes that occur during cell fate transitions around the human pluripotent state. Where relevant, we highlight notable differences in gene regulatory networks between mouse and human pluripotent cells.
Deep sequencing of early human embryos: a molecular blueprint for pluripotency in vivo
What are the key molecular properties of human pluripotent cells in their in vivo environment? Owing to limited access to human embryos, molecular profiling of the human blastocyst has lagged behind comparable studies in mice. However, in the past 5 years, several studies have characterized the major changes that take place in the transcriptome and DNA methylome during human pre-implantation development (summarized in Fig. 1).
Insights into the transcriptional and epigenetic properties of human pluripotent cells in vivo. An overview of molecular events occurring during human pre-implantation development. (A) The transcription factor DUX4 activates cleavage-specific gene and transposon expression during zygotic genome activation (ZGA) (Hendrickson et al., 2017; De Iaco et al., 2017; Whiddon et al., 2017). (B) The three lineages of the human blastocyst – epiblast (EPI), primitive endoderm (PE) and trophectoderm (TE) – form concurrently and are associated with specific markers, as inferred from single-cell RNA-Seq analyses (Blakeley et al., 2015). (C) Both X chromosomes are actively transcribed in female blastocysts and show co-expression of the lncRNAs XIST and XACT in mutually exclusive nuclear domains (Okamoto et al., 2011; Petropoulos et al., 2016; Vallot et al., 2017). Human pre-implantation development is also marked by globally reduced levels of DNA methylation (Smith et al., 2014; Guo et al., 2014).
Insights into the transcriptional and epigenetic properties of human pluripotent cells in vivo. An overview of molecular events occurring during human pre-implantation development. (A) The transcription factor DUX4 activates cleavage-specific gene and transposon expression during zygotic genome activation (ZGA) (Hendrickson et al., 2017; De Iaco et al., 2017; Whiddon et al., 2017). (B) The three lineages of the human blastocyst – epiblast (EPI), primitive endoderm (PE) and trophectoderm (TE) – form concurrently and are associated with specific markers, as inferred from single-cell RNA-Seq analyses (Blakeley et al., 2015). (C) Both X chromosomes are actively transcribed in female blastocysts and show co-expression of the lncRNAs XIST and XACT in mutually exclusive nuclear domains (Okamoto et al., 2011; Petropoulos et al., 2016; Vallot et al., 2017). Human pre-implantation development is also marked by globally reduced levels of DNA methylation (Smith et al., 2014; Guo et al., 2014).
The transcriptome of human pre-implantation embryos
The initial transcriptional profiling of whole human embryos was performed on microarray platforms (Vassena et al., 2011; Xie et al., 2010; Zhang et al., 2009; Dobson et al., 2004; Madissoon et al., 2014). However, owing to the heterogeneous nature of these samples, it was challenging to derive lineage-specific gene expression patterns from these studies. The advent of single-cell transcriptome analysis has allowed lineage-specific gene expression in human pre-implantation embryos to be delineated more precisely. For example, Tang and colleagues analyzed global gene expression in 90 single human cells from the oocyte to late blastocyst stage, and in 34 single hESCs at early and late passages (Yan et al., 2013). The expression profiles of single cells obtained from late blastocysts clustered into three distinct groups, representing the epiblast (EPI), primitive endoderm (PE) and trophectoderm (TE) lineages (see Glossary, Box 1). Dramatic transcriptional changes were observed between the EPI cells and hESCs derived in culture (Yan et al., 2013). Thus, conventional hESCs differ substantially from the pluripotent cells that are present in the human blastocyst in terms of their global gene expression patterns.
By comparing single-cell RNA-Seq data from early human embryos (Yan et al., 2013) with that of stage-matched mouse embryos (Deng et al., 2014), Niakan and colleagues reported that a single wave of ZGA occurs between the human 4-cell and 8-cell stages (Blakeley et al., 2015). This contrasts with prior studies that had reported that a minor wave of ZGA occurs before the human 4-cell stage (Dobson et al., 2004; Xue et al., 2013), and indicates a delay in the onset of ZGA in human embryos compared with mouse embryos (Flach et al., 1982). Three groups recently identified the double homeobox transcription factor 4 (DUX4) as a global regulator of ZGA in mouse and human embryos (Hendrickson et al., 2017; De Iaco et al., 2017; Whiddon et al., 2017). Mutations in DUX4 are associated with facioscapulohumeral dystrophy, a common muscular dystrophy, and forced DUX4 expression activates an early embryonic transcriptional program in primary human myoblasts (Geng et al., 2012). DUX4 mRNA is expressed from the oocyte to 4-cell stage of human embryos, and when overexpressed in hESCs, DUX4 induces the derepression of hundreds of genes, including ZSCAN4 and KDM4E, and HERVL retrotransposons. In addition, it was shown that a mouse ortholog, Dux, can induce a ʻ2 cell-like' state in mESCs (De Iaco et al., 2017; Hendrickson et al., 2017). Hence, DUX family proteins are potent transcriptional activators of cleavage-specific gene and transposon expression (Fig. 1A).
In the mouse, lineage segregation in pre-implantation embryos occurs in two distinct waves: the TE first segregates from the inner cell mass (ICM; see Glossary, Box 1) as Cdx2 becomes restricted to outer cells of the morula, and this is then followed by segregation of the EPI and PE. In human embryos, however, CDX2 is not detected until after blastocyst formation (Niakan and Eggan, 2013). In fact, a single-cell transcriptome analysis of 1529 cells from 88 human embryos suggested that human lineage specification occurs simultaneously during blastocyst formation at around E5 (Petropoulos et al., 2016). An intermediate state in which EPI, PE and TE markers are co-expressed appears to precede the onset of concurrent lineage commitment. It is worth noting that human blastocysts seem to undergo an additional round of cell division prior to implantation (Niakan et al., 2012), which may be linked to the differences in the timing of lineage segregation. Human blastocysts also differ from their mouse counterparts in the spatial distribution of emerging EPI and PE cells. While cells expressing Nanog and Gata4/6 are scattered around the mouse ICM in a mutually exclusive ʻsalt-and-pepper' pattern (Chazaud et al., 2006), NANOG-positive cells assume a more tightly clustered organization in the early human ICM (Durruthy-Durruthy et al., 2016b).
How similar are the mouse and human blastocysts at the transcriptional level? Blakeley et al. (2015) identified conserved and human-specific gene expression patterns in the three lineages of the blastocyst (Fig. 1B). Conserved lineage-specific genes include NANOG, OCT4 (POU5F1) and SOX2 (in the EPI), CDX2, GATA3 and KRT18 (in the TE), and GATA4, PDGFRA and SOX17 (in the PE). Several genes are exclusively expressed in the human EPI, such as KLF17 (confirmed at the protein level), whereas the mouse EPI factors Esrrb, Klf2 and Bmp4 are not expressed in the human EPI. Another notable species-specific difference was the expression of TE-associated genes: Elf5 and Eomes are specific to mouse TE, while CLDN10, PLAC8 and TRIML1 are specific to human TE (Blakeley et al., 2015). Hence, there are fundamental differences in the expression of key developmental regulators between mouse and human blastocysts. These data indicate that the transcriptional regulation of human pre-implantation development cannot easily be inferred from studies in mice. This conclusion is further underscored by the recent finding that OCT4 may be required at an earlier stage of pre-implantation development in human embryos compared with mouse embryos (Fogarty et al., 2017).
The epigenome of human pre-implantation embryos
The DNA methylation landscape of early human embryos has been charted by whole-genome bisulfite sequencing from the oocyte to post-implantation stages (Okae et al., 2014; Smith et al., 2014; Guo et al., 2014). Genome-wide cytosine DNA demethylation occurs shortly after fertilization, but with different kinetics in the maternal and paternal genomes. The paternal genome is actively demethylated in the zygote, while the maternal genome undergoes progressive passive demethylation during cleavage. CpG methylation (see Glossary, Box 1) is restored overall to somatic levels after implantation, although the precise timing of global remethylation in human embryos has not been established. Hence, pluripotent cells in human blastocysts are hypomethylated. Meissner and colleagues have also shown that hESCs undergo rapid remethylation upon derivation from the blastocyst (Smith et al., 2014). In addition to the aforementioned transcriptional differences, this highlights a second molecular difference between hESCs in vitro and the pluripotent cells of the blastocyst.
Changes in the epigenetic state of the X chromosome have also been studied in mouse and human embryos. The paternally inherited X chromosome is inactivated in female mouse embryos at the 4-cell stage, and this is followed by X-chromosome reactivation in the EPI at E4.5 (reviewed by Pasque and Plath, 2015). This process is mediated by the long non-coding RNA (lncRNA) Xist, which coats the inactive X chromosome. RNA fluorescence in situ hybridization (RNA FISH; see Glossary, Box 1) studies have indicated that both X chromosomes are also actively transcribed in human female blastocysts but, surprisingly, XIST is co-expressed (Okamoto et al., 2011). These observations were confirmed by allele-specific gene expression analyses from Lanner and colleagues, who proposed that a progressive dampening of X-linked gene expression occurs on both X chromosomes during human pre-implantation development (Petropoulos et al., 2016). Such a model of dosage compensation is reminiscent of that observed in hermaphrodite worms, in which expression of both X chromosomes is reduced by half (reviewed by Meyer, 2010). Why does XIST not induce X-chromosome silencing in human blastocysts? Rougeulle and colleagues showed that XIST assumes a dispersed organization in human blastocysts, which might explain why it does not induce X inactivation (Vallot et al., 2017). This study also suggested that the repressive effect of XIST in human embryos is antagonized by another X-linked lncRNA, XACT, which co-accumulates with XIST on the same actively transcribed X chromosomes (Fig. 1C).
Layers of gene regulation in hPSCs
Having summarized the pivotal molecular events occurring during human pre-implantation development, we now turn our focus to the regulation of gene expression in hPSCs. Although our understanding of gene control in human embryos still remains fragmentary, the ability to map the genome-wide location of crucial nuclear regulators and perturb their expression has identified complex interactions among multiple layers of gene regulation in hPSCs (summarized in Fig. 2). Below, we focus on those regulatory mechanisms that have been shown to be functionally significant by either genetic or pharmacological manipulation.
Layers of gene expression control in primed human PSCs. Overview of the major determinants of global gene expression in hPSCs cultured under conventional (primed) culture conditions. (A) The core transcription factors OCT4, SOX2 and NANOG form an autoregulatory network and repress distinct lineage fates in hESCs (Boyer et al., 2005; Wang et al., 2012). (B) Crosstalk among the major signaling pathways in hESCs as proposed by Singh et al. (2012). According to this model, activation of the PI3K/AKT pathway by IGF1 or FGF2 promotes the self-renewal of hESCs via two mechanisms. First, PI3K/AKT modulates the threshold of SMAD2/3 activity, allowing for the activation of NANOG but not mesendoderm-associated genes. As shown in A, the activation of NANOG stimulates the expression of core pluripotency genes and blocks neuroectoderm differentiation. Active PI3K/AKT also inhibits MEK/ERK and maintains GSK3β activity, which blocks β-catenin-mediated stimulation of pro-differentiation genes. Note that Wnt/β-catenin signaling has a distinct role in mESCs, where it functions to promote self-renewal instead of differentiation (ten Berge et al., 2011; Wray et al., 2011; Yi et al., 2011). (C) Chromatin-modifying enzymes with functional significance in hESC self-renewal include EZH1/2, which deposit H3K27me3 (Collinson et al., 2016; Shan et al., 2017), and enzymes that control the levels of H3K4me3 (Adamo et al., 2011; Bertero et al., 2015). (D) Various non-coding RNAs regulate the human pluripotent state. Transposon-derived lncRNAs, including HERVH and HPAT5, contribute to the self-renewal of hESCs (Lu et al., 2014; Durruthy-Durruthy et al., 2016a), as does miR-302/367 (Rosa et al., 2009; Lipchina et al., 2011; Rosa and Brivanlou, 2011). However, let-7 miRNA blocks the processing of pluripotency transcripts and is inhibited by LIN28 (Viswanathan et al., 2008; Newman et al., 2008; Rybak et al., 2008). (E) Other major determinants of gene expression in hPSCs. (Left) Metabolites, such as methionine (Met) and glucose (Gluc), generate substrates for histone modifications in hPSCs (Shiraki et al., 2014; Moussaieff et al., 2015). (Middle) The spliceosome produces a pluripotency-specific isoform of the transcription factor FOXP1, while SON ensures the accurate splicing of OCT4 and PRDM14 (Gabut et al., 2011; Lu et al., 2013). (Right) Insulated neighborhoods established by cohesion-associated CTCF loops constrain enhancer-promoter interactions at human pluripotency loci (Ji et al., 2016).
Layers of gene expression control in primed human PSCs. Overview of the major determinants of global gene expression in hPSCs cultured under conventional (primed) culture conditions. (A) The core transcription factors OCT4, SOX2 and NANOG form an autoregulatory network and repress distinct lineage fates in hESCs (Boyer et al., 2005; Wang et al., 2012). (B) Crosstalk among the major signaling pathways in hESCs as proposed by Singh et al. (2012). According to this model, activation of the PI3K/AKT pathway by IGF1 or FGF2 promotes the self-renewal of hESCs via two mechanisms. First, PI3K/AKT modulates the threshold of SMAD2/3 activity, allowing for the activation of NANOG but not mesendoderm-associated genes. As shown in A, the activation of NANOG stimulates the expression of core pluripotency genes and blocks neuroectoderm differentiation. Active PI3K/AKT also inhibits MEK/ERK and maintains GSK3β activity, which blocks β-catenin-mediated stimulation of pro-differentiation genes. Note that Wnt/β-catenin signaling has a distinct role in mESCs, where it functions to promote self-renewal instead of differentiation (ten Berge et al., 2011; Wray et al., 2011; Yi et al., 2011). (C) Chromatin-modifying enzymes with functional significance in hESC self-renewal include EZH1/2, which deposit H3K27me3 (Collinson et al., 2016; Shan et al., 2017), and enzymes that control the levels of H3K4me3 (Adamo et al., 2011; Bertero et al., 2015). (D) Various non-coding RNAs regulate the human pluripotent state. Transposon-derived lncRNAs, including HERVH and HPAT5, contribute to the self-renewal of hESCs (Lu et al., 2014; Durruthy-Durruthy et al., 2016a), as does miR-302/367 (Rosa et al., 2009; Lipchina et al., 2011; Rosa and Brivanlou, 2011). However, let-7 miRNA blocks the processing of pluripotency transcripts and is inhibited by LIN28 (Viswanathan et al., 2008; Newman et al., 2008; Rybak et al., 2008). (E) Other major determinants of gene expression in hPSCs. (Left) Metabolites, such as methionine (Met) and glucose (Gluc), generate substrates for histone modifications in hPSCs (Shiraki et al., 2014; Moussaieff et al., 2015). (Middle) The spliceosome produces a pluripotency-specific isoform of the transcription factor FOXP1, while SON ensures the accurate splicing of OCT4 and PRDM14 (Gabut et al., 2011; Lu et al., 2013). (Right) Insulated neighborhoods established by cohesion-associated CTCF loops constrain enhancer-promoter interactions at human pluripotency loci (Ji et al., 2016).
Core transcriptional circuitry
A core set of master transcription factors is considered to be essential for the maintenance of hESCs. Boyer and colleagues mapped the genome-wide binding of OCT4, SOX2 and NANOG (collectively OSN) by chromatin immunoprecipitation (ChIP) coupled with DNA microarrays (ChIP-Chip; see Glossary, Box 1) and made three major observations: (1) OSN co-occupy a large fraction of promoters in their target genes; (2) roughly half of the genes bound by OSN are actively transcribed in hESCs; and (3) OSN collaborate in a regulatory circuitry that involves extensive autoregulatory and feedforward loops (Boyer et al., 2005). The target genes of OSN are poorly conserved between mouse and human ESCs (Loh et al., 2006); however, this is partially explained by the evolution of new transcription factor binding sites by species-specific transposable elements (Kunarso et al., 2010). As we will discuss, the core pluripotency transcription factors are directly plugged into multiple layers of gene expression control in hESCs (Fig. 2A-E).
Do OSN act collectively on shared target genes or do they act individually to repress distinct cell fates? Ivanova and colleagues performed loss- and gain-of-function studies in multiple independent hESC lines and reported that OSN exercise distinct roles in suppressing lineage commitment (Wang et al., 2012). Specifically, OCT4 interacts with the BMP4 pathway to specify four distinct cell fates in a dose-dependent manner (Fig. 2A). High levels of OCT4 induce mesendoderm (see Glossary, Box 1) in the presence of BMP4, but maintain hESC self-renewal in the absence of BMP4. By contrast, low levels of OCT4 induce extraembryonic genes in the presence of BMP4, but stimulate neuroectoderm genes in the absence of BMP4. In addition, NANOG represses neuroectoderm differentiation, whereas SOX2 represses mesendoderm differentiation (Fig. 2A). Furthermore, the overexpression of OSN does not trigger hESC differentiation, indicating that OSN act primarily as differentiation repressors in hESCs (Wang et al., 2012).
Other transcription factors that cooperate with the core transcription factors in sustaining the human pluripotent state have also been identified. For instance, Ng and colleagues performed a high-throughput RNA interference (RNAi) screen in OCT4-GFP reporter hESCs, and identified an essential role for the transcription factor PRDM14 (Chia et al., 2010), the ortholog of which is required for the establishment of primordial germ cells (PGCs; see Glossary, Box 1) in mice (Yamaji et al., 2008). PRDM14 regulates the expression of OCT4 by binding to its proximal enhancer and colocalizes with OSN across the genome in hESCs. Several Forkhead box (FOX) transcription factors have also been implicated in the control of human pluripotency. FOXO1 binds and regulates the promoters of OCT4 and SOX2 (Zhang et al., 2011). In addition, a specific level of FOXD3 is required for hESC self-renewal: overexpression of FOXD3 induces differentiation to paraxial mesoderm, whereas depletion of FOXD3 generates mesodermal and endodermal derivatives (Arduini and Brivanlou, 2012).
Growth factor signaling
There has been a concerted effort to define the signaling requirements of hESCs since their initial derivation on mouse embryonic fibroblasts (MEFs) in serum-containing media (Thomson et al., 1998). Two growth factors have been identified that can replace the requirement for MEFs or for MEF-conditioned medium: transforming growth factor β (TGFβ)/Activin/Nodal, which signals through the downstream effectors SMAD2/3 (James et al., 2005; Beattie et al., 2005); and basic fibroblast growth factor (FGF2) (Xu et al., 2005). Recombinant provision of Activin and FGF2 allows hESCs to be maintained in the absence of both MEFs and serum (Vallier et al., 2005). But how do these signaling pathways contribute to maintenance of the pluripotent state?
FGF/ERK signaling was reported to induce the expression of NANOG in hESCs, but the underlying mechanism remains unclear (Greber et al., 2010; Yu et al., 2011). On the other hand, direct binding of SMAD2/3 to the NANOG promoter suggests that the TGFβ/Activin/Nodal pathway directly regulates the expression of this core transcription factor (Vallier et al., 2009; Xu et al., 2008; Greber et al., 2008) (Fig. 2B). Interestingly, the transcription of components of the TGFβ signaling pathway is enriched in human EPI cells (Blakeley et al., 2015). Moreover, NANOG protein expression is reduced in human blastocysts treated with a TGFβ inhibitor, but this is not seen in treated mouse blastocysts (Blakeley et al., 2015). Thus, the TGFβ-mediated regulation of NANOG appears common to human pluripotent cells in vitro and in vivo. Inhibition of TGFβ/Activin/Nodal signaling in hESCs rapidly suppresses NANOG expression and induces neural differentiation (Chambers et al., 2009). This is consistent with the aforementioned role of NANOG in repressing neuroectoderm-associated genes.
Wnt signaling has also been implicated in regulating hESCs, although its role has been harder to define due to its dose-dependent effects. It was initially reported that hESC self-renewal can be maintained by providing recombinant Wnt3a or by inhibiting glycogen synthase kinase 3 (GSK3), the enzyme that targets β-catenin for degradation (Sato et al., 2004). However, Moon and colleagues showed that the long-term treatment of hESCs with Wnt3a actually compromises their self-renewal (Davidson et al., 2012). A similar phenotype was observed by increasing the concentration of GSK3 inhibitor, which resulted in a more robust induction of β-catenin transcriptional activity. Variable levels of endogenous Wnt pathway activity have been linked to the heterogeneous differentiation propensity of hESCs: whereas Wnthigh cells predominantly form mesendoderm, Wntlow cells are biased towards neuroectoderm (Blauwkamp et al., 2012). A function for endogenous Wnt signals in driving exit from pluripotency was also observed in mEpiSCs (Kurek et al., 2015), and this thus appears to be a conserved feature of PSCs that resemble the post-implantation EPI. It should be noted, however, that the pro-differentiation role of Wnt/β-catenin signaling in mEpiSCs/hESCs is distinct from its effects in mESCs, where Wnt signals and GSK3 inhibition are known to reinforce the pluripotency network and suppress differentiation (Ying et al., 2008; ten Berge et al., 2011; Wray et al., 2011; Yi et al., 2011).
Extensive crosstalk has been proposed to occur among the major signaling pathways in hESCs (Singh et al., 2012). This model of Singh et al. centers on the role of PI3K/AKT signaling as a node that modulates inputs from FGF/ERK, Activin/Nodal and Wnt/β-catenin signaling (Fig. 2B). Activation of PI3K/AKT by growth factors such as FGF2, IGF1 or heregulin (neuregulin 1) keeps phosphorylated SMAD2/3 levels within a tight range that allows for the activation of pluripotency genes, including NANOG. Conversely, the withdrawal of PI3K/AKT stimulation leads to elevated SMAD2/3 activity and to the induction of mesendoderm genes. According to this model, PI3K/AKT also inhibits MEK/ERK signaling, which in turn blocks β-catenin-mediated stimulation of developmental genes (Singh et al., 2012).
Chromatin-modifying enzymes
The chromatin landscape of pluripotent cells was first charted in mESCs. This work surprisingly revealed the genome-wide colocalization of active and repressive chromatin marks: tri-methylation of lysine 4 on histone 3 (H3K4me3) and tri-methylation of lysine 27 on histone 3 (H3K27me3), respectively (Bernstein et al., 2006; Azuara et al., 2006). Such a ʻbivalent' chromatin signature is thought to mark genes that are repressed in ESCs but poised to allow for alternative fates. H3K4me3 and H3K27me3 also overlap at thousands of genes in hESCs, but few genes exhibit H3K27me3 alone (Pan et al., 2007; Zhao et al., 2007). This indicates that bivalency is the default chromatin state at key developmental control genes marked by H3K27me3 in hESCs (Fig. 2C).
Deposition of H3K27me3 is mediated by Polycomb repressive complex 2 (PRC2; see Glossary, Box 1) via its catalytic subunit EZH1/2. Chromatin immunoprecipitation followed by DNA sequencing (ChIP-Seq; see Glossary, Box 1) studies by the Bernstein lab have revealed that many bivalent domains in mouse and human ESCs are also occupied by the PRC1 complex, and that such types of bivalent domains more efficiently retain repressive chromatin marks during differentiation (Ku et al., 2008). The precise mechanisms governing Polycomb recruitment to specific genomic targets remain unknown (reviewed by Holoch and Margueron, 2017), but a recent study in mESCs showed that Polycomb-associated proteins have a unique DNA recognition motif that specifically binds to unmethylated CpG sites (Li et al., 2017). In addition, PRDM14 has been reported to direct PRC2 to specific developmental regulators in hESCs (Chan et al., 2013a), and a significant fraction of PRC2 targets in mouse and human ESCs is co-occupied by OSN (Boyer et al., 2005; Lee et al., 2006).
The genetic ablation of various enzymes belonging to the PRC2 complex causes the removal of repressive H3K27me3 marks, reduced self-renewal and upregulation of mesendoderm genes in hESCs (Collinson et al., 2016; Shan et al., 2017). By contrast, the loss of Eed (which encodes a PRC2 component) has little impact on the transcriptome of mESCs grown in chemically defined 2i/L conditions (see Glossary, Box 1) (Galonska et al., 2015), indicating that mESCs can self-renew in the complete absence of H3K27me3. Enzymes that regulate the levels of H3K4me3 are also required for the self-renewal of hESCs. Lysine-specific demethylase 1 (LSD1; also known as KDM1A) keeps the levels of H3K4 methylation in check at the promoters of developmental genes marked by bivalent domains (Adamo et al., 2011). By contrast, the DPY30-COMPASS methyltransferase promotes H3K4me3 deposition at a subset of pluripotency genes and bivalent genes involved in mesendoderm specification (Bertero et al., 2015). Hence, enzymes controlling both active and repressive histone marks are essential for hESC identity (Fig. 2C).
In addition to their dependence on histone-modifying enzymes, hESCs are exquisitely sensitive to the loss of DNMT1, the DNA methyltransferase responsible for maintaining CpG methylation during DNA replication. Meissner and colleagues have shown that ablating the de novo methyltransferases DNMT3A and/or DNMT3B in hESCs causes only a modest reduction in DNA methylation content, and does not compromise hESC self-renewal (Liao et al., 2015). However, the deletion of DNMT1 in hESCs causes global DNA demethylation and cell death. This dependence of hESCs on DNMT1 contrasts with the role of DNMTs in mESCs; mESCs tolerate the combined removal of all three DNMT enzymes and the complete loss of DNA methylation (Tsumura et al., 2006). Together, these examples demonstrate that certain gene-silencing mechanisms, while dispensable for the self-renewal of mESCs, are functionally required in hESCs.
Functional roles of non-coding RNAs
MicroRNAs (miRNAs) are small RNAs (∼22 nt) that mediate the post-transcriptional repression of mRNA targets (Bartel, 2009). The miR-302/367 cluster comprises the most abundantly expressed miRNA family in hESCs (Suh et al., 2004; Morin et al., 2008; Bar et al., 2008). miR-302/367 is thought to repress neural differentiation in hESCs by modulating Nodal and BMP signaling (Rosa et al., 2009; Lipchina et al., 2011) and by downregulating NR2F2 (Coup-TFII), a transcription factor that induces neural differentiation (Rosa and Brivanlou, 2011) (Fig. 2D). The expression of miR-302/367 is positively influenced by OCT4, which binds directly to the miR-302/367 promoter in mouse and human ESCs (Marson et al., 2008). Other families of miRNAs counteract the pluripotency network. For example, miR-145 post-transcriptionally silences the expression of OCT4, SOX2 and KLF4 during hESC differentiation (Xu et al., 2009). Another well-studied example of an miRNA that antagonizes pluripotency is let-7, the processing of which into mature miRNAs is blocked by the RNA-binding protein LIN28 in mouse and human ESCs (Viswanathan et al., 2008; Rybak et al., 2008; Newman et al., 2008).
Long non-coding RNAs (lncRNAs) are another class of non-coding RNAs that contribute to the transcriptional regulation of hESCs, and are associated with endogenous retroviral elements. One such lncRNA derives from retrovirally encoded HERVH transcripts and functions by recruiting OCT4 and co-activators, such as p300 (EP300), to activate the expression of nearby genes (Lu et al., 2014) (Fig. 2D). Reijo Pera and colleagues (Durruthy-Durruthy et al., 2016a) also identified three transposon-derived lncRNAs, called HPAT2, HPAT3 and HPAT5, that are found in hESCs and are also highly expressed in the human blastocyst. The injection of siRNAs that target HPAT2, HPAT3 and HPAT5 into human embryos at the 2-cell stage prevents the injected blastomeres from contributing to the ICM. Hence, these lncRNAs are required for the establishment of human pluripotent cells in vivo (Durruthy-Durruthy et al., 2016a). The exogenous expression of HPAT5 inhibits hESC differentiation by interfering with the expression and activity of let-7 miRNAs, which reveals an interesting example of functional crosstalk between miRNAs and lncRNAs (Fig. 2D). Like the miRNA families discussed above, many transposon-derived lncRNAs are under the direct transcriptional regulation of core pluripotency factors in hESCs (Santoni et al., 2012; Durruthy-Durruthy et al., 2016a; Fort et al., 2014).
Other determinants
Several additional layers of gene expression control have been implicated in hPSCs. First, numerous links have been identified between the levels of metabolites and epigenetic marks in hESCs (reviewed by Mathieu and Ruohola-Baker, 2017). Notably, methionine metabolism provides a substrate for histone methylation (Shiraki et al., 2014), while glycolysis provides a substrate for histone acetylation in hESCs (Moussaieff et al., 2015) (Fig. 2E, left). Second, it has also been shown that the spliceosome ensures the accurate splicing of pluripotency-associated transcripts (Lu et al., 2013) and is known to produce a pluripotency-specific isoform of the transcription factor FOXP1 (Gabut et al., 2011) (Fig. 2E, middle). Third, cohesin-associated chromatin loops are known to establish boundaries between insulated neighborhoods (see Glossary, Box 1), which constrain enhancer-promoter interactions at loci such as PRDM14 (Ji et al., 2016) (Fig. 2E, right). Thus, global gene expression in hESCs is regulated by a complex web of interactions among transcription factors, chromatin modifiers, metabolites and 3D chromatin structure.
Transcriptome and epigenome remodeling during the transitions into and out of pluripotency
In this final section, we examine how the regulatory network that governs human pluripotency is established and dynamically remodeled during three pivotal cell fate transitions: the induction of pluripotency by defined factors, the interconversion between distinct pluripotent states, and the exit from pluripotency. A graphical summary of the major molecular changes during these three cell fate transitions is provided in Fig. 3.
Molecular dynamics during the transitions into and out of human pluripotency. Summary of the transcriptional and epigenetic changes during cell fate transitions involving hPSCs. (A) Sequence of events occurring during the reprogramming of human somatic cells to pluripotency upon overexpression of OCT4, SOX2, KLF4 and c-MYC (OSKM) (Cacchiarelli et al., 2015). (B) Events occurring during the primed-to-naive PSC transition. Collier et al. (2017) assessed transcriptional dynamics at early and late stages of naive resetting upon overexpression of KLF2 and NANOG (KN) transgenes in primed hESCs and transfer to t2i/L+Gö medium (Takashima et al., 2014) or by switching primed hESCs to 5i/L/A medium (Theunissen et al., 2014). Note that the Smith laboratory recently showed that hPSCs can also be reset to naive pluripotency by transient histone deacetylase inhibition and transfer to t2i/L+Gö medium (Guo et al., 2017). The dynamics of X-chromosome reactivation during naive resetting were analyzed by Sahakyan et al. (2017). Naive hPSCs derived in t2i/L+Gö or 5i/L/A display globally reduced DNA methylation levels and erasure of most imprinted regions (Pastor et al., 2016; Theunissen et al., 2016). (C) The exit of human pluripotency, specifically the relationship between the cell cycle and lineage commitment of hPSCs (Pauklin and Vallier, 2013; Pauklin et al., 2016). The existence of a formative phase, during which PSCs acquire competence for multilineage and germ cell induction, has recently been proposed (Smith, 2017). OxPHOS, oxidative phosphorylation; Xi, inactive X.
Molecular dynamics during the transitions into and out of human pluripotency. Summary of the transcriptional and epigenetic changes during cell fate transitions involving hPSCs. (A) Sequence of events occurring during the reprogramming of human somatic cells to pluripotency upon overexpression of OCT4, SOX2, KLF4 and c-MYC (OSKM) (Cacchiarelli et al., 2015). (B) Events occurring during the primed-to-naive PSC transition. Collier et al. (2017) assessed transcriptional dynamics at early and late stages of naive resetting upon overexpression of KLF2 and NANOG (KN) transgenes in primed hESCs and transfer to t2i/L+Gö medium (Takashima et al., 2014) or by switching primed hESCs to 5i/L/A medium (Theunissen et al., 2014). Note that the Smith laboratory recently showed that hPSCs can also be reset to naive pluripotency by transient histone deacetylase inhibition and transfer to t2i/L+Gö medium (Guo et al., 2017). The dynamics of X-chromosome reactivation during naive resetting were analyzed by Sahakyan et al. (2017). Naive hPSCs derived in t2i/L+Gö or 5i/L/A display globally reduced DNA methylation levels and erasure of most imprinted regions (Pastor et al., 2016; Theunissen et al., 2016). (C) The exit of human pluripotency, specifically the relationship between the cell cycle and lineage commitment of hPSCs (Pauklin and Vallier, 2013; Pauklin et al., 2016). The existence of a formative phase, during which PSCs acquire competence for multilineage and germ cell induction, has recently been proposed (Smith, 2017). OxPHOS, oxidative phosphorylation; Xi, inactive X.
The induction of pluripotency in human somatic cells
Since the discovery that somatic cells can be reprogrammed to pluripotency by exogenous transcription factors (Takahashi and Yamanaka, 2006; Takahashi et al., 2007; Yu et al., 2007), there has been significant interest in elucidating how reprogramming factors act. Zaret and colleagues performed ChIP-Seq analyses of OCT4, SOX2, KLF4 and c-MYC (collectively OSKM) in human fibroblasts to examine their initial engagement with chromatin (Soufi et al., 2012). They reported that OSKM predominantly bind to distal regulatory regions after 48 h of reprogramming factor expression. While OCT4, SOX2 and KLF4 (OSK) act as ʻpioneer factors' (see Glossary, Box 1) that open up closed chromatin, c-MYC facilitates the engagement of OSK with chromatin. Collectively, OSKM activate genes in fibroblasts that are known to promote reprogramming, such as the transcription factor GLIS1 (Maekawa et al., 2011) and miR-302/367 (Anokye-Danso et al., 2011). Early OSKM binding was also enriched at genes with roles in apoptosis and senescence, including TP53 (Soufi et al., 2012). This is consistent with prior reports that the cell death pathway is induced rapidly after expression of the reprogramming factors in human fibroblasts (Kawamura et al., 2009). A recent analysis of OSK activity during mouse reprogramming indicates that these reprogramming factors do not act in isolation, however, but rely on collaborative interactions with stage-specific transcription factors to silence somatic enhancers and to activate pluripotency-specific enhancers (Chronis et al., 2017).
Analyzing the changes in gene expression and chromatin state during reprogramming is an arduous task due to the heterogeneous and inefficient nature of the process. Mikkelsen and colleagues (Cacchiarelli et al., 2015) took advantage of immortalized ʻsecondary' fibroblasts to characterize the global transcriptional and epigenomic events that occur during human reprogramming (Fig. 3A). Secondary fibroblasts are derived from a clonal line of human iPSCs that was generated with drug-inducible transcription factors, and therefore contain a uniform stoichiometry of the reprogramming factors (Hockemeyer et al., 2008). Using these cells, it was shown that, shortly after factor induction, genes associated with fibroblast identity are downregulated, whereas genes important for proliferation and metabolic reprogramming are activated by day 5 (Cacchiarelli et al., 2015). Subsequently, the fibroblasts transiently acquire a mesendodermal gene expression signature, in agreement with a prior study (Takahashi et al., 2014). As fibroblasts are of mesodermal origin, this intermediate state marks a reversal of the normal order of development. Another transient wave of gene expression was observed at the later stages of reprogramming when certain mRNAs (e.g. DPPA3, DNMT3L, TFCP2L1) and miRNAs (miR371) that are associated with human pre-implantation development are upregulated. However, this pre-implantation gene expression signature is lost again as the hiPSCs acquire a post-implantation identity typical of hESCs. This final step is also accompanied by a gain of DNA hypermethylation and of bivalent chromatin domains (Cacchiarelli et al., 2015). It will be of interest to investigate whether passage through these transient mesendodermal and pre-implantation states is required for successful reprogramming.
Understanding the sources of gene expression variability between iPSC lines is imperative for establishing well-controlled models of human disease. The NextGen Consortium and HipSci Initiative combined whole-genome DNA and RNA sequencing in hundreds of iPSC lines to quantify the fraction of transcriptional variability caused by genetic background (Carcamo-Orive et al., 2017; DeBoever et al., 2017; Kilpinen et al., 2017). These analyses revealed that differences between individuals contribute the largest proportion of variation between iPSC lines (50-53% of variation at the single-gene level, depending on the study). In addition, a significant fraction of inter-individual variability is driven by expression quantitative trait loci (eQTL; see Glossary, Box 1) specific to iPSCs, which are enriched at the binding sites for OCT4, NANOG and other pluripotency regulators (Carcamo-Orive et al., 2017; DeBoever et al., 2017; Kilpinen et al., 2017). By contrast, non-genetic background-associated transcriptional variability is largely driven by the skewed deposition of H3K27me3 silencing marks at Polycomb target genes (Carcamo-Orive et al., 2017). This is consistent with the findings from an RNAi screen that revealed that Polycomb enzymes are important regulators of human somatic cell reprogramming (Onder et al., 2012).
An accompanying study from the NextGen Consortium examined variability in DNA methylation between iPSC lines obtained from multiple pairs of monozygotic twins (Panopoulos et al., 2017). This work revealed that aberrant DNA methylation is not primarily driven by genetic variation, but seems to occur at regulatory regions that contain MYC or MYC-like binding sites in a clone-specific manner. The process of epigenetic reprogramming – whether by defined factors or by somatic cell nuclear transfer – also renders hPSCs vulnerable to the loss of genomic imprinting, especially at paternally methylated imprints (Bar et al., 2017). These large-scale analyses offer a cautionary note about the extent of variability between hiPSC lines and provide a wealth of data to further dissect the molecular drivers of gene expression variability during reprogramming.
Interconversions between distinct human pluripotent states
Two distinct pluripotent stem cell states can be derived from mouse embryos: mESCs derived from the blastocyst represent an uncommitted ʻnaive' state, whereas mEpiSCs derived from the post-implantation EPI are designated as ʻprimed' (see Glossary, Box 1) (Nichols and Smith, 2009). Evidence has accumulated that hESCs and hiPSCs are in a primed pluripotent state analogous to that observed in mEpiSCs. hPSCs share basic biological properties with mEpiSCs, including a flat morphology, dependence on FGF/Activin and propensity for X-chromosome inactivation (XCI) (Brons et al., 2007; Tesar et al., 2007). Like mEpiSC lines, hPSC lines also frequently display lineage bias during differentiation (Osafune et al., 2008; Bock et al., 2011; Nishizawa et al., 2016; Blauwkamp et al., 2012). In addition, global gene expression in hPSCs correlates most strongly with that of the primate late post-implantation EPI (Nakamura et al., 2016), reaffirming their primed identity. These observations have generated widespread interest in methods to isolate naive hPSCs (Hanna et al., 2010; Gafni et al., 2013; Chan et al., 2013b; Ware et al., 2014; Takashima et al., 2014; Theunissen et al., 2014; Guo et al., 2016; Qin et al., 2016; Zimmerlin et al., 2016; Guo et al., 2017; Liu et al., 2017).
Recent studies have begun to explore the transcriptional and epigenetic dynamics that occur during the naive ʻresetting' of conventional hPSCs and their subsequent ʻrepriming'. For example, Sahakyan et al. (2017) examined the status of the X chromosome in female naive hESCs derived using two alternative culture conditions: t2i/L+Gö (Takashima et al., 2014) and 5i/L/A (see Glossary, Box 1) (Theunissen et al., 2014). These two conditions induce a global transcriptional profile that correlates strongly with that of human and monkey pre-implantation embryos (Huang et al., 2014; Nakamura et al., 2016). The conversion of primed hESCs to the naive state results in X-chromosome reactivation together with the expression of XIST, although XIST is predominantly expressed from a single X chromosome (Sahakyan et al., 2017). Unlike primed hESCs that undergo irreversible ʻerosion' of the inactive X chromosome, naive hESCs can initiate de novo XCI upon differentiation, albeit in a non-random manner (Sahakyan et al., 2017). Lanner and colleagues defined combinations of cell-surface markers that could distinguish primed from naive hESCs (Collier et al., 2017). Using these markers, these authors tracked temporal changes in gene expression during the primed-to-naive conversion (Fig. 3B). Whereas some naive-specific genes (such as DPPA3 and TBX3) are activated relatively early, others (such as DPPA5, KLF17 and ZFP57) are fully induced only upon subsequent passaging (Collier et al., 2017). Naive resetting of hPSCs also involves profound metabolic changes, including increased oxidative phosphorylation (Takashima et al., 2014; Sperber et al., 2015) and glycolytic metabolism (Gu et al., 2016), and confers tolerance to the removal or inhibition of PRC2 enzymes (Moody et al., 2017; Shan et al., 2017).
Given their stage-specific patterns of expression during early human development, it has been proposed that transposable elements might serve as biomarkers to distinguish between distinct human pluripotent states. Initial work based on the culture of hESCs in 2i/L suggested that HERVH might be a marker of naive human pluripotency (Wang et al., 2014). However, it was reported that naive cells cultured in t2i/L+Gö or 5i/L/A show more specific upregulation of LTR5-HERVK and SVA-D integrants (Theunissen et al., 2016; Collier et al., 2017; Guo et al., 2017) (Fig. 3B). HERVK is actively transcribed during human embryogenesis from the 8-cell stage, and encodes viral proteins that are expressed in the human blastocyst (Goke et al., 2015; Grow et al., 2015). SVA-D elements are highly expressed at the human morula and blastocyst stages and are part of a hominid-specific family of retroelements (Hancks and Kazazian, 2010). The full repertoire of naive-specific transposable elements was not activated until after passaging and subsequent maturation of the naive phenotype in the course of primed-to-naive resetting (Collier et al., 2017). Whether these transposon families simply provide biomarkers of distinct human pluripotent states or also have functional significance remains to be elucidated.
The above studies indicate that naive hESCs offer a window into understanding the mechanisms of human pre-implantation development that are difficult to model in conventional hESCs, such as X-chromosome reactivation and inactivation, and the role of early embryonic transposons. However, current naive hESCs do not perfectly resemble pluripotent cells in the human blastocyst, as exemplified by the loss of imprint methylation in t2i/L+Gö or 5i/L/A (Pastor et al., 2016; Theunissen et al., 2016; Guo et al., 2017), and the observation that naive hESCs maintained in 5i/L/A are prone to genetic instability (Theunissen et al., 2014; Pastor et al., 2016; Liu et al., 2017). Thus, it remains to be determined whether strategies can be devised to capture naive hESCs that are as robust as their mouse counterparts, an advance that could facilitate a wide range of applications in regenerative medicine. While the contribution of currently available naive hESCs to interspecies chimeras has been inefficient (reviewed by Wu et al., 2016a), a recent study reported robust interspecies chimerism upon injection of human extended pluripotent stem cells (EPS cells; see Glossary, Box 1), which were derived in a newly defined chemical culture condition (Yang et al., 2017). The exact developmental identity of EPS cells remains unclear, but these data suggest that alternative pluripotent states that have enhanced proliferative capacity could enable human cells to more efficiently contribute to interspecies chimeras. Finally, the generation of hPSCs with distinct developmental identities might also facilitate other research applications, such as the induction of human PGC-like cells (von Meyenn et al., 2016; Irie et al., 2015), and more efficient genome editing (Yang et al., 2016).
Exiting the human pluripotency program
How is the human pluripotency network disassembled during the onset of lineage commitment and what are the accompanying transcriptional and epigenetic changes? In attempt to address this, Ng and colleagues performed a high-throughput RNAi screen to identify the mechanisms that promote exit from the human pluripotent state (Gonzales et al., 2015). hESCs containing a NANOG-GFP reporter transgene were transfected with siRNAs 24 h before induction of differentiation. As the expression of NANOG is tightly linked to pluripotency, the depletion of genes that promote dissolution of pluripotency is expected to delay the downregulation of NANOG-GFP reporter activity during differentiation. This screen identified cell cycle regulators that are involved in DNA replication during S phase or in the G2-to-M transition as regulators of pluripotent state dissolution (Gonzales et al., 2015). As depletion of these factors effectively arrests the cells in S or G2, these cell cycle phases appear to be intrinsically inclined towards maintenance of pluripotency. The underlying mechanism rests on the activation of the ATM/ATR-p53 axis during S phase and of cyclin B1 during G2 phase, which stimulate the expression of TGFβ-related genes (Gonzales et al., 2015).
By contrast, the G1 phase of the cell cycle has been identified as being favorable to hESC differentiation. Using the fluorescence ubiquitin cell cycle indicator (FUCCI; see Glossary, Box 1) reporter system to isolate hESCs in different stages of the cell cycle, Pauklin and Vallier demonstrated that hESCs in early G1 can only initiate differentiation into endoderm, while late G1 is permissive for ectoderm differentiation. This partitioning of G1 is the result of increasing levels of Cyclin D-CDK4/6 in the course of G1, which has two major consequences: (1) the removal of SMAD2/3 from target genes by Cyclin D-CDK4/6 (Pauklin and Vallier, 2013); and (2) the direct transcriptional repression of endoderm genes and activation of ectoderm genes by Cyclin D (Pauklin et al., 2016) (Fig. 3C). The G1 phase also has a unique epigenetic configuration that renders it transcriptionally competent to exit from pluripotency: many developmental genes transiently gain bivalent domains during G1, but are only marked by H3K27me3 during the remainder of the cell cycle (Singh et al., 2015). Collectively, these studies demonstrate that the phase of the cell cycle contributes to the decision of whether to maintain or dissolve the pluripotent state in response to lineage-inductive cues.
Genome-wide studies have revealed that the lineage specification and differentiation of hESCs involve broad changes in DNA methylation, in active and repressive histone marks, and in transcription factor binding (Xie et al., 2013; Gifford et al., 2013; Tsankov et al., 2015). An important focus for future work will be to define the order in which these regulatory dynamics occur during exit from pluripotency. In view of their heterogeneity, lineage priming and post-implantation identity, conventional (primed) hESCs are an imperfect model for addressing these questions. By contrast, naive hESCs may provide a more uniform and unbiased starting point to investigate how the pluripotent state becomes disassembled. It should be noted, however, that naive hESCs derived in t2i/L+Gö or 5i/L/A appear to be less responsive to lineage-inductive cues, although they can differentiate efficiently after re-exposure to primed culture conditions (Takashima et al., 2014; Sahakyan et al., 2017; Guo et al., 2017; Liu et al., 2017). This might reflect a developmental requirement for transition through a formative phase in which pluripotent cells acquire increased capacity to respond to specification cues (Smith, 2017). This formative period may last longer in humans, as the primate post-implantation EPI maintains a relatively stable transcriptome for one week (Nakamura et al., 2016). Hence, an important objective will be to better understand the transition between naive and formative pluripotency, and determine whether it is possible to capture formative hPSCs in culture.
Future perspectives
Research in the past decade has illuminated fundamental aspects of genome regulation in early human embryos and PSCs, but some significant gaps in our understanding remain. First, while gene expression dynamics during human pre-implantation development have been studied in depth by whole-embryo and single-cell RNA sequencing, information about the post-implantation stages remains limited. The ability to culture human embryos in vitro beyond implantation paves the way for a detailed investigation of this critical period of human development (Deglincerti et al., 2016; Shahbazi et al., 2016). Second, studies of chromatin accessibility and histone modifications using limited numbers of cells have revealed that dramatic chromatin reorganization occurs during mouse pre-implantation development (Liu et al., 2016; Zhang et al., 2016; Dahl et al., 2016; Wu et al., 2016b). We anticipate that such approaches will also offer key insights into epigenetic aspects of early human embryogenesis. Third, recent advances in genome editing present an opportunity to examine the function of human pluripotency regulators in their in vivo context, as recently shown for OCT4 (Fogarty et al., 2017). Fourth, whereas the composition of pluripotency-associated protein complexes has been analyzed in mESCs (Wang et al., 2006; Rafiee et al., 2016; Costa et al., 2013; Ding et al., 2015; Gagliardi et al., 2013), comparable studies have not been performed in hESCs. Such biochemical approaches might reveal connections among seemingly disparate regulatory modules and could identify novel reprogramming factors. Finally, there remains a strong interest in deriving novel types of human PSCs that reflect distinct stages of human embryogenesis. Efforts in this regard are guided by emerging insights into the transcriptional and epigenetic status of pluripotent cells in early primate embryos, which provide a set of rigorous benchmarks to assess the developmental identity of hPSCs (Boroviak and Nichols, 2017). The ability to capture hPSCs with distinct embryonic identities will enable researchers to more precisely reconstruct the earliest steps of human development in vitro.
Acknowledgements
We thank Steven Lee for assistance with illustrations.
Funding
T.W.T. is supported by a Sir Henry Wellcome Postdoctoral Fellowship from the Wellcome Trust (098889/Z/12/Z) and R.J. is supported by grants from the National Institutes of Health. Deposited in PMC for release after 12 months.
References
Competing interests
R.J. is co-founder of Fate Therapeutics, Fulcrum Therapeutics and Omega Therapeutics.