ABSTRACT
Development and homeostasis rely upon concerted regulatory pathways to establish the specialized cell types needed for tissue function. Once a cell type is specified, the processes that restrict and maintain cell fate are equally important in ensuring tissue integrity. Over the past decade, several approaches to experimentally reprogram cell fate have emerged. Importantly, efforts to improve and understand these approaches have uncovered novel molecular determinants that reinforce lineage commitment and help resist cell fate changes. In this Review, we summarize recent studies that have provided insights into the various chromatin factors, post-transcriptional processes and features of genomic organization that safeguard cell identity in the context of reprogramming to pluripotency. We also highlight how these factors function in other experimental, physiological and pathological cell fate transitions, including direct lineage conversion, pluripotency-to-totipotency reversion and cancer.
Introduction
Maintaining cell identity is fundamentally important to ensure the function of specialized cells and tissues in organisms. The canonical model of differentiation posits that developmental plasticity is progressively restricted as cells acquire a more specialized fate (Fig. 1). However, we now know that cell fate change is not necessarily unidirectional. Indeed, pioneering work in both amphibians and mammals demonstrated that cellular plasticity can be experimentally manipulated in the context of somatic cell nuclear transfer (SCNT) (Campbell et al., 1996; Gurdon et al., 1958). Subsequent work on transcription factor (TF)-mediated reprogramming in both mouse (Davis et al., 1987; Takahashi and Yamanaka, 2006) and human (Takahashi et al., 2007; Yu et al., 2007) reinforced this view and provided a more accessible experimental method to control cell fate change, albeit with varying and relatively low efficiency (Fig. 1) (Apostolou and Hochedlinger, 2013; Gurdon and Melton, 2008; Hanna et al., 2010; Stadtfeld and Hochedlinger, 2010; Yamanaka and Blau, 2010). Given the therapeutic potential of the reprogramming approach, subsequent studies focused on improving the efficiency of TF-mediated reprogramming (Bar-Nur et al., 2014; Di Stefano et al., 2014; Esteban et al., 2010; Onder et al., 2012; Tran et al., 2019a; Vidal et al., 2014; Yamanaka, 2007). These and other studies demonstrated that, although it is possible to modulate the identity of certain cell types, many safeguarding mechanisms exist to prohibit reprogramming from occurring naturally.
Developmental plasticity. During differentiation, plasticity progressively decreases as safeguarding mechanisms are established to restrict cell fate. These safeguards must be overcome during reprogramming. Many of the same mechanisms play important roles in alternative cell fate changes (e.g. during transdifferentiation) and in cancer (e.g. during cellular transformation). ESC, embryonic stem cell; iPSC, induced pluripotent stem cell.
Developmental plasticity. During differentiation, plasticity progressively decreases as safeguarding mechanisms are established to restrict cell fate. These safeguards must be overcome during reprogramming. Many of the same mechanisms play important roles in alternative cell fate changes (e.g. during transdifferentiation) and in cancer (e.g. during cellular transformation). ESC, embryonic stem cell; iPSC, induced pluripotent stem cell.
By exposing these safeguards, reprogramming approaches (see Box 1 and Fig. 2) have emerged as powerful tools for identifying and characterizing barriers to cell fate change. Intuitively, the disruption of many of these barriers frequently results in disease, particularly cancer. In this Review, we highlight recent work that has characterized barriers to cell fate change, with a focus on factors that function in the context of induced pluripotency. Additionally, we explore the possibility that these barriers play a role during alternative cell fate transitions as well as in tumorigenesis.
Box 1. Reprogramming models
Various tissue culture models of cell fate change have emerged as crucial tools for identifying and characterizing safeguarding mechanisms. Converting differentiated cells into induced pluripotent stem cells (iPSCs) via the expression of TFs represents one of the most drastic examples of cell fate change (Fig. 2). Under standard conditions, this process – often referred to as ‘reprogramming’ – is very inefficient (0.1-3%), rendering it a valuable system that can be used to screen for safeguarding mechanisms whose suppression increases reprogramming efficiency.
Mouse pluripotent stem cells, such as mouse embryonic stem cells (ESCs) maintained in conventional culture conditions (i.e. serum/LIF), spontaneously revert towards a more-primitive totipotent-like state resembling the two-cell (2C) stage of pre-implantation development (Macfarlan et al., 2012). Given the limited increase in developmental potency from a pluripotent state (i.e. a state that can only give rise to embryonic cell types) towards a totipotent state (i.e. a state that can give rise to both embryonic and extra-embryonic cell types), we refer to this process as ‘reversion’ (Fig. 2). Acquisition of a 2C-like state in ESC cultures, like the generation of iPSCs, occurs at extremely low efficiency (∼1%) and thus provides another powerful system with which to identify potential safeguarding regulators.
The transition of epiblast stem cells (EpiSCs) into ESCs represents another example of reversion between two closely related pluripotent states (Gillich et al., 2012; Guo et al., 2009). Whereas ESCs are derived from pre-implantation embryos and resemble the ‘naïve’ preimplantation epiblast (Evans and Kaufman, 1981; Martin, 1981; Nichols and Smith, 2009), EpiSCs are derived from E6.5 embryos and resemble the post-implantation ‘primed’ epiblast, and thus are more developmentally restricted (Brons et al., 2007; Tesar et al., 2007). The reversion of EpiSCs to ESCs is reportedly as inefficient as that of ESCs to 2C-like cells or the reprogramming of somatic cells into iPSCs.
The direct conversion of one differentiated cell type into another using ectopic TF expression represents a final example of cell fate change (reviewed by Vierbuchen and Wernig, 2012). Because no increase in developmental potency is typically achieved in this type of cell fate change, it is commonly referred to as ‘transdifferentiation’. As may be expected from the closer relationship between the converted cell types, transdifferentiation is generally more efficient, ranging from 20-70%.
Notably, reprogramming and transdifferentiation have also been achieved in animals (Abad et al., 2013; Chiche et al., 2017; Ieda et al., 2010; Mosteiro et al., 2016; Ocampo et al., 2016; Ohnishi et al., 2014; Qian et al., 2012; Rouaux and Arlotta, 2013; Song et al., 2012). We do not discuss these in vivo examples of cell fate change here, as most insights into safeguarding mechanisms have so far been gained from in vitro models, which are experimentally more accessible.
Developmental stages and reprogramming. Different reprograming methods (shown in bold; dedifferentiation, transdifferentiation, reversion and reprogramming) are depicted together with the developmental stage (i.e. the in vivo or in vitro counterpart) that each culture represents.
Developmental stages and reprogramming. Different reprograming methods (shown in bold; dedifferentiation, transdifferentiation, reversion and reprogramming) are depicted together with the developmental stage (i.e. the in vivo or in vitro counterpart) that each culture represents.
Chromatin-associated safeguarding mechanisms
As cells differentiate and commit to a specialized cell fate, chromatin-based gene regulation reinforces lineage-specific transcriptional programs and safeguards cell identity (Rasmussen and Helin, 2016). As a result, these chromatin-related barriers must be overcome during the process of induced cell fate change (Fig. 3) (Ang et al., 2011b; Apostolou and Hochedlinger, 2013; Hochedlinger and Plath, 2009). Below, we review the roles of select chromatin assembly pathways, specific histone modifications and DNA methylation in safeguarding cell identity.
Chromatin-based safeguarding of cell identity. Various chromatin factors and transposable elements regulate cell identity by controlling chromatin 3D structure, nucleosome occupancy, histone modifications and the expression of adjacent genes. Activating regulators are shown in green while repressive regulators are shown in red.
Chromatin-based safeguarding of cell identity. Various chromatin factors and transposable elements regulate cell identity by controlling chromatin 3D structure, nucleosome occupancy, histone modifications and the expression of adjacent genes. Activating regulators are shown in green while repressive regulators are shown in red.
Chromatin assembly
Chromatin organization is a crucial mechanism that helps to establish and maintain cellular identity. In support of this notion, the chromatin assembly factor complex CAF-1 was identified as a potent roadblock to TF-mediated reprogramming in a chromatin-focused siRNA screen (Cheloufi et al., 2015). CAF-1 was initially characterized as a replication-dependent histone chaperone that promotes the deposition of histones H3.1 and H4 onto newly synthesized DNA (Smith and Stillman, 1989). Additionally, CAF-1 has been implicated in heterochromatin maintenance through its interactions with H3K9 methylation-dependent readers such as HP1-α and HP1-γ (Murzina et al., 1999). Suppression of CAF-1 subunits (Chaf1a or Chaf1b) accelerates and enhances the reprogramming of fibroblasts to induced pluripotent stem cells (iPSCs) by increasing the accessibility of reprogramming factors to their pluripotency-associated targets (Cheloufi et al., 2015). Moreover, CAF-1 suppression facilitates the opening of H3K9me3-marked heterochromatin domains that were previously shown to resist reprogramming in the context of SCNT (so-called reprogramming-resistant regions or RRRs; see below) (Ishiuchi et al., 2015). Notably, the effects of CAF-1 are dose dependent, i.e. moderate CAF-1 suppression dramatically enhances reprogramming, while strong CAF-1 suppression abrogates reprogramming. In contrast to the repressive effect of CAF-1 on reprogramming, overexpression of the histone chaperone ASF1A, which transfers newly synthesized and acetylated histones H3/H4 to CAF-1, reportedly enhances the generation of iPSCs (Gonzalez-Muñoz et al., 2014). Although this result may seem counterintuitive given that ASF1A acts upstream of CAF-1, we hypothesize that enhanced reprogramming is due to increased deposition of acetylated nucleosomes by ASF1A, leading to a more plastic chromatin state. Taken together, these results suggest that CAF-1 safeguards cellular identity by maintaining both nucleosome assembly and heterochromatin maintenance, whereas ASF1A may facilitate cell fate change by impacting histone acetylation (Cheloufi and Hochedlinger, 2017).
Consistent with the ubiquitous expression pattern of CAF-1, modulation of its subunits was shown to impact alternative cell fate transitions (Box 1), including reversion, transdifferentiation and tumorigenesis. Specifically, CAF-1 facilitates reversion of embryonic stem cells (ESCs) towards the 2C-like state (Ishiuchi et al., 2015). This process was proposed to involve reactivation of stage-specific transposable elements, which control the expression of nearby 2C-associated transcripts. Indeed, a parallel study identified CAF-1 in an siRNA screen targeting genes that silence repetitive elements in pluripotent cells (Yang et al., 2015).
In the context of direct lineage conversion, CAF-1 suppression was shown to facilitate the transdifferentiation of fibroblasts into neurons following forced expression of the neural TF Ascl1, and of B cells to macrophages following expression of the myeloid TF C/EBPα (Cheloufi et al., 2015). Suppression of the CAF-1 ortholog Lin-53, together with overexpression of the neuronal-specific TF CHE-1, leads to the transdifferentiation of C. elegans germ cells into neurons, suggesting that Lin-53 likewise safeguards germ cell fate (Tursun et al., 2011). More recently, suppression of the histone chaperone FACT, which facilitates transcription by displacing nucleosomes from chromatin, was shown to enhance germ cell-to-neuron and intestine-to-neuron transdifferentiation in C. elegans, as well as human fibroblast-to-iPSC reprogramming and fibroblast-to-neuron transdifferentiation (Kolundzic et al., 2018). These findings suggest that modulation of histone turnover is a more general and conserved mechanism to control cell fate.
Intriguingly, a connection between CAF-1 levels and cell fate has recently been established in cancer cells (Volk et al., 2018). CAF-1 overexpression in a mouse model of leukemia promotes tumorigenesis, whereas heterozygous deletion suppresses cancer formation, which may appear counterintuitive considering the role of CAF-1 in reprogramming. However, these observations share a mechanistic basis. In leukemia, increased CAF-1 levels reportedly interfere with chromatin access of the TFs, such as C/EBPα, that normally promote myeloid differentiation of leukemic cells. Conversely, heterozygous CAF-1 deletion suppresses leukemogenesis by driving leukemic cells into differentiation. Similar to observations made during iPSC reprogramming (Cheloufi et al., 2015), complete loss of CAF-1 is toxic to hematopoietic cells, emphasizing the dose-dependent effects of CAF-1 on cell fate.
Collectively, these examples suggest that CAF-1-dependent nucleosome assembly and heterochromatin maintenance safeguard cell identity, while CAF-1 suppression increases cellular plasticity across diverse physiological, experimental and pathological cell fate transitions. These data further imply that the outcome of increased plasticity following CAF-1 suppression depends on cellular context. In differentiated cells, this characteristic manifests as an increased probability to acquire pluripotency when induced to reprogram, while in leukemic cells, increased plasticity manifests as a renewed ability for cells to differentiate into mature cell types.
Histone H3K9 methylation
Histone H3K9 di- and tri-methylation are typically considered repressive and are frequently found in large regions of constitutive heterochromatin, contributing to repetitive element silencing. The association between H3K9 methylation and cell fate change has been thoroughly studied, both under experimental and physiological conditions (Becker et al., 2017; Chen et al., 2013; Epsztejn-Litman et al., 2008; Matoba et al., 2014; Soufi et al., 2012; Sridharan et al., 2013; Wang et al., 2011; reviewed by Becker et al., 2016; Nicetto and Zaret, 2019). The restrictive influence of H3K9 methylation on induced pluripotency was first appreciated following ChIP-seq analyses of reprogramming factors (Soufi et al., 2012). This work identified megabase-wide regions of the genome that are differentially bound by reprogramming factors in fibroblasts compared with ESCs. These regions, termed ‘differentially bound regions’ or ‘DBRs’, are highly enriched for the H3K9me3 mark and contain protein-coding genes, such as Nanog, that are important for the later stages of reprogramming. The authors hypothesized that H3K9me3 prevents binding of ectopic reprogramming factors at these loci, which effectively impedes the acquisition of pluripotency. Suppressing the H3K9 methyltransferase genes Setdb1, Suv39h1 or Suv39h2 was shown to increase both the kinetics and efficiency of reprogramming (Soufi et al., 2012). These results were corroborated in pre-iPSCs, which represent a late-stage reprogramming intermediate (Chen et al., 2013; Sridharan et al., 2013). Consistent with previous work, exogenously expressed pluripotency factors in pre-iPSCs do not bind canonical target sites on core pluripotency loci, which are correspondingly marked by H3K9me3 (Chen et al., 2013). However, knockdown of H3K9 methyltransferases or overexpression of H3K9 demethylases is sufficient to activate these loci and drive reprogramming of pre-iPSCs to a pluripotent state (Chen et al., 2013; Sridharan et al., 2013).
In line with the aforementioned results, H3K9 methylation acts as a barrier to reprogramming during SCNT (Chung et al., 2015; Matoba et al., 2014; Wei et al., 2017). Based on transcriptional comparison between fertilized and cloned 2C stage embryos, several hundred RRRs were identified that correlate with impaired SCNT. Analogous to DBRs, these regions are significantly enriched for H3K9me3, suggesting that they are not efficiently reactivated during the reprogramming process. Overexpression of H3K9 demethylases or knockdown of Suv39h1/2 decreases H3K9me3 in SCNT-generated embryos, activates expression from RRRs and increases the efficiency of SCNT (Matoba et al., 2014).
Regulation by H3K9 methylation appears to be a more general mechanism for safeguarding cell identity because it similarly restricts the transdifferentiation of human fibroblasts into human-induced hepatic cells (hiHeps) (Becker et al., 2017). Mechanistically, knockdown of SUV39H1 or the H3K9me3 readers RBMX and RBMXL1 facilitates transdifferentiation by relieving repression over key hepatic genes following hiHep induction. This observation further suggests that H3K9 methylation may be required for maintaining lineage fidelity in differentiated cells. In support of this notion, hepatoblasts and adult hepatocytes lacking the three H3K9me3-specific methyltransferases (Suv39h1, Suv39h2 and Setdb1) fail to establish liver-specific transcriptional programs and aberrantly express genes from alternative lineages (Nicetto and Zaret, 2019).
H3K9 methylation is likewise implicated in tumorigenesis. For example, loss of Suv39h1/2 predisposes mice to B cell lymphomas (Peters et al., 2001), while suppression of H3K9 methylation through H3K9M expression leads to T cell lymphomas (Brumbaugh et al., 2019). Moreover, inhibiting the methyltransferase G9a, which mediates H3K9 mono- and di-methylation, facilitates lung cancer progression in adenocarcinoma, whereas suppressing H3K9 demethylases disrupts this effect (Rowbotham et al., 2018). Similar results have been reported in skin tumors where G9a suppression leads to increased chromatin accessibility and genomic instability (Avgustinova et al., 2018). Consistent with observations in the lung, prolonged treatment with methyltransferase inhibitors expands a subset of cancer progenitors in squamous cell carcinoma that are more aggressive than progenitors in untreated controls (Avgustinova et al., 2018; Rowbotham et al., 2018).
Together, these results suggest that H3K9 methylation has evolved as an important epigenetic modification to stably silence gene expression programs from earlier developmental stages as well as alternative lineages. Consequently, suppressing H3K9 methylation favors a more plastic cell state, which promotes aggressive, highly metastatic cancer cell populations in the context of tumorigenesis, lowers the barrier to cell fate change in the context of reprogramming and transdifferentiation, and facilitates the expression of genes from alternative lineages in endodermal progenitors.
Histone H3K27 methylation
Trimethylation of histone H3 on lysine-27 (H3K27me3) is another repressive chromatin mark that has been implicated in silencing embryonic and lineage-inappropriate genes during cell fate specification (reviewed in by Aloia et al., 2013; Schuettengruber et al., 2017). Polycomb repressive complex 2 (PRC2, comprising subunits EZH1/2, SUZ12 and EED) deposits H3K27me3, although stable gene silencing also requires the activity of the PRC1 complex (PCGF1-6, HPF1-3, RING1A/B and CBX proteins), which ubiquitylates histone H2A. Importantly, H3K27me3 can be removed from methylated histones with the aid of histone demethylases such as UTX1 (KDM6A), allowing reversal of gene silencing. Loss of Polycomb proteins typically results in embryonic lethality or homeotic transformation (Van der Lugt et al., 1994), emphasizing the important role of the Polycomb complex in safeguarding cell identity (Schuettengruber et al., 2017). Thus, it was conceivable that dysregulation of H3K27me3 levels would also impact reprogramming. Indeed, disruption of either the PRC2 complex or the demethylase UTX1 abolishes iPSC generation from mouse embryonic fibroblasts (MEFs) (Fragola et al., 2013; Mansour et al., 2012; Zhang et al., 2011), while EZH2 overexpression facilitates reprogramming to pluripotency (Buganim et al., 2012). These observations imply that both silencing of the somatic program via PRC2-mediated H3K27me3 deposition as well as activation of pluripotency genes via UTX1-mediated H3K27me3 removal are essential for establishing pluripotency. In addition, different PRC1 complex components, such as BMI1 and RING1B (Onder et al., 2012), USP26 (Ning et al., 2017) and PCGF6 (Zdzieblo et al., 2014), were shown to resist reprogramming to iPSCs, consistent with a crucial role for both PRC1 and PRC2 in maintaining differentiated cell identity.
PRC1 and PRC2 components have also been associated with alternative cell fate transitions, tissue stem cells and cancer. For example, UTX1 is required for the reversion of EpiSCs to ESCs in vitro and for the maturation of primordial germ cells in vivo (Mansour et al., 2012), which involves global resetting of the epigenome akin to iPSC reprogramming. Furthermore, components of the non-canonical PRC1 complex, including PCGF6 and RYBP, scored in an siRNA screen for barriers to the 2C-like state in mouse ESCs, indicating that Polycomb repression prevents reversion from a pluripotent to a totipotent-like state (Rodriguez-Terrones et al., 2018). The canonical PRC1 component BMI1 was recently identified in an unbiased shRNA screen for barriers to transdifferentiation from fibroblasts to cardiac cells. Mechanistically, BMI1 was shown to maintain cardiac genes such as Gata4 and Isl1 in a repressed state in fibroblasts, thus safeguarding cell identity (Zhou et al., 2016). Last, ample evidence points to an involvement of the Polycomb complex during tissue stem cell maintenance (Avgustinova and Benitah, 2016), with its disruption often leading to cancer (reviewed by Chan and Morey, 2019; Schuettengruber et al., 2017). A classic example is, again, BMI1, which is required for maintenance of hematopoietic and neural stem cells and inhibits tumorigenesis in these tissues, at least in part by suppressing the cell cycle inhibitor Cdkn2a (Lessard and Sauvageau, 2003; Molofsky et al., 2003; Park et al., 2003). The latter observation highlights another notable parallel to reprogramming, as CDKN2A also functions as a potent barrier to the generation of iPSCs from fibroblasts (Li et al., 2009; Utikal et al., 2009). Similar to BMI1, EZH1 and EZH2 are crucial for the maintenance of intestinal, skin and hematopoietic stem cells, highlighting that Polycomb regulation plays a crucial role in all major regenerative tissues (Chan and Morey, 2019; Chiacchiera et al., 2016; Ezhkova et al., 2009; Hidalgo et al., 2012; Klauke et al., 2013; Lien et al., 2011; Mejetta et al., 2011; Mochizuki-Kashio et al., 2011; Oguro et al., 2010; Park et al., 2003; Van Den Boom et al., 2013; Xie et al., 2014).
Lysine-to-methionine mutations of histone H3 globally suppress methylation by sequestering or aberrantly targeting histone methyltransferases that recognize these sites. Two such mutants, H3K27M and H3K36M, are found in pediatric gliomas, chondroblastomas, and/or head and neck cancers, where they are thought to contribute to malignancy by rewiring the H3K27 methylome and blocking differentiation (Fang et al., 2016; Funato et al., 2014; Lewis et al., 2013; Lu et al., 2016; Papillon-Cavanagh et al., 2017). Intriguingly, induction of these ‘oncohistones’ in the context of normal tissue stem cells in mice leads to major differentiation blocks, defects in tissue homeostasis and leukemia (Brumbaugh et al., 2019), further underscoring the direct role of affected chromatin marks in safeguarding cell identity in normal and pathological settings.
The NuRD complex
The NuRD complex is an ubiquitously expressed repressor complex comprising the histone deacetylases HDAC1/2, the methyl-binding protein MBD3 and the ATP-dependent chromatin remodeler CHD3 or CHD4 (Basta and Rauchman, 2015). Strikingly, knockdown of Mbd3, which functions as an essential scaffolding component of the NuRD complex, allows close to 100% reprogramming efficiency when converting EpiSCs to ESCs, or fibroblasts to iPSCs (Luo et al., 2013; Rais et al., 2013). In addition, MBD3 loss facilitates the conversion of primordial germ cells into pluripotent stem cells (Rais et al., 2013). This conversion represents another reprogramming paradigm whereby unipotent germ cells that normally give rise to oocytes or sperm in vivo acquire pluripotency at low frequency upon culture (1-3%) (Matsui et al., 1992; Resnick et al., 1992). The role of MBD3 as a guardian of cell identity appears to be conserved, as MBD3 knockdown similarly enhances reprogramming in human cells (Rais et al., 2013). However, the effects of Mbd3 on reprogramming may be context dependent. For example, genetic deletion of Mbd3 strongly impairs the reprogramming of neural stem cells into iPSCs, and Mbd3 overexpression even enhances reprogramming efficiency when accompanied by Nanog overexpression (dos Santos et al., 2014). This finding highlights the point that very subtle molecular differences can profoundly influence cell fate changes.
Mechanistically, MBD3 directly interacts with reprogramming factors via its MBD domain. Through this interaction, reprogramming factors recruit MBD3 to target genes important for pluripotency, where it exerts a repressive effect on chromatin. Releasing the pluripotency factors from this repressive regulator permits transcriptional activation at the pluripotency loci and hence allows reprogramming efficiencies approaching 100% (Rais et al., 2013). From a conceptual standpoint, this work represents an important advance as it demonstrates that reprogramming is not an inherently lengthy and inefficient progress; instead, there are key barriers to reprogramming that can be identified and surmounted in essentially every cell, providing a handle to the study of regulatory mechanisms that safeguard cell fate (Zviran et al., 2019). One caveat of Mbd3 depletion is that its knockdown severely impairs cell division. This effect may in part explain the difficulty in reproducing these results by other laboratories. To circumvent this issue, a recent study screened for NuRD components whose suppression positively affects reprogramming but does not impair cell proliferation (Mor et al., 2018). Indeed, knockdown or genetic deletion of Gatad2a, another subunit of NuRD, similarly enhances reprogramming without inducing cell cycle arrest. While the precise mechanism is currently unknown, these findings further implicate the NuRD complex as a safeguard for cell fate.
The NuRD complex has opposing roles in tumorigenesis and has been most extensively studied in the context of colorectal cancer (Lai and Wade, 2011). In line with its role as a tumor suppressor, depletion of Mbd3 in mice enhances colon progenitor cell proliferation and increases susceptibility to tumorigenesis (Aguilera et al., 2011). The mechanistic basis for these observations is particularly interesting in light of corresponding studies on reprogramming. Under physiological conditions, MBD3 physically interacts with unphosphorylated Jun, which recruits MBD3 to the target genes of Jun and suppresses their expression (Aguilera et al., 2011). Loss of Mbd3 relieves this negative regulatory module and increases histone acetylation and transcription of cancer-promoting genes activated by Jun (Aguilera et al., 2011). This mechanism is reminiscent of the model proposed by Rais et al., whereby cell fate is safeguarded under normal conditions through the recruitment of MBD3 to pluripotency-related genes through direct interaction with reprogramming factors (Rais et al., 2013). Despite its proposed role as a tumor suppressor, MBD3 is also implicated as a tumor promoter. Indeed, multiple studies have reported that the NuRD complex is recruited to repress tumor suppressor genes and ultimately to promote tumorigenesis in colorectal cancer (Cai et al., 2014; Magdinier and Wolffe, 2001). In this context, MBD3 reinforces transcriptional silencing of CDKN1A, which encodes p21WAF/CIP1 and represents a crucial downstream mediator of signaling through the p53 axis (Choi et al., 2013). Notably, p53 suppression dramatically enhances the reprogramming potential of fibroblasts by bypassing senescence and DNA damage responses (Hong et al., 2009; Kawamura et al., 2009; Li et al., 2009; Marión et al., 2009; Utikal et al., 2009). Together, these findings suggest that Mbd3 has a complex regulatory role in tumorigenesis, analogous to its function in reprogramming.
Histone variants
In some cases, differences as subtle as histone variants are sufficient to influence cell plasticity. A prime example is MacroH2A, which differs from canonical H2A isoforms by the presence of a large, non-histone domain at its C terminus (Gaspar-Maia et al., 2013). MacroH2A is associated with repressive chromatin and is prevalent at genomic loci that must be activated early in the reprogramming process, suggesting that these chromatin features represent a first line of defense against aberrant transcription of genes that promote cell fate changes (Gaspar-Maia et al., 2013). Consistent with this notion, SSEA1, an early surface marker of reprogramming intermediates, is activated in a larger proportion of MacroH2A-null cells during iPSC induction relative to controls. Correspondingly, a larger number of iPSC colonies are observed when reprogramming MacroH2A knockout fibroblasts (Gaspar-Maia et al., 2013), confirming that MacroH2A acts as a barrier to cell fate change.
Similar conclusions have arisen from studies focusing on MacroH2A function during SCNT. Specifically, MacroH2A depletion enhances transcriptional reprogramming and reactivation of the somatically silenced X chromosome in MEF nuclei upon transfer into Xenopus oocytes (Pasque et al., 2011). In cloned mouse embryos, MacroH2A levels sharply decrease following SCNT, which coincides with the initiation of developmental programs and is consistent with the observation that maternal MacroH2A is rapidly eliminated from zygotes (Chang et al., 2010). However, MacroH2A transcription is reactivated at the morula stage of preimplantation development, suggesting a stage-specific role for this chromatin regulator (Chang et al., 2010). Interestingly, MacroH2A-deficient pluripotent stem cells retain differentiation capacity but, following differentiation, revert more readily to the pluripotent state, suggesting that MacroH2A is fundamentally important for maintaining rather than specifying somatic cellular identity (Gaspar-Maia et al., 2013). This illustrates the nuanced role of MacroH2A in cell fate changes.
Like many safeguards of cell identity, altered MacroH2A levels are implicated in various malignancies. In particular, aggressive melanomas in both human and mouse exhibit a loss of MacroH2A isoforms and, correspondingly, decondensed chromatin (Kapoor et al., 2010). Direct evidence for the role of MacroH2A as a tumor suppressor has been provided through knockdown experiments, which demonstrate that reduced MacroH2A leads to increased proliferation and metastasis in both in vivo and in vitro melanoma models (Kapoor et al., 2010). Interestingly, CDK8, a component of the Mediator complex, appears to modulate these effects and its expression increases when relieved of MacroH2A-mediated repression (Kapoor et al., 2010). Lung cancers can likewise be stratified based on MacroH2A expression (Sporn et al., 2009). Here, the effect of MacroH2A has been linked to senescence, a feature that negatively correlates with malignancy. The association between MacroH2A and senescence may also explain why its suppression increases reprogramming, as high passage primary cells are usually refractory to induced pluripotency.
DNA methylation
The methylation of DNA at CpG sites is a well-studied epigenetic modification that is largely associated with stable transcriptional repression (Li et al., 2007; Rasmussen and Helin, 2016; Reik, 2007). During differentiation, DNA methylation is progressively deposited on stem cell-related genes, and ESCs deficient for the de novo DNA methyltransferases DNMT3A and DNMT3B fail to downregulate pluripotency factors and cannot differentiate (Chen et al., 2003; Jackson et al., 2004; Meissner et al., 2008; Smith and Meissner, 2013). Following lineage specification, DNA methylation accumulates at developmental genes that are silenced in a given differentiated cell type (Meissner et al., 2008). Together, these observations suggest that DNA methylation restricts plasticity by suppressing aberrant expression of fate-instructive genes (Meissner et al., 2008). Consistent with this notion, somatic cells with experimentally reduced global DNA methylation levels (due to DNMT1 suppression) are more susceptible to reprogramming in the context of iPSC generation and SCNT (Blelloch et al., 2006; Mikkelsen et al., 2008).
Further evidence that DNA methylation protects against cell fate change comes from studies on the Ten-eleven translocation (TET) family proteins: TET1, TET2 and TET3 (Costa et al., 2013; Di Stefano et al., 2014; Doege et al., 2012; Hu et al., 2014; Sardina et al., 2018; Tran et al., 2019b). These proteins facilitate the removal of DNA methylation by oxidizing the methyl group of 5-methylcytosine (Rasmussen and Helin, 2016; Tahiliani et al., 2009). Both Tet1 and Tet2 are upregulated during reprogramming and their suppression strongly impairs iPSC colony formation (Costa et al., 2013; Doege et al., 2012; Hu et al., 2014; Sardina et al., 2018). TET1 functions synergistically with NANOG and their physical association suggests that NANOG directs TET1 to key pluripotency-related genes to initiate demethylation. Indeed, combined expression of TET1 and NANOG results in decreased DNA methylation at Esrrb and Pou5f1, which likely primes these genes for transcriptional activation during reprogramming (Costa et al., 2013). A similar mechanism was described for TET2, which interacts sequentially with C/EBPα, KLF4 and TFCP2L1 to demethylate regions important for reprogramming preB cells into iPSCs (Sardina et al., 2018). In many cases, demethylation at these loci precedes chromatin opening, suggesting that TET enzymes may mediate the pioneer function of certain TFs during reprogramming (Sardina et al., 2018). In line with this, TET2 overexpression facilitates conversion of preB cells into iPSCs (Di Stefano et al., 2014). Following the induction of pluripotency factors, MEFs deficient for all three TET enzymes fail to undergo mesenchymal-to-epithelial transition (MET), one of the earliest steps in the reprogramming process (Hu et al., 2014). Correspondingly, TET triple knockout cells do not reactivate miRNAs that negatively regulate mesenchymal genes, including Snai1, Snai2 and Zeb1, and forced expression of these miRNAs rescues the triple knockout phenotype (Hu et al., 2014). Notably, genetic deletion of Tdg, the enzyme that recognizes and removes the oxidation products of 5-methylcytosine, similarly blocks reprogramming (Hu et al., 2014). This suggests that complete removal of methylation and its oxidized products is required to overcome barriers to reprogramming.
DNA methylation has also been implicated in alternative cell fate decisions. For example, a recent CRISPR-Cas9 deletion screen identified DNMT1 as a barrier to conversion between ESCs and the 2C-like state (Fu et al., 2019). Genes important for the 2C-like state are methylated in ESCs and, correspondingly, DNMT1 suppression increases their expression and the appearance of 2C-like cells (Fu et al., 2019). In another recent study, TET enzymes were shown to safeguard the conversion between mouse primed and naive pluripotent states (Fidalgo et al., 2016). TET1 reportedly collaborates with the transcription factor ZFP281 to maintain the primed pluripotent state, while TET2 maintains the naïve pluripotent state. Mechanistically, ZFP281 and TET1 activate miR-302/367, which in turn inhibits TET2 expression in EpiSCs. Consistent with these findings, overexpression of TET2 but not of TET1 is sufficient to revert EpiSCs to ESCs (Fidalgo et al., 2016). Methylation is also a potent barrier to transdifferentiation between hematopoietic cell types (Kallin et al., 2012). During the transition of preB cells into macrophages, C/EBPα activates Tet2 (Kallin et al., 2012). Accordingly, the promoters of key macrophage-related genes are demethylated, suggesting that TET2 mediates transcriptional activation at these sites (Kallin et al., 2012). In support of this notion, Tet2 suppression impairs the upregulation of macrophage-related genes in preB cells undergoing transdifferentiation and this phenotype can be rescued by treating cells with compounds that inhibit DNA methylation or shRNAs targeting DNMT1 (Kallin et al., 2012). Collectively, these studies suggest a common mechanism whereby DNA methylation acts as a barrier against cell fate change. Removing DNA methylation through the activity of TET enzymes increases the plasticity of cells in these experimental systems, which raises the intriguing possibility that it may also play an important role in regeneration in vivo. Aberrant DNA methylation has also been associated with tumorigenesis, as reviewed extensively elsewhere (Feinberg, 2004; Jones and Baylin, 2007; Klutstein et al., 2016; Kulis and Esteller, 2010; Pfeifer, 2018; Rasmussen and Helin, 2016).
Histone H3K79 methylation
H3K79 di-/tri-methylation was first implicated as a barrier to cell fate change based on a candidate shRNA screen that targeted chromatin factors during reprogramming in human cells (Onder et al., 2012). Hairpins or small molecules against DOT1L, the sole methyltransferase responsible for methylating H3K79, are among the most consistent and efficient enhancers of reprogramming in this system. Although suppression of DOT1L during reprogramming reduces levels of H3K79me2/3 genome-wide, methylation is particularly low at genes associated with the somatic cell type of origin (i.e. SNAI1, TWIST1 and ZEB1) (Onder et al., 2012). Expression of these genes decreases following DOT1L inhibition, in line with the finding that H3K79me2/3 is largely associated with active transcription (Farooq et al., 2016). These observations suggest that decreased H3K79 methylation facilitates reprogramming by deactivating fibroblast-specific transcription programs. In support of this notion, overexpression of these genes in the presence of DOT1L inhibitor lowers reprogramming efficiency to levels commensurate with controls (Onder et al., 2012). Conceptually, these experiments make the important point that safeguarding mechanisms include not only repressive chromatin modifications that ensure silencing of embryonic and alternative lineage programs, but also chromatin modifications that are important for sustained expression of cell type-specific programs.
DOT1L alterations have been identified in a number of malignancies. For example, CRISPR/Cas9-mediated deletion of DOT1L in an ovarian cancer cell line increases cell migration, Wnt signaling and ALDH activity (Wang et al., 2019); the authors hypothesized that these features indicate a stem cell-like phenotype. Indeed, Wnt signaling reportedly maintains stemness in cancer (Liu et al., 2018; Zhan et al., 2017) and ALDH is positively associated with epithelial-to-mesenchymal transition (EMT), which has previously been associated with stem cell-like features in cancer (De Francesco et al., 2018; Krebs et al., 2017; Liu and Fan, 2015; Yang et al., 2004). These data suggest that loss of H3K79 methylation drives tumorigenesis by pushing cancer cells toward a more primitive developmental state. In contrast, genetic or pharmacological inhibition of DOT1L in MLL-rearranged leukemias (i.e. leukemias with fusions between the H3K4 methyltransferase MLL and oncogenes such as AF9) effectively prevents or abrogates tumorigenesis, indicating a highly cell context-dependent role for DOT1L. Interestingly, the effect of DOT1L inhibition on leukemia regression is in part due to the induction of terminal differentiation of leukemic cells without having a major impact on normal hematopoiesis (Bernt et al., 2011). Thus, similar to CAF-1, DOT1L appears to safeguard cell identity and resist cell fate change in a context-specific manner. That is, DOT1L safeguards fibroblast identity by resisting reprogramming into iPSCs; DOT1L safeguards cancer cell identity by resisting transition to a more aggressive stem cell-like state; and DOT1L safeguards undifferentiated leukemic cell identity by resisting differentiation in MLL-rearranged leukemia.
Post-transcriptional safeguarding mechanisms
Post-transcriptional modifications represent a dynamic and rapid way to change gene expression, and are therefore well suited to play a role in regulating cell fate changes. A variety of proteins operate at both the post-transcriptional and post-translational levels to guide and ultimately stabilize cell identity. Here, we discuss how direct RNA modifications, RNA processing and protein modifications (beyond histone methylation) have recently been implicated in reprograming and cell fate changes.
RNA modification m6A
N6-methyladenosine (m6A) is the most common internal RNA modification and is conserved among eukaryotes (Yue et al., 2015). The m6A mark fulfills numerous functions related to RNA biology and in many cases controls mRNA stability by inducing turnover (Fig. 4A) (Batista et al., 2014). The role of m6A during reprogramming is controversial. One study reported that overexpression of the m6A methyltransferase Mettl3 leads to increased m6A levels and modestly enhances reprogramming efficiency (Chen et al., 2015). However, overexpression of Zfp217, a negative regulator of m6A, also increases iPSC induction approximately threefold (Aguilo et al., 2015); the authors of this study proposed that Zfp217 acts by directly binding to and inhibiting the activity of Mettl3. Consistent with this notion, Zfp217 knockdown leads to increased m6A levels on the reprogramming factors Sox2, Myc and Klf4. Correspondingly, expression of these genes is reduced in MEFs expressing reprogramming factors, resulting in decreased reprogramming efficiency to iPSCs.
RNA-based safeguarding of cell identity. (A) m6A (red) destabilizes mRNAs encoding key developmental factors, thus influencing cell fate change. (B) Alternative polyadenylation safeguards cell identity by regulating the expression of key transcripts during reprogramming. Proximal polyadenylation leads to a shorter 3′ UTR and eliminates cis-regulatory sequences to increase protein level expression of the corresponding gene. (C) Alternative splicing generates proteomic diversity that subsequently influences cell fate change.
RNA-based safeguarding of cell identity. (A) m6A (red) destabilizes mRNAs encoding key developmental factors, thus influencing cell fate change. (B) Alternative polyadenylation safeguards cell identity by regulating the expression of key transcripts during reprogramming. Proximal polyadenylation leads to a shorter 3′ UTR and eliminates cis-regulatory sequences to increase protein level expression of the corresponding gene. (C) Alternative splicing generates proteomic diversity that subsequently influences cell fate change.
The negative regulation of pluripotency-related RNAs by m6A is in line with observations in pluripotent stem cells and early embryogenesis (Batista et al., 2014; Geula et al., 2015). Specifically, m6A has been identified on many pluripotency factors, including Nanog and Rex1, and is purported to promote their turnover during differentiation (Batista et al., 2014; Geula et al., 2015). Mettl3 knockout or knockdown has no overt effect on maintenance of the naïve pluripotent state. However, when Mettl3-deficient cells are induced to differentiate via embryoid body formation, cells fail to extinguish the expression of pluripotency factors normally marked by m6A (Batista et al., 2014). Similarly, suppression of Mettl14, an additional subunit of the complex responsible for m6A deposition, impairs differentiation while its overexpression facilitates exit from pluripotency (Huang et al., 2019). This form of regulation appears to be conserved in early development because genetic deletion of Mettl3 is embryonic lethal after E8 even though pre-implantation embryos appear largely normal (Geula et al., 2015). Together, these findings suggest a paradigm whereby m6A deposition decreases RNA levels for stem cell-related genes, thus facilitating differentiation in vitro and in vivo. Evidence for this model extends beyond embryonic development to tissue homeostasis, as Mettl3 deletion also blocks naïve T-cell maturation and hematopoietic stem cell differentiation (Lee et al., 2019; Li et al., 2017c). Further studies will determine whether the differentiation of other adult stem cells similarly relies upon m6A-driven regulation.
The role of m6A is similarly complex in the context of cancer (reviewed by Lan et al., 2019; Sun et al., 2019). Focusing on glioblastoma, one study found that Mettl3 expression is increased in glioblastoma stem cells (GSCs) and plays a crucial role in maintaining these cells in an undifferentiated state (Visvanathan et al., 2018). Interestingly, this phenotype is reportedly mediated via the stabilization of SOX2 mRNA following m6A modification, which is in contrast to the role of m6A in stem cells where it facilitates turnover of self-renewal-associated transcripts. However, an independent group reported that m6A loss contributes to the initiation of certain gliomas by impairing differentiation (Cui et al., 2017). Specifically, knockdown of Mettl3 or Mettl14 increases proliferation and self-renewal of GSCs (Cui et al., 2017). Correspondingly, the expression of neural markers is decreased in knockdown cells, pointing to incomplete differentiation. The authors extended their observation to an in vivo model by transplanting GSCs into mice and showing that Mettl3 or Mettl14 knockdown increases tumor growth. Of note, inhibiting the action of an m6A demethylase, FTO (ALKBH9), abrogates this tumor-promoting effect, which may provide a therapeutic target in the future (Cui et al., 2017). Inhibition of another m6A demethylase, ALKBH5, similarly suppresses GSC self-renewal, reportedly due to altered m6A deposition and expression of its target, FOXM1 (Zhang et al., 2017). Collectively, these results show the nuanced roles of m6A in stem cell and cancer biology, which likely reflects the diverse targets and context-specific action of this modification.
Alternative polyadenylation
By screening for shRNAs that lower the barrier to reprogramming, the RNA-binding protein NUDT21 was recently identified as a safeguarding molecule that helps cells resist fate changes (Brumbaugh et al., 2018). Nudt21 knockdown in MEFs leads to a greater than 30-fold increase in reprogramming efficiency and reduces the latency of iPSC formation by half, from 7-8 days to 4-5 days under standard conditions. The effect on cell fate change is also relevant to transdifferentiation as suppressing Nudt21 enhances the direct lineage conversion of fibroblasts to induced trophoblast stem cells (Brumbaugh et al., 2018). Functionally, NUDT21 is a key determinant of polyadenylation site use and helps to direct the 3′-end processing complex to distal regions of the 3′ UTR through a phenomenon called alternative polyadenylation (APA) (Masamha et al., 2014; Zhu et al., 2018). APA effectively changes the placement of the polyA tail within a transcript and frequently changes the length of the 3′ UTR (Fig. 4B) (Mayr, 2017; Shi, 2012; Tian and Manley, 2017). This form of post-transcriptional regulation is widespread as ∼70% of mammalian mRNAs are subject to APA. In general, transcripts with shorter 3′ UTRs are more stable because negative cis-regulatory sequences (e.g. miRNA seed sequences, AU-rich destabilizing elements) are excluded (Fig. 4B) (Mayr and Bartel, 2009; Sandberg et al., 2008). In the context of reprogramming, Nudt21 knockdown leads to a global shortening of 3′ UTRs, which eliminates miRNA seed sequences from transcripts encoding crucial chromatin modifiers and ultimately increases corresponding protein levels (Brumbaugh et al., 2018). For example, the H3K4me3 reader WDR5 and the PRC1 component RYBP exhibit altered polyadenylation profiles and increased expression following Nudt21 knockdown. Notably, both of these chromatin modifiers were previously identified as positive effectors of reprogramming (Ang et al., 2011a; Li et al., 2017b). ATAC-seq analyses revealed that the chromatin landscape in Nudt21 knockdown cells more closely resembles that of iPSCs compared with control cells, suggesting that increased expression of these targets creates a chromatin state that is more permissive for reprogramming. Whereas suppression of Nudt21 leads to shorter 3′ UTRs, suppression of another component of the 3′ end processing complex, Fip1l1, globally increases 3′ UTR length and reduces the efficiency of iPSC formation (Lackford et al., 2014). Together, these results mechanistically link APA-, chromatin- and miRNA-based regulation as previously unrecognized safeguarding mechanisms to protect cell fate.
APA has important implications for other physiological and pathological cell fate changes. For example, Pax3, which encodes an important myogenic regulator, is alternatively polyadenylated in muscle stem cells (termed ‘satellite cells’) from different muscle tissues (Boutet et al., 2012). Expression of the short 3′ UTR isoform of Pax3 effectively eliminates a miR-206 binding site, which alters PAX3 protein levels and, consequently, satellite cell proliferation and differentiation in adult muscle. APA has also been explored in the context of cancer. Using a panel of cancer cell lines from 27 different tissues, it was found that transcripts with shorter 3′ UTRs are enriched in tumor cells relative to long 3′ UTR isoforms (Mayr and Bartel, 2009), with the shorter 3′ UTRs increasing protein-level expression of the corresponding genes. The functional relevance of this observation was demonstrated by comparing the effect of expressing the short or long isoform of the oncogene IMP1 in fibroblasts or breast epithelial cells during colony formation assays. Only the shortened IMP1 isoform was capable of transforming cells in this system, providing direct evidence for the role of APA in transformation (Mayr and Bartel, 2009). Work in glioblastoma has also emphasized the connection between shortened 3′ UTR isoforms and disease (Masamha et al., 2014). Loss of NUDT21 has been reported in other diseases, although the connection to cell plasticity is currently unclear. For example, decreased levels of NUDT21 contribute to pulmonary fibrosis through APA of factors related to extracellular matrix remodeling (Weng et al., 2019). In addition, NUDT21 was recently shown to control APA and gene expression of the CpG methyl-binding protein MECP2, which is mutated in individuals afflicted with the neurodevelopmental disorder Rett syndrome. Although potential changes to cell fate were not examined, this report established an additional link between APA and chromatin signaling, and identified NUDT21 as an important upstream regulator of a disease-relevant gene (Gennarino et al., 2015). Collectively, this work implicates APA in tumorigenesis and highlights multiple targetable levels of regulation.
Alternative splicing
Alternative splicing is a key regulatory mechanism that contributes to protein diversity (Fig. 4C) (Cieply et al., 2016; Gabut et al., 2011). Splicing patterns change dramatically during differentiation and reprogramming (Cieply et al., 2016; Ohta et al., 2013; Salomonis et al., 2010), suggesting that this form of RNA processing influences cell fate decisions. A notable example is the forkhead family transcription factor Foxp1. Alternative splicing of Foxp1 in both human and mouse ESCs generates a pluripotent stem cell-specific isoform (Foxp1-ES) with altered DNA-binding properties (Gabut et al., 2011). Consequently, Foxp1-ES activates expression of pluripotency-related genes, including Nanog, Pou5f1 and Nr5a2 (Gabut et al., 2011). Foxp1-ES overexpression impairs ESC differentiation, while its suppression strongly reduces reprogramming efficiency (Gabut et al., 2011), supporting the notion that alternative splicing of this factor regulates cell fate transitions. A similar observation was reported for Mbd2, which likewise comprises two cell-type specific isoforms that have reciprocal effects on pluripotency (Lu et al., 2014).
Manipulating the machinery that controls these splicing events also has strong effects on reprogramming. Knockdown of MBNL proteins, which repress ESC-specific alternative splicing, increases reprogramming efficiency twofold (Han et al., 2013). Conversely, knockdown of Srsf3, which regulates splicing of chromatin remodelers and several pluripotency-related genes, decreases reprogramming efficiency (Ratnadiwakara et al., 2018). Genome-wide analyses of alternative splicing during reprogramming have identified several other splicing factors that are important for generating iPSCs at different points during the reprogramming process, including ESRP1, which mediates mesenchymal-to-epithelial transition (Cieply et al., 2016). Together, these studies highlight the important role that alternative splicing plays in enhancing or inhibiting cell fate change.
Relatively little is known regarding the role of alternative splicing in cell fate transitions other than reprogramming. However, the splicing factor SRRM4 has recently been implicated in the pathological transition between prostate adenocarcinoma and neuroendocrine prostate cancer (Li et al., 2017a). Mechanistically, SRRM4 expression drives transdifferentiation of adenocarcinoma cells by altering the splice isoforms of key genes, including REST (Li et al., 2017a). The resulting cells express a neural-specific isoform of REST and neuroendocrine prostate cancer biomarkers (Li et al., 2017a). Understanding the role of alternative splicing in this cell fate change may lead to novel therapeutic approaches, as neuroendocrine prostate cancer is typically resistant to treatments for prostate adenocarcinoma, and blocking the aberrant transition to neuroendocrine-like cells may therefore render cells sensitive to current therapies.
Sumoylation
Lysine sumoylation is a reversible post-translational modification that controls protein activity across a wide range of cellular processes, including DNA damage repair, immune responses, cancer progression, transcription and chromatin organization (Deribe et al., 2010). Similar to ubiquitylation, sumoylation is mediated by three dedicated enzymes, termed E1, E2 and E3 ligases, which catalyze the covalent attachment of SUMO1 and SUMO2 peptides to target proteins, altering their activity, localization or stability. Importantly, sumoylation was previously shown to be essential for embryonic development, demonstrating its functional relevance in vivo (Wang et al., 2014a). Additionally, SUMO2 and the SUMO E2 ligase UBC9 (Ube2i) have been identified as top-scoring hits in independent loss-of-function screens for barriers to reprogramming, suggesting that sumoylation plays a role in regulating somatic cell identity (Borkent et al., 2016; Cheloufi et al., 2015).
A recent study evaluated the functional role of sumoylation in iPSC reprogramming (Cossec et al., 2018). The authors performed ChIP-seq for SUMO1 and SUMO2 (abbreviated as SUMO) in MEFs and ESCs to map the location of chromatin-bound SUMOylated proteins. This revealed that, although SUMO is largely present at active enhancer elements in MEFs, it is enriched at heterochromatic regions, including ERVs, as well as a rare subset of pluripotency-associated super-enhancers in ESCs. These context-dependent binding patterns suggest that SUMO contributes to gene repression in pluripotent cells and gene activation in somatic cells. Consistent with this, fibroblast-associated genes are downregulated more effectively in Ube2i-depleted cells undergoing reprogramming compared with control cells. Thus, sumoylation primarily stabilizes somatic cell identity during reprogramming, impeding the acquisition of a pluripotent fate.
This study also revealed the surprising finding that sumoylation is preferentially localized at repressed ERVs in ESCs (Cossec et al., 2018). Given that ERVs are typically repressed in ESCs but expressed in the 2C-like state, the authors asked whether hyposumoylation might be sufficient to reactivate ERVs and induce a 2C-like state. Indeed, they observed that Ube2i suppression in ESCs leads to the transcriptional induction of multiple ERV families as well as transcriptional regulators that are important for acquisition of a 2C-like state, such as Dux and Zscan4b/c/d/f. These results extend the importance of sumoylation as a safeguarding mechanism whereby it resists acquisition of a more primitive state. However, the mechanism by which sumoylation safeguards pluripotent cell identity (i.e. by silencing the 2C-like state) is very different from that in fibroblasts where it activates the somatic program.
Sumoylation also appears to be relevant for transdifferentiation and differentiation (Cossec et al., 2018). Specifically, Ube2i suppression or pharmacological inhibition of the SUMO pathway reportedly promotes the conversion of MEFs into neurons and B cells into macrophages. In addition, perturbing SUMO enhances the commitment of hematopoietic progenitors to mature cells (Cossec et al., 2018). Collectively, these functional experiments demonstrate that sumoylation acts as a general regulator of cell fate across diverse developmental contexts. Key questions that remain to be addressed in these experiments are which proteins are sumoylated and how does this posttranslational modification impact their function? Novel proteomics-based approaches (Becker et al., 2013; Wang et al., 2016) should help to answer these questions in the future.
The role of 3D chromatin structure in safeguarding cell identity
A combination of microscopy-based methods and chromosome-conformation capture-based approaches has recently enabled researchers to delineate the three dimensional (3D) organization of the genome (Bonev and Cavalli, 2016; Denker and de Laat, 2016). This research has led to the view that chromosomes are spatially segregated into two compartments within the nucleus: the A compartment, which corresponds to active chromatin, and the B compartment which represents heterochromatin and is enriched at the periphery of the nucleus. These subnuclear compartments can be further separated into topologically associated domains (TADs) (Dixon et al., 2012; Nora et al., 2012) and chromatin loops that are, at least in part, cell type-specific (Phillips-Cremins et al., 2013). Partitioning chromatin permits the regulation of many biological processes in a cell type-specific manner and may thus represent a previously underappreciated yet important ‘upstream’ mechanism to safeguard cell identity. (Dixon et al., 2015; Ghavi-Helm et al., 2014; Gorkin et al., 2014; Gröschel et al., 2014; Jhunjhunwala et al., 2009; Lupiáñez et al., 2015). Given the importance of these various biological processes in how cells make fate decisions, it is not surprising that the 3D structure of the genome has been implicated in reprogramming. Indeed, Apostolou et al. and Denholtz et al. first reported the relationship between 3D genome conformation and transcriptional activity around select pluripotency loci in reprogramming intermediates (Apostolou et al., 2013; Denholtz et al., 2013). Interestingly, these studies found that the establishment of new chromatin loops frequently coincides with or precedes transcriptional activity, suggesting that conformational changes may have an instructive role for gene activation. However, these studies were limited to a few loci and it remained unclear whether this phenomenon reflected a general mechanism that occurs on a genome-wide scale during reprogramming.
Extending these findings, genome-wide maps of the 3D genome structure of different somatic cells (NPCs, B-cells, macrophages and embryonic fibroblasts) and multiple iPSC lines were recently catalogued (Beagan et al., 2016; Krijger et al., 2016). These studies demonstrate that the overall 3D genome of somatic cells drastically differs from that of their reprogrammed derivatives, supporting the notion that chromatin structure is tightly linked to cell identity. Moreover, many somatic-specific chromatin interactions are eliminated during iPSC formation and new pluripotency-specific chromatin loops are formed, concomitant with activation of the pluripotency transcriptional network. Intriguingly, one study also revealed that, despite exhibiting substantial structural reorganizations that resemble those seen in ESCs, early passage iPSCs retain founder cell-specific structural features during reprogramming that are lost upon continuous passaging (Krijger et al., 2016). These data provided the first evidence for the existence of a topological memory in induced pluripotency. This may further explain the previous observation that early passage iPSCs retain a transcriptional and epigenetic memory of their cell type of origin, which can affect their differentiation potential (Kim et al., 2010; Kim et al., 2011; Polo et al., 2010).
More recently, the dynamic relationship between chromatin structure and transcription has been characterized on a genome-wide scale using a highly efficient and synchronous reprogramming system that converts B cells into iPSCs (Di Stefano et al., 2014; Stadhouders et al., 2018). Using assays for chromatin topology (Hi-C), enhancer activity (ATAC-seq and ChIP-seq) and transcriptional activity (RNA-seq), the authors discovered that the dynamics of genome topology, chromatin opening and transcription are tightly linked during reprogramming. These findings led to a model in which TFs bind to gene regulatory regions, causing a change in chromatin topology that ultimately results in robust gene activation. In most cases, chromatin reorganization occurs concomitantly or before transcriptional activation, corroborating the idea that chromatin structure facilitates transcriptional changes during cell fate changes. This study also noted that transcriptional activation of the pluripotency-related genes Nanog and Sox2, which occurs late during B cell reprogramming, is preceded by a change in compartmentalization and TAD structures.
These observations raise the intriguing possibility that topological structures in somatic cells act as a barrier that resist cell identity switches such as reprogramming, differentiation or transformation. In support of this hypothesis, it was recently shown that deletions, amplifications and inversions of TAD boundaries alter the expression of fate-instructive molecules through an enhancer-adoption mechanism (Andrey and Mundlos, 2017; Norton and Phillips-Cremins, 2017). Similar mechanisms have been associated with loss of cell identity in malignant transformation. For example, leukemic T-cell genomes often contain deletions that erase the boundary sites of TADs encompassing specific oncogenes such as TAL1 and LMO2. Importantly, deletion of these boundaries in non-malignant cells is sufficient to activate cancer-driving genes (Hnisz et al., 2016). Similarly, duplication of TAD boundaries can exert a pathogenic effect by inducing the formation of novel chromatin domains called ‘neo-TADs’. In this scenario, a gene in the amplified region is placed under the control of the regulatory regions of the neo-TAD and acquires a new expression pattern (Weischenfeldt et al., 2017).
Given the key role that the TF CTCF plays in the formation and maintenance of TAD boundaries, mutations in its binding sequence have also been shown to correlate with oncogene expression (Flavahan et al., 2016). Indeed, loss of methylation-sensitive CTCF-binding motifs in glioma cells disrupts insulator elements and aberrantly activates the canonical glioma oncogene PDGFRA, which ultimately contributes to tumorigenesis. Collectively, these studies reinforce the idea that 3D genome structure plays a key role in maintenance of cell identity in both physiological and pathological contexts.
Transposable elements as safeguarding factors
The mammalian genome contains a strikingly large proportion (∼50%) of sequences derived from transposable elements (TEs) (Percharde et al., 2018). Many of these elements are no longer capable of transposition and were once classified as junk DNA. However, accumulating evidence suggests that TEs harbor important regulatory features that influence gene expression, often in a cell type-specific manner (Fig. 3) (Chuong et al., 2017). Importantly, several groups have reported that TEs modulate cell potency during early embryonic development (Garcia-Perez et al., 2016; Percharde et al., 2018; Wang et al., 2014b). For example, murine zygotes injected with antisense oligonucleotides targeting long interspersed element 1 (LINE1) arrest prior to morulation and fail to downregulate MERVL and Dux (Percharde et al., 2018). While MERVL is a transposon specifically expressed at the 2C stage, Dux is considered an upstream master regulator of the 2C-specific transcriptional program, and is required and sufficient for the induction of a 2C-like state in ESCs (Hendrickson et al., 2017). However, its deletion in mice was recently shown to be compatible with development and survival to adulthood (Chen and Zhang, 2019). This surprising observation highlights the limitations of a valuable in vitro system, such as the 2C model, compared with the complexity of developmental processes in vivo. Regardless, these reports suggest that LINE1 may function as one of the earliest safeguards to limit cell plasticity. Mechanistically, LINE1 expression does not restrict cell fate through its protein products, insertional mutagenesis or by influencing transcription of nearby genes (Percharde et al., 2018). Rather, LINE1 mRNA seems to act as a nuclear scaffold, in a manner analogous to lncRNAs, to interact directly with NUCLEOLIN and KAP1. Through these interactions, LINE1 mRNA recruits repressors to specific genomic loci, including Dux and ribosomal genes, thereby preventing expression of the 2C program. In this way, LINE1 activation during early development may help to drive differentiation from a totipotent to a pluripotent state. It will be interesting to determine whether LINE1 plays a similar physiological role in germ cells and neural cells, where it is transiently expressed (Coufal et al., 2009; Ergün et al., 2004).
As with many other barriers to cell fate change, LINE1 activation has been tied to cancer (Helman et al., 2014). For example, LINE1 is highly expressed in a variety of malignancies, particularly colorectal and other epithelial cancers where it may contribute to tumorigenesis (Rodić et al., 2014). Indeed, TEs were previously shown to directly activate oncogenes and drive cancer in rare cases (Howard et al., 2008; Miki et al., 1992; Morse et al., 1988). Considering that epigenetic mechanisms silence TEs in normal cells, aberrant expression from TEs and nearby oncogenes is particularly prevalent in cancer cells where chromatin states are altered (Babaian and Mager, 2016; Lamprecht et al., 2010). Teasing apart whether TE activation is a cause or a consequence of tumorigenesis in these cases will be important for future studies. One intriguing possibility is that aberrant expression of LINE1 mRNA may collaborate with other tumorigenic alterations to initiate cancer through its role as a nuclear scaffold, following the model of Percharde et al. (Percharde et al., 2018). This mechanism may have been overlooked in previous studies and could represent a tractable target for therapeutics. Recent advances in perturbing TE expression across the genome should help to address this issue (Fuentes et al., 2018; Pontis et al., 2019).
Conclusions and perspectives
The complex and dynamic nature of gene expression that drives cell specification and functionality requires an equally complex regulatory system. The integrity of this regulatory system is paramount for maintaining cell fate, and recent work has shown that altering key regulators facilitates induced cell fate changes. These regulators are typically widely expressed across tissues and impact diverse aspects of gene regulation, including histone methylation and nucleosome turnover, DNA and RNA methylation, alternative polyadenylation, alternative splicing and three-dimensional chromatin structure. While the majority of previously examined safeguarding mechanisms function in gene repression (i.e. DNA/RNA methylation, H3K9 methylation, Polycomb silencing, CAF-1-dependent heterochromatin maintenance, NuRD-dependent histone deacetylation and H2A.Z-dependent chromatin compaction), recent evidence shows that processes associated with active enhancers and active transcription (i.e. elongation-associated H3K79me3 deposition via DOT1L, histone turnover via FACT and the marking of active enhancers by SUMO) may be equally potent mechanisms to safeguard cell identity. These results imply that efficient safeguarding mechanisms may require both the active maintenance of cell type-specific gene expression patterns and the stable silencing of alternative lineage programs. Conversely, cellular plasticity may be best achieved by simultaneously suppressing pathways that maintain cell type-specific gene expression and silence alternative lineage programs. It is therefore likely that molecules regulating these two processes will have an impact on cell fate change. Not coincidentally, many of the same regulatory mechanisms discussed in this Review go awry during tumorigenesis, setting up an interesting and important intersection between cancer and cell fate change. Expanding our understanding of experimental reprogramming systems thus can expose mechanistic details that are important for cell fate and may provide fundamental information and therapeutic targets in the context of cancer.
While this Review has focused mostly on experimental reprogramming paradigms and their relevance to cancer, recent evidence points to the importance of physiological examples of transdifferentiation and dedifferentiation in the context of regeneration (Fig. 2). For example, lineage-committed secretory, absorptive and Paneth cell progenitors of the intestinal epithelium have been shown to reacquire stem cell identity and replenish all differentiated cell types upon tissue damage (Buczacki et al., 2013; Tetteh et al., 2016; van Es et al., 2012). Similarly, secretory cells of the airway epithelium have the capacity to dedifferentiate into stable and functional stem cells following genetic ablation of stem cells that maintain normal tissue turnover (Tata et al., 2013). In certain injury contexts, such as parasitic infection of the colonic epithelium, epithelial cells can even dedifferentiate towards a fetal-like state as part of the regeneration process (Nusse et al., 2018). Last, wound repair in the skin was recently shown to induce cellular plasticity and lineage infidelity in two separate stem cell populations, hair follicle stem cells and epidermal stem cells, and this process appears to be hijacked in cancer cells (Ge et al., 2017). Although the mechanisms underlying these transdifferentiation/dedifferentiation events remain largely unknown, recovery of developmentally primitive gene expression programs in adult tissues may in part be explained by the recent observation that enhancer hypomethylation associated with embryonic lineage specification persists into adulthood (Jadhav et al., 2019). Given that most of the safeguarding regulators we have discussed in this Review are widely expressed across tissues, it is conceivable that they may also play a role in coordinating the acquisition of a more plastic state whenever required. If this turns out to be the case, modulation of these pathways could be exploited in the future to manipulate cell fate in a clinical setting, e.g. by restraining unwanted plasticity in undifferentiated, proliferative cancer cells or enhancing plasticity in tissues where there is limited or no regenerative capacity.
Acknowledgements
We thank members of the Hochedlinger lab for critical review of this manuscript.
Footnotes
Funding
The authors' research received support from the National Institutes of Health (P01-GM099134 and R01-HD058013). Deposited in PMC for release after 12 months.
References
Competing interests
The authors declare no competing or financial interests.