Tissue-specific transcription factors primarily act to define the phenotype of the cell. The power of a single transcription factor to alter cell fate is often minimal, as seen in gain-of-function analyses, but when multiple transcription factors cooperate synergistically it potentiates their ability to induce changes in cell fate. By contrast, transcription factor function is often dispensable in the maintenance of cell phenotype, as is evident in loss-of-function assays. Why does this phenomenon, commonly known as redundancy, occur? Here, I discuss the role that transcription factor networks play in collaboratively regulating stem cell fate and differentiation by providing multiple explanations for their functional redundancy.
During mouse development, a single, totipotent cell divides repeatedly to give rise to a few billion cells, which differentiate into a few hundred different cell types. Differentiation is the process by which a cell changes phenotype and becomes increasingly specialized. Cell phenotypes are defined by particular combinations of genes expressed in a cell type-dependent manner (Armit et al., 2017). The selection of these combinations is mainly driven by cell type-specific transcription factors (TFs), which in turn are regulated by other TFs that integrate and respond to extracellular signals in order to maintain cell phenotype (Davidson, 1993). Thus, TFs form a network in which each TF is reciprocally regulated to maintain its balanced expression.
A TF network often forms part of a gene regulatory network (Box 1, Glossary). Gene regulatory networks are divided into functional subcircuits (Davidson, 2010) and consist of multiple layers of regulatory mechanisms at the epigenetic, topological and transcriptional level. The epigenetic regulation of chromatin accessibility is thought to be important for maintaining the irreversibility of a cell's differentiated status under normal physiological conditions (Perino and Veenstra, 2016). The topological regulation of chromatin is also believed to control global gene expression patterns in organisms (Acemel et al., 2017); however, the degree to which these regulatory mechanisms actively determine specific cell types is unclear. By contrast, TF networks have been shown to play a pivotal role in defining cell types, which is reflected in the ability of certain TFs to instruct changes in cell phenotype when ectopically expressed in various contexts (Niwa, 2007; Morris, 2016). The discovery of MyoD (Myod1), for example, was a key finding in this field as it demonstrated the power of a single TF to define cell phenotype (Davis et al., 1987). This example might be somewhat of a rarity, however, since in general the potential of a single cell-type-specific TF to instruct fate is limited, and it often only regulates differentiation in a particular context. More commonly, specific combinations of TFs are required to instruct cell fate, most famously during the reprogramming of differentiated somatic cells to pluripotent stem cells, which requires a combination of four TFs (Takahashi and Yamanaka, 2006). Other combinations of TFs have also been used to induce direct lineage reprogramming, whereby a cell transitions from one cell type to another without returning to a pluripotent state (Morris, 2016).
2i culture. This feeder- and serum-free culture condition consists of N2B27 basal medium, MEK inhibitor and GSK3 inhibitor. Inhibition of GSK3 mimics activation of the Wnt signal.
Definitive endoderm. A cell lineage of the post-implantation embryo after gastrulation. Definitive endoderm gives rise to embryonic endoderm cell types.
Embryonic stem cells (ESCs). Pluripotent stem cells (PSCs) derived from blastocyst stage embryos. Mouse ESCs retain the character of PSCs in the epiblast at late blastocyst stage, and continue to self-renew in conventional serum-containing culture with LIF or serum-free 2i culture.
Epiblast stem cells (EpiSCs). PSCs derived from post-implantation embryos. Mouse EpiSCs retain the character of PSCs in epiblast at late egg cylinder stage, and self-renewal in the presence of activin A and Fgf2.
Gene regulatory network. The system controlling the transcriptional activities of all genes in the genome. It consists of multiple layers of molecular mechanisms of genetic and epigenetic regulation.
Jak-Stat3 pathway. Jak is a tyrosine kinase associated with the cytoplasmic domain of the LIF receptor. Jak phosphorylates Stat3, and the phosphorylated form of Stat3 translocates into the nucleus where it acts as a transcription factor.
LIF (leukemia inhibitory factor). LIF is an IL6 family cytokine. The LIF signal is received into the cytoplasm via the transmembrane LIF receptor, which consists of Il6st and Lifr, and is transduced by the Jak-Stat3, PI3K-Akt and Erk-MAPK pathways.
Mediator complex. A complex of 26 subunits that functions as transcriptional co-activator. It interacts with tissue-specific TFs on enhancers and general TFs and RNA polymerase II on promoter regions. These interactions mediate transcriptional activation by TFs.
Mek-Erk pathway. Mek (Erk kinase) and Erk compose the canonical MAPK pathway. LIF signal activates the Mek-Erk pathway via LIF receptor-Grb2-Sos-Ras. The Fgf signal also activates the Mek-Erk pathway via Fgf receptor.
Mesoendoderm. A cell lineage of the post-implantation embryo after gastrulation. Mesoendoderm gives rise to both mesoderm and definitive endoderm.
Neuroectoderm. A cell lineage of the post-implantation embryo after gastrulation. Neuroectoderm gives rise to neuronal cell types.
PI3K-Akt pathway. PI3K is coupled with the cytoplasmic domain of various receptors and produces phosphatidylinositol (3,4,5)-trisphosphate upon stimulation. These lipids then recruit PDK1 and Akt, resulting in activation of the kinase activity of Akt.
Primitive endoderm. The cell lineage located at the surface of the inner cell mass of a late blastocyst stage embryo. Primitive endoderm participates in the formation of yolk sac in post-implantation development. Primitive endoderm is also known as extra-embryonic endoderm.
Super-enhancer. An enhancer is defined as a DNA element that regulates transcription in cis within a certain distance from the promoter element. A super-enhancer is defined as a cluster of enhancers within a certain range of genomic DNA that allows the recruitment of the Mediator complex with higher affinity than a conventional enhancer that lacks such clustering.
Transcriptional bursting. Discontinuous events of transcription. Gene expression occurs as a sum of episodic transcription, and the frequency of transcriptional bursting determines the expression level. The frequency of bursting is regulated by the dynamic recruitment of transcription factors at enhancers.
Trophectoderm. The cell lineage located at the outer layer of a blastocyst stage embryo. Trophectoderm gives rise to a large part of the placenta in post-implantation development.
Trophoblast stem cells (TSCs). The stem cell line derived from trophoblasts. Trophoblasts are a proliferative population of trophectoderm and retain the ability to differentiate into multiple cell types of the placental lineage. TSCs retain the characteristics of trophoblasts in vitro in the presence of Fgf4.
These findings raised a key question: why are multiple TFs required to artificially change a cellular phenotype? To answer this question, we need to know how TFs function in a cell to define a phenotype. During differentiation, multiple TFs are known to cooperate with each other to activate transcription of their target genes (Whyte et al., 2013). To stably maintain a certain cell type, multiple TFs form a network that maintains their own expression, as well as that of cell type-specific genes as a downstream subcircuit (Davidson, 2010). While these broad principles have been established for a number of years, specific issues, such as what determines the exact number of TFs required to form a cell type-specific TF network, and how TF networks are sequentially replaced during differentiation, remain unanswered. A simple model system with synchronous differentiation would provide an ideal platform to address these issues. The in vitro differentiation system of mouse embryonic stem cells (mESCs) (Box 1, Glossary; Box 2) provides one such model (Niwa, 2010). In this Review, I focus on studies that analyze the role of TFs in regulating mESC self-renewal and differentiation, and summarize the mechanisms involved in the functioning and transitioning of TF networks.
The pluripotent states at early and late developmental stages are distinct, and are designated as the naïve and primed pluripotent states (Nichols and Smith, 2009). mESCs are in the naïve pluripotent state that mimics the character of late stage epiblast of blastocyst stage embryos (Boroviak et al., 2015), whereas mouse EpiSCs are in the primed pluripotent state that mimics the character of the late post-implantation stage epiblast (Brons et al., 2007; Tesar et al., 2007). In the developmental context, the late stage epiblast of blastocyst stage embryos gives rise to the late post-implantation stage epiblast, suggesting that the primed pluripotent state could represent a direct transition from mESCs. Indeed, culturing mESCs in the culture conditions for EpiSCs (i.e. containing activin A and Fgf2) allows their gradual transition over several passages (Guo et al., 2009). However, to date, there is no way to direct a homogeneous transition of mESCs to the primed state within a few days in culture, as is observed in the developmental context. This suggests that the transition from the naïve to the primed state might not be a direct process. Recently, an intermediate state between naïve and primed was proposed. This state, designated formative pluripotency, is defined by the downregulation of the naïve-specific TFs without the activation of the lineage-primed TFs that are activated in the primed state (Smith, 2017). Although PSCs in the formative state have not been captured stably, epiblast-like cells obtained transiently by the culture of mESCs could be close to it (Hayashi et al., 2011). The dynamic changes in TF binding between mESCs and EpiSCs may support the existence of such an intermediate state as defined by a stable TF network (see figure; dashed lines indicate events that can be induced under artificial conditions) (Matsuda et al., 2017). The transition to the embryonic cell lineages – definitive endoderm, ectoderm and mesoderm – could occur directly from the primed state TF network, but never from the naïve state TF network.
Modes of interaction between TFs and their target sites
To elucidate the structure of the TF network, we first need to know how each TF regulates its target genes. TFs are categorized into two classes: general TFs and tissue-specific TFs (Levine et al., 2014). General TFs bind to a promoter element to recruit RNA polymerase II and initiate transcription. Tissue-specific TFs bind to either a proximal element of a promoter or a distal regulatory element, designated as an enhancer, to promote the recruitment of general TFs to a promoter. Tissue-specific TFs quantitatively regulate the frequency of transcriptional bursting (Box 1, Glossary) by influencing the binding affinity of TFs to each other and of trans-acting co-factors to their target sites, as well as the stability of transcriptional complexes (Bartman et al., 2016; Fukaya et al., 2016). Here, I focus on the mechanisms that mediate the synergistic action of multiple tissue-specific TFs.
Direct interactions among TFs
Each TF recognizes four to eight specific DNA sequences as binding sites (Fig. 1A). However, this sequence specificity is often not enough to identify the functional target sites of a TF in the genome. Even if a TF recognizes 8 bases with perfect specificity, for example, the number of potential binding sites could be as high as ∼45,000 in the mouse genome. Many TFs form homodimers and/or heterodimers to recognize their target sequence with even higher specificity/affinity (Kamachi and Kondoh, 2013). Among the canonical TFs in the cocktail used to reprogram somatic cells to induced pluripotent stem cells (iPSCs), it is well known that Oct3/4 (Pou5f1) and Sox2 form a heterodimer to recognize the pluripotent stem cell-specific target sites in the genome (Rizzino and Wuebben, 2016). Oct3/4 and Sox2 recognize 8 and 7 bases, respectively, so the heterodimer recognizes 15 bases. Theoretically, only 2.7 sites in the mouse genome contain this 15 base sequence, thus a longer binding site ensures higher binding site selectivity (Fig. 1B). However, the number of binding sites of the Oct3/4-Sox2 heterodimer experimentally determined by ChIP-seq is ∼10,000 in mESCs due to diversity in the recognition of the target sequence (Hnisz et al., 2013). The interaction between Oct3/4 and Sox2 was first discovered on the enhancer element of Fgf4 (Yuan et al., 1995). In the years following that discovery, several mESC-specific enhancers were identified as being targets of the Oct3/4-Sox2 heterodimer, including the enhancers of Oct3/4 and Sox2 themselves (Tomioka et al., 2002; Okumura-Nakanishi et al., 2005). ChIP-seq analysis revealed that the major binding sites of Oct3/4 and Sox2 contain consensus binding sequences for Oct3/4 and Sox2 in tandem without or with few spaces (OCT-SOX motif), irrespective of the functional significance (Chen et al., 2008). Crystallographic analysis confirmed that Oct3/4 and Sox2 directly interact with each other via the outer surfaces of their DNA-binding domains on the target DNA (Remenyi et al., 2003), suggesting that the heterodimer would be more stable and would bind with higher affinity to the target site than either TF bound to the target site alone.
The Oct3/4-Sox2 heterodimer alters its composition during differentiation, taking on new binding partners specific to certain lineages. During the differentiation of mESCs to the definitive endoderm lineage (Box 1, Glossary), Oct3/4 changes its partner from Sox2 to Sox17 (Aksoy et al., 2013). This replacement is accompanied by a dynamic change in its target sites. During the differentiation of mESCs into neuroectoderm or trophoblast stem cells (TSCs; Box 1, Glossary), Sox2 changes its partner from Oct3/4 to Otx2 or Tfap2c, respectively (Fig. 2) (Iwafuchi-Doi et al., 2012; Adachi et al., 2013). Such changes in TF function resemble the sequential switching of TF heterodimer partners that occurs during blood cell differentiation (Sieweke and Graf, 1998). Direct interactions between TFs are common among pluripotency-associated TFs as well. For instance, it was reported that Nanog forms a homodimer (Mullin et al., 2008; Wang et al., 2008) as well as a heterodimer with Sox2 (Gagliardi et al., 2013; Mullin et al., 2017) to execute its function in mESCs.
TF interactions with super-enhancers
A super-enhancer (Box 1, Glossary) is a region in the genome where several enhancers cluster together. It is bound by different TFs, which then drive the transcription of the key genes that define cell type (Whyte et al., 2013). In total, 8794 enhancers have been identified in mESCs as being binding sites for Oct3/4, Sox2 and Nanog, which are often designated as core TFs of the mESC-specific network because of their pivotal functions (Hnisz et al., 2013). Among them, 231 are super-enhancers that consist of binding sites for Oct3/4, Sox2, Klf4, Nanog, Esrrb, Nr5a2 (Lrh1), Prdm14, Tfcp2l1 (Crtr1), Smad3, Stat3 and Tcf7l1 (Tcf3) (Hnisz et al., 2013). These binding sites sit within a ∼100 kb region and regulate target genes in mESCs. The clusters of these TFs recruit the Mediator complex (Box 1, Glossary), which functions as a transcriptional co-activator (Fig. 1C) with higher affinity than that of conventional enhancers that lack these clusters (Hnisz et al., 2013). Super-enhancers also recruit other co-factors to DNA, which are involved in the transcriptional activation of, for example, RNA polymerase II, cohesin, Nipbl, p300, CBP, Chd7, Brd4, esBAF and Lsd1-NuRD complexes (Hnisz et al., 2013). It is speculated that these factors form a large complex that facilitates the looping of the enhancer region such that it contacts the promoter element and activates transcription (Levine et al., 2014). The formation of such TF clusters could also give rise to higher sequence specificity for specific target sites and higher affinity for the Mediator complex, resulting in more specific and frequent transcriptional bursting (Fukaya et al., 2016). Super-enhancers are highly cell type specific, so their activation is tightly linked to the determination of cell type via specific sets of TFs (Saint-André et al., 2016).
Interconnected regulatory loops in the TF network
Genes sometimes contain binding sites for the very same TFs that they encode. Such regulatory loops might help to ensure that the entire set of TFs that cooperate to activate certain targets are co-expressed. For example, Oct3/4 and Sox2 possess enhancer elements that are activated by the Oct3/4-Sox2 heterodimer (Okumura-Nakanishi et al., 2005; Tomioka et al., 2002). Hnisz et al. (2013) found that 8 of the 11 TFs recruited to mESC-specific super-enhancers are themselves regulated by super-enhancers, indicating that they form an interconnected autoregulatory loop (Table 1). Thus, these TFs form a network in which TFs are regulated by other TFs and/or by themselves, and are stabilized by the balanced maintenance of their expression (Fig. 3). The members of the TF network are reciprocally interconnected through the transcriptional regulation of each other, resulting in either activation or repression. A TF network should itself be cell type specific in order to direct gene expression patterns in a cell type-dependent manner. The stability of the TF network is also likely to be controlled by external signals that mediate cell type change as development progresses.
Understanding redundancy in TF function
In functional analyses of the mESC-specific TF network, it has often been observed that the function of a single TF is sufficient to produce a dominant effect in a gain-of-function assay, but its function is not essential for maintaining the ESC state in loss-of-function assays (Table 1). The term redundancy describes an activity of a molecule that is sufficient but not essential for a particular biological function. TF redundancy in mESCs is often coupled with a key extracellular signal for maintaining mESC self-renewal in culture, for example leukemia inhibitory factor (LIF; Box 1, Glossary). Activation of the LIF signal is sufficient and essential for the stable self-renewal of mESCs in conventional culture with serum (Smith et al., 1988) but is dispensable in serum-free 2i culture (Box 1, Glossary) (Ying et al., 2008), indicating that there may be redundant pathways that substitute for the requirement of the LIF signal in mESC self-renewal. LIF mainly acts via activation of the Jak-Stat3 pathway (Box 1, Glossary) (Niwa et al., 1998). The artificial activation of Stat3 in mESCs sustains self-renewal in the absence of LIF (Matsuda et al., 1999), whereas blocking its activation abolishes the biological action of LIF (Niwa et al., 1998), highlighting its pivotal role in mediating LIF function. It has also been shown that LIF signal activates the PI3K-Akt and Mek-Erk pathways (Box 1, Glossary) in parallel (see Fig. 5) (Niwa et al., 2009).
Previous studies have shown that the overexpression of any one of the TFs Nanog, Klf2, Klf4, Klf5, Tbx3 or Esrrb supports self-renewal in conventional mESC culture without LIF (Chambers et al., 2003; Mitsui et al., 2003; Ema et al., 2008; Hall et al., 2009; Niwa et al., 2009; Martello et al., 2012) (Table 1). However, the complete inactivation of any one of these TF genes by gene targeting does not affect mESC self-renewal in the presence of LIF (Mitsui et al., 2003; Ema et al., 2008; Hall et al., 2009; Niwa et al., 2009; Martello et al., 2012; Russell et al., 2015; Waghray et al., 2015) (Table 1). As discussed below, the functional redundancy of TF function could be based on various molecular mechanisms.
Functional redundancy on the same target sites
The simplest mechanism underlying functional redundancy is where different TFs have functions that overlap on the same target sites. During animal evolution, ancestral TF-encoding genes underwent duplication, often multiple times, giving rise to TF families (Nowick and Stubbs, 2010). One such TF family is the mouse krüppel-like factor (Klf) family, which consists of 17 members, each of which shares a conserved DNA-binding domain that recognizes an almost identical target sequence (McConnell and Yang, 2010). Klf TFs are divided into three groups based on the homology of their DNA-binding domain. Among them, three group 2 Klf family members, namely Klf2, Klf4 and Klf5, function in mESCs (Jiang et al., 2008). The overexpression of any one of these genes is sufficient to support LIF-independent mESC self-renewal (Ema et al., 2008; Hall et al., 2009; Niwa et al., 2009). Moreover, these three Klf family members are functionally interchangeable in the reprogramming of somatic cells into iPSCs, when used in combination with Oct3/4, Sox2 and Myc (Nakagawa et al., 2008), and when singly overexpressed to force the conversion of primed epiblast stem cells (EpiSCs) (Box 1, Glossary; Box 2) into naïve pluripotent stem cells (Jeon et al., 2016). The single or double knockdown of any of these three Klf family members is tolerated by mESCs, but knockdown of all three abrogates self-renewal (Jiang et al., 2008). The overlapping functions of Klfs is a typical example of the functional redundancy that exists among TFs of the same family (Fig. 4A).
Klf4 has been identified as one of the TFs recruited to mESC-specific super-enhancers, and its own expression is regulated by an mESC-specific super-enhancer (Hnisz et al., 2013). Since Klf2 and Klf5 share many functions with Klf4 in mESCs, including their DNA target sequence, they might also be recruited to mESC-specific super-enhancers. Interestingly, Klf2 and Klf5 also possess the mESC-specific super-enhancers (Table 1), suggesting that all Klfs could be members of the mESC-specific TF network.
Synergistic functions at super-enhancers
Nanog belongs to the homeobox TF family (Chambers et al., 2003). Among the 279 homeobox genes in the mouse genome, Nanog is a divergent gene that shares weak homology with other homeobox family members (Holland, 2012). Therefore, the continuous self-renewal of Nanog-null mESCs (Chambers et al., 2007) should not be due to functional redundancy with other homeobox genes. Nanog has been shown to be recruited to mESC-specific super-enhancers together with other TFs, of which only Oct3/4 and Sox2 are absolutely essential for maintaining mESC self-renewal in any culture condition (Niwa et al., 2000; Masui et al., 2007), whereas the other recruited factors are dispensable, either absolutely or in a context-dependent manner (Table 1). Stat3 is essential in conventional mESC culture conditions, but serum-free 2i culture conditions allow Stat3-null mESCs to self-renew (Ying et al., 2008). This might be because TFs act synergistically on mESC-specific super-enhancers, in which case the accumulation of a certain number of TFs together with Oct3/4 and Sox2 might be sufficient to stimulate transcriptional activation (Fig. 4B). Such TFs might also share overlapping functions that enable the complex bridged by the Mediator complex achieve DNA binding specificity and/or affinity above functional threshold levels. Alternatively, their synergistic action might confer affinity to the Mediator complex that mediates transcriptional bursting. It has been suggested that sub-optimization of the binding motifs to decrease the binding affinities of individual TFs could be selected in evolution to avoid the ectopic activation by any single TF and maximize their synergistic action on super-enhancers (Farley et al., 2015).
The robustness of the TF network
Multiple TFs form a network with interconnected regulatory loops as discussed above, and the TF network receives multiple inputs from extracellular signals (Niwa et al., 2009). Such multiple interconnections could confer robustness to the network to protect against the loss of one or more of its components. Tbx3 is a member of the T-box family, which consists of 17 genes in the mouse genome (Papaioannou, 2014) divided into five subfamilies. Among the Tbx2 subfamily, only Tbx3 is expressed at a high level in mESCs. Tbx3-null mESCs continue to self-renew with no evidence that other subfamily members undergo compensatory upregulation (Russell et al., 2015; Waghray et al., 2015). This suggests that the functional redundancy of Tbx3 in mESC self-renewal might not involve other subfamily members. Also, Tbx3 is not recruited to the mESC-specific super-enhancers (Table 1). How can the redundant function of Tbx3 be explained? It has been shown that Tbx3 is a target of the PI3K-Akt pathway under the LIF signal (Niwa et al., 2009). The PI3K-Akt pathway functions in parallel to the canonical Jak-Stat3 pathway in mediating the action of LIF in maintaining mESC self-renewal. Thus, the parallel action of these pathways downstream of LIF might explain the functional redundancy of Tbx3 in the maintenance of mESC self-renewal. This parallel function might be regulated at the regulatory elements of a gene targeted by both Tbx3 and Klf4, downstream of the Jak-Stat3 pathway. This could also be the case for Nanog because of its possible link to the PI3K-Akt-Tbx3 axis (Niwa et al., 2009). Alternatively, Tbx3 might act in a different way that is complementary to the maintenance of self-renewal, for example by inhibiting differentiation, promoting the cell cycle or preventing cell death.
Considering the complexity of the interactions between different TFs in a network, the dispensability of a single component would help to ensure network robustness. The loss of a component might initially alter the balance in the expression of the other TFs in the network, but then the network would become stabilized in a different balanced state (Fig. 4C). This could also be the case for any TFs that show functional redundancy.
Non-redundant TFs and core TFs
Among the members of the mESC-specific TF network, only Oct3/4 and Sox2 are absolutely essential for maintenance of the mESC pluripotent state irrespective of the culture condition, prompting the question: what are the unique characteristics of Oct3/4 and Sox2 that make them indispensable, while other TFs are not? Oct3/4 is a member of the class V POU family (Frankenberg et al., 2014). The mouse genome contains only two class V POU family members, and of these only Oct3/4 is expressed in mESCs. Sox2 belongs to the Sry-related HMG box (Sox) family (Kamachi and Kondoh, 2013). Together with Sox1 and Sox3, Sox2 is a group B1 family member. Neither Sox1 nor Sox3 is expressed in mESCs and yet they are able to functionally replace Sox2 in maintaining pluripotency when expressed artificially (Niwa et al., 2016).
The formation of predominantly heterodimeric Oct3/4 and Sox2 complexes is a unique feature among members of the mESC-specific TF network. As previously discussed, heterodimer formation enables TFs to bind with higher specificity and affinity to their target sites than when binding as monomers. Such high specificity and affinity might point to a more important role of Oct3/4 and Sox2 over other TFs at the super-enhancers (Fig. 4B). They might act as essential TFs to recruit the Mediator complex, which would then allow the redundant cooperation of other TFs to achieve higher enhancer activity. Alternatively, the mESC-specific TF network might not be able to achieve a balanced state without Oct3/4 or Sox2.
Rewiring the mESC-specific TF network for differentiation
The TF networks active within a cell define its transcriptome, which in turn defines cell fate and phenotype. In the case of mESCs, TF networks need to be stably maintained in response to extracellular signals, and then rapidly transit to another network state to accomplish differentiation. For example, when Oct3/4 is downregulated in mESCs, the cells differentiate towards trophectoderm (Box 1, Glossary) (Niwa et al., 2000). In the presence of Fgf4, these trophectoderm cells acquire the characteristics of TSCs. This efficient transition through multiple differentiation states provides a useful model in which to investigate the dynamics of the TF network during differentiation.
The downregulation of Oct3/4 in mESCs mediates the upregulation of Cdx2 (Niwa et al., 2000). Since Cdx2 overexpression is also sufficient to induce the transition of mESCs to TSCs (Niwa et al., 2005), it is likely that the regulatory link between Oct3/4 and Cdx2 is the key event that drives this transition. The genome-wide analysis of this transition event has revealed several interesting findings that shed light on how TF network transitions occur during differentiation (Adachi et al., 2013).
Changes in TF expression
Following the artificial downregulation of Oct3/4 in mESCs, most members of the mESC-specific TF network are subsequently downregulated during the transition to TSC fate. In parallel, members of the TSC-specific TF network are upregulated (Niwa et al., 2000; Adachi et al., 2013). It has been proposed that Oct3/4 and Cdx2 reciprocally repress each other to ensure the exclusive expression of either gene (Niwa et al., 2005). Such reciprocal repression might also occur between other members of the mESC-specific and TSC-specific TF networks to mediate the rapid transition from epiblast to trophoblast fate. Even during the dynamic exchange of TFs in the transition from mESC to TSC fate, some TFs maintain their expression (Adachi et al., 2013). A key example is Sox2 (Figs 2 and 5) which, as described above, is a binding partner of Oct3/4 and essential for mESC self-renewal. However, Sox2 expression is maintained in TSCs, and its function is essential for TSC self-renewal (Adachi et al., 2013). The requirement of Sox2 in both cell types in vitro is consistent with its requirement in vivo: Sox2 is required for the maintenance of the epiblast cell population (the in vivo counterpart of mESCs), as well as for the expansion of the trophoblast (the in vivo counterpart of TSCs) (Avilion et al., 2003). Expression of Esrrb is also common to both the mESC-specific and TSC-specific TF networks. However, in this case, its function is dispensable in mESCs (Martello et al., 2012), but is essential in TSCs based on an inhibitor assay (Tremblay et al., 2001). The phenotype of Esrrb-null embryos supports these in vitro findings since these mutants have a normal epiblast but die from trophoblast defects (Luo et al., 1997). The few components shared between the mESC-specific and TSC-specific TF networks seem to function by forming distinct connections with other TFs in each network.
Rewiring between extracellular signalling pathways and the TF network
Exogenous Fgf4 is an essential requirement for the acquisition of TSC-like characteristics from mESCs (Adachi et al., 2013). The Fgf signal is transduced by Fgf receptors and activates multiple signal transduction pathways (Korsensky and Ron, 2016). Among them, the Mek-Erk pathway is responsible for exerting the action of Fgf4, and inhibition of this pathway inhibits differentiation along the TSC lineage (Adachi et al., 2013). Embryo-derived TSCs also require Fgf4 for continuous self-renewal in culture (Tanaka et al., 1998). As in the case of LIF signalling in mESCs, the overexpression of particular TFs can replace the requirement for Fgf4 in TSCs (Adachi et al., 2013). When Sox2 and Esrrb or Sox2 and Tfap2c are overexpressed in TSCs together, they can support long-term self-renewal without Fgf4. Since these TFs are downregulated more rapidly than others following the withdrawal of Fgf4 or the inhibition of Mek in TSCs, these TFs are likely to be targets of the Mek-Erk pathway under the Fgf4 signal (Adachi et al., 2013; Latos et al., 2015).
Although some TFs, such as Sox2 and Esrrb, are expressed both in mESCs and TSCs, they interact with different signalling pathways in a cell type-specific manner. In mESCs, the Mek-Erk pathway has a negative effect on the maintenance of the mESC-specific TF network, and the inhibition of Mek stabilizes it. By contrast, in mESCs neither the inhibition of Mek nor the addition of Fgf4 to the culture medium affects the expression of Sox2 and Esrrb (Adachi et al., 2013). These findings indicate that a different regulatory cue from the Mek-Erk pathway is established for Sox2 and Esrrb during the transition from mESCs to TSCs (Fig. 5). In summary, the connections between extracellular signals and the TF network are likely to undergo extensive rewiring when cells differentiate into another cell type. In this way, the same TF could be regulated by different signalling cues in a context-dependent manner.
Rewiring of the interconnections among TFs in the network
In the mESC-specific TF network, Sox2 is believed to be regulated by mESC-specific enhancers that are bound by Oct3/4 and Sox2. A recent report has revealed that a distal enhancer located >100 kb downstream of Sox2 makes a greater contribution to regulating its expression in mESCs than the proximal OCT-SOX motifs (Zhou et al., 2014). This enhancer is a typical super-enhancer that is occupied by multiple TFs, such as Oct3/4, Sox2, Nanog, Smad1, Esrrb, Klf4, Nr5a2, Tfcp2l1 and Stat3, which are specific to mESCs. A different enhancer is therefore likely to be responsible for regulating the expression of Sox2 in TSCs, although this enhancer remains to be discovered. Changes in enhancer activity are often observed during differentiation (Long et al., 2016). In the case of Sox2, multiple enhancers are present in the regions upstream and downstream of this gene, each of which activates its expression in a distinct tissue-specific manner (Uchikawa et al., 2003; Okamoto et al., 2015). Such differential usage of enhancer elements allows the same TF to be incorporated into different networks (Fig. 5).
Changing TF target sites
During the transition of mESCs to TSCs, the binding sites of Sox2 change dramatically (Adachi et al., 2013). The occupancy of most Sox2 binding sites in mESCs is lost in TSCs, and new TSC-specific binding sites come into use (Adachi et al., 2013). By comparing the consensus binding site sequences for Sox2 in mESCs and TSCs, the OCT-SOX motif was found to be the most enriched motif at mESC-specific binding sites, whereas the AP-2 and Sox motifs were enriched at TSC-specific binding sites (Adachi et al., 2013). Tfap2c (AP2γ) was identified as the AP-2 family member responsible for the recruitment of Sox2 to TSC-specific sites. In this case, switching binding partners could be a primary mechanism by which Sox2 binds to different target sites (Fig. 2). How a TF interacts with other TFs at a super-enhancer might also shape how it interacts with its target genes, as is the case for Sox2. Sox2 is recruited to mESC-specific super-enhancers, including one near the Sox2 gene. Other mESC-specific TFs are co-recruited to these super-enhancers, where they act synergistically to stabilize their binding to the super-enhancer, resulting in the activation of transcription. When the cells exit from the mESC state, most of these TFs become downregulated, and as a result Sox2 binding to these mESC-specific super-enhancers is lost. Concurrently, multiple TSC-specific TFs become active and cooperate with Sox2 to bind and activate TSC-specific super-enhancers (Figs 2 and 5).
Esrrb also undergoes dynamic changes to its pattern of binding at the onset of differentiation to the TSC lineage (Latos et al., 2015). Among the 9049 peaks identified by a ChIP-seq analysis of Esrrb in mESCs, only 3027 were shared by TSCs. Esrrb is a member of the group of TFs recruited to mESC-specific super-enhancers, but it also binds to TSC-specific TF-encoding genes, such as Eomes and Elf5 (Latos et al., 2015). In TSC-specific Esrrb binding sites, the Cdx2 motif is enriched. These findings suggest that during the transition of the TF network, a single TF can alter its target binding sites in different ways, for example by changing its partner TFs or by cooperating with multiple TFs.
Programming, reprogramming and the evolution of the TF network
The TF network functions as a whole; it defines the mode of action of each TF, including how it is regulated by extracellular signals and its specificity for target sites in the genome. By understanding TFs as a network, it is possible to generate several hypotheses that might help to answer questions concerning the functions of TFs in various contexts such as the programming of differentiation, reprogramming of cell types and the evolution of novel cell types.
Programming the TF network
In the developmental context, the TF network transitions from one state to another under the regulation of extracellular signals. Cell differentiation can occur in a short time period, and the direction of differentiation can be defined by the capacity of the original cell type. The number of extracellular signals required for differentiation can be limited, and the same signalling pathway can be used in different processes to provide different outcomes. An obvious example is found in the varied functions of the LIF signal. The LIF signal works to prevent differentiation of mESCs, but promotes terminal differentiation of M1 leukemia cells (Tomida et al., 1984), although the signal is transduced via activation of Stat3 in both cell types (Hutchins et al., 2013). How can a TF network transition from one state to another in a context-dependent manner and with such rapid kinetics? As discussed above, the function of a single TF depends on the context in which it functions. The activation of a single TF that is not a member of the network might not have any effect in that cell type, but can have an impact in another cell type. To define the direction of differentiation, TFs within the original network must cooperate with the activated TF. In the case of the transition of mESCs to TSCs, the overexpression of Cdx2 triggers this transition with an efficiency of almost 100% (Niwa et al., 2005). This response might depend on the presence of Sox2 and Esrrb as members of the mESC-specific TF network. Indeed, overexpression of Cdx2 in mouse EpiSCs, which are in the primed pluripotent state and no longer possess a naïve state-specific TF network that includes Esrrb, does not induce activation of the TSC-specific TF network (Blij et al., 2015). If this is the general rule, then the presence of certain TFs in the original network might define the capacity of a cell to undergo differentiation in a physiological context. Moreover, the newly forming TF network might advantageously share members with the original network in order to facilitate the transition to the next cell state. Thus, the capacity for a cell to transition between states might be programmed by the TF network itself.
From the programmed transition of the TF network, it is clear that TFs do indeed form a network in which each TF has a similar impact on the entire network. In the case of the transition from mESCs to TSCs, the downregulation of Sox2 induces differentiation to trophectoderm as efficiently as does the repression of Oct3/4 (Masui et al., 2007). Moreover, the overexpression of Eomes, Elf5 or Tfap2c induces TSC-like differentiation as efficiently as does the overexpression of Cdx2 (Niwa et al., 2005; Ng et al., 2008; Kuckenberg et al., 2010). In these cases, the resulting states are the same, suggesting that a programmed transition can be initiated by different triggers (Fig. 6).
The mESC-specific TF network is programmed to transition to a few different states directly. The TSC-specific TF network is one such destination, and it could be a programmed default choice since it is evoked by the artificial disruption of the mESC-specific TF network. The primitive endoderm-specific (Box 1, Glossary) TF network is an alternative choice: differentiation along this lineage can be stimulated by overexpressing one of two Gata TFs, namely Gata4 or Gata6 (Fujikura et al., 2002). It was recently reported that a combination of Activin/Nodal and Wnt signals can induce differentiation of mESCs toward primitive endoderm (Anderson et al., 2017). Interestingly, differentiation occurred only in the absence of insulin, while Activin/Nodal and Wnt signals in the presence of insulin served to sustain self-renewal (Anderson et al., 2017). These data indicate that specific combinations of the signal inputs are important and can drive different outcomes from a single mESC-specific TF network. In the case of the transition from an mESC-specific to a primitive endoderm-specific TF network, there is some evidence to suggest that Oct3/4 and Tbx3 might function as pre-existing TFs to mediate the transition, as found in the case of Sox2 and Esrrb in the transition to a TSC-specific network (Aksoy et al., 2013; Lu et al., 2011). Nishiyama et al. (2009) examined the effect of overexpressing 50 TFs in mESCs and found that Cdx2 has the highest impact on the transcriptional change and is the sole TF that can induce an obvious morphological differentiation event, supporting the hypothesis that the pre-existing TF network might define the number of networks that can form in the next cell state.
Reprogramming the TF network
Following the discovery of iPSCs, it was confirmed that the combinatorial overexpression of TFs could induce the reprogramming of multiple different cell types from one state to another, albeit with low efficiency (Morris, 2016). Forced reprogramming does not take place in a physiological context and requires very specific combinations of TFs to work, which might reflect the nature of the TF network in the starting cell. In the case of Sox2, its mode of function is defined by other TFs in the network. Co-expression of Oct3/4 allows Sox2 to bind mESC-specific target genes but it is not sufficient to activate the mESC-specific TF network. Klf4 is a minimal requirement to support the activation of the mESC-specific TF network, which might be due to sufficient activation of mESC-specific super-enhancers.
If the cells being reprogrammed already express Sox2 as a member of their TF network, then exogenous Sox2 can be omitted from the reprogramming TF cocktail (Eminli et al., 2008; Kim et al., 2008b). Neural stem cells (NSCs) express Sox2, Klf4 and Myc and can be reprogrammed by Oct3/4 alone (Kim et al., 2009). However, although the NSC-specific and mESC-specific TF networks share Sox2 as a common component, the efficiency of NSC reprogramming is as low as that of fibroblasts being reprogrammed to iPSCs: 0.11% with Oct3/4 and Klf4, and 0.014% with Oct3/4 alone (Kim et al., 2008b; Kim et al., 2009). Thus, the overlap of a few TF members between physiologically unlinked networks does not contribute to the efficiency of network reprogramming.
The nature of reprogramming highlights the important function of specific TFs in the context of networks that determine cell fate. An alternative set of TFs, consisting of Sall4, Nanog, Esrrb and Lin28, is as efficient at reprogramming somatic cells to iPSCs as the original TF reprogramming cocktail (Buganim et al., 2014). The TF networks of naïve and primed pluripotent cells share Oct3/4, Sox2, Nanog and other TFs as common components. The reprogramming of primed pluripotent stem cells into the naïve state is achieved by the ectopic expression of various single TFs that are specific to naïve pluripotent cells (Table 1), but the efficiency of reprogramming is always low (∼1%) (Weinberger et al., 2016). In these cases, different TFs result in the activation of the same naïve-specific network. Such varied effects of different TFs fit with the hypothesis that TFs are reciprocally interconnected within the network.
Unidirectional transition of the TF network
The presence of shared TFs among networks might facilitate state transitions via modulation of the activity of a single TF. However, cell fate transitions in the developmental context occur in a unidirectional manner. Indeed, the transition of mESCs to TSCs occurs efficiently in response to the manipulation of a single TF and yet the reverse transition is inefficient, as with other reprogramming events in general (Wu et al., 2011) (Fig. 6). TSCs can be reprogrammed to the mESC state using the traditional reprogramming cocktail minus Sox2 (Wu et al., 2011), but the efficiency is ∼0.1% at best with four factors. Therefore, a mechanism that defines the direction of cell state transitions must exist between two linked TF networks.
The disruption of the mESC-specific network by the inactivation of one or both of its core TFs Oct3/4 or Sox2 automatically activates the transition to the TSC-specific TF network, but not vice versa (Niwa et al., 2000; Masui et al., 2007). The elimination of Cdx2 or Esrrb in TSCs causes them to further differentiate toward the terminally differentiated cell types of the trophectoderm lineage, rather than reactivating the mESC-specific TF network (Tremblay et al., 2001; Niwa et al., 2005). A possible explanation for this unidirectionality is the existence of negative feedback between the two TF networks and the influence that this might have on transitions between them. During the mESC-TSC transition, the mESC-specific network is negatively regulated by the TSC-specific network, which effectively turns the mESC-specific TF network off. This negative regulation might facilitate rapid cell fate transition, and might be stronger than the negative regulation of the TSC-specific network by the mESC-specific network. Chicken ovalbumin upstream promoter transcription factor (Coup-TF) I and II (Nr2f1 and Nr2f2), which are members of the nuclear receptor family and have been shown to strongly suppress the Oct3/4 promoter, are expressed in non-pluripotent cells, including TSCs (Ben-Shushan et al., 1995; Adachi et al., 2013). The affinity of Coup-TFs for the Oct3/4 promoter binding site is much higher than that of the Oct3/4 activators that are expressed in mESCs and bind to the same site (Ben-Shushan et al., 1995). This simple competition might work as a mechanism to prevent the reverse transition of the TF network.
The relationship between two developmentally linked TF networks can be defined by several characteristics: (1) connections between the overlapping members; (2) the capacity of extracellular signals to activate key TFs; and (3) strong negative regulation of the original TF network by the network of the new cell state. Sequential differentiation events during development would be programmed according to these principles, while (1) and (3) could be important parameters in defining the capability of reprogramming. However, the existence of a few TFs that are common to different networks does not appear to increase reprogramming efficiency, which might depend on the strength of (3). Since cell type-specific networks are programmed by combinatorial TF expression, and since the number of TF-encoding genes is limited, it might happen by chance that unlinked TF networks meet the characteristics of both (1) and (3), which might enable cells to efficiently transition between states. In support of this, Di Stefano et al. (2014) reported that mouse primary B cells can be reprogrammed to iPSCs with very high efficiency (∼95%) and rapid kinetics through the transient expression of C/EBPα, followed by the expression of four iPSC factors.
Evolution of the TF network
In animal evolution, the number of TFs encoded in the genome has increased with the number of cell types. How is the acquisition of an evolutionarily novel cell type achieved with respect to the underlying TF network? The placenta is an evolutionary novelty in mammals, and thus the TSC-specific TF network is an evolutionary novelty of placental mammals. However, very few novel genes that are unique to mammals function in placental development. Peg10, which derives from a retrotransposon, is a rare example of this category (Ono et al., 2006). By contrast, the molecular functions tend to be highly conserved during evolution. Sox2 is a member of the group B1 Sox family, which is conserved in invertebrates. In Drosophila, the group B1 Sox family members function in neuronal development, and neuronal function is conserved in mammalian group B1 members Sox1, Sox2 and Sox3 (Kamachi and Kondoh, 2013). The amino acid sequences of the Drosophila and mammalian group B1 members share homology only in the HMG box (Niwa et al., 2016). However, mouse Sox2 can be replaced by Drosophila SoxN in maintaining the pluripotency of mESCs, indicating that the conserved function of the group B1 Sox family is effective even in mESCs (Niwa et al., 2016). Since there is no evidence that a pluripotent cell population exists in developing Drosophila, then it follows that the pluripotent state is an evolutionary novelty in vertebrates, especially mammals. Therefore, the specific function of Sox2 in mESCs must be co-opted from its evolutionarily conserved function.
The use of evolutionarily conserved TFs to define novel cell types suggests that the evolutionary acquisition of novelty could derive from the establishment of new combinations of TFs in a network, rather than the acquisition of new TF functions. In the new network, each TF interconnects with other TFs by acquiring new enhancers with little or no acquisition of new molecular function. The flexibility of the connection between the extracellular signals and the TF networks, and between the TF networks and the target genes, would allow a new combination of regulatory elements to evolve. The evolution of the cis-regulatory elements was emphasized in a genetic theory of morphological evolution (Carroll, 2008), and the gain-of-function of cis-regulatory elements can occur by insertion of new elements without affecting the ancestral regulatory element, allowing co-optional use of the genes for novel functions (Peter and Davidson, 2011). In addition, the combined activity of multiple TFs on a super-enhancer to recruit the Mediator complex could make it relatively straightforward to achieve new synergistic functions among a novel combination of TFs. These events would enable a new combination of TFs in the network to be established, which in turn could lead to the specification of an evolutionarily novel cell type. Among the TFs of the TSC-specific network, the evolutionarily conserved functions of Sox2, Eomes and Cdx2 are found in neuroectoderm, mesoendoderm and definitive endoderm, respectively (Box 1, Glossary) (Sasai, 2001; Beck and Stringer, 2010; Probst and Arnold, 2017), but they cooperate together to define an evolutionarily new cell type, the trophoblast. Such flexibility in the combination of TFs could be a result of the flexible design of a TF network.
To reveal the function of TFs in defining cell phenotype, it is essential to analyze the structure and function of TF networks. Several studies have described one or more aspects of TF networks. Pioneering studies reported the structure of interconnected TF networks based on data from ChIP-on-ChIP and ChIP-seq of multiple TFs (Chen et al., 2008; Kim et al., 2008a). However, knowing the structure of a TF network is not enough to enable modelling of how it functions: quantification of the parameters (e.g. binding affinity, binding frequency and binding mode) coupled with the biological outcome is also required. To monitor the activity of a TF network, perturbation experiments with quantitative outcomes (i.e. transcriptomics) are required. Dunn et al. (2014) reported a computational modelling study of the mESC-specific TF network using a set of functionally validated TFs based on gene expression data in different culture conditions. The simplest version of their model was based on Boolean network formalism [see box 2 of Sharpe (2017)] and comprised only 12 TFs with 16 interactions under three signal inputs, indicative of the existence of a small cell-type-specific TF network. More recently, Goode et al. (2016) reported that multiple steps occur during the differentiation of mESCs into macrophages that involve dynamic changes to TF networks. These authors collected multiple 'omics data (RNA-seq, DNA hypersensitivity-seq and ChIP-seq for histone marks and TFs) and combined them to depict the transition of the TF network.
The simulations of Dunn et al. and Goode et al. raise several questions about the functioning of TF networks. The first is how to calculate the cooperative effect of multiple TFs during the activation of super-enhancers. This could be additive, synergistic or conditional. In the case of mESC-specific super-enhancers, Oct3/4 and Sox2 could be prerequisite for mediating the functions of other TFs in either an additive or synergistic manner. It is difficult, however, to compose a rule governing such cooperation, especially considering the possible role of repressive TFs such as Tcf7l1, the nucleosome remodelling deacetylase (NuRD) complex, and Gro/TLE transcriptional co-repressors (Hnisz et al., 2013; Wray et al., 2011; Reynolds et al., 2012; Laing et al., 2015). A second question raised is how to evaluate the function of TFs as proteins, which is itself a challenge. Protein levels are regulated at translational and post-translational levels. For example, the stability and translation efficiency of Sox2 mRNA is regulated by multiple microRNAs, while the stability and activity of Sox2 protein is controlled by multiple modifications such as acetylation, phosphorylation, ubiquitylation, sumoylation and methylation (Liu et al., 2013). A third question is how to account for the complex interactions of TFs that can modulate the activity of other TFs either positively or negatively, as has been shown in the case of Sox2 and Tex10 (Ding et al., 2015).
Many more functional studies are required to gain a better understanding of the complex functioning and interconnectivity of TF networks, and to model their dynamic nature more accurately. Importantly, future studies must try to delineate TF activity within a specific cellular or developmental context. Hopefully this will be made easier by the limited number of tissue-specific TFs that are encoded in the genome (Ravasi et al., 2010), as well as the relatively small size of each TF network. The ability to accurately simulate the sequential differentiation process by modelling the transitions of TF networks, which in some ways has been provided by systems biologists (Sharpe, 2017), is becoming a realistic reductionist vision.
I thank Dr Futatsugi-Nakai (RIKEN CDB) for critical reading and editing of the manuscript.
This work was supported by a RIKEN research grant and SICORP program from Japan Agency for Medical Research and Development (AMED) to H.N.
The author declares no competing or financial interests.