Understanding how cell identity is encoded by the genome and acquired during differentiation is a central challenge in cell biology. I have developed a theoretical framework called EnhancerNet, which models the regulation of cell identity through the lens of transcription factor-enhancer interactions. I demonstrate that autoregulation in these interactions imposes a constraint on the model, resulting in simplified dynamics that can be parameterized from observed cell identities. Despite its simplicity, EnhancerNet recapitulates a broad range of experimental observations on cell identity dynamics, including enhancer selection, cell fate induction, hierarchical differentiation through multipotent progenitor states and direct reprogramming by transcription factor overexpression. The model makes specific quantitative predictions, reproducing known reprogramming recipes and the complex haematopoietic differentiation hierarchy without fitting unobserved parameters. EnhancerNet provides insights into how new cell types could evolve and highlights the functional importance of distal regulatory elements with dynamic chromatin in multicellular evolution.

A single fertilised animal egg cell divides and differentiates into many cell types that are maintained throughout the life of the organism. Each cell type is associated with a distinct gene expression profile, which can be acquired by gradual differentiation through progenitor states or by reprogramming. Despite great advancements in the molecular characterization of cell dynamics, many fundamental questions regarding how cell types are acquired and maintained remain unanswered. Specifically, key open challenges include how cell types and their associated expression profiles are encoded by the genome, identifying the regulatory mechanisms underlying stepwise differentiation versus direct reprogramming pathways, understanding how signalling triggers the acquisition of particular cell identities and determining how new cell types with unique expression signatures can evolve.

Addressing these questions requires formulating quantitative hypotheses on the dynamics of gene expression over time and in response to perturbations. From a theoretical perspective, cell types correspond to stable attractors of the gene regulatory network of the cell (Waddington, 1957; Huang et al., 2005; Lang et al., 2014; Teschendorff and Feinberg, 2021); that is, the gene regulatory network has multiple stable configurations. These stable configurations are set by feedback interactions between hundreds of transcription factors (TFs), which are DNA-binding proteins that can modulate the expression of other genes. Specific cell types are associated with the expression of distinct combinations of TFs, and TFs play a crucial role in maintaining and modulating cell identity (Burke et al., 1995; Heinz et al., 2010; Holmberg and Perlmann, 2012; Hnisz et al., 2013; Whyte et al., 2013; Kelaini et al., 2014; Saint-André et al., 2016; Reiter et al., 2017; Reilly et al., 2020; Hobert, 2021; Wang et al., 2021; Almeida et al., 2021).

The importance of TFs is reflected in their central role in models of cell identity dynamics. Early models focused on how small transcriptional network motifs, involving only a few TFs, can generate cell fate bifurcations, such as those specific to blood lineages and early embryogenesis (Ferrell, 2002; Huang et al., 2007; Graham et al., 2010; Bessonnard et al., 2014; Sáez et al., 2022). Although these models are important for understanding gene expression dynamics at specific decision points, they cannot predict large-scale dynamics of the cell identity network, such as complex, hierarchical differentiation patterns or reprogramming between distinct cell types.

In contrast to these ‘small network’ models, pioneering work has demonstrated that aspects of cell identity behaviour can be captured by high-dimensional attractor models based on Classical Hopfield networks (Lang et al., 2014; Fard et al., 2016; Pusuluri et al., 2017; Guo and Zheng, 2017; Teschendorff and Feinberg, 2021; Yampolskaya et al., 2023; Smart and Zilman, 2023). Classical Hopfield networks are well-established models for dynamical systems that encode combinatorial attractor states (patterns) through the interactions of their components (Amari, 1972; Little, 1974; Hopfield, 1982). In the context of transcriptional networks, the state of the dynamics is represented by the activity of cell identity TFs, with the attractor states corresponding to distinct cell types. In Hopfield networks, the entire dynamics of the system can be specified solely by defining the attractor states, which are the observed cell types. Although in their original formulation Classical Hopfield networks cannot encode for correlated expression profiles, this limitation can be overcome by an orthogonal projection of the expression profiles (Amit, 1989; Lang et al., 2014). This allows for predictions about the gene regulatory network dynamics based on mere knowledge of the cell types encoded by the network. Lang et al. exploit this property to demonstrate that it is possible to capture and predict the effects of TF overexpression on cellular reprogramming (Lang et al., 2014). The Hopfield formulation also predicts that mixtures of attractor states may themselves be attractor states (known as ‘spurious attractors’). Lang et al. propose that this phenomenon explains the observed expression profiles of partially reprogrammed cells (Lang et al., 2014).

Although this work provides a compelling motivation for applying attractor networks to study cell identity dynamics, it has several important limitations that restrict its applicability. First, although Hopfield dynamics provide an attractive computational model, the relation between this model and underlying molecular mechanisms for cell type specification is unclear. The model also specifies binary expression patterns and does not capture continuous changes in expression levels, such as low-level multilineage priming in progenitor states, which is a hallmark of hierarchical differentiation (Hu et al., 1997; Miyamoto et al., 2002; Mercer et al., 2011; Nimmo et al., 2015; Kim et al., 2016; Olsson et al., 2016; Briggs et al., 2018; Zheng et al., 2018; Martin et al., 2021; Singh et al., 2022). Finally, such progenitor states must be explicitly specified in the model as attractor cell types, rather than emerging naturally from the dynamics.

Here, I develop a theoretical framework for modelling the dynamics of the cell identity network by considering the feedback regulation of TFs through enhancers, which I denote as EnhancerNet. Enhancers are regulatory elements that are crucial for cell identity specification. TFs bind to enhancers to activate cell type-specific expression patterns. Each enhancer can be bound by multiple TFs and, in turn, initiate gene expression in one or more distal target genes (Pennacchio et al., 2013; Uyehara and Apostolou, 2023). The binding of TFs to an enhancer recruits co-factors that can alter its activity by modulating epigenetic properties, such as the biochemical characteristics of its associated chromatin. This, in turn, affects the transcription initiation rate associated with the enhancer (Heintzman et al., 2009; Creyghton et al., 2010; Calo and Wysocka, 2013; Park et al., 2021; Hansen et al., 2022).

A well-documented experimental observation reveals a symmetry between TF binding and enhancer regulation (Hnisz et al., 2013; Whyte et al., 2013; Adam et al., 2015; Saint-André et al., 2016; Feng et al., 2023). The TFs that determine cell identity bind to enhancers that regulate both their own expression and that of other identity-determining TFs that are co-expressed in the same cell types. This interaction forms dense autoregulatory networks of TF-enhancer interactions. These dense autoregulatory networks likely play a crucial role in cell type specification, as their associated enhancers exhibit activation patterns that are highly cell-type specific (Hnisz et al., 2013; Heinz et al., 2015). The cell type specificity of enhancer activity contrasts with the broader activity of TFs, which may be shared across multiple cell types (Hnisz et al., 2013).

I show that the symmetry between TF binding and enhancer regulation imposes an exacting constraint on the dynamics of the cell identity network model, resulting in a highly simplified dynamical model that is predictive and captures broad experimental observations on cell type specification, enhancer selection, differentiation through progenitors, and reprogramming, without the need of fitting unobserved parameters. I demonstrate a mathematical analogy between our mechanistic model and Modern Hopfield networks (Krotov and Hopfield, 2016; Ramsauer et al., 2020 preprint) that overcomes the limitations of previous approaches. This analogy provides a mechanistic link between models for associative memories and cell identity networks, explains the role of distal regulatory elements with dynamic chromatin in multicellular evolution, and provides specific and testable predictions.

Model for enhancer activation dynamics in transcriptional feedback networks

I began by deriving a general mathematical model for the feedback regulation of TF expression through their interaction with enhancers (Fig. 1A). Enhancers consist of multiple TF binding motifs (Spitz and Furlong, 2012). The binding of TFs potentiates enhancer activity by recruiting transcriptional co-factors. These co-factors modulate the biochemical properties of enhancer chromatin through processes such as histone acetylation, leading to the recruitment of transcription initiation factors and ultimately resulting in the transcription of enhancer-associated genes (Narita et al., 2021; Panigrahi and O'Malley, 2021).

Fig. 1.

Model for the regulation of cell identity transcription factors by enhancers. (A) Enhancers are cis-regulatory elements that can initiate transcription in distant genes through interaction with specific transcription factors (TFs) and transcriptional machinery. Each enhancer can interact with several TFs, and each TF can be controlled by multiple enhancers. Binding of TFs modulates enhancer chromatin and can increase the transcription initiation rate of the enhancer. (B) Enhancer types (top) bind specific combinations of TFs (bottom) according to binding strengths specified by the matrix, and, in turn, initiate transcription according to the rate matrix . The weight vector determines baseline activity and may be modulated by binding of signalling TFs. (C) Autoregulation is the observation that TFs that are co-expressed in specific cell types co-bind their own enhancers. These enhancers are, in turn, selected and activated in these specific cell types. (D) Different cell types are associated with specific enhancers that may have overlapping TF binding. (E) Autoregulation constrains and implies reciprocity.

Fig. 1.

Model for the regulation of cell identity transcription factors by enhancers. (A) Enhancers are cis-regulatory elements that can initiate transcription in distant genes through interaction with specific transcription factors (TFs) and transcriptional machinery. Each enhancer can interact with several TFs, and each TF can be controlled by multiple enhancers. Binding of TFs modulates enhancer chromatin and can increase the transcription initiation rate of the enhancer. (B) Enhancer types (top) bind specific combinations of TFs (bottom) according to binding strengths specified by the matrix, and, in turn, initiate transcription according to the rate matrix . The weight vector determines baseline activity and may be modulated by binding of signalling TFs. (C) Autoregulation is the observation that TFs that are co-expressed in specific cell types co-bind their own enhancers. These enhancers are, in turn, selected and activated in these specific cell types. (D) Different cell types are associated with specific enhancers that may have overlapping TF binding. (E) Autoregulation constrains and implies reciprocity.

To capture these complex molecular mechanisms within an effective mathematical framework, I considered the following setup. An enhancer (indexed i) is characterized by two vectors: a vector, , denoting its binding affinity to different identity TFs (); and a vector, Qi, denoting the rates at which it can induce the transcription of the same TFs (). The binding of TFs to an enhancer modulates the state of the enhancer chromatin, which in turn sets the rate at which the transcriptional initiation machinery is recruited to the enhancer. Specifically, I assumed that the chromatin state sets the energy for the recruitment of the transcription initiation machinery and that, on the timescale of cell identity changes, the recruitment rate is captured by an equilibrium distribution (see Materials and Methods). Taken together, the dynamics of the gene regulatory network are given by:
(1)
where is the vector of TF expression whose entries are xi, is a matrix whose entries are ξi,j, Q is the association matrix whose entries are qi,j, τ is the timescale of the dynamics and β is an effective inverse temperature parameter that depends on the turnover of chromatin modifications (see Materials and Methods, Fig. 1B). The baseline activity vector depends on effects upstream of the cell identity network, such as the binding of effector TFs for signalling pathways (Hnisz et al., 2015). In this study, I will refer to the network model and its associated dynamics as EnhancerNet.

Although Eqn 1 considered different physical enhancers as separate entities, in practice, enhancers with similar binding profiles can appear throughout the genome (Saint-André et al., 2016; Kvon et al., 2021), corresponding to similar rows of . I denote these as enhancer types and note that Eqn 1 can be rewritten so that is a matrix of enhancer types (i.e. with distinct rows) and the matrix corresponds to the overall association between enhancer types and TF expression (see Materials and Methods). Hereafter, I will assume that Eqn 1 captures the dynamics of enhancer types, which can correspond to the activation of multiple physical enhancers.

Autoregulation of TF-enhancer interactions constrains the EnhancerNet model

Eqn 1 can be further constrained by taking into account well-established properties of developmental networks. TFs associated with specific cell identities form densely interconnected networks in which they co-bind adjacent enhancers, which in turn activate the same TFs (Whyte et al., 2013; Hnisz et al., 2013, 2015; Saint-André et al., 2016) (Fig. 1C,D). This implies a strong positive correlation between the rows of , which correspond to binding profiles of enhancers, and the rows of , which correspond to the effect of these enhancers on TF transcription. This was captured by taking (Fig. 1E; relaxing this to assume that are merely correlated does not affect our conclusions). This gives the following model for the dynamics:
(2)
The dynamics captured by Eqn 2 are mathematically similar to models for efficient memory storage in artificial neural networks (Ramsauer et al., 2020 preprint). This relationship can be exploited when analysing various properties of the dynamics. These dynamics are associated with a potential function (or ‘potential landscape’) (see Materials and Methods). The shape of this potential landscape determines how gene expression changes over time, including the stable gene expression patterns associated with different cell identities, which correspond to minima of the potential landscape.

Autoregulation, captured by , implies that the interactions between TFs in the developmental network are reciprocal, i.e. for each pair of TFs k, j, the effect of increasing TF j on the expression rate of TF k is the same as the effect of increasing TF k on the expression rate of TF j (). Moreover, reciprocity and autoregulation are effectively equivalent (see Materials and Methods) (Fig. 1E). This equivalence explains the observation that reciprocal interactions are ubiquitous in developmental networks (Kraut and Levine, 1991; Milo et al., 2002; Alon, 2007; Huang et al., 2007).

Reciprocal interactions between TFs can be positive (mutual activation) or negative (mutual repression), with both possibilities common in developmental networks. This is also captured in Eqn 2, despite the fact that, in the model, enhancers only enhance transcription, and the binding of TFs to enhancers increases their activity. Repression in the model is a by-product of global inhibition by competition over shared transcriptional machinery.

Robust specification of combinatorial cell types through interaction between TFs and enhancers

Cell identity is established through the process of enhancer selection (Heintzman et al., 2009; Hnisz et al., 2013, 2015; Saint-André et al., 2016). In this process, cell type-specific enhancers are marked for activation, leading to the expression of a distinct pattern of transcription factors associated with these enhancers. This is recapitulated in our model when the inverse temperature parameter β is sufficiently large (see Materials and Methods). Under this condition, the cell types encoded by the gene regulatory network, which represent stable attractors for the dynamics, are given by the rows of the enhancer-associated matrix (or, in the general case, by the rows of ; see Materials and Methods). Specifically, the stable fixed points of the network are represented by the vectors ()=() for all enhancer types (Fig. 2A,B). Each stable fixed point corresponds to a cell type with a unique transcription factor expression pattern. At these fixed points, cell type-specific enhancers exhibit maximal activity. These fixed points are robust to a large degree of asymmetry in TF-enhancer interactions (Fig. S1).

Fig. 2.

Specification of cell types and direct reprogramming. (A) The EnhancerNet model was initialized with the TF expression profiles of 45 cell types from Tabula Muris, taking 105 transcription factors (TFs) with high variability between cell types (see Materials and Methods). The resulting matrix (transposed) is plotted, with larger values corresponding to darker colours. (B) At high β each row of is an attractor of the network. PCA transformation of is shown, with transformed individual rows, corresponding to enhancer-binding patterns, plotted in grey. They are all attractor states – trajectories that start in the vicinity of each of these states converge to them. Red lines correspond to trajectories starting from rows of with 25% multiplicative noise. (C) Constitutive overexpression of TFs alters the attractor landscape and can result in a bifurcation where cell types lose stability and transition to another cell type. Arrow indicates transition between gene expression states. (D) The overexpression of Ascl1 transitions fibroblasts to neurons in the model, capturing the known effect of Ascl1 overexpression in fibroblasts. The reprogramming path is plotted in red (lower panel), with black paths capturing other known reprogramming recipes (see Materials and Methods). (E) Expression of TFs (left panel) and enhancer activity (right panel), corresponding to the probability that transcription is initiated from the enhancer.

Fig. 2.

Specification of cell types and direct reprogramming. (A) The EnhancerNet model was initialized with the TF expression profiles of 45 cell types from Tabula Muris, taking 105 transcription factors (TFs) with high variability between cell types (see Materials and Methods). The resulting matrix (transposed) is plotted, with larger values corresponding to darker colours. (B) At high β each row of is an attractor of the network. PCA transformation of is shown, with transformed individual rows, corresponding to enhancer-binding patterns, plotted in grey. They are all attractor states – trajectories that start in the vicinity of each of these states converge to them. Red lines correspond to trajectories starting from rows of with 25% multiplicative noise. (C) Constitutive overexpression of TFs alters the attractor landscape and can result in a bifurcation where cell types lose stability and transition to another cell type. Arrow indicates transition between gene expression states. (D) The overexpression of Ascl1 transitions fibroblasts to neurons in the model, capturing the known effect of Ascl1 overexpression in fibroblasts. The reprogramming path is plotted in red (lower panel), with black paths capturing other known reprogramming recipes (see Materials and Methods). (E) Expression of TFs (left panel) and enhancer activity (right panel), corresponding to the probability that transcription is initiated from the enhancer.

Signalling activity can induce specific cell fates or restrict inducible cell fates (Perrimon et al., 2012; Grover et al., 2014; Hnisz et al., 2015). This phenomenon is recapitulated in our model through the effect of signalling on , which in turn modulates the basins of attraction for each cell type. Signals that increase the activity of specific enhancers cause the cell types associated with these enhancers to become accessible from a broader range of gene expression states. As discussed later in the context of differentiation, this provides a mechanism to control the relative production of different cell types. Conversely, states with weak induction may become destabilized and thus reprogrammed to an alternative cell type. Importantly, the effect of signalling through changes in does not alter the TF identity of the cell types. Thus, the model provides a mechanism for how signalling can control cell type production without interfering with cell identities.

From a modelling perspective, the observation that cell types correspond to rows of suggests an intriguing possibility: that Eqn 2 can be parameterized using the TF identities of the observed (terminal) cell types, which can be retrieved from gene expression datasets. This leaves the inverse temperature parameter β and the signalling vector as the only free parameters. As I will show, in a variety of settings, there are strong constraints on both parameters. The model can thus provide predictions on the dynamics of the entire transcriptional network by considering only the observed cell types. In the rest of this article, I will scrutinize this possibility and demonstrate that, using this approach, the complex dynamics of the regulatory network can indeed be reconstructed in a variety of settings.

EnhancerNet recapitulates direct reprogramming between cell types and predicts reprogramming recipes

I next tested whether the EnhancerNet model could predict the dynamics of the regulatory network controlling cell identity. Our analysis focused on the two broad classes of cell identity change: direct reprogramming, in which there are direct transitions between cell types; and hierarchical differentiation processes that start from (or transition through) multipotent progenitor states.

Direct reprogramming can be experimentally carried out by overexpressing TFs (Wang et al., 2021). Direct reprogramming has been demonstrated between dozens of cell types by overexpressing a wide range of TFs and TF combinations. Two clear distinguishing features for whether specific TFs can efficiently reprogram into a target cell type are that these factors are: (1) highly expressed in the target cell type and (2) their expression is unique to that cell type (D'Alessio et al., 2015).

The very existence of direct reprogramming between distantly related cell types is beyond the scope of classical models for cell fate bifurcations that are based on competition between a small set of lineage-determining factors. However, it is a straightforward consequence of the dynamics of Eqn 2 (Fig. 2C). The overexpression of the TF xj modulates the basins of attraction of each cell type, which can then encompass other (previously stable) cell type expression patterns. If a cell has initially been placed in one of these distant patterns, i.e. , then, after TF overexpression, it can transition directly to a new pattern: .

Which transcription factors are most efficient in reprogramming to a specific cell type? The optimal transcription factor combination will uniquely increase the basin of attraction of the target cell type, while avoiding increasing the basins of attraction of other cell types. The degree of increase in the basin of attraction of cell type k by the overexpression of TF j is proportional to its binding association ξk,j, which also corresponds to its expression in the target cell type (see Materials and Methods). Thus, TFs that are effective for reprogramming to a given cell type are predicted by the model to have high expression in that cell type and that this high expression is unique to the target cell type. This captures the known properties of reprogramming factors and is consistent with computational methods to identify reprogramming factors (D'Alessio et al., 2015; Rackham et al., 2016).

To test whether the model can recapitulate known reprogramming recipes, I generated a dataset for 45 cell identities using Tabula Muris (Schaum et al., 2018) (see Materials and Methods, Fig. 2A). I then set according to the TFs used in 12 established reprogramming recipes, as well as other TFs with high expression and variability. The transient overexpression of the reprogramming factors recapitulated the known reprogramming behaviour, with the system transitioning from the original attractor state (e.g. fibroblast) to an end attractor state, in line with the experiment (Fig. 2D,E). Thus, the EnhancerNet model can quantitatively recapitulate direct reprogramming, with no fitting parameters and by only considering the observed cell types.

As a specific demonstration of direct reprogramming, consider the reprogramming of a fibroblast into a neuron by overexpression of Ascl1 (Fig. 2E) (Chanda et al., 2014). The cell begins in a fibroblast state, where a specific enhancer type is strongly activated. The transient activation of a specific TF can destabilise this state, resulting in a transition to a neuron state that is associated with the activation of a neuron-specific enhancer type. The dynamics of this transition are one-dimensional and, in the presence of noise, correspond to a stochastic barrier crossing between two potential minima, which is consistent with previous quantitative analyses (Fig. S2) (Pusuluri et al., 2017).

EnhancerNet recapitulates hierarchical differentiation dynamics and predicts progenitor identity

Direct reprogramming is an important experimental phenomenon and may also occur in natural settings after tissue perturbation (Merrell and Stanger, 2016). In homeostatic and developmental settings, however, it is more typical for differentiation trajectories to occur through a series of multipotent progenitor states (Moris et al., 2016). Such differentiation trajectories have been extensively studied in mammalian haematopoiesis, where a population of multipotent progenitors can give rise to many blood lineages (Månsson et al., 2007; Orkin and Zon, 2008). Other examples include the intestinal epithelium (Kim et al., 2016; Singh et al., 2022), stomach epithelium (Qiao et al., 2007) and skin (Toma et al., 2005), as well as throughout development (Briggs et al., 2018).

Differentiation trajectories share several common characteristics. Differentiation proceeds in a directed manner through a series of progenitor states. Each progenitor may acquire one of several terminal fates, with the range of target cell fates becoming more restricted as differentiation proceeds. For example, the cell may initially be in a multipotent progenitor state, and then transition to a bipotent state followed by a unipotent state. These dynamics are most famously conveyed by the Waddington landscape (Waddington, 1957), with the image of a ball rolling down a hill segregated by valleys capturing the progressive restriction of cell fate as it progresses through transitional progenitor states. Progenitors co-express at low levels the lineage-determining TFs associated with their target fates, which is a phenomenon known as multilineage priming (Hu et al., 1997; Olsson et al., 2016; Briggs et al., 2018; Zheng et al., 2018; Martin et al., 2021). Although the Waddington landscape image has gained great popularity for conceptualizing the dynamics of cell fate specification and for developing quantitative models for differentiation dynamics (Zhou et al., 2012; Rand et al., 2021), it is not clear how these complex hierarchical dynamics are implemented by the gene regulatory network.

Here, I show that Waddingtonian cell fate specification dynamics and the existence and identity of the progenitor states is an emergent property of the dynamics captured by Eqn 2. These are due to the second mechanism for cell-type transitions in the model, annealing, where β is transiently decreased by the cell (‘heating up’) and then slowly increased (‘cooling down’) (Fig. 3A). Recall that β is a coarse-grained parameter that depends on the regulation of chromatin by TF binding, and can be controlled by molecular mechanisms. Decreasing β results in a widespread pattern of enhancer activation and gene expression, while increasing β results in the activation of more specific enhancers, in line with experimental knowledge of differentiation hierarchies (Gulati et al., 2020). Specifically, decreasing β destabilises terminal attractor states, while increasing β restabilises them, resulting in a transition towards a new cell type (Fig. 3).

Fig. 3.

Hierarchical differentiation by annealing. (A) Annealing is the process by which β is transiently decreased and then increased. The decrease in β transitions the cell to a global attractor state corresponding to a multipotent progenitor with an averaged expression profile of terminal cell types. As β increases again, new attractor states are created, which are averages of smaller subsets of the terminal cell types, corresponding to progenitor states with limited potency until, finally, the cell transitions to a terminal cell type. (B) Progenitor states appear in subsets of cell types that have high cosine similarity. In this example, I considered a simple setup where enhancers bind overlapping transcription factors (TFs) (top left, with positive TF binding indicated in black). The dynamics begin from a cell type corresponding to the binding profile of EN1, and EN5 has a slight positive weight w5=0.05. Annealing was simulated as in the top right schematic. The dynamics proceed through progenitors that show multilineage priming (bottom left) and transition from widespread enhancer activity to cell type-specific activity (bottom right). (C) The cosine similarity of the expression profiles of the terminal blood lineages, where expression profiles were calculated using the TFs that show both sufficiently high interlineage variability (see Materials and Methods). Hierarchical clustering according to cosine similarity recreates the ‘classical’ haematopoiesis differentiation tree. (D) UMAP plot of differentiation trajectories, generated by direct simulation of annealing in an EnhancerNet model calibrated by the haematopoietic terminal lineage expression profiles. Trajectories were simulated from an initial homogeneous population with the addition of noise and with adjusted to produce a balanced differentiation profile (see Materials and Methods). Points represent samples of the trajectories at constant time intervals, with the colour corresponding to the identity of the final state. Simulation recreates the observed progenitor states in haematopoiesis, including deviations from tree-like differentiation.

Fig. 3.

Hierarchical differentiation by annealing. (A) Annealing is the process by which β is transiently decreased and then increased. The decrease in β transitions the cell to a global attractor state corresponding to a multipotent progenitor with an averaged expression profile of terminal cell types. As β increases again, new attractor states are created, which are averages of smaller subsets of the terminal cell types, corresponding to progenitor states with limited potency until, finally, the cell transitions to a terminal cell type. (B) Progenitor states appear in subsets of cell types that have high cosine similarity. In this example, I considered a simple setup where enhancers bind overlapping transcription factors (TFs) (top left, with positive TF binding indicated in black). The dynamics begin from a cell type corresponding to the binding profile of EN1, and EN5 has a slight positive weight w5=0.05. Annealing was simulated as in the top right schematic. The dynamics proceed through progenitors that show multilineage priming (bottom left) and transition from widespread enhancer activity to cell type-specific activity (bottom right). (C) The cosine similarity of the expression profiles of the terminal blood lineages, where expression profiles were calculated using the TFs that show both sufficiently high interlineage variability (see Materials and Methods). Hierarchical clustering according to cosine similarity recreates the ‘classical’ haematopoiesis differentiation tree. (D) UMAP plot of differentiation trajectories, generated by direct simulation of annealing in an EnhancerNet model calibrated by the haematopoietic terminal lineage expression profiles. Trajectories were simulated from an initial homogeneous population with the addition of noise and with adjusted to produce a balanced differentiation profile (see Materials and Methods). Points represent samples of the trajectories at constant time intervals, with the colour corresponding to the identity of the final state. Simulation recreates the observed progenitor states in haematopoiesis, including deviations from tree-like differentiation.

The annealing process proceeds as follows. The accessible and induced cell types, given by the rows of , where is sufficiently large, are all stable when β is very large, while their global average is the only stable state when β is small. This state expresses at low levels the TFs associated with all target lineages, and it corresponds to a multi-potent progenitor cell identity. As β increases, this state loses stability, and new stable states appear transiently. These new states are averages of increasingly restricted subsets of cell types, and they correspond to progenitors of restricted potential. Annealing thus recapitulates the hierarchical Waddingtonian differentiation dynamics and low-level multilineage progenitor expression patterns.

While progenitors generally correspond to averages of terminal cell types, not all averages are equally likely to be observed as stable progenitors. Rather, the stable progenitor states correspond to subsets of cell types with high internal similarity in their expression profiles (measured by cosine distance). As an illustrative example, I considered a simulation of a regulatory network with nine TFs, indexed , and six enhancers, indexed . Each enhancer can be bound by two TFs, and enhancers 1-2, 3-4 and 5-6 overlap by a single TF (Fig. 3B). The initial state corresponded to an expression profile associated with . I also take a slightly larger weight w5 for , corresponding to induction of this state by signalling. Annealing (a decrease in β followed by a gradual increase) transitions the cell through two progenitors: a ‘multipotent progenitor’ (low β), which corresponds to the average expression of all the enhancers; and a ‘restricted potential progenitor’ (intermediate β) that is associated with the average expression of and, finally, differentiation to . Thus, differentiation occurs through Waddingtonian dynamics associated with multilineage priming, allowing transition to an induced cell fate.

The model makes the specific prediction that differentiation hierarchies, including the identity of observed progenitor states, can be estimated by only knowing the identities of the terminal cell states. The model specifically predicts that they will appear only between transcriptional profiles with high internal cosine similarity. To test this, I considered differentiation in haematopoiesis, a system in which differentiation trajectories are complex and have been extensively studied. Using data on haematopoietic lineages in mice (extracted from Haemopedia; Choi et al., 2019), I considered the expression profile of all TFs that showed variability between terminal lineages and used the model to estimate the differentiation hierarchy that generated these lineages.

I first tested the hypothesis that progenitors emerge between correlated expression profiles by performing hierarchical clustering according to pairwise cosine similarity between the expression profiles of terminal cell types (Fig. 3C). Hierarchical clustering provides a heuristic approach to estimate the predicted identity of progenitor states from the terminal expression profiles. It produces a dendrogram (tree) that shows how expression profiles can be progressively grouped into clusters based on their similarity. This resulted in a tree structure that recreates the classic haematopoietic hierarchy (Fig. 3C). Branching points at the dendrogram capture the common lymphoid progenitor (CLP), common myeloid progenitor (CMP), megakaryocyte and erythroid progenitor (MEP), granulocyte and macrophage progenitor (GMP), T and NK cell progenitors (TNK), and B cell-biased lymphoid progenitor (BLP). Thus, a simple unsupervised clustering method with access to only the terminal cell fate information can reproduce a faithful estimate of the known complex differentiation landscape in haematopoiesis.

Motivated by the effectiveness of hierarchical clustering in predicting progenitor states, I developed a computational approach to model differentiation into multiple cell types. This approach is based on annealing in Eqn 2. Annealing trajectories were simulated beginning with a homogeneous initial population representing haematopoietic stem cells. To account for natural variability in differentiation paths, noise was introduced to Eqn 2 (our findings remained robust across various noise magnitudes, Fig. S3). In physiological settings, the production rate of each cell type is regulated by negative feedback, where the terminal cell population size inhibits its own production. This is exemplified by the negative-feedback regulation of red blood count through EPO signalling (Grover et al., 2014). To incorporate this regulation, I implemented a negative-feedback routine to tune , allowing each cell type to inhibit its own production (see Materials and Methods). This resulted in dynamics that generated all cell fates in a balanced manner. The differentiation trajectories were then visualized on a UMAP plot (Fig. 3D), revealing that differentiation dynamics proceed through a series of progenitors corresponding to those observed in haematopoiesis. Thus, our model quantitatively recapitulates differentiation in haematopoiesis without requiring parameter fitting.

Although the heuristic approach of hierarchical clustering produces a tree-like structure that recapitulates the main progenitor states in haematopoiesis, differentiation trajectories in the model need not be tree-like, as can be seen in the simulations of Fig. 3D. As β increases, multiple steady states can appear and disappear, and these new steady states may be averages of overlapping terminal states. The key prediction of the model is that these states will correspond to averages of terminal states with high internal similarity. For example, it is experimentally established that dendritic cells can emerge from both the myeloid lineage through macrophage-dendritic progenitors and from the lymphoid lineage through B cell-biased lymphoid progenitors (Anderson et al., 2021). This observation aligns with our model and simulations (Fig. 3D, see also Fig. S4 for simple examples of tripotent progenitors and deviations from tree structure). The gene expression profile of dendritic cells exhibits high cosine similarity to both B cells and macrophages (and, more generally, myeloid cells), whereas B cells are less similar to myeloid cells. Similarly, neutrophils may arise from distinct monocyte-neutrophil and basophil-eosinophil-neutrophil progenitors, and monocytes from distinct monocyte-neutrophil and monocyte-dendritic cell progenitors (Weinreb et al., 2020; Yampolskaya et al., 2023). These patterns are also recapitulated in our simulations and are related to the similarity structure of the cell types. Higher-level progenitor structures can be captured in the same manner by the model. For example, lymphoid and granulocyte cells have moderate cosine similarity, as do granulocytes and erythro-megakaryocytes. However, lymphoid and erythro-megakaryocytes are poorly correlated. Consequently, each of the first two pairs (lymphoid and granulocyte cells, and granulocytes and erythro-megakaryocytes) can share accessible progenitors, whereas the latter two (lymphoid and erythro-megakaryocytes) are less likely to do so. Indeed, the first two progenitors exist (lympho-myeloid primed progenitors and common myeloid progenitors), whereas the latter, to the best of our knowledge, does not. The model can thus make predictions regarding highly complex transition dynamics by using information on only the terminal cell identities.

Evolution of new cell types

I conclude this study by considering the question of how new cell types can evolve within the regulatory network. Although in principle a new cell type may arise de novo, in practice it is likely to appear by the diversification of an ancestral cell type into sister cell types, a process known as genetic individuation (Arendt et al., 2016; Hobert, 2021). In this process, a cell type associated with a specific combination of TFs evolves into new and distinct cell types associated with new TF combinations.

The EnhancerNet model has unique properties that support the process of genetic individuation of cell types; I propose that these properties may have been instrumental for the evolution of distal cis-regulatory elements with dynamic chromatin in animals. These properties are (1) the ability to support multiple coexisting, highly correlated attractor states; and (2) the ability to have multiple enhancers regulating the same genes.

The importance of the first property is clear when one considers that around the time of the initial diversification of the ancestral cell types, the sister cells have highly correlated expression patterns. At this stage, they need to co-exist as distinct attractor states to be expressed in the body. Thus, in the biological context, the ability to support multiple correlated attractor cell type states is crucial. This is supported by the mechanism underlying the EnhancerNet through the inverse temperature parameter β, which allows cells to discriminate between highly similar expression patterns.

As a concrete example, consider the specification of neuronal identity in C. elegans worms. Reilly et al. investigated the expression of TFs in the mature worm nervous system (Reilly et al., 2020). Their findings showed that each of the 118 neuron classes in C. elegans is defined by a unique combination of homeobox TFs, with 68 TFs exhibiting variable expression across neuronal classes. Many neuronal classes share highly similar TF codes: eight classes differ from another class by only a single TF, whereas 70 classes differ from another by three or fewer TFs. The mechanism underlying Eqn 2, through regulation of the β parameter, successfully supports these patterns, with around a twofold increase in β allowing full specification of all neuronal classes from a single multipotent progenitor (see Materials and Methods).

The second property allows the regulatory network to generate and modify specific cell types without interfering with other cell types that may have common TFs. This relates to the modularity of encoding cell types by enhancer sequences. I illustrate this by a simple example where an ancestral cell type expressing TFs 1, 2, 3 and 4 is diversified into two cell types that express 1, 2 and 3 or 1, 2 and 4 (Fig. 4A,B). The initial attractor state is associated with an enhancer type, denoted EN1, that binds all four TFs. In our example, this enhancer is initially placed in the genome near all four TFs. It is also assumed that multiple copies, known as shadow enhancers (Hong et al., 2008; Cannavò et al., 2016; Kvon et al., 2021), of EN1 are near TF1 and TF2 (Fig. 4C). This phenomenon is widespread and well-established experimentally, and its observed behaviour is in line with model predictions (see Materials and Methods). This setup is illustrated in Fig. 4C, and corresponds to a stable attractor where TFs 1-4 are active.

Fig. 4.

Evolution of new cell types. (A,B) I consider the evolution from a cell type associated with the binding pattern of TF1-4 (EN1 in A, left cell in B) to two sister cell types, each associated with the binding of TF1 and TF2 and either TF3 or TF4 (EN1′ and EN1″ in A, middle two cells in B). This must occur without interference with other cells in the regulatory network that may share TFs with these cell types. (C-E) The evolution occurs in the presence of other cell types in the gene regulatory network. (C) The original setup (left) stabilises only the ancestral cell type: see right three panels with simulations starting from the ancestral state, where TF1-TF4 are active; from sister cell A, where TF1-TF3 are active; and from sister cell B, where TF1, TF2 and TF4 are active. (D) Mutations in specific enhancers can stabilise states near all cell types. (E) Further mutation can destabilise the ancestral state. For all simulations, I used the full dynamics of Eqn 1, with the shade of green corresponding to expression strength of the TF.

Fig. 4.

Evolution of new cell types. (A,B) I consider the evolution from a cell type associated with the binding pattern of TF1-4 (EN1 in A, left cell in B) to two sister cell types, each associated with the binding of TF1 and TF2 and either TF3 or TF4 (EN1′ and EN1″ in A, middle two cells in B). This must occur without interference with other cells in the regulatory network that may share TFs with these cell types. (C-E) The evolution occurs in the presence of other cell types in the gene regulatory network. (C) The original setup (left) stabilises only the ancestral cell type: see right three panels with simulations starting from the ancestral state, where TF1-TF4 are active; from sister cell A, where TF1-TF3 are active; and from sister cell B, where TF1, TF2 and TF4 are active. (D) Mutations in specific enhancers can stabilise states near all cell types. (E) Further mutation can destabilise the ancestral state. For all simulations, I used the full dynamics of Eqn 1, with the shade of green corresponding to expression strength of the TF.

Let us consider the possibility that enhancers near TF3 and TF4 evolved so that the enhancer near TF3 does not bind TF4, and the enhancer near TF4 does not bind TF3 (Fig. 4D). Now, there are three stable states: one where TF1-4 are active (with reduced activity of TF3 and TF4) and two where TF1 and TF2, and either TF3 or TF4 are active, which corresponds to the new sister cells. Over time, the shadow enhancers may evolve further, leading to the destabilisation of the original attractor state (Fig. 4E). Throughout this duration, the other cell types remain stable attractors. Thus, the EnhancerNet mechanism allows the evolution of new cell types without interfering with existing cell types.

Here, I have derived a predictive mechanistic model for the dynamics of cell identity, based on interactions between TFs and enhancers. The model incorporated several features of the architecture of the regulatory network, i.e. that enhancers form dense autoregulatory networks with TFs and that enhancer activity is determined by its chromatin state, which is set by TF binding. These features are sufficient to derive a simple and tractable model that can be used without the need to fit unobserved parameters, and that recapitulates the known processes of cell type specification, reprogramming and differentiation.

The mechanism underlying enhancer selection in EnhancerNet is mathematically related to a recent model for memory storage and retrieval known as Modern Hopfield networks (Ramsauer et al., 2020 preprint; Krotov and Hopfield, 2021). Classical Hopfield networks are based on direct and additive interactions between components (Hopfield, 1982). Classical Hopfield networks have long been considered important conceptual models for memory storage and retrieval in the brain (Hopfield, 1982; Krotov and Hopfield, 2021), and, more recently, they have been employed in pioneering studies as conceptual and predictive models for cell fate specification, and as the basis of computational methods to study cell state dynamics (Lang et al., 2014; Fard et al., 2016; Guo and Zheng, 2017; Conforte et al., 2020; Boukacem et al., 2024). It was not clear how such a network could be implemented mechanistically in cells. Modern Hopfield networks extend Classical Hopfield networks and allow the storage of many patterns through higher-order interactions (Krotov and Hopfield, 2016, 2021). I show that their dynamics arise naturally in cells by the interactions of TFs and enhancers. The mathematical analogy between Modern Hopfield networks and enhancer-TF interactions provides a mechanism through which cells can retrieve many attractor patterns encoded by a biochemical regulatory network.

The model makes specific testable predictions for processes of reprogramming and differentiation, based on the identities of the terminal states. The model can predict reprogramming ‘recipes’ by which TFs can transition a cell from a given cell type to a target cell type. These predictions are consistent with known algorithms for this purpose and recapitulate established reprogramming recipes. The model thus provides a flexible computational framework to model reprogramming dynamics.

For differentiation, the model predicts the identity of progenitors and can predict complex differentiation hierarchies by using only the identities of the terminal cell types. Specifically, the model predicts that, in cases where a single multipotent progenitor gives rise to multiple cell types, more-restricted progenitors will preferentially appear between sets of cell types with similar expression profiles that are distinct from other cell types. The model recapitulates the complex and well-characterized differentiation hierarchy of haematopoiesis with no fitting parameters. In another well-studied system, intestinal stem cells differentiate into secretory progenitors that then give rise to specific progenitors for Paneth and goblet cells and enterochromaffin and non-enterochromaffin enteroendocrine cells (Singh et al., 2022). Singh et al. showed that the progenitors exhibit multilineage priming at both the transcriptional level, with low-level expression of cell type-specific transcripts, and at the chromatin level, with intermediate chromatin accessibility signatures at cell type-specific enhancers. The identity of these progenitors, along with their epigenetic and transcriptional profiles, aligns closely with the model predictions that are based on the similarities among these cell types.

Both the haematopoietic and intestinal systems are thus consistent with an annealing model for differentiation into multiple cell types. From a functional point of view, the annealing strategy corresponds to well-established techniques from physics and optimization to settle a system at a global minimum (Kirkpatrick et al., 1983; Van Laarhoven et al., 1987). From a biological perspective, this allows progenitor and stem cells to identify the cell types induced by signalling pathways (illustrated in Fig. 3B) and direct their differentiation towards them, avoiding convergence to metastable states. The mechanism also supports the balanced production of multiple cell fates, as demonstrated by the simulations for haematopoiesis (Fig. 3D). Annealing thus provides a solution to a key problem of the cell identity network – the need to encode multiple stable configurations (including stem and terminal configurations) while allowing transitions between these configurations, which are sensitive to upstream signalling.

Our analysis predicts that differentiation from a multipotent to a specified cell identity occurs through an increase in β. In our model, β is proportional to the ratio of global production to removal of latent chromatin modifications. A key candidate mechanism for this process is histone acetylation, a hallmark of active enhancers that plays a crucial role in transcription initiation (Creyghton et al., 2010; Kondo et al., 2014; Narita et al., 2021, 2023). The model predicts that a decrease in overall histone acetylation activity, corresponding to a reduction in β, will lead to loss of cell identity, whereas a decrease in the rate of histone deacetylation activity, corresponding to an increase in β, will promote cellular differentiation. These predictions align with well-established effects observed upon the respective inhibition of histone acetyltransferases (Ebrahimi et al., 2019; Gomez et al., 2009; Lipinski et al., 2020; Zhang et al., 2021; Narita et al., 2021; He et al., 2021) and histone deacetylases (Marks et al., 2000; Hsieh et al., 2004; Karantzali et al., 2008; Kondo et al., 2014; Li et al., 2014).

Other mechanisms, in addition to histone acetylation, are known to play a role the regulation of enhancer activity. These include the activity of chromatin remodellers such as the SWI/SNF, Mi-2/NuRD, SET1/MLL and Polycomb complexes, and involve various mechanisms, including the depletion and modification of histone proteins (Gao et al., 2009; Whyte et al., 2012; Iurlaro et al., 2021; Wolf et al., 2023; Chan et al., 2018). In addition to these, the methylation of DNA itself can play a role in regulating enhancer function (Angeloni and Bogdanovic, 2019). Some epigenetic mechanisms may show bistability through local positive-feedback regulation, resulting in digital activation patterns (Dodd et al., 2007; Angel et al., 2011; Haerter et al., 2014; Berry et al., 2017; Sneppen and Ringrose, 2019; Movilla Miangolarra et al., 2024). This property may confer robustness to cell types and protect against spontaneous reprogramming. Specifically, I considered the case where enhancer activity acts as a bifurcation parameter for a bistable switch that inhibits itself (see Materials and Methods). In such a case, a drop in enhancer activity below a critical threshold results in positive feedback that further decreases the basin of attraction for the corresponding cell type, resulting in a barrier for reprogramming.

The model can explain how new cell types can evolve without affecting pre-existing cell types encoded by the network. This applies both to evolutionary dynamics between generations of organisms and to the evolution of cells in diseases such as cancer. The model aligns with the experimentally observed robustness of shadow enhancers, which are thought to play a crucial role in the evolution of cellular diversity. The predictions of the model could be tested by synthetically engineering new cell types, through mutating shadow enhancers as prescribed by the model.

In conclusion, I propose that EnhancerNet provides a simple predictive framework for the dynamics of the gene regulatory network that control cell identity, and a basis for dissecting the complex processes of cell reprogramming, differentiation and evolution.

Derivation of general EnhancerNet model

I modelled transcriptional activation by considering a process whereby transcription is initiated at enhancer i at rate pi, which results in the transcription of gene j at a coupling rate of qi,j. As in established thermodynamic models for gene regulation, it was assumed that there is a rate-limiting step for transcription that involves the recruitment to a specific site in the genome of transcription initiation machinery (Ackers et al., 1982; Bintu et al., 2005), which may be of high multiplicity (the effect of multiplicity in the model would be to scale overall transcription, which is not important for our conclusions). It was assumed that enhancers compete for the binding of this machinery. I denote using i the energy for transcriptional initiation at enhancer i and assume that the transcriptional machinery is always bound to one of the enhancers. It was assumed that on the timescale of interest, which is related to changes in cell identity (days to weeks), the rate of transcription initiation is captured by equilibrium statistical mechanics. Namely, the rate of transcription initiation at enhancer i is proportional to the Boltzmann distribution:
(3)
where is inverse temperature. The initiation of transcription at enhancer i can then result in the transcription of several associated genes, and each gene can be transcribed after an interaction with one of several enhancers. The dynamics of the expression of gene j, denoted xj, is given by the sum:
(4)
where τ is the timescale of gene expression changes.

Feedback in the model occurs when TFs modulate the activation energies of the enhancers. In classical models for gene regulation in prokaryotes, the binding of activator or repressor molecules directly modulates the rate of transcription initiation, corresponding to the variables i in our model (Bintu et al., 2005). For enhancer-mediated transcription, on the other hand, the dominant mode of enhancer activation appears to be indirect, through the modulation of enhancer chromatin (Eck et al., 2020). Binding of TFs to enhancers leads to loosening of nucleosomes and to the biochemical modification of histone proteins (Calo and Wysocka, 2013; Park et al., 2021; Hansen et al., 2022), resulting in dynamic changes in enhancer chromatin that are closely linked to enhancer activity (Heintzman et al., 2009; Creyghton et al., 2010). A dominant mode of enhancer activation appears to be the TF-mediated recruitment of histone acetyltransferases, which modify enhancer chromatin, resulting in the recruitment of transcription initiation machinery (Narita et al., 2021).

From a biophysical perspective, this can be captured by a model in which i is set by a latent variable mi, which accumulates proportionally to the binding of TFs:
(5)
(6)
where ξi,j is the effective binding rate TF j to enhancer i, and where κ1, κ−1 are the associated ‘on’ and ‘off’ rates for mi, which may be related to the global activity of enzymes such as histone acetyltransferases and histone deacetylases. Experimentally, it was observed that the timescale of these dynamics is of the order of minutes (Weinert et al., 2018; Narita et al., 2021), and thus much faster than the typical timescale of changes in cell identity. The weight parameter determines the energy in the absence of mi or of effects on i that are outside the cell identity feedback network.

The mechanism described in Eqn 6 is generic and may correspond to several underlying biological processes; it is, in essence, similar to modulation of receptor activation energy by methylation in bacterial chemotaxis (Tu, 2013). Taking a quasi-steady-state of mi and denoting by , Eqn 1 is derived, with the entries of the vector given by and the effective inverse temperature given by .

It is also possible to consider a more general model where TF binding to enhancers is modulated at the level of the entire enhancer, e.g. by the binding of ‘pioneer’ factors that increase accessibility for other TFs. In such a case, Eqn 6 may be altered so that:
(7)
where ζi corresponds to enhancer-level gain, which effectively multiplies .

Equivalence of model with physical enhancers to model with enhancer types

I consider the dynamics of Eqn 1 where a subset, , of enhancers has identical association patterns, i.e. ξi,j=ξk,j=ξj for all j and all . Then:
(8)
(9)
(10)
Now:
(11)
(12)
(13)
where . Thus,
(14)
Denoting the denominator by Z, it was noted that:
(15)
(16)
(17)
(18)
(19)
where . Thus, the following may be derived:
(20)
which does not include the physical enhancers of type , as they are replaced with a combined ‘enhancer type’ with the association constant q*,j and weight w*.

Reciprocity and autoregulation in enhancer feedback networks

Here, I will show that reciprocity:
(21)
for all k, j implies that qi,j=ξi,j for all i, j. As the above relation is trivial for j=k, instead jk is taken:
(22)
where .
Thus, for an arbitrary x, the equality implies that qi,j=νξi,j:
(23)
Thus, reciprocal interactions between TFs are equivalent to the autoregulation of TFs by binding to their enhancers.

Scalar potential for transcriptional dynamics

Many derivations related to Eqn 2 appear in Ramsauer et al. (2020 preprint), including stability analysis of the fixed points, as well as storage capacity and equivalence with other machine-learning models. Here, for completeness, I will derive the important results that pertain to our paper, and refer the reader to Ramsauer et al. for more in-depth derivations that pertain to other aspects of the model. Considering the symmetric case, I will show that the dynamics are a gradient flow. A scalar potential for Eqn 2 is:
(24)
where . This function is similar to the Lyapunov function of Ramsauer et al. (2020 preprint), and accounts for the bias . To see that the dynamics are a potential flow:
(25)
observe that
(26)
(27)
(28)
(29)
which provides the dynamics of .

Rows of Q are fixed points at high β

Consider the row vectors of given by . The dynamics of Eqn 2 at are given by:
(30)
where θi,j determines the angle between and the magnitude is defined by the Euclidean norm. It was assumed that the rows of are all of magnitude unity, which is equivalent to assuming that they have comparable overall binding affinities. In this case:
(31)
where the last equality holds for sufficiently large β and assuming the rows of are distinct, because, in that case cosθi,k=1 if (and only if) i=k. Thus, for a large enough β, the rows of are fixed points of the dynamics. In practice, it is only important that cell types with high cosine similarity have similar overall binding affinities. Consider, for example, the case where, for pattern k′:
(32)
In such a case, the pattern k will be unstable even at very large β, and thus such a pattern will not be accessible. For this reason, it is expected that similar cell types will have comparable overall binding affinities. In cells, this could be achieved either through evolution of enhancer composition or by population-level negative feedback, such as the coupling of the enhancer-level gain parameter (ζi) to the production of cell type i.
What about the case where the matrix is distinct from ? Consider then dynamics at :
(33)
where now θi,j determines the angle between . Now, in order for the pattern to be a fixed point at high β, it is required that:
(34)
Under the assumption that the rows of are comparable, is a fixed point of the dynamics at sufficiently large β when there is high cosine similarity between compared with the other rows of .

Averages of rows of can be fixed points

Let us denote using a subset of indices of enhancers of magnitude and consider the averaged vector:
(35)
Setting gives the dynamics:
(36)
where in the last equality it was assumed that the rows of are also unity. In the case where all weights wi are similar, this will be zero trivially when β is very small and where is the set of all enhancers (all K rows of ), as in this case:
(37)
This can be generalized in a straightforward manner to the case where some of the weights of w are large and of comparable magnitude. Otherwise, and assuming for simplicity that , it is required that:
(38)
which, at large enough β, occurs when the averaged cosine similarity of a set element with the other elements of the set:
(39)
is both (1) comparable within the set, for , and (2) larger than outside the set, for . An averaged pattern of a subset will therefore be a fixed point if each pattern in the subset is similar to each of the other patterns, and distinct from patterns outside the subset. Condition (1) is specifically easy to satisfy for subsets with only two patterns; in this case, their average (a ‘bipotent progenitor’) will be a fixed point when they have large cosine similarity and are distinct from other patterns.

Global stability of averages depends on β

To probe the global stability of averages of rows of , the scalar potential V can be used. The dynamics proceed from high V to low V so it is expected that states with higher V are less likely to be stable than states with lower V. As the potential function is only defined for the symmetric case, the states are defined as averages of the rows of , which are analogously defined as , where is as before a set of indices:
(40)
Setting , the following may be evaluated:
(41)
Again, taking the magnitude of the rows of to be unity and (equivalent to enhancers with similar binding strengths and weights) gives:
(42)
The energy of averaged patterns thus depends only on their cosine similarity relative to all patterns (first term) and internal cosine similarity (second term). As an illustrative case, let us assume that for and for [considering that, in general, A, B are between (−1, 1)]. Then:
(43)
where K is the overall number of patterns. The case of comparable AB, corresponding to random subsets, gives:
(44)
which for large β would correspond to and will thus have high energy when B is low. On the other hand, in the case where AB, as a rough approximation:
(45)
When β is very large, the size of the subset |I| does not contribute to V and thus it is minimized when A is maximized, i.e. at single patterns with A=1 (where ). At intermediate β, however, larger subsets may have lower energy. For two subsets where and A1<A2, the larger set will have lower energy when:
(46)
which occurs for a larger range when the difference between the set magnitudes is larger and the difference between their average cosine similarity is smaller. Specifically, the differentiation from a bipotent progenitor to a terminal identity occurs at a critical β:
(47)
Thus, an annealing strategy where β is decreased and then slowly increased results in a transition from large (global) averages, which correspond to multipotent progenitors, to a set of ever more restricted progenitors, similar to the ‘Waddingtonian’ dynamics.

Temperature transitions for cell type evolution

In this section, I will estimate the temperature β required to specify a population of related cell types. These cell types may share activity patterns across many of their TFs and differ in some smaller regions. As an example that will be discussed later, neurons may have similar TF activity patterns across many TFs, yet small differences in TF activity can fine-tune the identity of specific neurons.

I will specifically consider K patterns associated with closely related cell types and K′ patterns associated with other cell types. There are N bits that differ between the related cell types, of which a fraction η are active (drawn at random), N″ bits that are inactive and N′ bits that are active in all other cell types. As patterns have magnitude unity, the activity level of active bits was set to . It was assumed that unrelated patterns have ηN+N′ active TFs drawn at random. The average inner product of the related cell types is given by:
(48)
while the average inner product between random cell types is given by:
(49)
Let denote each of the closely related patterns, and their average. The energy of each pattern is given by:
(50)
It was assumed that the patterns are sufficiently well separated (AB) and that β is sufficiently large, such that:
(51)
while the energy of the averaged pattern is:
(52)
In general, as β increases, it would be expected to see first a transition from the global average to the local average of the two patterns, which occurs when V1 drops below V2. This transition occurs at a critical value of β that satisfies (assuming K≫1):
(53)
When β is small then and the l.h.s. is equal to 1−A and thus larger than the r.h.s. When the l.h.s. is negative and thus smaller than the r.h.s. Thus, the critical β occurs around:
(54)
I will now consider the case where a new cell type evolves from . I will assume, without loss of generally, that the evolved pattern differs from the existing pattern by two bits, i.e. ξ1,1=a, ξ1,1=0, ξ1,2=0, ξ1,2=a. The energy of the new patterns is given by:
(55)
with C corresponding to the inner product of the new cell types:
(56)
whereas the energy of the averaged pattern is:
(57)
For the second transition, the case of interest is , where the contributions of the (K−1)eβB terms to the energy levels are negligible. In this case, Eqn 47 can be used to give:
(58)
Using Eqns 58 and 54, a ratio χ can be derived that captures the relative increase in β required to evolve and stabilize a new cell type from an existing population of cells:
(59)
The gain χ thus depends only on the TFs that are variable between the cells and, among these TFs, scales with the variance of the active TFs. In the case of neuronal identity specification in C. elegans, the expression code is sparse, with μ=0.1 among variable TFs, with the other parameters being K=118 neuron classes and N=68 TFs exhibiting variable expression across neuronal classes. For this system, this gives:
(60)
implying a further 80% increase in β required from the initial destabilization of a multipotent progenitor to the complete specification of all cell types.

Local positive feedback as a barrier for reprogramming

Consider a modification to the EnhancerNet model where the energy for transcriptional activation is dependent on both the original modification mi and a new modification ui:
(61)
Here, λ is a scalar and ui is autocatalytic and inhibited by enhancer activation. I propose the following dynamics for ui:
(62)
where n≫1 and f(pi) is a decreasing function of enhancer activity pi. Eqn 62 can exhibit either bistability or hysteresis, depending on the value of ρ. It was assumed that initially ui=0 and pi is sufficiently large (as in the multipotent state), maintaining stability at ui=0. As cells differentiate, the activity of specific enhancers decreases while others increase. When the activity of an enhancer drops below a critical threshold, pcrit, Eqn 62 undergoes a bifurcation, causing ui to increase to a much larger value (ui≫1). This increase may persist due to a transition to a new fixed point when the dynamics are bistable. Consequently, the energy required to initiate transcription from the enhancer increases, making reprogramming to the associated cell type (whether induced by signalling, overexpression of specific transcription factors, or noise) less likely. This mechanism effectively creates a barrier to reprogramming.

Direct reprogramming through the overexpression of TFs

Consider a transcription factor xj that is constitutively overexpressed to an extent δj. This constitutive expression alters energy levels i without feedback. This therefore takes into account only the effect of the additional TF production on the energy levels of the enhancers, without altering the coordinates of the stable fixed points. V′ denotes the potential under the perturbed dynamics. Evaluating V′ at pattern k yields, at high β:
(63)
where it was assumed that for all i. Assuming that unperturbed pattern k is stable at a given β, the summand for δj=0 takes its maximum at i=k, where:
(64)
thus:
(65)
Under the assumption that the perturbation magnitude is relatively small (and, generally, for patterns that are stable after the perturbation):
(66)
and thus:
(67)
Note that Eqn 67 readily generalizes to a perturbation in several TFs :
(68)
If the goal is to steer the dynamics toward a pattern k, then the TFs and their overexpression need to be chosen such that is sufficiently larger for i=k than for all other patterns ik. That is, the ideal expression for a TF j is such that ξi,j is large for i=k (amplifying δj) and zero for ik.

Preprocessing and simulation

The matrix was initialized from transcript counts extracted from murine cells and averaged over cell type. For the Tabula Muris dataset (Schaum et al., 2018), I used all available data across organs and averaged over cell type. For haematopoiesis, I used the Haemopedia dataset (Choi et al., 2019) and averaged all cells that belong to the same terminal lineage. The data were then log-transformed (using a log1+x transformation) and filtered for TFs with mean and std expression larger than log4, as well as for Tabula Muris the TFs that participate in the tested reprogramming pathways (a full list of cell types and TFs used can be found in Tables S1-S4). Although, in principle, a log transformation is not needed for our model, which does not assume log-transformed values of x, it has the statistical advantage of reducing variance in the rows of and thus reducing sensitivity to TF choice. Finally, the rows of were normalized to unity. Simulations were performed using Python, taking τ=1, and the (terminal) β was chosen so that all patterns were stable.

Simulation of reprogramming

To simulate forced TF expression during reprogramming, the following dynamics may be used:
(69)
where is a vector where δi corresponds to the degree of activation of TFi. In general, it was assumed that δi=1 when the TF was in the reprogramming pathway and δi=0 when it was not. However, for a few genes, setting δi as different from unity was necessary to achieve the correct reprogramming (setting δi=1 achieves reprogramming to a closely related cell type). The full list of reprogramming pathways and relevant δ values can be found in Table S5.

Simulation of balanced differentiation

Haematopoiesis was simulated by using annealing from β=0 to β=50 over 50 time units. The initial state corresponded to the averaged value of all terminal states. Simulations were performed by adding additive white noise:
(70)
with σ set at 0.01. To simulate balanced differentiation in haematopoiesis, I used the following feedback procedure that aims to imitate in vivo feedback on differentiation, where there is negative feedback on the production of each cell type by signalling. was initialized as a zero vector. Then, running from k=0 to k=kmax, we performed differentiation by annealing. If the outcome terminal cell corresponded to row i of , I adjusted wi=wi−0.5(1−k/kmax) and then mean-adjusted w to zero. This resulted in a signalling profile, w, that produced all cell fates from the initial progenitor population.

I am grateful to Anton Grishchekcin, Benjamin Simons and Tal Agranov for helpful discussions and for providing feedback on the manuscript. All figures were created with BioRender.com.

Author contributions

Conceptualization: O.K.; Methodology: O.K.; Software: O.K.; Validation: O.K.; Formal analysis: O.K.; Investigation: O.K.; Writing - original draft: O.K.

Funding

Open Access funding provided by Imperial College London. Deposited in PMC for immediate release.

Data availability

The original code for the simulations presented in this study is available from https://github.com/karin-lab/enhancernet.

Ackers
,
G. K.
,
Johnson
,
A. D.
and
Shea
,
M. A.
(
1982
).
Quantitative model for gene regulation by lambda phage repressor
.
Proc. Natl Acad. Sci. USA
79
,
1129
-
1133
.
Adam
,
R. C.
,
Yang
,
H.
,
Rockowitz
,
S.
,
Larsen
,
S. B.
,
Nikolova
,
M.
,
Oristian
,
D. S.
,
Polak
,
L.
,
Kadaja
,
M.
,
Asare
,
A.
,
Zheng
,
D.
et al.
(
2015
).
Pioneer factors govern super-enhancer dynamics in stem cell plasticity and lineage choice
.
Nature
521
,
366
-
370
.
Almeida
,
N.
,
Chung
,
M. W.
,
Drudi
,
E. M.
,
Engquist
,
E. N.
,
Hamrud
,
E.
,
Isaacson
,
A.
,
Tsang
,
V. S.
,
Watt
,
F. M.
and
Spagnoli
,
F. M.
(
2021
).
Employing core regulatory circuits to define cell identity
.
EMBO J.
40
,
e106785
.
Alon
,
U.
(
2007
).
Network motifs: theory and experimental approaches
.
Nat. Rev. Genet.
8
,
450
-
461
.
Amari
,
S.-I.
(
1972
).
Learning patterns and pattern sequences by self-organizing nets of threshold elements
.
IEEE Trans. Comp.
100
,
1197
-
1206
.
Amit
,
D. J.
(
1989
).
Modeling Brain Function: The World of Attractor Neural Networks
.
Cambridge University Press
.
Anderson
,
D. A.
, III
,
Dutertre
,
C.-A.
,
Ginhoux
,
F.
and
Murphy
,
K. M
. (
2021
).
Genetic models of human and mouse dendritic cell development and function
.
Nat. Rev. Immunol.
21
,
101
-
115
.
Angel
,
A.
,
Song
,
J.
,
Dean
,
C.
and
Howard
,
M.
(
2011
).
A polycomb-based switch underlying quantitative epigenetic memory
.
Nature
476
,
105
-
108
.
Angeloni
,
A.
and
Bogdanovic
,
O.
(
2019
).
Enhancer dna methylation: implications for gene regulation
.
Essays Biochem.
63
,
707
-
715
.
Arendt
,
D.
,
Musser
,
J. M.
,
Baker
,
C. V.
,
Bergman
,
A.
,
Cepko
,
C.
,
Erwin
,
D. H.
,
Pavlicev
,
M.
,
Schlosser
,
G.
,
Widder
,
S.
,
Laubichler
,
M. D.
et al.
(
2016
).
The origin and evolution of cell types
.
Nat. Rev. Genet.
17
,
744
-
757
.
Berry
,
S.
,
Dean
,
C.
and
Howard
,
M.
(
2017
).
Slow chromatin dynamics allow polycomb target genes to filter fluctuations in transcription factor activity
.
Cell Syst.
4
,
445
-
457
.
Bessonnard
,
S.
,
De Mot
,
L.
,
Gonze
,
D.
,
Barriol
,
M.
,
Dennis
,
C.
,
Goldbeter
,
A.
,
Dupont
,
G.
and
Chazaud
,
C.
(
2014
).
Gata6, nanog and erk signaling control cell fate in the inner cell mass through a tristable regulatory network
.
Development
141
,
3637
-
3648
.
Bintu
,
L.
,
Buchler
,
N. E.
,
Garcia
,
H. G.
,
Gerland
,
U.
,
Hwa
,
T.
,
Kondev
,
J.
and
Phillips
,
R.
(
2005
).
Transcriptional regulation by the numbers: models
.
Curr. Opin. Genet. Dev.
15
,
116
-
124
.
Boukacem
,
N. E.
,
Leary
,
A.
,
Thériault
,
R.
,
Gottlieb
,
F.
,
Mani
,
M.
and
François
,
P.
(
2024
).
Waddington landscape for prototype learning in generalized hopfield networks
.
Phys. Rev. Res.
6
,
033098
.
Briggs
,
J. A.
,
Weinreb
,
C.
,
Wagner
,
D. E.
,
Megason
,
S.
,
Peshkin
,
L.
,
Kirschner
,
M. W.
and
Klein
,
A. M.
(
2018
).
The dynamics of gene expression in vertebrate embryogenesis at single-cell resolution
.
Science
360
,
eaar5780
.
Burke
,
A. C.
,
Nelson
,
C. E.
,
Morgan
,
B. A.
and
Tabin
,
C.
(
1995
).
Hox genes and the evolution of vertebrate axial morphology
.
Development
121
,
333
-
346
.
Calo
,
E.
and
Wysocka
,
J.
(
2013
).
Modification of enhancer chromatin: what, how, and why?
Mol. Cell
49
,
825
-
837
.
Cannavò
,
E.
,
Khoueiry
,
P.
,
Garfield
,
D. A.
,
Geeleher
,
P.
,
Zichner
,
T.
,
Gustafson
,
E. H.
,
Ciglar
,
L.
,
Korbel
,
J. O.
and
Furlong
,
E. E.
(
2016
).
Shadow enhancers are pervasive features of developmental regulatory networks
.
Curr. Biol.
26
,
38
-
51
.
Chan
,
H. L.
,
Beckedorff
,
F.
,
Zhang
,
Y.
,
Garcia-Huidobro
,
J.
,
Jiang
,
H.
,
Colaprico
,
A.
,
Bilbao
,
D.
,
Figueroa
,
M. E.
,
LaCava
,
J.
,
Shiekhattar
,
R.
et al.
(
2018
).
Polycomb complexes associate with enhancers and promote oncogenic transcriptional programs in cancer through multiple mechanisms
.
Nat. Commun.
9
,
3377
.
Chanda
,
S.
,
Ang
,
C. E.
,
Davila
,
J.
,
Pak
,
C.
,
Mall
,
M.
,
Lee
,
Q. Y.
,
Ahlenius
,
H.
,
Jung
,
S. W.
,
Südhof
,
T. C.
and
Wernig
,
M.
(
2014
).
Generation of induced neuronal cells by the single reprogramming factor ascl1
.
Stem Cell Rep.
3
,
282
-
296
.
Choi
,
J.
,
Baldwin
,
T. M.
,
Wong
,
M.
,
Bolden
,
J. E.
,
Fairfax
,
K. A.
,
Lucas
,
E. C.
,
Cole
,
R.
,
Biben
,
C.
,
Morgan
,
C.
,
Ramsay
,
K. A.
et al.
(
2019
).
Haemopedia rna-seq: a database of gene expression during haematopoiesis in mice and humans
.
Nucleic Acids Res.
47
,
D780
-
D785
.
Conforte
,
A. J.
,
Alves
,
L.
,
Coelho
,
F. C.
,
Carels
,
N.
and
Silva
,
F. A. B. d.
(
2020
).
Modeling basins of attraction for breast cancer using hopfield networks
.
Front. Genet.
11
,
314
.
Creyghton
,
M. P.
,
Cheng
,
A. W.
,
Welstead
,
G. G.
,
Kooistra
,
T.
,
Carey
,
B. W.
,
Steine
,
E. J.
,
Hanna
,
J.
,
Lodato
,
M. A.
,
Frampton
,
G. M.
,
Sharp
,
P. A.
et al.
(
2010
).
Histone h3k27ac separates active from poised enhancers and predicts developmental state
.
Proc. Natl Acad. Sci. USA
107
,
21931
-
21936
.
D'Alessio
,
A. C.
,
Fan
,
Z. P.
,
Wert
,
K. J.
,
Baranov
,
P.
,
Cohen
,
M. A.
,
Saini
,
J. S.
,
Cohick
,
E.
,
Charniga
,
C.
,
Dadon
,
D.
,
Hannett
,
N. M.
et al.
(
2015
).
A systematic approach to identify candidate transcription factors that control cell identity
.
Stem Cell Rep.
5
,
763
-
775
.
Dodd
,
I. B.
,
Micheelsen
,
M. A.
,
Sneppen
,
K.
and
Thon
,
G.
(
2007
).
Theoretical analysis of epigenetic cell memory by nucleosome modification
.
Cell
129
,
813
-
822
.
Ebrahimi
,
A.
,
Sevinç
,
K.
,
Gürhan Sevinç
,
G.
,
Cribbs
,
A. P.
,
Philpott
,
M.
,
Uyulur
,
F.
,
Morova
,
T.
,
Dunford
,
J. E.
,
Göklemez
,
S.
,
Ar
,
Ş.
et al.
(
2019
).
Bromodomain inhibition of the coactivators cbp/ep300 facilitate cellular reprogramming
.
Nat. Chem. Biol.
15
,
519
-
528
.
Eck
,
E.
,
Liu
,
J.
,
Kazemzadeh-Atoufi
,
M.
,
Ghoreishi
,
S.
,
Blythe
,
S. A.
and
Garcia
,
H. G.
(
2020
).
Quantitative dissection of transcription in development yields evidence for transcription-factor-driven chromatin accessibility
.
Elife
9
,
e56429
.
Fard
,
A. T.
,
Srihari
,
S.
,
Mar
,
J. C.
and
Ragan
,
M. A.
(
2016
).
Not just a colourful metaphor: modelling the landscape of cellular development using hopfield networks
.
NPJ Syst. Biol. Appl.
2
,
1
-
9
.
Feng
,
C.
,
Song
,
C.
,
Jiang
,
Y.
,
Zhao
,
J.
,
Zhang
,
J.
,
Wang
,
Y.
,
Yin
,
M.
,
Zhu
,
J.
,
Ai
,
B.
,
Wang
,
Q.
et al.
(
2023
).
Landscape and significance of human super enhancer-driven core transcription regulatory circuitry
.
Mol. Ther. Nucleic Acids
32
,
385
-
401
.
Ferrell
,
J. E.
Jr
(
2002
).
Self-perpetuating states in signal transduction: positive feedback, double-negative feedback and bistability
.
Curr. Opin. Cell Biol.
14
,
140
-
148
.
Gao
,
H.
,
Lukin
,
K.
,
Ramírez
,
J.
,
Fields
,
S.
,
Lopez
,
D.
and
Hagman
,
J.
(
2009
).
Opposing effects of swi/snf and mi-2/nurd chromatin remodeling complexes on epigenetic reprogramming by ebf and pax5
.
Proc. Natl Acad. Sci. USA
106
,
11258
-
11263
.
Gomez
,
R. A.
,
Pentz
,
E. S.
,
Jin
,
X.
,
Cordaillat
,
M.
and
Sequeira Lopez
,
M. L. S.
(
2009
).
Cbp and p300 are essential for renin cell identity and morphological integrity of the kidney
.
Am. J. Physiol. Heart Circ. Physiol.
296
,
H1255
-
H1262
.
Graham
,
T. G.
,
Tabei
,
S. A.
,
Dinner
,
A. R.
and
Rebay
,
I.
(
2010
).
Modeling bistable cell-fate choices in the drosophila eye: qualitative and quantitative perspectives
.
Development
137
,
2265
-
2278
.
Grover
,
A.
,
Mancini
,
E.
,
Moore
,
S.
,
Mead
,
A. J.
,
Atkinson
,
D.
,
Rasmussen
,
K. D.
,
O'Carroll
,
D.
,
Jacobsen
,
S. E. W.
and
Nerlov
,
C.
(
2014
).
Erythropoietin guides multipotent hematopoietic progenitor cells toward an erythroid fate
.
J. Exp. Med.
211
,
181
-
188
.
Gulati
,
G. S.
,
Sikandar
,
S. S.
,
Wesche
,
D. J.
,
Manjunath
,
A.
,
Bharadwaj
,
A.
,
Berger
,
M. J.
,
Ilagan
,
F.
,
Kuo
,
A. H.
,
Hsieh
,
R. W.
,
Cai
,
S.
et al.
(
2020
).
Single-cell transcriptional diversity is a hallmark of developmental potential
.
Science
367
,
405
-
411
.
Guo
,
J.
and
Zheng
,
J.
(
2017
).
Hopland: single-cell pseudotime recovery using continuous hopfield network-based modeling of waddington's epigenetic landscape
.
Bioinformatics
33
,
i102
-
i109
.
Haerter
,
J. O.
,
Lövkvist
,
C.
,
Dodd
,
I. B.
and
Sneppen
,
K.
(
2014
).
Collaboration between cpg sites is needed for stable somatic inheritance of dna methylation states
.
Nucleic Acids Res.
42
,
2235
-
2244
.
Hansen
,
J. L.
,
Loell
,
K. J.
and
Cohen
,
B. A.
(
2022
).
A test of the pioneer factor hypothesis using ectopic liver gene activation
.
Elife
11
,
e73358
.
He
,
R.
,
Dantas
,
A.
and
Riabowol
,
K.
(
2021
).
Histone acetyltransferases and stem cell identity
.
Cancers
13
,
2407
.
Heintzman
,
N. D.
,
Hon
,
G. C.
,
Hawkins
,
R. D.
,
Kheradpour
,
P.
,
Stark
,
A.
,
Harp
,
L. F.
,
Ye
,
Z.
,
Lee
,
L. K.
,
Stuart
,
R. K.
,
Ching
,
C. W.
et al.
(
2009
).
Histone modifications at human enhancers reflect global cell-type-specific gene expression
.
Nature
459
,
108
-
112
.
Heinz
,
S.
,
Benner
,
C.
,
Spann
,
N.
,
Bertolino
,
E.
,
Lin
,
Y. C.
,
Laslo
,
P.
,
Cheng
,
J. X.
,
Murre
,
C.
,
Singh
,
H.
and
Glass
,
C. K.
(
2010
).
Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and b cell identities
.
Mol. Cell
38
,
576
-
589
.
Heinz
,
S.
,
Romanoski
,
C. E.
,
Benner
,
C.
and
Glass
,
C. K.
(
2015
).
The selection and function of cell type-specific enhancers
.
Nat. Rev. Mol. Cell Biol.
16
,
144
-
154
.
Hnisz
,
D.
,
Abraham
,
B. J.
,
Lee
,
T. I.
,
Lau
,
A.
,
Saint-André
,
V.
,
Sigova
,
A. A.
,
Hoke
,
H. A.
and
Young
,
R. A.
(
2013
).
Super-enhancers in the control of cell identity and disease
.
Cell
155
,
934
-
947
.
Hnisz
,
D.
,
Schuijers
,
J.
,
Lin
,
C. Y.
,
Weintraub
,
A. S.
,
Abraham
,
B. J.
,
Lee
,
T. I.
,
Bradner
,
J. E.
and
Young
,
R. A.
(
2015
).
Convergence of developmental and oncogenic signaling pathways at transcriptional super-enhancers
.
Mol. Cell
58
,
362
-
370
.
Hobert
,
O.
(
2021
).
Homeobox genes and the specification of neuronal identity
.
Nat. Rev. Neurosci.
22
,
627
-
636
.
Holmberg
,
J.
and
Perlmann
,
T.
(
2012
).
Maintaining differentiated cellular identity
.
Nat. Rev. Genet.
13
,
429
-
439
.
Hong
,
J.-W.
,
Hendrix
,
D. A.
and
Levine
,
M. S.
(
2008
).
Shadow enhancers as a source of evolutionary novelty
.
Science
321
,
1314-1314
.
Hopfield
,
J. J.
(
1982
).
Neural networks and physical systems with emergent collective computational abilities
.
Proc. Natl Acad. Sci. USA
79
,
2554
-
2558
.
Hsieh
,
J.
,
Nakashima
,
K.
,
Kuwabara
,
T.
,
Mejia
,
E.
and
Gage
,
F. H.
(
2004
).
Histone deacetylase inhibition-mediated neuronal differentiation of multipotent adult neural progenitor cells
.
Proc. Natl Acad. Sci. USA
101
,
16659
-
16664
.
Hu
,
M.
,
Krause
,
D.
,
Greaves
,
M.
,
Sharkis
,
S.
,
Dexter
,
M.
,
Heyworth
,
C.
and
Enver
,
T.
(
1997
).
Multilineage gene expression precedes commitment in the hemopoietic system
.
Genes Dev.
11
,
774
-
785
.
Huang
,
S.
,
Eichler
,
G.
,
Bar-Yam
,
Y.
and
Ingber
,
D. E.
(
2005
).
Cell fates as high-dimensional attractor states of a complex gene regulatory network
.
Phys. Rev. Lett.
94
,
128701
.
Huang
,
S.
,
Guo
,
Y.-P.
,
May
,
G.
and
Enver
,
T.
(
2007
).
Bifurcation dynamics in lineage-commitment in bipotent progenitor cells
.
Dev. Biol.
305
,
695
-
713
.
Iurlaro
,
M.
,
Stadler
,
M. B.
,
Masoni
,
F.
,
Jagani
,
Z.
,
Galli
,
G. G.
and
Schübeler
,
D.
(
2021
).
Mammalian swi/snf continuously restores local accessibility to chromatin
.
Nat. Genet.
53
,
279
-
287
.
Karantzali
,
E.
,
Schulz
,
H.
,
Hummel
,
O.
,
Hubner
,
N.
,
Hatzopoulos
,
A.
and
Kretsovali
,
A.
(
2008
).
Histone deacetylase inhibition accelerates the early events of stem cell differentiation: transcriptomic and epigenetic analysis
.
Genome Biol.
9
,
1
-
15
.
Kelaini
,
S.
,
Cochrane
,
A.
and
Margariti
,
A.
(
2014
).
Direct reprogramming of adult cells: avoiding the pluripotent state
.
Stem Cells Cloning
7
,
19
-
29
.
Kim
,
T.-H.
,
Saadatpour
,
A.
,
Guo
,
G.
,
Saxena
,
M.
,
Cavazza
,
A.
,
Desai
,
N.
,
Jadhav
,
U.
,
Jiang
,
L.
,
Rivera
,
M. N.
,
Orkin
,
S. H.
et al.
(
2016
).
Single-cell transcript profiles reveal multilineage priming in early progenitors derived from lgr5+ intestinal stem cells
.
Cell Rep.
16
,
2053
-
2060
.
Kirkpatrick
,
S.
,
Gelatt
,
C. D.
, Jr
and
Vecchi
,
M. P
. (
1983
)
. Optimization by simulated annealing
.
Science
220
,
671
-
680
.
Kondo
,
Y.
,
Iwao
,
T.
,
Yoshihashi
,
S.
,
Mimori
,
K.
,
Ogihara
,
R.
,
Nagata
,
K.
,
Kurose
,
K.
,
Saito
,
M.
,
Niwa
,
T.
,
Suzuki
,
T.
et al.
(
2014
).
Histone deacetylase inhibitor valproic acid promotes the differentiation of human induced pluripotent stem cells into hepatocyte-like cells
.
PLoS One
9
,
e104010
.
Kraut
,
R.
and
Levine
,
M.
(
1991
).
Mutually repressive interactions between the gap genes giant and krüppel define middle body regions of the drosophila embryo
.
Development
111
,
611
-
621
.
Krotov
,
D.
and
Hopfield
,
J. J.
(
2016
).
Dense associative memory for pattern recognition
.
Adv. Neural Inf. Process. Syst.
29
,
1172
.
Krotov
,
D.
and
Hopfield
,
J.
(
2021
).
Large associative memory problem in neurobiology and machine learning
.
International Conference on Learning Representations, 2021
. https://openreview.net/forum?id=X4y_10OX-hX
Kvon
,
E. Z.
,
Waymack
,
R.
,
Gad
,
M.
and
Wunderlich
,
Z.
(
2021
).
Enhancer redundancy in development and disease
.
Nat. Rev. Genet.
22
,
324
-
336
.
Lang
,
A. H.
,
Li
,
H.
,
Collins
,
J. J.
and
Mehta
,
P.
(
2014
).
Epigenetic landscapes explain partially reprogrammed cells and identify key reprogramming genes
.
PLoS Comput. Biol.
10
,
e1003734
.
Li
,
Q.
,
Foote
,
M.
and
Chen
,
J.
(
2014
).
Effects of histone deacetylase inhibitor valproic acid on skeletal myocyte development
.
Sci. Rep.
4
,
7207
.
Lipinski
,
M.
,
Muñoz-Viana
,
R.
,
Del Blanco
,
B.
,
Marquez-Galera
,
A.
,
Medrano-Relinque
,
J.
,
Caramés
,
J. M.
,
Szczepankiewicz
,
A. A.
,
Fernandez-Albert
,
J.
,
Navarrón
,
C. M.
,
Olivares
,
R.
et al.
(
2020
).
Kat3-dependent acetylation of cell type-specific genes maintains neuronal identity in the adult mouse brain
.
Nat. Commun.
11
,
2588
.
Little
,
W. A.
(
1974
).
The existence of persistent states in the brain
.
Math. Biosci.
19
,
101
-
120
.
Månsson
,
R.
,
Hultquist
,
A.
,
Luc
,
S.
,
Yang
,
L.
,
Anderson
,
K.
,
Kharazi
,
S.
,
Al-Hashmi
,
S.
,
Liuba
,
K.
,
Thorén
,
L.
,
Adolfsson
,
J.
et al.
(
2007
).
Molecular evidence for hierarchical transcriptional lineage priming in fetal and adult stem cells and multipotent progenitors
.
Immunity
26
,
407
-
419
.
Marks
,
P. A.
,
Richon
,
V. M.
and
Rifkind
,
R. A.
(
2000
).
Histone deacetylase inhibitors: inducers of differentiation or apoptosis of transformed cells
.
J. Natl. Cancer Inst.
92
,
1210
-
1216
.
Martin
,
E. W.
,
Krietsch
,
J.
,
Reggiardo
,
R. E.
,
Sousae
,
R.
,
Kim
,
D. H.
and
Forsberg
,
E. C.
(
2021
).
Chromatin accessibility maps provide evidence of multilineage gene priming in hematopoietic stem cells
.
Epigenetics Chromatin
14
,
1
-
15
.
Mercer
,
E. M.
,
Lin
,
Y. C.
,
Benner
,
C.
,
Jhunjhunwala
,
S.
,
Dutkowski
,
J.
,
Flores
,
M.
,
Sigvardsson
,
M.
,
Ideker
,
T.
,
Glass
,
C. K.
and
Murre
,
C.
(
2011
).
Multilineage priming of enhancer repertoires precedes commitment to the b and myeloid cell lineages in hematopoietic progenitors
.
Immunity
35
,
413
-
425
.
Merrell
,
A. J.
and
Stanger
,
B. Z.
(
2016
).
Adult cell plasticity in vivo: de-differentiation and transdifferentiation are back in style
.
Nat. Rev. Mol. Cell Biol.
17
,
413
-
425
.
Milo
,
R.
,
Shen-Orr
,
S.
,
Itzkovitz
,
S.
,
Kashtan
,
N.
,
Chklovskii
,
D.
and
Alon
,
U.
(
2002
).
Network motifs: simple building blocks of complex networks
.
Science
298
,
824
-
827
.
Miyamoto
,
T.
,
Iwasaki
,
H.
,
Reizis
,
B.
,
Ye
,
M.
,
Graf
,
T.
,
Weissman
,
I. L.
and
Akashi
,
K.
(
2002
).
Myeloid or lymphoid promiscuity as a critical step in hematopoietic lineage commitment
.
Dev. Cell
3
,
137
-
147
.
Moris
,
N.
,
Pina
,
C.
and
Arias
,
A. M.
(
2016
).
Transition states and cell fate decisions in epigenetic landscapes
.
Nat. Rev. Genet.
17
,
693
-
703
.
Movilla Miangolarra
,
A.
,
Saxton
,
D. S.
,
Yan
,
Z.
,
Rine
,
J.
and
Howard
,
M.
(
2024
).
Two-way feedback between chromatin compaction and histone modification state explains saccharomyces cerevisiae heterochromatin bistability
.
Proc. Natl Acad. Sci. USA
121
,
e2403316121
.
Narita
,
T.
,
Ito
,
S.
,
Higashijima
,
Y.
,
Chu
,
W. K.
,
Neumann
,
K.
,
Walter
,
J.
,
Satpathy
,
S.
,
Liebner
,
T.
,
Hamilton
,
W. B.
,
Maskey
,
E.
et al.
(
2021
).
Enhancers are activated by p300/cbp activity-dependent pic assembly, rnapii recruitment, and pause release
.
Mol. Cell
81
,
2166
-
2182
.
Narita
,
T.
,
Higashijima
,
Y.
,
Kilic
,
S.
,
Liebner
,
T.
,
Walter
,
J.
and
Choudhary
,
C.
(
2023
).
Acetylation of histone h2b marks active enhancers and predicts cbp/p300 target genes
.
Nat. Genet.
55
,
679
-
692
.
Nimmo
,
R. A.
,
May
,
G. E.
and
Enver
,
T.
(
2015
).
Primed and ready: understanding lineage commitment through single cell analysis
.
Trends Cell Biol.
25
,
459
-
467
.
Olsson
,
A.
,
Venkatasubramanian
,
M.
,
Chaudhri
,
V. K.
,
Aronow
,
B. J.
,
Salomonis
,
N.
,
Singh
,
H.
and
Grimes
,
H. L.
(
2016
).
Single-cell analysis of mixed-lineage states leading to a binary cell fate choice
.
Nature
537
,
698
-
702
.
Orkin
,
S. H.
and
Zon
,
L. I.
(
2008
).
Hematopoiesis: an evolving paradigm for stem cell biology
.
Cell
132
,
631
-
644
.
Panigrahi
,
A.
and
O'Malley
,
B. W.
(
2021
).
Mechanisms of enhancer action: the known and the unknown
.
Genome Biol.
22
,
108
.
Park
,
Y.-K.
,
Lee
,
J.-E.
,
Yan
,
Z.
,
McKernan
,
K.
,
O'Haren
,
T.
,
Wang
,
W.
,
Peng
,
W.
and
Ge
,
K.
(
2021
).
Interplay of baf and mll4 promotes cell type-specific enhancer activation
.
Nat. Commun.
12
,
1630
.
Pennacchio
,
L. A.
,
Bickmore
,
W.
,
Dean
,
A.
,
Nobrega
,
M. A.
and
Bejerano
,
G.
(
2013
).
Enhancers: five essential questions
.
Nat. Rev. Genet.
14
,
288
-
295
.
Perrimon
,
N.
,
Pitsouli
,
C.
and
Shilo
,
B.-Z.
(
2012
).
Signaling mechanisms controlling cell fate and embryonic patterning
.
Cold Spring Harbor Perspect. Biol.
4
,
a005975
.
Pusuluri
,
S. T.
,
Lang
,
A. H.
,
Mehta
,
P.
and
Castillo
,
H. E.
(
2017
).
Cellular reprogramming dynamics follow a simple 1d reaction coordinate
.
Phys. Biol.
15
,
016001
.
Qiao
,
X. T.
,
Ziel
,
J. W.
,
McKimpson
,
W.
,
Madison
,
B. B.
,
Todisco
,
A.
,
Merchant
,
J. L.
,
Samuelson
,
L. C.
and
Gumucio
,
D. L.
(
2007
).
Prospective identification of a multilineage progenitor in murine stomach epithelium
.
Gastroenterology
133
,
1989
-
1998
.
Rackham
,
O. J.
,
Firas
,
J.
,
Fang
,
H.
,
Oates
,
M. E.
,
Holmes
,
M. L.
,
Knaupp
,
A. S.
,
Consortium
,
F.
,
Suzuki
,
H.
,
Nefzger
,
C. M.
,
Daub
,
C. O.
et al.
(
2016
).
A predictive computational framework for direct reprogramming between human cell types
.
Nat. Genet.
48
,
331
-
335
.
Ramsauer
,
H.
,
Schäfl
,
B.
,
Lehner
,
J.
,
Seidl
,
P.
,
Widrich
,
M.
,
Adler
,
T.
,
Gruber
,
L.
,
Holzleitner
,
M.
,
Pavlović
,
M.
,
Sandve
,
G. K.
et al.
(
2020
).
Hopfield networks is all you need
.
arXiv
,
02217
.
Rand
,
D. A.
,
Raju
,
A.
,
Sáez
,
M.
,
Corson
,
F.
and
Siggia
,
E. D.
(
2021
).
Geometry of gene regulatory dynamics
.
Proc. Natl Acad. Sci. USA
118
,
e2109729118
.
Reilly
,
M. B.
,
Cros
,
C.
,
Varol
,
E.
,
Yemini
,
E.
and
Hobert
,
O.
(
2020
).
Unique homeobox codes delineate all the neuron classes of c. elegans
.
Nature
584
,
595
-
601
.
Reiter
,
F.
,
Wienerroither
,
S.
and
Stark
,
A.
(
2017
).
Combinatorial function of transcription factors and cofactors
.
Curr. Opin. Genet. Dev.
43
,
73
-
81
.
Sáez
,
M.
,
Blassberg
,
R.
,
Camacho-Aguilar
,
E.
,
Siggia
,
E. D.
,
Rand
,
D. A.
and
Briscoe
,
J.
(
2022
).
Statistically derived geometrical landscapes capture principles of decision-making dynamics during cell fate transitions
.
Cell Systems
13
,
12
-
28
.
Saint-André
,
V.
,
Federation
,
A. J.
,
Lin
,
C. Y.
,
Abraham
,
B. J.
,
Reddy
,
J.
,
Lee
,
T. I.
,
Bradner
,
J. E.
and
Young
,
R. A.
(
2016
).
Models of human core transcriptional regulatory circuitries
.
Genome Res.
26
,
385
-
396
.
Schaum
,
N.
,
Karkanias
,
J.
,
Neff
,
N. F.
,
May
,
A. P.
,
Quake
,
S. R.
,
Wyss-Coray
,
T.
,
Darmanis
,
S.
,
Batson
,
J.
,
Botvinnik
,
O.
,
Chen
,
M. B.
et al.
(
2018
).
Single-cell transcriptomics of 20 mouse organs creates a tabula muris: The tabula muris consortium
.
Nature
562
,
367
.
Singh
,
P. N.
,
Madha
,
S.
,
Leiter
,
A. B.
and
Shivdasani
,
R. A.
(
2022
).
Cell and chromatin transitions in intestinal stem cell regeneration
.
Genes Dev.
36
,
684
-
698
.
Smart
,
M.
and
Zilman
,
A.
(
2023
).
Emergent properties of collective gene-expression patterns in multicellular systems
.
Cell Rep. Phys. Sci.
4
,
101247
.
Sneppen
,
K.
and
Ringrose
,
L.
(
2019
).
Theoretical analysis of polycomb-trithorax systems predicts that poised chromatin is bistable and not bivalent
.
Nat. Commun.
10
,
2133
.
Spitz
,
F.
and
Furlong
,
E. E.
(
2012
).
Transcription factors: from enhancer binding to developmental control
.
Nat. Rev. Genet.
13
,
613
-
626
.
Teschendorff
,
A. E.
and
Feinberg
,
A. P.
(
2021
).
Statistical mechanics meets single-cell biology
.
Nat. Rev. Genet.
22
,
459
-
476
.
Toma
,
J. G.
,
McKenzie
,
I. A.
,
Bagli
,
D.
and
Miller
,
F. D.
(
2005
).
Isolation and characterization of multipotent skin-derived precursors from human skin
.
Stem Cells
23
,
727
-
737
.
Tu
,
Y.
(
2013
).
Quantitative modeling of bacterial chemotaxis: signal amplification and accurate adaptation
.
Annu. Rev. Biophys.
42
,
337
-
359
.
Uyehara
,
C. M.
and
Apostolou
,
E.
(
2023
).
3d enhancer-promoter interactions and multi-connected hubs: Organizational principles and functional roles
.
Cell Reports
42
,
112068
.
Van Laarhoven
,
P. J.
,
Aarts
,
E. H.
,
van Laarhoven
,
P. J.
and
Aarts
,
E. H
. (
1987
).
Simulated Annealing
.
Springer
.
Waddington
,
C
. (
1957
).
The Strategy of the Genes: A Discussion of Some Aspects if Theoretical Biology
.
George Allen & Unwin
.
Wang
,
H.
,
Yang
,
Y.
,
Liu
,
J.
and
Qian
,
L.
(
2021
).
Direct cell reprogramming: approaches, mechanisms and progress
.
Nat. Rev. Mol. Cell Biol.
22
,
410
-
424
.
Weinert
,
B. T.
,
Narita
,
T.
,
Satpathy
,
S.
,
Srinivasan
,
B.
,
Hansen
,
B. K.
,
Schölz
,
C.
,
Hamilton
,
W. B.
,
Zucconi
,
B. E.
,
Wang
,
W. W.
,
Liu
,
W. R.
et al.
(
2018
).
Time-resolved analysis reveals rapid dynamics and broad scope of the cbp/p300 acetylome
.
Cell
174
,
231
-
244
.
Weinreb
,
C.
,
Rodriguez-Fraticelli
,
A.
,
Camargo
,
F. D.
and
Klein
,
A. M.
(
2020
).
Lineage tracing on transcriptional landscapes links state to fate during differentiation
.
Science
367
,
eaaw3381
.
Whyte
,
W. A.
,
Bilodeau
,
S.
,
Orlando
,
D. A.
,
Hoke
,
H. A.
,
Frampton
,
G. M.
,
Foster
,
C. T.
,
Cowley
,
S. M.
and
Young
,
R. A.
(
2012
).
Enhancer decommissioning by lsd1 during embryonic stem cell differentiation
.
Nature
482
,
221
-
225
.
Whyte
,
W. A.
,
Orlando
,
D. A.
,
Hnisz
,
D.
,
Abraham
,
B. J.
,
Lin
,
C. Y.
,
Kagey
,
M. H.
,
Rahl
,
P. B.
,
Lee
,
T. I.
and
Young
,
R. A.
(
2013
).
Master transcription factors and mediator establish super-enhancers at key cell identity genes
.
Cell
153
,
307
-
319
.
Wolf
,
B. K.
,
Zhao
,
Y.
,
McCray
,
A.
,
Hawk
,
W. H.
,
Deary
,
L. T.
,
Sugiarto
,
N. W.
,
LaCroix
,
I. S.
,
Gerber
,
S. A.
,
Cheng
,
C.
and
Wang
,
X.
(
2023
).
Cooperation of chromatin remodeling swi/snf complex and pioneer factor ap-1 shapes 3d enhancer landscapes
.
Nat. Struct. Mol. Biol.
30
,
10
-
21
.
Yampolskaya
,
M.
,
Herriges
,
M. J.
,
Ikonomou
,
L.
,
Kotton
,
D. N.
and
Mehta
,
P.
(
2023
).
sctop: physics-inspired order parameters for cellular identification and visualization
.
Development
150
,
dev201873
.
Zhang
,
L.
,
Sheng
,
C.
,
Zhou
,
F.
,
Zhu
,
K.
,
Wang
,
S.
,
Liu
,
Q.
,
Yuan
,
M.
,
Xu
,
Z.
,
Liu
,
Y.
,
Lu
,
J.
et al.
(
2021
).
Cbp/p300 hat maintains the gene network critical for β cell identity and functional maturity
.
Cell Death Dis.
12
,
476
.
Zheng
,
S.
,
Papalexi
,
E.
,
Butler
,
A.
,
Stephenson
,
W.
and
Satija
,
R.
(
2018
).
Molecular transitions in early progenitors during human cord blood hematopoiesis
.
Mol. Syst. Biol.
14
,
e8041
.
Zhou
,
J. X.
,
Aliyu
,
M.
,
Aurell
,
E.
and
Huang
,
S.
(
2012
).
Quasi-potential landscape in complex multi-stable systems
.
J. R. Soc. Interface
9
,
3539
-
3553
.

Competing interests

The authors declare no competing or financial interests.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.

Supplementary information