ABSTRACT
Liquid–liquid phase separation (LLPS) has increasingly been found to play pivotal roles in a number of intracellular events and reactions, and has introduced a new paradigm in cell biology to explain protein–protein and enzyme–ligand interactions beyond conventional molecular and biochemical theories. LLPS is driven by the cumulative effects of weak and promiscuous interactions, including electrostatic, hydrophobic and cation–π interactions, among polypeptides containing intrinsically disordered regions (IDRs) and describes the macroscopic behaviours of IDR-containing proteins in an intracellular milieu. Recent studies have revealed that interactions between ‘charge blocks’ – clusters of like charges along the polypeptide chain – strongly induce LLPS and play fundamental roles in its spatiotemporal regulation. Introducing a new parameter, termed ‘charge blockiness’, into physicochemical models of disordered polypeptides has yielded a better understanding of how the intrinsic amino acid sequence of a polypeptide determines the spatiotemporal occurrence of LLPS within a cell. Charge blockiness might also explain why some post-translational modifications segregate within IDRs and how they regulate LLPS. In this Review, we summarise recent progress towards understanding the mechanism and biological roles of charge block-driven LLPS and discuss how this new characteristic parameter of polypeptides offers new possibilities in the fields of structural biology and cell biology.
Introduction
Biomolecules form various biological condensates within the cell that play pivotal roles in cellular homeostasis and responses. The formation and dissolution of such condensates within a cell must be carefully regulated, and collapse or dysregulation of condensates can result in the dysfunction of crucial cellular processes and, in extreme cases, cell death. Liquid–liquid phase separation (LLPS), in which biomolecules form a liquid-state condensate, is involved in the formation of various intracellular membraneless organelles, such as nucleoli and other RNA-containing granules (Banani et al., 2017; Hirose et al., 2023). The major driving forces of LLPS are multivalent interactions among polypeptides (Choi et al., 2020; Gao et al., 2022; Mittag and Parker, 2018; Wang et al., 2018b). Multivalency can be achieved either via multiple stereospecific interactions (Banani et al., 2017; Banjade and Rosen, 2014; Li et al., 2012) or via multiple relatively weak interactions between resides in intrinsically disordered regions (IDRs, also referred to as low-complexity regions), in which the polypeptide lacks secondary or tertiary structure under physiological conditions (Harmon et al., 2017; Teixeira et al., 2005; Wang et al., 2014). In the latter example, interpreted through the so-called ‘stickers and spacers’ framework, ‘stickers’ represent amino acid residues that can interact with their partners through distinct weak interactions, including electrostatic, hydrophobic, π–π and cation–π interactions, occurring between flexible ‘spacer’ regions (Wang et al., 2018b). Thus, the phase behaviour of an IDR largely depends on the composition and sequence of monomers along the chain (for example, the amino acid composition and sequence of a polypeptide) (Quiroz and Chilkoti, 2015; Simon et al., 2017; Weber, 2017).
Several studies using bioinformatics approaches, including machine learning tools, have attempted to correlate amino acid sequence with the structural ensemble of an IDR (a computational model predicting a set of possible conformations of an unstructured protein) and the propensity for LLPS (Chu et al., 2022; Lotthammer et al., 2024). Numerous studies in soft-matter physics have also investigated the correlation between charge distribution along a polymer and its coacervation, which occurs in a polymer solution when the solute separates into condensed and dilute phases. Computational studies using simulated polymers have found that a charged polymer (polyampholyte) with large charge segregation (hereafter termed ‘charge blockiness’) exhibits stronger coacervation than one with a random charge distribution (Hazra and Levy, 2020; Das et al., 2018b; Das et al., 2018a; Lin et al., 2018; Chang et al., 2017; Das and Pappu, 2013). This behaviour has been demonstrated both for complex coacervation, involving two different polymer species with opposite charges (one negative and one positive), and self-coacervation, involving a single polymer with alternating oppositely charged blocks. The sequence–phase relationship has also been investigated using polypeptide chains in vitro and with computational modelling; these studies suggest that charge blockiness influences LLPS of polypeptides in a similar manner to coacervation of simulated polymers (Dinic et al., 2021; Kapelner et al., 2021; Lytle et al., 2019; Martin and Mittag, 2018; Nott et al., 2015). The polypeptide sequence of a protein also affects the properties of its biological condensates, suggesting a mechanism of sequence- and composition-dependent regulation of membraneless organelles (Weber, 2017).
Despite a large amount of evidence supporting the involvement of charge block-driven coacervation in LLPS, a mechanism has not been fully elucidated. Recent studies have reported that many cellular systems utilise charge block-driven LLPS to achieve unique spatiotemporal regulation of processes within the intracellular milieu. A better understanding of the fundamental relationship between amino acid sequence, charge blockiness and phase behaviour of proteins is therefore warranted. In this Review, we introduce the principles of charge block-driven phase separation through molecular dynamics simulations and theoretical approaches. We then summarise recent progress in understanding the mechanism and biological roles of charge block-driven LLPS, with a focus on how post-translational modifications (PTMs) impact charge blockiness and, consequently, protein behaviours.
Mechanism of charge block-driven LLPS
A charge block can be defined as a region of charge segregation along a polymer chain. For polypeptides, this can be visualised by plotting the rolling average charge within a fixed window of amino acids against the amino acid position (Fig. 1A). A charge block can be recognised as region above or below the neutral (net zero charge) line in this plot. The characteristics of a charge block are determined by the length of this region, which represents the number of amino acids in the block, and the height of the region, which depends on the density of the charged residues (Fig. 1B). Several parameters have been introduced or utilised to quantify charge blockiness. These include κ (Das and Pappu, 2013), which describes the sum of the charge asymmetry along a charged polymer chain; sequence charge decoration (Sawle and Ghosh, 2015), a weighted summation over all pairs of charges along a given sequence; Dseg (Yamazaki et al., 2022), which describes the statistical variation in the distribution of charged monomers; and BLC (Yamazaki et al., 2022), a degree of the segregation of like charges along the polymer chain. In this section, we will introduce and demonstrate the principles of charge block-driven phase separation using a coarse-grained molecular dynamics simulation of charged polymers and theoretical approaches from soft-matter physics.
Defining charge blocks within polymer molecules. (A) A schematic diagram of charge blocks in a representative polymer (polyampholyte) made up of positively charged (blue), negatively charged (red) and neutral (grey) monomers (top). The corresponding charge plot (bottom) shows the rolling average charge within a fixed window of monomers (y-axis) against the monomer (e.g. amino acid) position (x-axis). Charge blocks are represented as areas above (shaded blue) or below (shaded red) the neutral charge line (y=0). (B) Charge blocks are characterised by the length (l), which depends on the number of amino residues within a block, and height (h), which depends on the charge density within a block.
Defining charge blocks within polymer molecules. (A) A schematic diagram of charge blocks in a representative polymer (polyampholyte) made up of positively charged (blue), negatively charged (red) and neutral (grey) monomers (top). The corresponding charge plot (bottom) shows the rolling average charge within a fixed window of monomers (y-axis) against the monomer (e.g. amino acid) position (x-axis). Charge blocks are represented as areas above (shaded blue) or below (shaded red) the neutral charge line (y=0). (B) Charge blocks are characterised by the length (l), which depends on the number of amino residues within a block, and height (h), which depends on the charge density within a block.
Simulation approach
To understand how the propensity of a polymer to undergo LLPS is dependent on its monomer sequence and charge block pattern, we conducted molecular dynamics simulations in which we compared the phase diagrams of polyampholytes with different monomer sequences (see Box 1 for a detailed description of the model). Four examples of electrically neutral polyampholytes (with a net charge of zero per chain) comprising 30 monomers (N=30) with differing charge sequences are shown in Fig. 2A (Seq1, Seq2, Seq3 and Seq15). Each monomer carries either +1 or −1 elementary charge. Typical phase diagrams obtained from the simulations for solutions of these four polyampholytes are shown in Fig. 2B. Notably, polyampholytes with different charge sequences produced different phase diagrams, indicating that the propensity of the polyampholytes to undergo LLPS depends on the monomer sequence. Furthermore, we observed a clear correlation between charge blockiness and the propensity for LLPS: polyampholytes with a higher degree of charge blockiness were more prone to phase separation, which is consistent with previous studies (Chang et al., 2017; Das et al., 2018b; Dinic et al., 2021; Lin et al., 2018; McCarty et al., 2019), and the sequence with the least charge blockiness (Seq1) did not show phase separation in our simulations.
Box 1. Simulation model
Here, we detail the model used in the molecular dynamics simulations discussed in this Review. The monomers (N=30) in each polyampholyte were linearly connected by a finite extensible non-linear elastic potential (Kremer and Grest, 1990). In addition to this bonding potential between adjacent monomers, the monomers interact with each other via a screened Coulomb potential (Landau and Lifshitz, 1980), also called the Yukawa potential, and a purely repulsive Lennard-Jones potential (Kremer and Grest, 1990). Unlike the bonding potential, these two non-bonded potentials act between any pair of monomers in both the same and different chains of the system. The purely repulsive Lennard-Jones potential results from the excluded volume effect, which defines the volume around a monomer from which other monomers are excluded due to the presence of the first monomer. The particle diameter, σ, is defined in this Lennard-Jones potential, and each particle has a volume v0≃σ3. Electrostatic interactions between positively and negatively charged monomers were defined using the Debye–Hückel potential (Landau and Lifshitz, 1980), where the Debye screening length was set to mimic physiological salt conditions. The Debye screening length, which is defined in the screened Coulomb potential, was fixed at 2.0σ in the present simulation system.
Next, the molecular architecture of the four species Seq1, Seq2, Seq3 and Seq15 is illustrated (Fig. 2A). Each chain of these species consisted of 15 positively charged monomers and 15 negatively charged monomers. Positive and negative blocks are alternately positioned in each chain of the Seqj species, where one positive (or negative) block is defined as a sequence of continuously positioned positively (or negatively) charged monomers. For example, Seq3 contains repeating blocks of three positively charged monomers followed by three negatively charged monomers. Chains of the Seq15 species are equivalent to those of diblock copolymers, which are polymers composed of only two blocks.
Sequence-dependent phase behaviour of polyampholytes. (A) Schematic representations of four polyampholyte chains with different levels of charge blockiness (Seq1, Seq2, Seq3 and Seq15). (B) A phase diagram showing typical examples of binodal curves, also referred to as coexisting curves, of the four molecular species described in A. The x-axis (φ) indicates the volume fraction, which represents the polymer concentration in the system, and the y-axis (Bjerrum length, lB) represents the strength of interaction between monomers. The area above each curve is the parameter region where two phases appear. The volume fractions of polyampholyte within the condensed phase [φ(c)] and dilute phase [φ(d)] for Seq2 at a fixed value of lB=0.9σ (horizontal dotted line) are indicated as a representative example. Note that, assuming σ≈0.8 nm (typical size of an amino acid), the value lB=0.9σ corresponds to the Bjerrum length in water at room temperature. (C) To analyse the conformations of the four molecular species in the condensed and dilute phases, the average spatial distance between two monomers, r(s), was plotted against the positional distance along the polymer chain (s). The results for the four different sequences in the condensed (left) and dilute (right) phases at lB=0.9σ are shown. The result for Seq1 in the condensed phase is not shown because this polyampholyte does not show phase separation in this condition. The average spatial distance r(s) of an ideal chain (Kremer and Grest, 1990), which is defined as r(s)/σ=c1×s1/2, is represented by a dotted line. The numerical constant c1=1.26 was adopted from an earlier work (Kremer and Grest, 1990). In the condensed phase, the analysed sequences display nearly ideal chain conformations (random coil) regardless of charge block pattern. In the dilute phase, sequences with higher degrees of charge blockiness demonstrate more compact, globular conformations. (D) Schematic illustrations of polymer conformations of the four sequences described in A. The results shown in panels B and C are depicted. (E) A diagram illustrating the relationship between mesh sizes of the polymers Seq2 and Seq3 in dilute [ξ(d)] and condensed [ξ(c)] phases. Seq3, with a higher degree of charge blockiness, is predicted to have a smaller mesh size, higher compaction and, thus, increased propensity for LLPS.
Sequence-dependent phase behaviour of polyampholytes. (A) Schematic representations of four polyampholyte chains with different levels of charge blockiness (Seq1, Seq2, Seq3 and Seq15). (B) A phase diagram showing typical examples of binodal curves, also referred to as coexisting curves, of the four molecular species described in A. The x-axis (φ) indicates the volume fraction, which represents the polymer concentration in the system, and the y-axis (Bjerrum length, lB) represents the strength of interaction between monomers. The area above each curve is the parameter region where two phases appear. The volume fractions of polyampholyte within the condensed phase [φ(c)] and dilute phase [φ(d)] for Seq2 at a fixed value of lB=0.9σ (horizontal dotted line) are indicated as a representative example. Note that, assuming σ≈0.8 nm (typical size of an amino acid), the value lB=0.9σ corresponds to the Bjerrum length in water at room temperature. (C) To analyse the conformations of the four molecular species in the condensed and dilute phases, the average spatial distance between two monomers, r(s), was plotted against the positional distance along the polymer chain (s). The results for the four different sequences in the condensed (left) and dilute (right) phases at lB=0.9σ are shown. The result for Seq1 in the condensed phase is not shown because this polyampholyte does not show phase separation in this condition. The average spatial distance r(s) of an ideal chain (Kremer and Grest, 1990), which is defined as r(s)/σ=c1×s1/2, is represented by a dotted line. The numerical constant c1=1.26 was adopted from an earlier work (Kremer and Grest, 1990). In the condensed phase, the analysed sequences display nearly ideal chain conformations (random coil) regardless of charge block pattern. In the dilute phase, sequences with higher degrees of charge blockiness demonstrate more compact, globular conformations. (D) Schematic illustrations of polymer conformations of the four sequences described in A. The results shown in panels B and C are depicted. (E) A diagram illustrating the relationship between mesh sizes of the polymers Seq2 and Seq3 in dilute [ξ(d)] and condensed [ξ(c)] phases. Seq3, with a higher degree of charge blockiness, is predicted to have a smaller mesh size, higher compaction and, thus, increased propensity for LLPS.
Next, to understand why sequences with higher degrees of charge blockiness have greater propensity for LLPS, we performed conformational analysis of the polyampholyte molecules in the dilute (outside the condensate) and condensed (inside the condensate) phases. This provided useful information on the molecular mechanism of charge block-driven LLPS. Fig. 2C shows the average spatial distance r(s) between two points along the chain that are s monomers apart. In the condensed phase, the chain conformations of the polyampholyte molecules followed ideal chain statistics, representing random coil conformations. Notably, this tendency was sequence independent, suggesting that molecules in the condensed phase show near-random coil configurations regardless of charge blockiness. In contrast, in the dilute phase the chain conformations showed a strong sequence dependency. The molecule that did not show phase separation (Seq1) was predicted to have a random coil conformation, whereas those that showed phase separation (Seq2, Seq3 and Seq15) were predicted to have more compact (globular) conformations in the dilute phase (Fig. 2C,D). Notably, the degree of compaction was proportional to the charge blockiness, and the overall globule size R was , where the superscript d indicates the dilute phase. Such a correlation between the single-chain molecular compactness and the propensity for phase separation has been observed not only in physiological salt conditions, as used in our simulations, but also in a salt-free solution (Lin and Chan, 2017; McCarty et al., 2019). Intuitively, both the compaction of isolated polymers and phase separation are driven by the same attractive interactions; thus, this relationship seems to be a general characteristic of polyampholyte systems that might also be relevant to the LLPS of proteins in solution.
Theoretical approaches
The relationship between molecular compactness and propensity for LLPS can also be discussed using theoretical approaches. LLPS occurs when the homogeneous state of a solution becomes unstable or metastable such that phase separation decreases the total free energy of the system (de Gennes, 1979; Doi and See, 1995). Theoretical descriptions of LLPS usually require some sort of simplification of the complex details of the system, and several theoretical approaches have been employed, each of which differs in the level of coarse graining and the approximations involved (Lin et al., 2018). One of the approaches often adopted to describe LLPS of protein solutions is the Flory–Huggins theory (described in Box 2). The advantage of the Flory–Huggins theory lies in its simple structure, which neglects the chain connectivity effect in evaluating the interaction energy in a polymer solution (the sum of monomer–monomer, monomer–solvent and solvent–solvent interactions). This approach provides a miscibility phase diagram in the plane spanned by protein concentration (or volume fraction) φ and the effective interaction strength χ (see Box 2) (de Gennes, 1979; Deviri and Safran, 2021; Rubinstein and Colby, 2003). Another well-known approach is the random phase approximation, which, unlike the Flory–Huggins theory, takes into account the correlation effect arising from chain connectivity. Therefore, by explicitly evaluating the effect of the charge sequence in an approximated way, this method has been demonstrated to provide qualitatively correct LLPS phase diagrams of polyampholyte solutions that include the correlation between charge blockiness and phase-separation propensity (Borue and Erukhimovich, 1988; Dinic et al., 2021; Kudlay and Olvera De La Cruz, 2004; Lin and Chan, 2017). Here we employ the Flory–Huggins framework to analyse our simulation results.
Box 2. Flory–Huggins theory
where kBT and φ (=cv0) are the thermal energy and the volume fraction of the protein (with c being the monomer concentration), respectively (de Gennes, 1979; Doi and See, 1995; Rubinstein and Colby, 2003). The Flory–Huggins theory was originally developed for homopolymeric systems. Thus, when applying the Flory–Huggins description to protein solutions, the heteropolymeric nature of the protein is not explicitly considered, and as such the strength of effective interactions between amino acids is represented by a single parameter χ. Analysis of Eqn 1 reveals that a homogeneous solution (with overall volume fraction φ0) phase separates into a condensate [a protein-rich dense phase with volume fraction φ=φ(c)>φ0] and a supernatant [a dilute phase with volume fraction φ=φ(d)<φ0] when χ is larger than a critical value . The box figure below is a schematic illustration of the resulting binodal curve, which is shown in black. The binodal curve, representing the phase boundary, is determined by the equality of the chemical potential
and the osmotic pressure Π=cμ−f between two phases. The critical point (φ=φc, χ=χc; marked by a circle in the box figure), where the difference between two phases is infinitesimal, separates the binodal curve into the dilute (left) and dense (right) branches.


This relationship results from the balance between the two-body attraction and three-body repulsion represented by the second and third terms in Eqn 1, respectively. The asymptotic behaviour of the binodal curve at high φ given in Eqn 2 is represented by a grey dashed line.
Fig. 2B shows that the region of the phase diagram where a two-phase state occurs widens as the charge blockiness increases (see the dotted line at a fixed interaction strength of 0.9). Here, the major difference among sequences is the volume fraction of polymer found inside the respective condensates. It is possible to connect the volume fraction inside the condensate φ(c) and the size of isolated globules in the dilute phase R(d), where one expects that the mesh size inside the condensate ξ(c) and the mesh size inside the globules ξ(d) (see Box 3 for a detailed explanation of mesh sizes) are equal (Fig. 2D,E) based on the analysis of free energy (Box 2). Whereas ξ(c) is expected to be inversely proportional to φ(c), R(d) increases with ξ(d). Combining these two relations, we find the following: a sequence with more charge blockiness has a small R(d) (see the simulation results in Fig. 2C), which corresponds to small ξ(d)=ξ(c) and, consequently, large φ(c). Thus, the modulation of charge blockiness in amino acid sequences provides an efficient way to control the phase behaviour of protein solutions. In living cells, charge blockiness can be modulated via PTMs. We will next overview recent advances in the understanding of how cells might use PTMs to exert spatiotemporal control over LLPS.
Box 3. Mesh size
Inside a condensate, protein molecules obey nearly ideal (i.e. Gaussian) conformational statistics and strongly overlap with each other, which corresponds to the semi-dilute θ solution in polymer physics terminology. Such an overlapped state is characterised by the correlation length of the concentration fluctuation, or the so-called ‘mesh size’ ξ(c)≃σ/φ(c) (de Gennes, 1979; Rubinstein and Colby, 2003). This results from the relations and ξ(c)≈σ(g(c))1/2, which imply that there are only g(c) monomers from a single protein inside the mesh and that the condensate is made of the dense piling of such meshes. In contrast, proteins in the dilute phase are isolated from each other and adopt compact globular conformations, which can be described by the same free energy as in Eqn 1 (see Box 2), without the first term. This indicates that the volume fraction φ(d) inside a single globular protein is given by Eqn 2 (see Box 2), which allows the size of the globule to be determined:
A physical picture of this globular conformation can be inferred by defining the length scale ξ(d)≃σ(2χ−1)−1. This length, called the thermal blob size, results from the relations and ξ(d)≈σ(g(d))1/2, which imply that inside a thermal blob [with g(d) monomers], the attraction, which drives compaction and phase separation, is a negligible perturbation. At a larger scale, attraction prevails over thermal fluctuations, causing the thermal blobs to adhere to each other. One can rederive Eqn 3 by viewing a globule as a compact stacking of thermal blobs, that is:
.
Temporal regulation of LLPS coupled to PTMs
Understanding charge block-mediated polymer interactions provides not only a new LLPS-promoting mechanism from a physicochemical point of view but also a novel mechanism for spatiotemporal regulation of LLPS in a biological context. However, most theoretical studies have focused on relatively simple systems containing one (or no more than a few) species of polymer. The phase diagram of such homotypic LLPS reflects only a few parameters, such as the interaction strength (determined by temperature or salt concentration) and polymer concentration. In contrast, cellular environments are heterogeneous and comprise diverse polymers, salts and other biomolecules. Charge blocks can also be dynamically altered by PTMs that either mask or neutralise charged amino acids or add new charged groups to neutral residues (Drazic et al., 2016; Luo, 2018). Understanding the phase behaviour of such heterogeneous systems with an enormous number of parameters is highly complex and challenging. Nevertheless, given that the formation of intracellular membraneless organelles is spatiotemporally regulated (i.e. occurring at a certain location within a cell at a certain time), understanding heterotypic LLPS will be necessary for the further elucidation of fundamental cellular mechanisms. Here, we discuss the possible involvement of charge block-driven LLPS in spatiotemporal regulation of membraneless organelles. The existence of such a mechanism might help resolve the question of whether the amino acid sequence of a polypeptide can intrinsically determine the spatiotemporal occurrence of a condensate within a cell.
PTMs affect LLPS
The regulatory mechanism of LLPS in the intracellular environment is an important topic of research. As the physicochemical environment of the cellular plasma (for example, temperature and salt concentration) largely does not change in vivo, biological cues – such as signalling molecules, intracellular localisation and PTMs – are expected to influence when and where in the cell LLPS occurs. PTMs in particular are of great interest because they play pivotal roles in the regulation of many biological signalling pathways and responses. Indeed, PTMs have been demonstrated to regulate LLPS in vitro and in vivo (Beutel et al., 2019; Greig et al., 2020; Rai et al., 2018; Valverde et al., 2023; Wang et al., 2014; Wang et al., 2018a; Wippich et al., 2013), although the molecular mechanisms have remained elusive.
Acetylation and methylation occur at the amino groups of lysine and arginine residues, and these PTMs change the electrostatic properties of the modified side chain. Acetylation neutralises the +1 charge of the amino group and affects cation–π interactions (interactions between cations and aromatic ring structures), which are a major driving force of LLPS. Methylation of the amino groups of lysine and arginine residues is similarly expected to affect cation–π interactions. In contrast to acetylation, methylation does not remove the +1 charge but masks it. Both of these PTMs affect protein–protein and protein–DNA interactions, as well as the propensity for LLPS (Drazic et al., 2016; Li et al., 2021; Wang et al., 2022).
Phosphorylation occurs at serine, threonine and tyrosine residues and adds a −2 charge to the modified side chain. As these residues are not involved in cation–π or hydrophobic interactions, they affect LLPS via different mechanisms. In a classical structural biological model, the covalent addition of a phosphate group to a specific site of the target protein changes its surface structure and properties, which affects stereospecific protein–protein or enzyme–ligand interactions. These stereospecific effects of phosphorylation can explain how a single phosphorylation event changes protein function. Interestingly, a number of studies using informatics approaches have reported that phosphorylation preferably occurs in IDRs rather than in structured domains (Collins et al., 2008; Darling and Uversky, 2018; Iakoucheva, 2004; Jiménez et al., 2007; Pejaver et al., 2014; Yamazaki et al., 2022), suggesting that most phosphorylation regulates the behaviour of IDRs via a mechanism different from the classical stereospecific model. Given that LLPS is driven by multivalent interactions between IDRs, it is unlikely that a single phosphate group on a long IDR can change the propensity for LLPS. However, evidence suggests that hyperphosphorylation of IDRs might be involved in regulating LLPS.
Bulk effect of phosphorylation on LLPS
Our research group has demonstrated for the first time that hyperphosphorylation can regulate LLPS by changing the charge blockiness of an IDR (Yamazaki et al., 2022). Using quantitative phosphoproteomics to identify proteins that exhibit high levels of phosphorylation during mitosis (mitotic hyperphosphorylation), we have found that the hyperphosphorylation of some nucleolar proteins changes their charge block patterns. The IDR of the nucleolar protein nucleophosmin (NPM1) contains a clear alternating pattern of positive and negative charge blocks during interphase (when it is non-phosphorylated) and has a strong propensity for LLPS. Interestingly, mitotic hyperphosphorylation of NPM1 occurs mostly in the positively charged blocks, which neutralises them. This results in reduced self-coacervation and LLPS of hyperphosphorylated NPM1 in an in vitro assay (Yamazaki et al., 2022). Interestingly, the opposite effect is observed for another IDR-rich nucleolar phosphoprotein, antigen Kiel 67 (Ki-67, also known as MKI67). The repeat domain of Ki-67, which is predicted to be highly disordered, shows several positive charge blocks when non-phosphorylated during interphase. In this case, mitotic hyperphosphorylation occurs in relatively charge-neutral regions and thus produces new negative charge blocks. This generates alternating positive and negative charge blocks and enhances the self-coacervation of Ki-67 (Yamazaki et al., 2022).
The results described above demonstrate that hyperphosphorylation can regulate LLPS as a result of the clear relationship between alternating charge blocks and the propensity for self-coacervation. These findings suggest that when phosphorylation abolishes alternating charge blocks, LLPS is inhibited, and when phosphorylation enhances charge blockiness, LLPS is promoted. This might explain the statistical evidence that phosphorylation, as well as other PTM, of multiple residues occurs cooperatively at proximal residues (Freschi et al., 2014; Li et al., 2009; Pejaver et al., 2014; Schweiger and Linial, 2010). Thus, although a single phosphorylation might not affect the charge block pattern of an IDR, multiple phosphorylations occurring at proximal residues can affect charge blockiness and change the propensity for LLPS. This mechanism is in clear contrast to the conventional model of phosphorylation-dependent regulation of protein function, in which the addition of a phosphate group to a specific residue positively or negatively changes stereospecific protein–protein or protein–ligand interactions.
LLPS-coupled robustness of the system
An advantage of charge block-dependent regulation of LLPS over stereospecific regulation of enzymatic activity might be related to the robustness of the system. In stereospecific regulation, the two structural states of a protein (for example, phosphorylated and dephosphorylated) correspond to two functional states (for example, catalytically ‘on’ or ‘off’, or bound or unbound). Thus, a phosphorylation–dephosphorylation cycle can reversibly convert protein function between these two discontinuous states (Fig. 3A, top). In a hypothetical system containing a finite pool of such a substrate protein, the population of protein in the ‘on’ state increases as phosphorylation proceeds. This results in a clear linear relationship between protein function and phosphorylation in the entire system until the response reaches a certain plateau level (Fig. 3A, bottom), indicating that two discontinuous states of the substrate protein produce a continuous response in the system. In contrast, in charge block-driven LLPS, phosphorylation produces multiple phosphorylated states of a substrate protein. For example, a protein with three phosphosites has eight different possible phosphorylated states. This indicates that as phosphorylation proceeds, the system contains an increasingly more heterogeneous ensemble of phosphorylated protein states (Fig. 3B, top). In contrast to homogeneous polymer solutions, our understanding of the phase behaviour of such heterogeneous solutions remains relatively incomplete. However, it can be speculated that the bulk amount of phosphosites present in the system can affect the phase boundary. At a low bulk amount of phosphosites, the system is in a single phase, but will show two-phase separation when the bulk amount of phosphosites exceeds a threshold level (Fig. 3B, bottom). This mechanism confers both robustness and a drastic phase transition to the system depending on the range of phosphorylation, which is in clear contrast to the single-site regulation system. These properties might be fundamental to the robustness and noise-cancelling mechanisms of many biological systems. For example, expression noise, the natural variation in expression levels of intracellular proteins, originates from fluctuations in gene expression rates and the activities of transcription- and translation-related proteins. LLPS has been speculated to buffer cells against expression noise by compartmentalising different proteins within condensates (Deviri and Safran, 2021; Stoeger et al., 2016). In addition, phosphorylation-dependent regulatory signals contains various types of noise, originating partly from the stochastic reactions catalysed by multiple kinases and phosphatases (Aledo, 2018; Cohen, 2000; Holmberg et al., 2002). Thus, charge block-driven LLPS associated with multisite phosphorylation might contribute robustness to such a noisy system.
Behaviours of heterogeneous systems of polymers. (A,B) Responses of a hypothetical system containing (A) a structured phosphoprotein or (B) a disordered phosphoprotein. In A, a single phosphorylation (P) of the substrate protein by a protein kinase switches the protein function from ‘off’ to ‘on’. The population of substrate protein in the ‘on’ state increases as the kinase activity continues until all of the substrate is in the ‘on’ state. This results in a linear relationship between kinase activity and the response (output) of the system. In B, with charge block-driven LLPS, the substrate protein undergoes multiple phosphorylations. As the kinase reaction proceeds, proteins with several different phosphorylation states are produced. A condensate does not form when the bulk amount of phosphosites is below a certain level (the critical concentration). When the bulk amount of phosphosites exceeds this threshold, protein condensates start to form (phase transition). This system results in a non-linear relationship between kinase activity and the response, wherein a steady state persists despite increasing kinase activity until the critical concentration of phosphosites is reached. (C) Several mechanisms of mixing or exclusion (de-mixing) for two species of protein liquid droplets. In heterogeneous systems such as the cytoplasm, physical properties such as the type of molecular interaction driving condensate formation (top), the elasticity of condensates (middle) and the degree of charge blockiness (bottom) are speculated to influence whether condensates comprised of different biomolecules mix or exclude one another.
Behaviours of heterogeneous systems of polymers. (A,B) Responses of a hypothetical system containing (A) a structured phosphoprotein or (B) a disordered phosphoprotein. In A, a single phosphorylation (P) of the substrate protein by a protein kinase switches the protein function from ‘off’ to ‘on’. The population of substrate protein in the ‘on’ state increases as the kinase activity continues until all of the substrate is in the ‘on’ state. This results in a linear relationship between kinase activity and the response (output) of the system. In B, with charge block-driven LLPS, the substrate protein undergoes multiple phosphorylations. As the kinase reaction proceeds, proteins with several different phosphorylation states are produced. A condensate does not form when the bulk amount of phosphosites is below a certain level (the critical concentration). When the bulk amount of phosphosites exceeds this threshold, protein condensates start to form (phase transition). This system results in a non-linear relationship between kinase activity and the response, wherein a steady state persists despite increasing kinase activity until the critical concentration of phosphosites is reached. (C) Several mechanisms of mixing or exclusion (de-mixing) for two species of protein liquid droplets. In heterogeneous systems such as the cytoplasm, physical properties such as the type of molecular interaction driving condensate formation (top), the elasticity of condensates (middle) and the degree of charge blockiness (bottom) are speculated to influence whether condensates comprised of different biomolecules mix or exclude one another.
Spatial control of condensates within a cell
In addition to temporal regulation, spatial regulation and intracellular localisation of biological condensates are important areas of research where a large gap remains between in vitro and in vivo studies. Given that many proteins contain both structured and disordered domains, it can be speculated that the structured domains tend to stereospecifically interact with a certain intracellular architecture, scaffold or protein complex, whereas the interactions between the disordered domains tend to drive the macroscopic behaviour of the proteins at the same subcellular location. This pattern can be observed, for example, in condensate formation near the membrane and the generation of membrane curvature by proteins such as CIP4 (also known as TRIP10), which is a multidomain protein with a classic F-BAR curvature-sensing domain and IDRs (Yu and Yoshimura, 2023; Zeno et al., 2018).
Another question related to the spatial regulation of LLPS regards the mechanism of incorporation or exclusion of two or more different species of condensates (the ‘mixed-or-excluded’ problem) (Fig. 3C) (Konishi and Yoshimura, 2020). When multiple polymers with different properties coexist in a system (such as the cytoplasm), they can coexist in the same liquid (dilute) phase, exclude each other from condensates (excluded) or encapsulate the other(s) within the same condensate (mixed) (Latham and Zhang, 2022; Liu et al., 2023; Lu and Spruijt, 2020). Such complex behaviour of multicomponent systems can be partly understood by analysing the phase diagrams of individual components. However, the system usually exhibits more complicated phase behaviour than the sum of these individual phase behaviours. In living cells, many different biological condensates assemble and disassemble in a spatially well-defined manner without mixing. Given that LLPS is driven by weak and promiscuous interactions, a specific mechanism likely exists that coordinates the spatial localisation of various condensates within a cell.
Several possible mechanisms have been proposed to explain the mixed-or-excluded problem. One mechanism involves the tendency of protein droplets formed via electrostatic interactions to exclude droplets formed via hydrophobic interactions (Konishi and Yoshimura, 2020), indicating that the type of molecular interaction that drives LLPS of a given condensate determines whether two or more condensates will mix. Another possible mechanism suggests that the physical properties of droplets are also a determinant of mixing or exclusion. Each condensate has different elastic properties depending on the physicochemical properties of its polymers, and elastic properties also change with time and other environmental factors (Franzmann et al., 2018; Jawerth et al., 2020; Shin and Brangwynne, 2017; Wang et al., 2018b). It has therefore been speculated that ‘soft’ droplets with lower elasticity might tend to fuse with each other more easily than ‘hard’ droplets with high elasticity.
A recent study has demonstrated that the charge block pattern of IDRs can also determine the partitioning pattern of condensates. Lyons et al. have demonstrated that the transcriptional regulator mediator of RNA polymerase II transcription subunit 1 (MED1) co-segregates with RNA polymerase II and excludes other transcriptional regulators (Lyons et al., 2023). The IDR of MED1 comprises alternating charge blocks, and changing this charge pattern affects the co-segregation of MED1 with RNA polymerase II and alters the gene activation profile. Interestingly, a chimeric protein in which the IDR of MED1 is replaced with an IDR of differing amino acid sequence but carrying similarly patterned charge blocks still co-segregates with RNA polymerase II and produces a similar gene expression profile. These findings suggest that charge blockiness also represents a novel mechanism for the spatial coordination of condensate formation and further underscore the biological significance of charge block-driven LLPS.
Concluding remarks and future directions
Charge block-driven LLPS is a novel macromolecular behaviour that influences protein function, particularly that of proteins containing IDRs. Recent studies have demonstrated that it regulates an increasing number of proteins and cellular events, although the theoretical study of these phenomena has just started. Here, we have provided an overview of work in the fields of cell biology, biochemistry and soft-matter physics that has begun to elucidate the mechanism and biological roles of charge block-driven LLPS, as well as how this behaviour might be modulated by PTMs such as bulk phosphorylation. Statistical analysis has demonstrated that nucleolar proteins contain higher fractions of IDRs (∼20%) than cytosolic proteins (∼14%) (Stenström et al., 2020) and that many of these proteins contain charge blocks (Yamazaki et al., 2022). Furthermore, the human proteome contains ∼50,000 phosphosites, ∼70% of which are located within IDRs (Darling and Uversky, 2018; Pejaver et al., 2014; Yamazaki et al., 2020). Taken together, these observations suggest that charge block-driven LLPS is involved in many and diverse cellular events. Charge blocks likely play a key role in the phosphorylation-dependent regulation of proteins, and charge block-driven LLPS associated with hyperphosphorylation provides a novel regulatory mechanism that is distinct from conventional models of stereospecific regulation of protein activity or protein–protein interactions. Determining whether other PTMs that affect the charge properties of the substrate protein (i.e. methylation and acetylation) also regulate LLPS by changing charge blockiness will be the target of future studies. As discussed above, changes in charge blockiness can also influence the molecular composition of mixed condensates. In addition, it has recently been demonstrated that biomolecular condensates can be composed of multiple distinct layers separated by interfaces with differing electrical states, with each layer containing different conformations of polymers (Farag et al., 2022; Hoffmann et al., 2023). Charge blocks might thus also play roles in the construction of such intra-condensate polymer networks. Further research is necessary to clarify the complex biological roles of this fascinating phenomenon.
Footnotes
Funding
This work was supported by Japan Society for the Promotion of Science (JSPS) KAKENHI grants JP22H05171 to S.H.Y., JP23H00369 to S.H.Y. and T.S., and JP23H04290 to T.S.; by the Japan Agency for Medical Research and Development (AMED) grant 20wm0325009s0201 to S.H.Y.; and by Joint Research of the Exploratory Research Center on Life and Living Systems, National Institutes of Natural Sciences (ExCELLS program 23EXC601-4 to S.H.Y.).
References
Competing interests
The authors declare no competing or financial interests.