Computational ‘microscopy’ refers to the use of computational resources to simulate the dynamics of a molecular system. Tuned to cell membranes, this computational ‘microscopy’ technique is able to capture the interplay between lipids and proteins at a spatio-temporal resolution that is unmatched by other methods. Recent advances allow us to zoom out from individual atoms and molecules to supramolecular complexes and subcellular compartments that contain tens of millions of particles, and to capture the complexity of the crowded environment of real cell membranes. This Commentary gives an overview of the main concepts of computational ‘microscopy’ and describes the state-of-the-art methods used to model cell membrane processes. We illustrate the power of computational modelling approaches by providing a few in-depth examples of large-scale simulations that move up from molecular descriptions into the subcellular arena. We end with an outlook towards modelling a complete cell in silico.
Cell membranes comprise a heterogeneous mixture of membrane proteins and lipids, and are far more complex than frequently realized. Typical eukaryotic plasma membranes, for instance, contain hundreds of different lipids that are distributed asymmetrically across the leaflets and heterogeneously in the membrane plane (Jacobson et al., 2007; van Meer et al., 2008). Furthermore, membranes are crowded with proteins. The membrane area that is covered with proteins can be as large as 30% at a lipid:protein ratio of about 1:50 to 1:100 (Engelman, 2005). Understanding the protein–lipid interplay that gives rise to the organization principles of cell membranes is essential for life and health (Holthuis and Menon, 2014), and is emerging as an exciting frontier taking place at the crossroads of biology, life sciences, physics and chemistry. However, our current understanding of the detailed organization of cellular membranes remains rather elusive. Characterization of the structural heterogeneity in vivo is very challenging, owing to the lack of experimental methods suitable for studying these fluctuating nanoscale assemblies in living cells with the required spatio-temporal resolution.
Computational modelling has emerged as a powerful alternative method and has become an indispensable tool to complement conventional experimental methods. This is illustrated by the award of the Nobel Prize in Chemistry for computational chemistry in 2013. According to Arieh Warshel, one of the laureates: “This (computational modelling) is the best tool we have to see how molecules are working”. Schulten and co-workers have recently introduced the term ‘computational microscopy’ to underline the sophistication of the field (Lee et al., 2009). In recent years, computational ‘microscopy’ has proven to be a unique approach for investigating the lateral organization of cellular membranes, in addition to traditional microscopy-based methods, such as electron microscopy, atomic force microscopy (AFM) and super-resolution microscopy. It should be stressed, however, that computational modelling is much more than merely an in silico ‘imaging’ technique. Computational ‘microscopy’ can provide information on dynamics, interactions, conformational changes, transport, etc., similar to fluorescence microscopy techniques, such as fluorescence correlation spectroscopy (FCS), dual-colour fluorescence-burst analysis (DCFBA), single-molecule fluorescence resonance energy transfer (smFRET), single particle tracking (SPT) and fluorescence recovery after photobleaching (FRAP). With this in mind, an analogy of computational ‘microscopy’ to experimental microscopy techniques can be considered.
The advantages of the computational approach are multiple. Foremost, it is not resolution limited, unlike experimental microscopy techniques. Computational ‘microscopy’ provides unrivalled spatial and temporal resolution down to the movement of individual atoms on a femtosecond time scale, literally following the movements of every atom of the system of interest. Furthermore, it allows precise control of the (virtual) laboratory environment as well as exact reproducibility – features that are difficult to achieve in real experiments. In addition, computer simulations enable the use of a variety of alchemical tricks (i.e. using unphysical pathways) to compute the thermodynamic driving forces underlying biomolecular processes, thus enabling us to unravel the fascinating lipid–protein interplay that dictates the behaviour of cell membranes.
However, there are also limitations. An important concern is the accuracy of the underlying molecular model. Poorly parameterized models can easily lead to erroneous results (i.e. the simulated system does not resemble the real one). Fortunately, substantial progress has been made over the past decades to improve the quality of these models, resulting in an increasing number of predictions from simulations that have been validated experimentally. Continuous efforts are directed at further improvement, often in close conjunction with experimental characterization of model membrane systems (Botan et al., 2015; Im et al., 2012; Kučerka et al., 2010; Pan et al., 2014; Vogel and Feller, 2012).
Another important drawback is the limited scale of the systems that can be studied. Calculating the interactions between a large number of particles is computationally demanding. Owing to continuous progress in both software and hardware development over the past decades, the accessible spatio-temporal scales have been steadily increasing. The first all-atom simulations of biological membranes, dating back to the early 90s, comprised only a few hundred phospholipids prearranged in a bilayer configuration and simulated on the sub-nanosecond time scale (Heller et al., 1993; Marrink et al., 1993; Stouch, 1993), whereas current state-of-the-art all-atom simulations can follow the motion of thousands of lipids and embedded proteins over multi-microsecond time periods, and are capable of unveiling a large variety of fundamental membrane-related processes in atomistic detail (reviewed in Marrink et al., 2009; Pluhackova and Böckmann, 2015).
To study membrane processes that exceed the microsecond time scale and involve complex membranes with multiple protein assemblies, the all-atom approach is however less suitable. A powerful alternative strategy is to neglect some of the atomistic degrees of freedom, allowing for a substantial increase in the speed of simulation. This so-called coarse-grained approach has gained popularity lately (Ingólfsson et al., 2014a; Noid, 2013; Saunders and Voth, 2013) but originates back to the early days of molecular modelling (Levitt and Warshel, 1975). The use of coarse-grained models now enables simulations of membrane patches containing tens of thousands of lipids and multiple proteins up to millisecond time scales. To take full benefit of computational ‘microscope’, the resolution of the ‘microscope’ should ideally be tuneable. Multiscaling techniques are key to this end, combining the benefit from the simulation speed gained with coarse-grained models and the accuracy inherent of all-atom (and possibly quantum mechanics) models.
In this Commentary, we will first describe the workings of the computational ‘microscope’ in more detail, introducing the underlying technique of molecular dynamics and related methods, ways to tune the resolution of the computational ‘microscope’ to the view we want to achieve, and current developments to achieve high-throughput assays. After this, we will discuss some state-of-the-art applications from our group to illustrate how computer simulations can now be used to study the interplay of lipids and proteins at conditions closely mimicking the conditions of real cellular membranes. We will conclude with a short future perspective of the field, heading toward a computational view of entire cells.
Workings of the computational ‘microscope’
At the heart of the computational ‘microscope’ lies the simulation algorithm, for which molecular dynamics is most widely used (see Box 1). Molecular dynamics simulations, in their most basic form, involve numerically solving Newton's equation of motion (F=m a) for a set of particles over a given time frame. Given initial positions and velocities, and a set of rules describing how particles interact (i.e. the force field), we can calculate the acceleration (ai) of each particle i at the time point t. From the latter, we can predict the new positions and velocities for all the particles at the next time point (t+Δt). This procedure is then iterated to generate the temporal evolution of the system. Molecular dynamics simulation algorithms have been implemented in a number of simulation software packages (see Box 2).
Following is a short glossary of commonly used molecular dynamics terms. For more detailed explanation of molecular dynamics and the terms listed here please see Allen and Tildesley, 1987; Berendsen, 2007; Frenkel and Smit, 2001; Monticelli and Salonen, 2012; Rapaport, 2014; Schlick, 2010; van Gunsteren et al., 2006.
All-atom (AA), atomic resolution where each atom is explicitly included in the simulation.
Brownian dynamics simulations, usually refers to a mesoscopic simulation methodology where stochastic forces are used to replace the explicit solvent.
Coarse-graining (CG), refers to a coarser level or representation where a number of atoms have been combined into a single interaction group or bead.
Dissipative particle dynamics (DPD), refers to a mesoscopic simulation method with pairwise stochastic forces that preserve hydrodynamic interactions.
Energy minimization, optimization of the coordinates with respect to the molecular force field; this is often done using a gradient descent optimization algorithm where the coordinates are perturbed to minimize the total energy of the system.
Force field, refers to the set of potentials that describe the inter- and intramolecular interactions used to calculate the forces in a molecular dynamics simulation.
Molecular dynamics (MD) simulations, computer simulations at the classical mechanics level to solve Newton's equations of motion.
Multiscaling, sequential or parallel modelling at different levels of resolution (e.g. QM/MM, AA/CG).
Quantum mechanics (QM) simulations, refer to any kind of simulation at the quantum level. Molecular dynamics force fields often derive some of their bonded terms from ab initio molecular orbital quantum mechanics calculations using various levels of theory.
United-atom, atomic resolution model where some H-atoms are implicitly treated as part of the heavier atom to which they are bound.
Running molecular dynamics simulations requires a force field – a set of rules that describe how all atoms, beads and molecules interact – and a program to execute the simulation. Below, we list some of the more commonly used, broad-purpose force fields and simulation packages that are available for simulations of cell membranes.
AMBER (assisted model building with energy refinement; http://ambermd.org/) (Dickson et al., 2014; Maier et al., 2015; Salomon-Ferrer et al., 2013) is both a united-atom force field and an extensive simulation package.
CHARMM (chemistry at Harvard macromolecular mechanics; www.charmm.org) (Brooks et al., 2009; Feller and MacKerell, 2000; MacKerell et al., 1998; Zhu et al., 2012) is both an all-atom force field and a versatile simulation package.
ESPResSo (extensible simulation package for research on soft matter; http://espressomd.org/) (Arnold et al., 2013; Limbach et al., 2006) is a simulation package specially focused on coarse-grained-particle simulations.
OPLS (optimized potentials for liquid simulations) (Jorgensen et al., 1996) is a force field with both united-atom and all-atom versions.
A large range of other software packages and specialized scripts are routinely used in conjunction with molecular dynamics simulations, especially for specific types of simulations (e.g. docking, folding, computational chemistry) and analysis. To assist with simulation visualization, a number of programs exist. The two most commonly used are VMD (visual molecular dynamics; www.ks.uiuc.edu/Research/vmd) (Humphrey et al., 1996) and PyMOL (www.pymol.org).
Based on classic mechanics and given appropriate initial conditions, ‘in theory’ we could calculate exactly how a system of interacting particles evolves over time, deriving any desired property of the system through statistical mechanics. There are a number of reasons why this is not feasible ‘in practice’, some of which we discuss below.
Neglect of quantum degrees of freedom
At the nanoscopic level, there will be quantum mechanics contributions that are not accounted for explicitly using classic mechanics – e.g. the breaking and forming of chemical bonds, electronic polarization effects, quantum tunnelling and others. The relevance of this limited resolution will depend on the system and the question to be addressed. For many molecular scale processes (e.g. diffusion, protein–protein or lipid–protein interactions), these degrees of freedom can be safely averaged out and included implicitly in classic models, whereas others require higher-resolution quantum-level models.
The integration of the equations of motion is computationally intensive and limits the amount of sampling we can perform. The main limitations are the need to calculate all the forces at each time point and the necessity of using a very small time step (Δt). The forces that act on each particle at a given time point depend on the exact position of the other particles. Therefore, the calculated acceleration for each particle is only valid at that specific time point and, as soon as the particles move, new forces and accelerations need to be calculated. As the simulations are performed in discrete time steps (Δt), we need to select a time step that is sufficiently small to allow the equations of motion to be properly integrated. A good rule of thumb is to use a time step that is five to ten times faster than that of the fastest oscillation in the system. For atomistic simulations, the fastest oscillations are typically H-bond vibrations (∼10 fs), resulting in a required time step of one to two femtoseconds. Therefore, 1015 iterations are needed to simulate 1 s, which is still far beyond what even the fastest computers can achieve.
The forces between particles are estimated based on a molecular mechanics force field. Most force fields decompose the forces into bonded and non-bonded parts. The bonded part represents the forces between atoms that are covalently bound – i.e. within a molecule. These are commonly treated as a series of harmonic bond, angle and dihedral potentials between closely connected atoms (normally up to a few neighbours away). Between other atoms, non-bonded forces are used; these are typically modelled as pair-potentials that are split into electrostatic forces (treated according to Coulomb's law) and van der Waals forces (approximated using Lennard–Jones potentials). To increase the speed of computation, non-bonded interactions are often limited to atoms within a specified cut-off distance. The accuracy, speed and availability of parameterized molecules vary greatly between different force fields, and typically the more accurate (high-resolution) force fields are slower to compute. It is therefore important to carefully choose the best force field that is available for the system and specific question asked. The commonly used general-purpose molecular dynamics force-fields (see Box 2) have all been continuously optimized over several years by a growing number of research groups. Parameters are mainly validated using higher-resolution simulations, experimental data and chemical intuition.
When executing a molecular dynamics simulation, in addition to choices pertaining to integrating the equations of motion and selecting an appropriate force field, a number of additional factors need to be considered.
Boundary conditions in a simulation refer to both physical boundaries of the system, as well as the way in which state variables (e.g. temperature, pressure) are being controlled. One important decision is how to treat the edges of the simulated system. There are a number of possible options, but the most common is a rectangular box with periodic boundary conditions (PBC). PBCs remove hard edges of the simulation box by virtually replicating it in every direction. This means that particles that cross the box sides re-enter the box from the opposite side, and those that approach the edge of the box sense the other side of the box instead of a vacuum or a hard wall. This effectively removes simulation box-edge artefacts. One should be aware, however, that PBCs do not eliminate the effects of a finite box size and that PBCs can influence properties, such as diffusion (Camley et al., 2015), bilayer undulation, and lipid domain sizes and dynamics.
Just as in a real experiment, we can choose under which thermodynamic conditions to perform the in silico experiments. This choice defines the ensemble – e.g. canonical (NVT), isothermal–isobaric (NPT), microcanonical (NVE) or grand canonical (µVT). The letters of the acronyms denote the macroscopic observable that is kept constant in the ensemble – N, number of particles; V, volume; T, temperature; P, pressure; E, energy; µ, chemical potential. In an NVT ensemble, the number of particles is kept constant throughout the simulation, the box size (volume) is fixed and the absolute temperature is controlled using an external thermostat. This control varies with the thermostat in question [e.g. Andersen (Andersen, 1980), Berendsen (Berendsen et al., 1984), v-rescale (Bussi et al., 2007), Nosé–Hoover (Hoover, 1985; Nosé and Klein, 1983), Langevin piston (Feller et al., 1995)], but generally involves a small tweak to the overall molecular kinetics every few time steps to maintain the temperature close to the desired reference value. In NPT simulations, an external barostat controls the pressure [e.g. Berendsen (Berendsen et al., 1984) or Parrinello–Rahman (Parrinello and Rahman, 1981)] by means of scaling the box dimensions. For membrane systems, semi-isotropic coupling schemes (separately coupling the pressure in the bilayer plane versus the perpendicular direction) are often used to keep the membrane at a surface tension of zero.
After choosing all the simulation parameters, a system can be set up by defining its initial configuration – i.e. the position of all the particles to be simulated. This initial structure is first energy minimized to remove possible strain in the system. Then, the simulation can start and run until the system has sampled the relevant part of phase space – i.e. accessed and explored the configurations that are relevant to the problem being studied. The resulting trajectory can be visualized and analysed with proper statistical mechanics tools that relate the microscopic states to macroscopic observables. Various methods exist to efficiently compute free energies and to enhance sampling when needed. Some of the more commonly used methods for membrane systems are thermodynamic integration, umbrella sampling, replica-exchange molecular dynamics (REMD), metadynamics and transition path sampling, but these will not be discussed in this review. For more background on molecular dynamics simulations, please see, for instance, Berendsen (2007), Frenkel and Smit (2001), Rapaport (2014) and van Gunsteren et al. (2006).
Tuning the ‘microscope’ to the appropriate resolution
Molecular dynamics simulations are a powerful tool because ‘in theory’ one has all the information needed to analyse any property. The caveat is that the resolution, the time and length scales have to be adequate for the desired property. This means that, depending on the question addressed, one needs to adjust the ‘zoom’ of the computational ‘microscope’ (Fig. 1), and given the limitations of current computational resources, some properties will need to be more coarsely approximated. For cell membrane modelling, a number of zoom levels that cross all biological scales are required to cover the wide range of time and length scales (Pluhackova and Böckmann, 2015).
Zooming into the highest resolution can be achieved by using quantum mechanics simulations (Berendsen, 2007; Groenhof, 2013; Merz, 2014; Schlick, 2010). At the quantum mechanics level, nuclei and electrons are explicitly included using various levels of quantum mechanical theory. At this resolution, it is possible to simulate the breaking and formation of chemical bonds as well as any charge transfer. Solving or approximating the Schrödinger wave equation is exceptionally computationally intensive and therefore only very small systems can be resolved at this scale (involving hundreds of atoms). Using various levels of approximations [e.g. Hartree–Fock, density functional theory (DFT) or Car–Parrinello molecular dynamics (Berendsen, 2007; Groenhof, 2013; Merz, 2014; Schlick, 2010)], the range of quantum mechanics simulations can be extended. Especially promising are recent density function theory algorithms that show linear scaling efficiency with the number of particles (Arita et al., 2014). Full quantum mechanics simulations of cellular membranes are, however, still out of reach. Importantly, quantum mechanics calculations are frequently used to determine bonded parameter terms and charge distributions for higher-level (i.e. atomistic) simulations, usually for small molecules or molecular fragments in vacuum.
Fortunately, most cellular membrane processes of interest do not require treatment at the quantum mechanics level and can be described using classic mechanics at the all-atom or united-atom resolution. All-atom models explicitly include all atoms in the force field, whereas united-atom models combine some atoms into a single interaction site, for example, the non-polar hydrogens of a methylene or methyl group. As outlined above, owing to the fast H-bond vibrations, atomistic simulations are normally limited to time steps of 1–2 fs. However, the highest frequency vibrations in the system can be removed using virtual interaction sites (Feenstra et al., 1999), allowing the use of larger time steps of 4–5 fs. To capture some of the neglected electronic degrees of freedom, polarizable models can be used. Parameterization of polarizable lipid models is demanding and still at the early stages of development (Chowdhary et al., 2013).
Simulations at the atomistic level of resolution are capable of unveiling a large variety of fundamental membrane-related processes in great detail. State-of-the-art examples are studies of membrane binding and the formation of pores by antimicrobial and cell-penetrating peptides (Berglund et al., 2015; Huang and García, 2013; Moiset et al., 2013; Ulmschneider et al., 2012), of the specific binding of lipids to membrane proteins (Aponte-Santamaria et al., 2012; Contreras et al., 2012; Lee and Lyman, 2012; Pöyry et al., 2013), of the cyclodextrin-mediated extraction of cholesterol from membranes (López et al., 2013a), of the dynamic organization of multi-component membranes (Martinez-Seara et al., 2010; Sodt et al., 2014; Wu et al., 2014b), of lipid–peptide interplay in membrane fusion (Blanchard et al., 2014; Larsson and Kasson, 2013) and of the functioning of membrane proteins (Dror et al., 2011; Kopfer et al., 2014; Maffeo et al., 2012; Marinelli et al., 2014; Moradi et al., 2015; Ostmeyer et al., 2013; Romo et al., 2010).
Simulations using all-atom models, however, are currently limited to about 106 atoms and microsecond time scales. Especially under crowding conditions, membrane dynamics slow down, and sampling of the motions of individual membrane proteins – let alone of protein complexes – becomes a serious issue (Goose and Sansom, 2013; Javanainen et al., 2013). The use of massively parallel computer resources or dedicated hardware can extend the range of applicability of all-atom models to a certain extent (Perilla et al., 2015). For instance, short all-atom simulations of an entire organelle (Chandler et al., 2014) or all-atom simulations totalling over 100 μs of a G-protein-coupled receptor (GPCR) embedded in a small lipid membrane patch (Dror et al., 2011) have in fact been reported, but these are associated with a computational cost that is not available to everyone.
Zooming out further reaches coarse-grained descriptions. Coarse-graining involves grouping together various atoms into functional groups, therefore effectively reducing the number of particles in a system. Apart from the obvious increase in simulation speed based on the reduced number of particles, coarse-grained force fields define a smoother energy landscape that leads to faster overall dynamics and allows the use of larger time steps (10–100 fs). This can result in an improvement in accessible length and time scales of a few orders of magnitude, but with a reduced level of accuracy. Different strategies exist to derive coarse-grained models that can be broadly divided into ‘bottom-up’ and ‘top-down’ approaches (Ingólfsson et al., 2014a; Noid, 2013; Saunders and Voth, 2013). The former uses more detailed structural data (either from all-atom models or experiments) to systematically derive the coarse-grained interactions, whereas the latter relies on directly reproducing a variety of macroscopic experimental observables.
Popular coarse-grained force fields to model cellular membranes include Martini (Marrink and Tieleman, 2013), PLUM (Bereau et al., 2014), Shinoda-DeVane-Klein (Shinoda et al., 2010) and ELBA (Orsi and Essex, 2011). When sufficient chemical information is retained, coarse-grained models can discriminate between different lipids and distinguish protein residues, and thus form a useful bridge between atomistic and macroscopic data. For instance, one can now simulate the collective lipid-mediated self-assembly of membrane proteins (Benjamini and Smit, 2013; Hall et al., 2012; Johnston et al., 2012; Periole et al., 2012; van den Bogaart et al., 2011) and their sorting between different membrane domains (de Jong et al., 2013; Janosi et al., 2012; Schäfer et al., 2011), as well as large-scale protein-induced membrane remodelling – including fusion and fission events (Baoukina and Tieleman, 2010; Braun et al., 2014; Davies et al., 2012; Fuhrmans and Müller, 2015; Kawamoto et al., 2015; Pinot et al., 2014; Risselada et al., 2014; Simunovic et al., 2013).
The use of coarse-grained models has also sparked an increase in simulation complexity, such as attempts to model real biological membranes more accurately. Key examples are multi-component lipid mixtures that model the plasma membrane (Flinner and Schleiff, 2015; Ingólfsson et al., 2014b; Koldsø and Sansom, 2015; Koldsø et al., 2014; van Eerden et al., 2015) and the dynamic model of an entire virion membrane (Reddy et al., 2015).
Supra coarse-grain resolution
Zooming out even further reaches ‘supra’ coarse-grain resolution (also denoted ‘ultra’, ‘highly’ or ‘shape-based’ coarse-graining); here, more atoms and molecules (up to entire proteins or large parts thereof) are grouped into single interaction sites. These kinds of models are nearly all run with implicit solvent and stochastic dynamics methods (e.g. Brownian dynamics or dissipative particle dynamics; see Box 1). Depending on the level of coarse-graining, they can be extremely fast and are able to handle very large systems, but their coarseness often restricts their applicability to address generic problems (Deserno, 2009). However, when carefully parameterized from higher-resolution models, semi-quantitative predictions can be made (Dama et al., 2013). Recent examples of the supra-coarse-grained approach include simulations of large-scale membrane remodelling by Bin–amphiphysin–Rvs (BAR) domains (Cui et al., 2013; Simunovic et al., 2013; Yu and Schulten, 2013), the membrane-induced formation of peptide fibrils (Morriss-Andrews et al., 2014) and the supra-molecular organization of photosynthetic membranes (Lee et al., 2015). Furthermore, the supra-coarse-grained approach allows for a natural connection to the macroscopic scale using, for instance, field-theory- or fluid-dynamics-based descriptions of cell membranes (Ayton et al., 2009; Camley and Brown, 2014; Fedosov et al., 2014; Yolcu et al., 2014).
In addition to fixed-resolution models, there have been some interesting developments in mixed resolution approaches (multiscaling), in which coarser and finer descriptions are being combined. Two types of multiscaling protocols, sequential and hybrid, are currently being pursued. Sequential multiscaling involves the switching back and forth between two or more levels of resolution. It has so far mainly been used to add back atomistic details to configurations obtained with coarse-grained models; a procedure called backmapping. A few backmapping strategies have been derived for lipid–protein systems (Rzepiela et al., 2010a; Stansfeld and Sansom, 2011; Wassenaar et al., 2014). These are commonly used to validate and/or refine predictions from coarse-grained models; for instance, those concerning lipid-binding sites on membrane proteins (Arnarez et al., 2013b; Stansfeld et al., 2013), the membrane solvation of nanoparticles (Barnoud et al., 2014) and the nature of transmembrane pores formed by antimicrobial peptides (Rzepiela et al., 2010b).
In hybrid multiscaling, all-atom and coarse-grained degrees of freedom are used concurrently in the same simulation. Hybrid models are, in principle, a very powerful simulation strategy because they combine the best of both worlds – sampling globally at a low resolution (for instance, the bulk membrane and solvent) with only a local high resolution in defined areas of interest (for instance, at the protein–lipid interface). Either a static division of the all-atom and coarse-grained degrees of freedom can be used (Han and Schulten, 2012; Rzepiela et al., 2011; Wassenaar et al., 2013), akin to the well-established hybrid quantum mechanics/molecular mechanics (QM/MM) method, or an adaptive boundary that allows particles to change resolution during the simulation (Praprotnik et al., 2008; Zavadlav et al., 2014). A number of pioneering hybrid studies of the static kind have appeared, in which all-atom/based membrane proteins are embedded in a coarse-grained model membrane environment (Han and Schulten, 2012; Orsi et al., 2011; Wassenaar et al., 2013). However, parameterization and implementation of all-atom/coarse-grained (AA/CG) hybrid models are quite challenging and have not yet been adapted to simulate large-scale cellular processes. Linking different resolution levels, in the same simulation, appears to be far from trivial, and much work is still needed in this area (Ayton et al., 2007; Goga et al., 2015; Zhou, 2014).
High-throughput and automation
Each year, molecular dynamics models are getting more accurate, simulation software faster and computational resources cheaper. Utilizing the increase in computational power, numerous methods have been developed to facilitate automation and to allow for more high-throughput simulations.
Improvement in hardware and software
In addition to the ever increasing power of general-purpose super computers, recent versions of molecular dynamics software (for example, see Box 2) implement faster and more efficient simulation algorithms, and routinely support acceleration modules, such as graphics processing units (GPUs) to assist with the calculations; they are also highly parallelizable – scaling to thousands of compute nodes. Specialized molecular dynamics super computers have also been developed, such as Anton 2 (Shaw et al., 2014), which can run molecular dynamics simulations up to two orders of magnitude faster than even the largest general-purpose machines. Distributed sampling, a method by which hundreds of thousands of computers are turned into an effective super computer, is another powerful way to increase the sampling speed (Lane et al., 2013).
Automatic topology builders
Newer generations of molecular dynamics force fields are usually more accurate and, just as importantly, an increasing number of molecules have been parameterized for those force fields. For instance, the popular CHARMM and Martini force fields currently have parameters available for over 100 different types of lipids (for example, see Feller and MacKerell, 2000; Feller et al., 2002; Klauda et al., 2010; López et al., 2013b; van Eerden et al., 2015; Venable et al., 2014; Wassenaar et al., 2015a). Additionally, methods for automatically creating initial molecular parameters have emerged, such as the automated topology builder (ATB) that follows the GROMOS building-block strategy (Malde et al., 2011), and Antechamber for AMBER (Wang et al., 2006). Automated parameterization methods for coarse-grained models are also being developed (Bereau and Kremer, 2015; Sinitskiy et al., 2012).
Automated system generation
New and improved methods have been established to help set up the initial simulation configurations. CHARMM-GUI (Wu et al., 2014a) is an easy-to-use web interface that guides the user through setting up a number of different membrane-based systems, such as multi-component bilayers and vesicles, with or without embedded proteins. CHARMM-GUI currently supports both the CHARMM and Martini force fields. A number of other similar programs exist, but most do not have a graphical interface and are controlled through the command line; this allows easy inclusion in scripts and automated pipelines. ‘INSert membrANE’ (insane) (Wassenaar et al., 2015a) is one of these tools and is associated with the Martini force field; it supports easy addition of new lipid templates based on simple building-block rules. A number of programs have also been developed that automatically set up and run simulations, such as Sidekick (Hall et al., 2014) and docking assay for transmembrane components (DAFT) (Wassenaar et al., 2015b), which are aimed at high-throughput screening of protein–protein interactions.
The recent advances in automation and high-throughput simulation methodology have greatly affected cell membrane modelling and make it possible to systematically explore different bilayer conditions (Ackerman and Feigenson, 2015; Khakbaz and Klauda, 2015; Wassenaar et al., 2015a; Zhang et al., 2015; Zhuang et al., 2014) and to also set up large-scale complex membrane systems, as discussed in the examples below.
Imaging complex cellular membranes
A realistic plasma membrane model
A typical plasma membrane contains hundreds of different lipid species that are actively regulated by the cell (Jacobson et al., 2007; van Meer et al., 2008). Using a variety of membrane and lipid imaging methods – such as AFM, nuclear magnetic resonance (NMR), X-ray, mass spectrometry and spectroscopy (Holthuis and Menon, 2014; Jacobson et al., 2007; Kaiser et al., 2009; Klose et al., 2012; Marsh, 2013; Sampaio et al., 2011; Sezgin et al., 2015; van Meer et al., 2008) – much has been learned with regard to cellular membranes, including their heterogeneous nature. However, the detailed lipid organization of these membranes remains elusive, and important questions remain, including what are the individual roles of all these lipids, and how do they interact and organize in the membrane plane?
To start to address these questions using computational ‘microscopy’, a model membrane with a realistic, or close to realistic, lipid composition is needed. Recently, we have modelled an idealized mammalian plasma membrane using the Martini coarse-grained force field (Ingólfsson et al., 2014b). In terms of lipid composition, our model is by an order of magnitude the most complex simulation to date. The membrane contains 63 different lipid types, comprising 14 different headgroups and 11 different tails that are asymmetrically distributed across the leaflets (Fig. 2). Large-scale simulations, containing approximately 20,000 lipids and simulated for up to 80 µs (Fig. 2), provide a high-resolution view of the lipid organization of plasma membranes at an unprecedented level of complexity. Based on these simulations, we obtained insights into some of the basic plasma membrane properties, such as non-ideal lipid mixing, lipid flip-flop dynamics, domain formation and coupling between the bilayer leaflets.
On the time scale of the simulation, cholesterol, ceramide and diacylglycerol lipids flip-flop between the leaflets. Cholesterol equilibrates to a slight enrichment in the outer leaflet (∼54%) due to its preferred interactions with the outer-leaflet lipid composition, which is enriched in saturated lipids. Globally, neither leaflet phase separates, but the lipids are heterogeneously mixed and show non-ideal mixing of different lipid species at different spatiotemporal scales. Patches of 5–50 nm in size of increased or decreased cholesterol density are transiently formed and correlate between the two leaflets. In the outer leaflet, ganglioside lipids form small clusters, whereas in the inner leaflet phosphatidylinositol mono-, bis- and trisphosphates form dimers and trimers much more frequently than would be expected based on their concentration (Ingólfsson et al., 2014b).
Overall, our plasma membrane simulations reveal a complex and dynamic interplay of all lipid species, and show that this gives rise to transient domains that are continuously changing in size and composition. This more heterogeneous view is in agreement with the findings of recent fluorescent imaging experiments (Kaiser et al., 2009; Sezgin et al., 2015).
Supercomplex formation in mitochondrial membranes
Mitochondria, the ‘powerhouses’ of the cell, generate most of the ATP that cells use. The synthesis of ATP is a complex and fundamental biological process occurring in and across the double membrane of mitochondria. The respiratory chain, comprising four protein complexes, synthesizes ATP. Three of these complexes are actively involved in generating a proton gradient across the inner membrane that is eventually used by the ATP synthase to produce ATP. A series of oxidoreduction reactions are performed by these enzymes, exchanging electrons through smaller molecular carriers, such as ubiquinone and cytochrome c, which diffuse into the surrounding solvent or membrane.
It has been hypothesized by Hackenbrock et al. that the kinetics of the entire process could be optimized by keeping the different partners that are involved in close proximity (Hackenbrock et al., 1986), leading to a ‘stochastic’ model that describes free diffusion and random encounters of the complexes of the respiratory chain and the electron carriers. In the early 2000s, this model was challenged by the identification of stable structures in the inner mitochondrial membrane that involve several respiratory complexes (Schagger and Pfeiffer, 2000). These structures are called supercomplexes and have been observed in different stoichiometry using various techniques (Althoff et al., 2011; Dudkina et al., 2011; Heinemeyer et al., 2007; Mileykovskaya and Dowhan, 2014; Mileykovskaya et al., 2012; Schafer et al., 2006). A ‘static’ model was formulated to account for these structures, which are thought to minimize the diffusion of the electron carriers and thus speed-up the oxidoreduction cycle. More recently, these models have been combined into the ‘plasticity’ model – reconciling both the existence of supercomplexes and the diffusion of individual complexes (Acín-Pérez et al., 2008). This model indicates that enzymes use adaptive association–dissociation regimes as a strategy to respond to different cell conditions.
To be stable, the supercomplexes require the presence of a specialized lipid in the membrane, cardiolipin, the signature lipid of mitochondria (Bazan et al., 2013; Pfeiffer et al., 2003; Wenz et al., 2009; Zhang et al., 2002, 2005). This anionic lipid is present at a high concentration in the inner membrane of mitochondria (up to 20% of the lipid population; Daum, 1985) and has been identified as contributing to many molecular mechanisms in mitochondria (Chicco and Sparagna, 2006; Pfeiffer et al., 2003; Schug and Gottlieb, 2009; Wenz et al., 2009; Zhang et al., 2005). It has been suggested that cardiolipin acts by forming part of the interface between the various complexes, and that it helps to mediate the transfer of protons and electrons between the different molecular effectors (Haines, 1983); however, the mechanism by which cardiolipin operates is unknown.
To address this question using computational ‘microscopy’, we have performed several simulations studies. First, we investigated the possibility of preferential binding of cardiolipin to the membrane-exposed surface of isolated complexes (Arnarez et al., 2013a,b). Simulations were performed at the coarse-grain resolution, allowing each cardiolipin molecule to explore the surface of the complexes and allowing us to identify its preferential binding sites. We found cardiolipin binding sites on complex I (C.A., unpublished data), complex III and complex IV, in agreement with the available experimental data (Arnarez et al., 2013a,b). The determination of the lipid-binding strength of cardiolipin for these different sites, as well as their residence times, further revealed the heterogeneity of cardiolipin binding to the different binding sites on complex IV (Arnarez et al., 2013a).
We also investigated the association of these complexes into supercomplexes using large-scale simulations (C.A., unpublished observations). Here, several replicas of complex III and complex IV were embedded in a mitochondrial model membrane and allowed to self assemble (Fig. 3). The model membranes used were cardiolipin:1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) and cardiolipin:POPC:1-palmitoyl-2-oleoyl-sn-glycero-3-phosphoethanolamine (POPE) with ratios similar to those under in vivo conditions (Daum, 1985). Cardiolipin was found at the interface of each supercomplex that formed in the simulations (Fig. 3), perfectly ‘gluing’ the complexes together, as suggested previously (Zhang et al., 2002). The simulations reveal that cardiolipin-binding sites are involved in determining the relative orientation of the complexes. The interactions between cardiolipin and the enzymes can be further refined by backmapping the conformations to all-atom resolution and equilibrating them with short atomic simulations for isolated complexes (Arnarez et al., 2013b) or supercomplexes (Fig. 3B).
Taken together, these data support a role for cardiolipin in fixing the formation of supercomplexes in the respiratory chain and reveal a mechanism that implies that supercomplex organization is regulated by cardiolipin-binding sites. Here, the computational ‘microscope’ provides information on the structure of the respiratory chain complexes at atomic resolution and complements the low-resolution images that are obtained with electron microscopy. This approach also provides insights into the role of the lipid environment in the association of the respiratory chain complexes, information that is difficult to assess experimentally.
Computational modelling has established itself as an indispensable tool for elucidating the structure and dynamics of cellular membranes. The view offered by the computational ‘microscope’ is unique and complements existing experimental techniques. The current state-of-the-art methods and applications deal with increasingly complex systems, approaching the complexity of real cell membranes. Time scales up to milliseconds and systems comprising millions of atoms can now be probed.
Given the maturing nature of the field, the question arises as to whether we could use this approach to ‘image’ an entire cell in full physiological detail. The answer is yes, in principle, but in practice, we are not there yet. For one, we still need more complexity in our models. For instance, the coupling between the cell membrane and the cytoskeleton has remained underexplored. There are numerous unknown aspects of cell structure and composition that require joint computational and experimental efforts in order to be able to arrive at a description at the molecular level. Furthermore, differentiation between membrane types has only just begun. With the anticipated further increase in experimental data for membrane composition (mainly using high-throughput lipidomics) and increasingly sophisticated techniques to resolve membrane protein structures (using nanodiscs, electron microscopy, X-ray free-electron lasers and ion-mobility–mass-spectrometry), progress can be expected in this direction. Another area of active research is the development of practical algorithms for hybrid multiscaling. Cell processes are inherently multiscale in nature; for example, the occurrence of an enzymatic reaction can trigger the assembly of protein complexes that, in turn, might lead to large-scale membrane remodelling. Methods that couple quantum-scale to all-atom and (supra-)coarse-grain resolutions are ultimately required to describe these processes.
Perhaps the largest challenge on our way to the in silico cell is to deal with the gap in spatio-temporal scales. In terms of the number of atoms, cells are huge (1010–1014, depending on cell type), and from the computational perspective, most biological processes are slow, occurring over microseconds to seconds. We can expect to close this gap in two ways – firstly, through the continued increase in computational power (soon entering the exascale era with the availability of million-core computing clusters) and secondly, through the use of smart sampling tricks, such as the removal of irrelevant degrees of freedom. As a reasonably bold prediction, within the next five to ten years, we expect to witness a computationally simulated molecular view of a relatively small and simple cell with a largely known composition, such as the red blood cell. In ten to twenty years, a variety of cell types could become alive in our virtual lab environment. However, ‘alive’ should not be misunderstood – real cells are in a non-equilibrium state, driven by chemical reactions requiring constant energy input. Eventually, we will need to integrate tools from systems biology and bioinformatics into our computational ‘microscope’ to complete our view of cell membranes.
Alex H. de Vries is acknowledged for stimulating discussions.
S.J.M. is supported by a TOP grant from the Netherlands Organisation for Scientific Research (NWO); and a European Research Council advanced grant.
The authors declare no competing or financial interests.