For decades, we have relied on population and time-averaged snapshots of dynamic molecular scale events to understand how genes are regulated during development and beyond. The advent of techniques to observe single-molecule kinetics in increasingly endogenous contexts, progressing from in vitro studies to living embryos, has revealed how much we have missed. Here, we provide an accessible overview of the rapidly expanding family of technologies for single-molecule tracking (SMT), with the goal of enabling the reader to critically analyse single-molecule studies, as well as to inspire the application of SMT to their own work. We start by overviewing the basics of and motivation for SMT experiments, and the trade-offs involved when optimizing parameters. We then cover key technologies, including fluorescent labelling, excitation and detection optics, localization and tracking algorithms, and data analysis. Finally, we provide a summary of selected recent applications of SMT to study the dynamics of gene regulation.
The ability to directly observe molecular kinetics in living cells, tissues and embryos has only recently become possible due to the maturation of single-molecule tracking (SMT) technologies. This possibility is derived from the culmination of decades of technological advancements in microscopy, fluorescent labelling, gene editing, computational tools and biophysical modelling. Over the past two decades, SMT has progressed from almost entirely in vitro applications to tracking individual protein molecules at cell surfaces (Sako et al., 2000), within bacteria (Deich et al., 2004; Elf et al., 2007), within the nuclei of cells (Liu et al., 2014; Yang et al., 2004) and in embryoid bodies (Chen et al., 2014a). While performing SMT in cells cultured on glass is now relatively mature, applications within embryos have only recently begun to be explored. Early applications of SMT in intact organisms were initially limited to surfaces, using similar optical approaches to those used for studies on single cells (Robin et al., 2014; Zhan et al., 2014; Schaaf et al., 2009). The advent of light-sheet approaches extended SMT to tracking tracer molecules in cytoplasmic regions and then finally within the nuclei of living multicellular embryos (Mir et al., 2018a; Reisser et al., 2018; Mir et al., 2017).
In the context of gene regulation, SMT has challenged classical models built on stable and hierarchical interactions of molecules in regulating transcription. Instead, SMT studies have highlighted the prevalence of transient and weak protein-DNA interactions, the local modulation of kinetics, and the important role of nuclear organization and compartmentalization in regulating transcription (Box 1) (Lionnet and Wu, 2021). To achieve such insights, the data provided by SMT must be complemented by genomic and biochemical assays, loss- and gain-of-function mutations, and other widely used approaches. This inherent interdisciplinary nature of applying SMT to study gene regulation has motivated us to write this Review.
In the context of gene regulation, single-molecule tracking (SMT) has transformed our view of regulatory mechanisms from one where stable hierarchical interactions dominate to one where transient interactions are the norm and the spatial-temporal distribution of protein concentrations within nuclei are crucial to function. Regulation of local concentration in the form of clustering is evident in the distribution of RNA polymerase II molecules (Cisse et al., 2013) and has been linked to the formation of clusters that incorporate other key components of the transcriptional machinery (Cho et al., 2018). Tracking of key regulators of chromatin topology, CTCF and cohesin, have revealed that they bind to DNA with distinct dynamics, rather than forming a stable complex, as previously imagined (Hansen et al., 2017). In addition, the search dynamics of CTCF is greatly enhanced by local-trapping mediated by RNA binding (Hansen et al., 2020). Studies on components of the polycomb repressive complex (a key developmental regulator of chromatin state) have shown that binding to specific sites is mediated both by sequence and specific histone modifications (Zhen et al., 2016), and that binding leads to formation of condensates on DNA to facilitate co-factor binding (Kent et al., 2020). Analysis of the dynamics of the pioneer factor Zelda and the anteroposterior morphogen Bicoid in Drosophila melanogaster embryos has revealed how hubs formed by Zelda create high local concentrations of Bicoid within nuclei and facilitate binding at target genomic loci (Mir et al., 2017, 2018b). The key consistent lesson from the examples above, and many other recent SMT studies, is that while the off-rates of protein-DNA interactions are usually very high, the on-rates of interactions can be drastically altered by modulating local protein concentrations. We encourage the reader to convince themselves of this conclusion by exploring a database of measurements we have made available online (https://www.mir-lab.com/dynamics-database).
Our goal is to provide an accessible overview of the fundamental concepts and the state-of-the-art methods for each of the key steps in designing, executing and interpreting an SMT experiment. There are a multitude of available choices for protein labelling, imaging modalities and analysis methods, and the decisions made at each of these steps are interdependent. We start with an overview of what benefits SMT provides over other imaging approaches, discuss the key steps and limitations of SMT and then review advances in technologies at each key step of a SMT experiment. Our hope is that this Review enables the reader to better interpret the rapidly increasing amount of SMT data reported in literature, as well as to inspire a greater use of rigorous SMT experiments in the context of developmental gene regulation. A number of recent reviews have been written covering various aspects of single-molecule tracking technologies and applications, and we encourage the reader to read them in addition to the primary literature (Liu et al., 2015; Elf and Barkefors, 2019; Shen et al., 2017). To facilitate understanding we have included a glossary of technical terms used in this article (Box 2).
Axial defocalization. Loss of focus of a particle during single-molecule imaging because the particle moves significantly in the z-direction, i.e. axially out of the depth of field of the imaging system.
Bessel beam. A beam whose amplitude is described by a Bessel function (a central lobe with additional side lobes). A mathematically ideal Bessel beam does not diffract and spread out as it propagates.
Brownian motion. The random freely diffusive motion of small particles suspended in a fluid, with the displacement probability following a Gaussian distribution.
Diffraction limited. The resolution limit of an optical system set by the physics of diffraction. The diffraction limit of a microscope is largely dependent on the numerical aperture of the objective being used and the wavelength of light (∼λ/2NA).
Diffractive optics. Optical elements with strategically designed microrelief structures that can be used to reshape transmitted light based on the fundamental principles of light diffraction.
Dwell time or residence time. The inverse of the average off-rate of a protein-protein or protein-chromatin interaction.
Electron multiplication. The process by which a single electron, when incident on a dynode (a secondary electron emitter), results in the emission of more than one electron. Often used in detectors such as electron multiplying charge-coupled devices to amplify a signal that may start with a single electron.
Gauss-Bessel beam. In practice, it is impossible to generate an ideal Bessel beam. In a real optical set-up, only a Gauss-Bessel beam can be generated, which results in limited propagation distance without diffraction. Typically, the Gaussian- or Bessel-like mixture in a beam can be tuned, resulting in either shorter propagation distances with fewer side lobes or longer distance with more side lobes, respectively.
Gaussian beam. The amplitude of a Gaussian beam is described by a Gaussian function (a central lobe without any side lobes). In contrast to a Bessel beam, a Gaussian beam does diffract and spread out as it propagates. The distance over which a Gaussian beam maintains its minimal thickness is known as the Rayleigh distance.
Hidden Markov model. A Markov process is a statistical system that evolves over time between multiple states, with the probability of a state transition depending on the present state. Hidden Markov models use statistical analysis in order to infer the state of an unobservable Markov process from an observed variable dependent on the state of the Markov process.
Laplace transforms. A mathematical operation that converts a signal from the time domain to the frequency domain. An inverse Laplace transform converts a signal from the frequency domain back to the time domain. Laplace transforms are often used to solve differential equations, such as those used to describe diffusive motion.
Mean square displacement. An ensemble-averaged measurement of how much the position of a molecule strays from its original position over time. This value reports how much space is explored by a molecule from the ensemble.
Monte-Carlo single-particle tracking simulations. The use of random sampling to simulate draws from a given probability distribution. It can be used to simulate single particle diffusion.
Non-parametric Bayesian statistics. A Bayesian model with an unconfined parameter space allowing for growth in complexity of a model as more data are obtained. Such models are often applied to machine learning and regression problems.
Photobleaching. The permanent loss of the ability to fluoresce due to electrons becoming trapped in a triplet state. The rate at which this process occurs for a population of molecules depends on the excitation power and local chemical environment (see also quantum yield and photon budget).
Photon budget. The number of photons emitted by a fluorophore that contribute to an image over the course of an experiment. The quantum yield and photobleaching rate together determine the photon budget, which, along with the limits of sample viability, constrain how fast or for how long an imaging experiment can be performed with the necessary amount of signal-to-noise ratio.
Point-spread function. The image obtained from a point source of light, also known as the impulse response of an optical system. The point-spread function is a result of the diffraction-limited resolution of an imaging system.
Poisson processes. A process that occurs repeatedly at a certain rate but with the time gaps between events following a Poisson probability distribution.
Power-law model. A non-linear regression of data to a power law function often used to model diffusive processes and fit mean-square displacement data. The models take the form ∼Dτα, where D is the diffusion coefficient, τ is the time lag and α is the degree of anomalous motion. The exponent α is equal to 1 for pure diffusion, is less than 1 for confined sub-diffusive motion and is greater than 1 for directed super-diffusive motion.
Quantum efficiency. The ratio of the number of electrons read out by a detector to the number of incoming photons. The quantum efficiency (QE) is a commonly used measure of the performance of a detector: the higher the QE the more efficient the detector.
Quantum yield. The efficiency of a fluorescence molecule calculated as the ratio of the excitation photons absorbed to the emitted photons. Quantum yield is dependent on the molecular structure of the fluorophore, as well as on the solvent and local environment (see also photon budget).
Signal-to-noise ratio. The ratio of the power of desired signal to the power of unwanted background noise. A higher signal-to-noise ratio results in better contrast and higher localization precision.
Total internal reflection (critical angle and evanescent field). When a ray of light hits a surface separating one medium from another (e.g. a glass separating water from air), it bends to an extent that is determined by the refractive indices of the different media and the incident angle of the light. For fixed refractive indices, as the angle between the incident ray and the normal to the surface keeps increasing, the angle of the refracted ray in the new medium keeps decreasing. At a high enough angle of the incident ray, no more refraction takes place and the entire beam is reflected back into the original medium. This phenomenon is known as total internal reflection and the angle at which it occurs is the critical angle. When total internal reflection takes place, an electromagnetic field termed as the evanescent field is generated that penetrates a few hundred nanometers into the second medium.
Inferring is believing: limitations of FCS and FRAP
An oft-repeated phrase associated with imaging is ‘seeing is believing’. However, when it comes to measuring molecular dynamics in vivo, we are forced to make inferences about the underlying reality from indirect observation (Fig. 1A). Molecular populations within living samples are non-homogenous and important kinetics can exist at diverse temporal scales ranging from seconds to hours. Being able to directly measure heterogeneity and reduce the number of assumptions that need to be made is thus crucial to developing accurate models of the underlying molecular processes in in vivo systems. Prior to the advent of in vivo SMT methods, fluorescence correlation spectroscopy (FCS) (Magde et al., 1972) and fluorescence recovery after photobleaching (FRAP) (Axelrod et al., 1976), were the most widely used tools to infer molecular-scale kinetics. Although powerful, both techniques provide ensemble-averaged information and are highly model dependent (Mazza et al., 2012). Nevertheless, FCS and FRAP continue to provide valuable insights and are useful assays to complement SMT measurements, particularly at faster temporal resolutions (for FCS) or for very stable interactions (FRAP). As such, we include a brief discussion on their principles, utility and limitations here.
In FRAP, fluorescent molecules in a small, usually diffraction limited (see Glossary, Box 2), volume are induced into a dark, non-fluorescing state in a process called photobleaching (see Glossary, Box 2) by using a high-intensity laser pulse. Subsequently, the rate at which fluorescent signal returns due to the surrounding unbleached molecules diffusing into the bleached volume is quantified (Fig. 1B). The shape of the fluorescence recovery curve is dependent on how unbleached molecules from outside the target volume replace the bleached molecules within. This recovery curve is fit to a model to infer the underlying molecular dynamics, quantified by parameters such as diffusion and binding constants. FRAP was notably used to reveal the high mobility of nuclear proteins (Phair and Misteli, 2000), the dynamics of the heat-shock response (Yao et al., 2006) and the kinetics of RNA polymerase II (Darzacq et al., 2007). FRAP was also successfully applied to study chromatin-binding kinetics (Hansen et al., 2017; Mazza et al., 2012) and works well in reaction-dominant kinetic populations. However, as FRAP relies on bleaching many molecules, it inherently provides ensemble-averaged information and conclusions can be strongly dependent on assumptions made in choosing a model (Mazza et al., 2012).
In FCS, a focused laser is used to illuminate a diffraction limited spot within a specimen and a point detector is used to measure fluctuations in the emitted fluorescence signal (Fig. 1C). The characteristics of the fluctuating signal are defined by the kinetics of molecules moving into and out of the illuminated volume, which depends on their diffusion coefficients, concentrations and oligomerization states. Although FCS can be applied to estimate chromatin-binding kinetics (Plachta et al., 2011; Fradin, 2017; Mazza et al., 2012), these estimates are highly model dependent. As FCS relies on picking up subtle fluctuations in a fluorescence signal, it requires a low number of emitters and a small target volume (∼1 fl), and provides ensemble-averaged data within that small volume. Although FCS is very powerful for analysing rapidly diffusing molecules and quantifying concentrations, applying it to heterogenous kinetic populations must be approached with great care.
Unlike FCS or FRAP, SMT allows us to directly visualize different kinetic populations with spatial context (Fig. 1D). However, although more direct, the challenges and trade-offs involved in conducting SMT experiments still require us to make many choices and assumptions on how we obtain and model data to infer the underlying molecular reality.
Key ingredients for a single-molecule tracking experiment
The major barriers to tracking individual molecules are the ability to differentiate them from one another and from the background noise in an image. The former barrier is due to the diffraction-limited resolution of light microscopy, which is ∼250 nm for visible light (Abbe, 1873); the latter is determined by the signal-to-noise ratio (see Glossary, Box 2) of the imaging system. The resolution barrier comes from the diffraction of point light sources through the apertures of the imaging optics to the camera, causing the light to spread into a blob known as the point-spread function (PSF; see Glossary, Box 2). To overcome both these barriers, most in vivo SMT experiments use sparse localization microscopy in which the position of individual molecules is determined with nanometre-scale precision (Thompson et al., 2002; Mortensen et al., 2010). The localization error depends on the number of photons collected and the background noise level (Bobroff, 1986). SMT experiments that rely on this principle of localization microscopy have been performed in vitro or on cell membranes for almost 40 years (Barak and Webb, 1981; Yildiz et al., 2003; Funatsu et al., 1995). Single-molecule localization microscopy relies on having sparse enough detections within a single exposure, such that there is a negligible probability of two fluorescent molecules being closer together than the resolving power of the microscope objective being used (size of the PSF). This idea of separating out molecules that are close in space into a temporal dimension (Betzig, 1995; Burns et al., 1985) via sparse excitation is also the basis of super-resolution imaging such as photoactivated localization microscopy (PALM) (Betzig et al., 2006) and stochastic optical reconstruction microscopy (STORM) (Rust et al., 2006). Rather than tracking molecules, PALM and STORM work by accumulating a sufficiently large number of localizations to reconstruct a super-resolved image. It is important to note that high localization precision does not directly translate to high-resolution images, which depend on having dense enough labelling to sufficiently sample the structure of interest. A recently developed alternative approach for SMT is MINFLUX (Balzarotti et al., 2017), which combines the principles of stimulated emission depletion microscopy (STED) (Hell and Wichmann, 1994) with localization microscopy, and reports localization precisions down to 1 nm with higher photon efficiencies than traditional camera-based localization. However, widespread use of MINFLUX has been limited due to constraints in implementation and a lack of clear practical advantages over more accessible technologies in the context of in vivo tracking studies (Prakash, 2021). Owing to their more widespread use and applicability, in this Review we focus only on hardware and analysis technologies for camera-based localization methods.
The workflow for conducting a SMT experiment can be broken down into the following key steps (Fig. 2): (1) making your protein of interest visible to an optical microscope through fluorescence labelling; (2) imaging your sample with high enough contrast (signal-to-noise ratio) to localize single molecules with high precision and at the appropriate temporal resolution to measure the kinetics of interest; (3) localizing each molecule and connecting localizations into trajectories; and (4) modelling of trajectory data. We next discuss the challenges associated with each of these steps, along with recent technological advances that help mitigate these challenges.
Trade-offs in single molecule tracking
Although SMT is performed using a fluorescence microscope the optimal experimental conditions for SMT differ from those required for conventional microscopy experiments. For SMT, the parameters that are generally optimized are localization precision, temporal resolution and trajectory length. Selecting which parameter to optimize intrinsically involves trade-offs for the other two (Fig. 3). The localization precision depends on the total number of photons detected in each exposure and the signal-to-noise ratio (Thompson et al., 2002; Mortensen et al., 2010). The number of photons that can be detected is set by the photon budget (see Glossary, Box 2) for a given fluorescent molecule and is influenced by its surrounding environment. In Fig. 3A, we use the analogy of a ‘bag of photons’ to represent the number of photons that can be collected from a given molecule in a certain amount of time. A fluorescent molecule's quantum yield (see Glossary, Box 2) (ratio of photons emitted to absorbed) dictates how quickly the bag can be emptied and thus the maximum theoretical temporal resolution that can be achieved at a given laser power and desired localization precision. The photo-bleaching rate dictates how long this temporal resolution can be maintained, i.e. how long before the bag is empty. While the quantum yield is determined by the choice of label (if the chemical environment is identical), the photobleaching rate is linked to the excitation power, exposure time and molecular stability. These trade-offs are well summarized in a ‘triangle of frustration’ (Fig. 3B), the vertices of which limit the dynamic range of SMT experiments in space and time. A consideration of paramount importance, which is not included in the triangle, is the effects of imaging on the viability of the specimen. For any set of optimized parameters, the health of the specimen must be assessed empirically, although some consequences may not be detectable easily (Laissue et al., 2017; Icha et al., 2017). Thus, it is generally advisable to minimize the excitation laser power and total light exposure in any experiment.
When designing an experiment, a point within the triangle of frustration must be chosen to explore a certain aspect of the underlying kinetic population (Fig. 3C). For example, to capture trajectories from both fast- and slow-moving molecular populations, short exposure times are required to optimize for imaging speed. In order to collect enough photons to maintain adequate localization precision during short exposures, the excitation powers must be increased, leading to higher photobleaching rates and shorter trajectories in time. Additionally, higher powers will lead to more photodamage and may compromise the integrity of the specimen. Conversely, to optimize for longer trajectories, longer exposures would be used allowing for lower laser powers; however, faster-moving molecules now blur into the background. Motion blur in SMT for fast-moving molecules is an inherent limitation of the technique due to camera speeds, as well as the quantum yield of fluorescent labels (Izeddin et al., 2014). Any molecule that moves a greater distance than the localization precision during the exposure time will be susceptible to blurring artefacts. Strategies to mitigate motion as a result of exposure time limits were implemented using stroboscopic illumination (short pulses of excitation) (Elf et al., 2007), which reduces the effective exposure time at the expense of the number of photons collected. Another strategy is to use short exposure times (<10 ms) at higher laser powers to minimize motion blur, with longer (> 200 ms) ‘dark’ intervals between exposures to minimize bleaching, resulting in longer trajectories at the expense of being able to quantify faster kinetics (Paakinaho et al., 2017).
The trade-offs summarized in Fig. 3 capture the optimization choices available with a given fluorescent label and microscope system. Technological advances in microscopy and labelling technologies reduce the severity of these trade-offs. Recently, there have been several improvements in fluorophore stability and quantum yield, allowing faster or longer duration measurements. Simultaneously, advances in microscopy technologies have led to improved signal-to-noise ratios and the ability to image faster without increasing laser powers in more complex specimens. Together, these advances have ushered in a new era in SMT.
Boosting signal: improved fluorophores and labelling strategies
One of the most crucial steps for SMT is fluorescently labelling the molecule of interest. Successful labelling involves careful consideration of both the labelling strategy and the specific fluorescent label. Parameters to consider when deciding between options include the quantum yield of the label, stability (how quickly molecules photobleach), spectral properties, propensity to oligomerize, ease of implementation and potential perturbation to protein function. Recent improvements in fluorophores increase the photon budget, while advances in gene editing allow the endogenous labelling of proteins with relative ease. There are two major strategies for labelling: fusion to a fluorescent protein (FP) or fusion to a peptide tag that can bind small molecule fluorophores (Fig. 2A).
Photoconvertible and photoactivatable fluorescent proteins have enabled the rapid adoption of localization-based super-resolution imaging with the ability to turn fluorescence on (for activatable proteins) or switch fluorescence to a different colour (for convertible proteins), allowing the sparse detections required for localization. A number of fluorescent proteins were developed based on variants of naturally existing proteins, such as photoactivatable green fluorescent protein (PA-GFP) (Patterson and Lippincott-Schwartz, 2002), Dendra (Gurskaya et al., 2006) and EosFP (Wiedenmann et al., 2004), which enabled in vivo SMT applications. However, early versions suffered from poor stability, poor quantum yield and a tendency to oligomerize. Rational design has progressively improved these proteins for SMT. For example, EosFP was made less prone to oligomerization in the fluorescent proteins mEos3.1 and mEos3.2 (Zhang et al., 2012), and then its stability was improved in mEos4b (Paez-Segala et al., 2015), which enables the acquisition of longer trajectories (De Zitter et al., 2019). An alternative to photoactivatable proteins to satisfy the condition of sparsity is stochastic labelling by using translational read-through to express a low-copy number of tagged proteins with multiple fluorophores per protein (Liu et al., 2018a). Although a stochastic multi-labelling approach provides longer trajectories, a smaller proportion of the molecular population is sampled and there is a greater probability of perturbing protein function with the larger tags. Similarly, fluorescent protein array tags, which rely on stochastic turnover, were recently introduced to enable theoretically unlimited tracking in time (Ghosh et al., 2019). However, signal-to-noise issues and the size of the array may limit their applicability. A community resource known as FPBase (Lambert, 2019) is now available online to help compare fluorescent proteins in an interactive manner (www.fpbase.com). We strongly encourage the reader to peruse this resource.
The development of self-labelling peptide tags, such as SnapTag (Gautier et al., 2008) and HaloTag (Los et al., 2008), allows a wide variety of synthetic fluorophore dyes to be conjugated to a single transgenic protein through covalent bonds to a shared chemical moiety on the fluorophore. Over the past 5 years, there has been a revolution in the synthesis of organic dyes that have improved quantum yields, photostability and cell permeability, and offer a wide variety of spectral bandwidths (Grimm et al., 2015, 2016; Wang et al., 2020; Frei et al., 2019). The palette of available dyes continues to expand further into redder wavelengths expanding the ability to carry out multicolour experiments (Grimm et al., 2020; Lukinavičius et al., 2013). However, free dyes can lead to artefacts; thus, concentrations and wash steps must be optimized for each protein and cell type, and potential toxicity must be considered when selecting a dye.
For both FPs and the self-labelling systems, rigorous controls and multiple labels should be tested. The specific linker sequence used for conjugation and the site of insertion (N versus C terminus versus internal) can also play a defining role (Snapp, 2005). Multiple linkers and fusion sites should be tested with assays including protein expression levels and function. In summary, FPs provide a simpler workflow but in almost every other way bright small-molecule dyes are superior for SMT (Banaz et al., 2019).
Reducing background noise: improved excitation strategies
Advances in imaging hardware come in the form of new excitation and detection technologies with the overall goal of reducing the severity of the trade-offs between temporal resolution, trajectory length and localization precision (Fig. 3), while enabling imaging deeper inside specimens. The earliest SMT experiments were conducted using wide-field epi-fluorescence microscopes (Barak and Webb, 1981). Wide-field illumination is often the default geometry used to conduct fluorescence microscopy and is relatively facile. However, the incident excitation light propagates through a large volume of the sample, resulting in significant background fluorescence from excited, out-of-focus molecules, reducing the signal-to-noise ratio and limiting localization precision (Fig. 4A). To limit out-of-focus excitation, total internal reflection fluorescence (TIRF; see Glossary, Box 2) microscopy (Axelrod, 1981) uses tilted illumination incident on the sample at above what is known as the ‘critical angle’. Illumination above the critical angle results in total reflection of the incident light. An evanescent electromagnetic field is generated that decays ∼100 nm into a sample and results in exquisite contrast. However, because of the limited axial propagation, TIRF can only be used to study molecules at the surface of cells close to the coverslip or in in vitro systems (Sako et al., 2000) (Fig. 4B). The popularity of TIRF microscopy for in vitro applications and the wide availability of commercial hardware has resulted in the rapid expansion of in vivo SMT studies due to the advent of highly inclined and laminated optical sheet microscopy (HILO) (Tokunaga et al., 2008). HILO microscopy can use the same hardware as TIRF, but decreases the angle of the incident light so it is slightly lower than the critical angle. This results in an inclined light-sheet propagating into the sample, which reduces out-of-plane excitation (Fig. 4C). HILO is widely used for SMT in single cells grown on glass; however, contrast degrades quickly at deeper positions due to increasing out-of-plane illumination, making HILO not optimal for thicker specimens.
Further improvements in restricting the axial range of illumination and moving single-molecule imaging into the context of intact tissue, embryos and small animals is enabled by high-numerical aperture light-sheet microscopy. Although light-sheet microscopes have been used in the context of developmental biology for many years, the first high numerical aperture implementations suitable for single-molecule imaging have been realized only relatively recently (Ritter et al., 2010; Cella Zanacchi et al., 2011). These initial implementations were able to demonstrate SMT up to ∼200 µm deep and sparked the development of a plethora of techniques. Two geometries of particular interest for imaging in embryos are reflected and tilted light-sheet microscopy (Gebhardt et al., 2013; Gustavsson et al., 2018; Greiss et al., 2016; Hu et al., 2013), which use mirrors or prisms instead of orthogonally oriented objectives (Fig. 4D). Reflected light-sheets were successfully used for imaging transcription factor dynamics in zebrafish embryos (Reisser et al., 2018). The idea of using tilted and reflected light-sheets was extended to create single-objective light-sheets with microscopes with inverted geometries relying on microfabrication methods and microfluidics to place a turning mirror close to the sample (Galland et al., 2015; Meddens et al., 2016). More recently, geometries that build on remote re-imaging and re-focusing (Botcherby et al., 2007, 2008) to correct for a tilted beam offer versatile solutions for samples that must be imaged using inverted microscopes (Yang et al., 2019; Sapoznik et al., 2020). The methods discussed above use Gaussian light-sheets for illumination that carry a quadratic trade-off in the thickness of the light-sheet versus field-of-view. For example, for a 1.1 numerical aperture objective with a depth of focus of ∼700 nm the effective field of view would be only ∼1 µm, which is impractically small for most biological contexts (Planchon et al., 2011). These trade-offs limit using an optimal excitation condition in which only molecules that lie within the depth of focus of the detection objective are illuminated such that there is no contribution to the background from out-of-focus fluorophores.
One of the most significant advances in light-sheet microscopy has come in the form of using alternatives to Gaussian beams (see Glossary, Box 2) for illumination to allow for the creation of thin sheets that can propagate over larger fields of views. This was initially achieved using Gauss-Bessel beams (see Glossary, Box 2) (Planchon et al., 2011; Gao et al., 2012), which enabled a light-sheet thickness of 0.5 µm over a 500 µm field of view and were used to image single molecule dynamics in embryonic stem cells (Chen et al., 2014b). Although Gauss-Bessel beams are a significant improvement over Gaussian beams in terms of optimizing field-of-view and sectioning, they carry significant energy in out-of-plane side lobes. Building off the success of Bessel-beam (see Glossary, Box 2) light-sheets, lattice light-sheet microscopy (Fig. 4E) uses interference between an array of Bessel beams to suppress side lobes and create uniform light-sheets with large fields of view, which can be tuned to match the depth of field of the detection objective (Chen et al., 2014a). The advantages of using lattice light-sheets over Gaussian beams in terms of propagation distance and thickness trade-offs was recently challenged in a systematic comparison (Chang et al., 2020). However, further empirical investigation in the context of biological applications is required. In addition to imaging in embryonic stem cells (Liu et al., 2014) and embryoid bodies, lattice light-sheet microscopy has enabled tracking of transcription factor dynamics inside the nuclei of living Drosophila melanogaster (Mir et al., 2017, 2018b) and mouse embryos (Mir et al., 2018a).
Recently, lattice light-sheet microscopy was combined with adaptive optics to correct for optical aberrations induced by the sample, mounting media and imaging solution (Liu et al., 2018b). Unlike most applications of adaptive optics in microscopy, the decoupled excitation and detection arms of light-sheet microscopes allow the independent correction of aberrations in both arms of the optical path (Wilding et al., 2016; Bourgenot et al., 2012). The ability to correct for light-sheet displacements (which cause defocus) and expansions (which cause out of plane excitation) further enhances the signal-to-noise ratio provided by lattice light-sheet excitation. A complementary approach to adaptive optics-based corrections is to use automated optical paths combined with computational methods to create ‘smart-microscopes’ that improve the signal-to-noise ratio and maintain it over long periods of time (Royer et al., 2016). Although not yet applied to SMT, adaptive optics-based approaches for optimizing excitation and other forms of adaptive microscopy supported by computational methods are likely to provide a more-efficient use of the limited photon budget for improved tracking at higher temporal resolutions within embryos and other three-dimensional (3D) tissue contexts.
Advances in detection: 3D tracking and turning photons into electrons
When imaging rapidly diffusing populations of molecules, the fast-moving molecules may be under-sampled due to a higher probability of axial defocalization (see Glossary, Box 2). Rigorous approaches were developed to correct for this under sampling by calculating the probability of defocalization (Hansen et al., 2018; Mazza et al., 2012). However, where the photon budget permits, it may be preferable to directly localize particles along the axial dimension to avoid making assumptions about the rate of loss due to defocalization. Axial localization may also be important when trying to analyse the motions of molecules relative to a region of interest in the sample. Two major types of approach were demonstrated for axial localization in SMT: defocused and/or refocused detection, and PSF engineering. To estimate position from images of defocused PSFs, the size and spacing of rings, which appear as a point moves through focus, are used (Fig. 5A) (Speidel et al., 2003). However, the axially symmetric shape of PSFs can lead to difficulty in determining which direction the molecule moved at small axial displacements, and the spreading out of photons at larger displacements leads to poor localization precisions. An extension of defocused imaging is using simultaneous multiplane imaging. Early implementations used biplane imaging (Juette et al., 2008; Prabhat et al., 2004; Toprak et al., 2007) and the approach was elegantly extended to nine planes using diffractive optics (see Glossary, Box 2) (Abrahamsson et al., 2013). Although powerful, multiplane methods severely challenge the photon budget because the signal from each molecule is split across multiple planes.
An alternative to multiplane detection is shaping or engineering the detection PSF. The most popular approach is to simply insert a cylindrical lens (CL) (Kao and Verkman, 1994; Huang et al., 2008) in the detection path, which results in an ellipsoidal PSF rotated by 90° above and below the focus (Fig. 5A). The axial asymmetry provided by astigmatic aberrations induced by the CL is then used for localization with ∼10 nm precision in the axial direction. More-complex PSF engineering is realized using either phase-plates (Baddeley et al., 2011), deformable mirrors or spatial light modulators, which allow the generation of higher order aberrations. PSFs with complex engineered structures and behaviours were generated, such as double helices (Pavani et al., 2009; a pair of points that rotate as you move through focus), self-bending (Jia et al., 2014), and optimally information rich complex structures termed ‘saddle points’ (Shechtman et al., 2014) and ‘tetrapods’ (Shechtman et al., 2015) (Fig. 5A). Optimally engineered PSFs also extend axial localization beyond the native focal range of the objective while maintaining lateral localization precision. The extended range comes from the ability to prevent the spreading out of photons laterally as a function of axial displacement, as happens in simple defocus detection. However, more-complex PSFs tend to be larger in size laterally, which can limit the maximum density of single molecules that can be localized per frame. Engineered PSFs are also more prone to motion blurring artefacts and could be challenging to use for studying fast-diffusing proteins. In addition to providing 3D tracking, adaptive optics-based PSF engineering was also used to correct aberrations in the detection path of the microscope to improve localization precisions (Izeddin et al., 2012; Liu et al., 2018b). Although the 3D tracking methods described here have seen limited use compared with 2D tracking (because of their greater demands on the photon budget and the need for more specialized hardware), we anticipate that the advances in excitation and labelling strategies will eventually enable more widespread use of 3D tracking.
The final hardware component of a microscope for SMT is the detector or camera. Choosing the correct detector is a crucial choice for data quality and typically represents a significant proportion of the hardware cost. The ideal camera for SMT would offer large fields of views, high temporal resolutions, a large dynamic range and excellent signal-to-noise ratios – even in low-light conditions. As in the trade offs discussed above for microscopy, these parameters are also often at odds, but are becoming less severe due to new technology. The parameter of utmost importance for detectors is also signal-to-noise ratio. The signal component is determined by how efficiently incoming photons are converted to electrical output, which is known as quantum efficiency (QE; see Glossary, Box 2) and is crucial, due to the low photon count nature of SMT experiments. However, if the noise levels (which come from thermal, stochastic and electronic sources) are too high, there is insufficient contrast for localization (Long et al., 2012). There are two types of detectors used for SMT: electron multiplying charge-coupled devices (EMCCDs) and scientific complementary metal-oxide-semiconductor detectors (sCMOSs) (Fig. 5B). Although EMCCDs and sCMOS detectors are now being produced with comparable QEs of ∼95%, EMCCDs largely remain the detector of choice for SMT due to their ability to amplify low-light signals through electron multiplication (see Glossary, Box 2). Electron multiplication does add an additional noise component (∼1.41 fold) but at low-light levels this factor is not significant. However, because electrons in EMCCDs are read out serially, whereas they are read out in parallel for sCMOS cameras, they are generally limited in terms of frame rates for comparable fields of views. The parallel readout configuration of sCMOS sensors does add pixel-to-pixel variation in the noise pattern that can complicate image analysis and introduce artefacts, but many approaches for correcting for this variation exist (Mandracchia et al., 2020; Diekmann et al., 2021 preprint). Historically, SMT experiments were limited by the photon budget and not by the fame rate of cameras because long exposure times were required to collect enough photons for high localization precision. However, with the improvements in fluorophores and microscopy hardware discussed above, faster SMT is now possible. With the advent of back illuminated sCMOS cameras with high QEs, large fields of views and high frame rates, EMCCDs are beginning to lose their monopoly on SMT experiments (Wang et al., 2017).
Connecting the dots: localization and tracking
Localization algorithms aim to optimize precision and computation times, and perform well in limited signal-to-noise conditions. Many early and current implementations fit detections to an assumed Gaussian PSF (Betzig et al., 2006); however, using experimentally measured PSFs (Mlodzianoski et al., 2009; Li et al., 2018) can lead to higher localization precisions while reducing computation times. New hardware technologies and engineered PSFs were accompanied by a plethora of new localization algorithms. A community effort recently evaluated 36 software packages against simulated datasets (Sage et al., 2019) in the second such competition in the past decade (Sage et al., 2015). The competition assessed localization precision while varying particle densities and signal-to-noise ratios. For 3D localization, the greatest variation in performance occurred at larger defocalization distances. The algorithms that performed well at these larger distances also behaved better at lower signal-to-noise ratios and higher particle densities. For 2D localization in sparse conditions, it was suggested that localization algorithms have now reached near optimal performance. We urge the reader to examine the report on the competition before deciding on an algorithm (Sage et al., 2019).
Several choices are also available for tracking, and efforts to compare available software have been made for more than two decades (Cheezum et al., 2001; Chenouard et al., 2014). For situations where particle density is very sparse or frame rates are low, long-established algorithms, such as Nearest neighbour can be used (Tinevez et al., 2017), but denser localization or more mobile populations quickly lead to tracking errors. Much of the effort in improving tracking algorithms has thus been to optimize performance in higher density situations while reducing computation times (Jaqaman et al., 2008; Sergé et al., 2008). The choices and parameters can be overwhelming, but we encourage the reader to expend effort at the outset of a project to explore and deeply understand their tracking algorithm of choice, in order to understand any potential sources of artefacts that may be specific to their application.
Inference is inevitable: analysis of trajectory data
The goal of most current SMT analysis is to determine the different kinetic populations that exist for each molecular species (Fig. 6A) and then quantitatively characterize each sub-population. The number of trajectories required for the different analyses depends on the distribution of molecules between different populations and should be determined empirically but, generally, thousands to tens of thousands of trajectories are required – and more is always better. The most common and oldest way to analyse diffusive particle motion is to calculate the mean square displacement (MSD; see Glossary, Box 2). As individual trajectories are noisy, and diffusion is a stochastic process, we must examine ensemble-averaged parameters. It is important to note that averaging should be carried out after sorting so different diffusive populations are not mixed. It is also important to note that the interpretation of MSDs at longer time lags is noisier due to less data (Saxton and Jacobson, 1997). In the case of pure diffusive motion, the relationship between MSD and time is linear (Fig. 6B), and the diffusion coefficient can be calculated with a simple linear fit. However, in the crowded cellular environment, motion is often sub-diffusive and anomalous. To characterize the degree of anomalous diffusion, the MSD is fit to a power law model (Fig. 6B; see Glossary, Box 2). Although MSD analysis is extremely popular, it can lead to inaccurate fits for rapidly diffusing molecules and short trajectories (Hansen et al., 2018). In addition, although all analysis approaches are prone to biases arising from data filtering and thresholding, MSDs are highly sensitive to these biases, which can lead to inaccurate multi-state models.
As an alternative method, time-dependent displacement histograms can be fit to Gaussian probability density mixture models in order to determine diffusion constants and the particle fractions in each state (Matsuoka et al., 2009) (Fig. 6C). These models can be normalized against photobleaching and particle motion out of the focal plane to account for bias in the trajectories (Mazza et al., 2012). When fit to models of two- or three-state Brownian motion (see Glossary, Box 2), displacement histogram analysis can provide information on diffusion coefficients and the fraction of molecules in each population. Spot-On, an open-source implementation of this technique, is a convenient software package for multi-state single-particle tracking (SPT) analysis (Hansen et al., 2018). Monte-Carlo SPT simulations (see Glossary, Box 2) suggest that displacement histogram analysis is less subject to bias than MSD and can produce accurate determinations of diffusion parameters in two- and three-state models (Hansen et al., 2018). However, Spot-On assumes that each trajectory consists of particles in only a single diffusive state, which may lead to inaccurate parameters for particles where rapid state transitions may occur in a single trajectory. ‘Variational Bayes SPT’ is one alternative approach, which combines data from a large number of short trajectories to infer state occupancies, transition probabilities and diffusion constants (Persson et al., 2013). More recently, analytical frameworks that infer parameters and state transitions directly from distributions of diffusion coefficients (Vink et al., 2020) and using non-parametric Bayesian statistics (see Glossary, Box 2) were reported (Karslake et al., 2021; Heckert et al., 2021 preprint). Other than Spot-On, the methods discussed above do not take axial defocalization into account, which could lead to biases in estimated parameters for fast-moving population. A major limitation of all the approaches discussed here is the assumption of purely diffusive Brownian motion. In the future, analysis methods that better capture the anomalous nature of protein motion in the crowded nuclear environment are needed.
One significant step in the direction of characterizing the effects of nuclear crowding and overcoming some of the limitations of MSD and displacement histograms is the analysis of relative angles of motions (Burov et al., 2013) (Fig. 6D). Molecules that tend to redundantly revisit the same spatial locations result in an anisotropic angle distribution with a backwards (180°) bias, whereas purely diffusive molecules exhibit a more isotropic exploration of space. This type of backwards biased motion can come from a reduction in the available dimensionality of exploration because of the crowded nuclear environment, referred to as ‘compact’ exploration (de Gennes, 1982) or from anomalous sub-diffusive motion (Fig. 6D). An angle distribution is generally calculated by first sorting the trajectories into sub-kinetic populations (bound, slow, fast, etc.) either by examining their average displacements or by using a classifier, such as a Hidden Markov Model (see Glossary, Box 2; Izeddin et al., 2014; Hansen et al., 2020). This sorting step is crucial to remove bound molecules as they will bias the angle distribution in the backwards direction. In addition to examining angular distributions, calculations of the mean first-passage time (MFPT) of molecules provides insight into the kinetics of diffusion limited searches (Condamin et al., 2008). Using this approach, a particle is assigned a target site, and the mean time for a particle to reach that target is calculated as the negative time derivative of the survival probability of the particle in the searching state. The MFPT depends upon the diffusion coefficient, as well as the distance searched by a particle and can provide insight into the specific nature of various sub-diffusive behaviours.
Another major application of SMT is the determination of the off rates or residence times (see Glossary, Box 2) of DNA-binding proteins. To calculate residence times, long exposure times are often used to blur out fast-moving molecules and reduce the demands on the photon budget by allowing for reduced laser powers and thus longer trajectories (Fig. 3). Individual molecule dwell times (see Glossary, Box 2) are calculated based on trajectory lengths and combined into a survival probability distribution, which is often fit with a two-component exponential decay model to extract off-rates for each of the two components. Trajectories are cut short by the independent processes of tracking errors, disassociation, photobleaching or defocalization; for accurate assessment of dwell time, the rate of each of these parameters should be determined and subtracted from the measured off-rates to obtain the true off-rate of a molecule (Fig. 6E). In experiments on embryonic stems cells (Chen et al., 2014b), the photobleaching rate was determined through single-exponent fits to a dwell time distribution of a dye solution. To determine the contributions of photobleaching and defocalization, Hansen and colleagues introduced the approach of assuming that these contributions come from independent Poisson processes (see Glossary, Box 2) and used a known stable chromatin association molecule, the Histone H2B, to estimate these two rates (Hansen et al., 2017). Recently, approaches that model survival probabilities with more than three states were developed using a power-law model (see Glossary, Box 2; Garcia et al., 2021) and inverse Laplace transforms (see Glossary, Box 2; Reisser et al., 2020). Advances in modelling approaches that do not assume discrete states may better capture the full range of interactions in which nuclear proteins engage.
As SMT technology becomes more widely accessible, it is likely that we will see new and creative analytical approaches. For example, approaches to estimate maps of diffusivity (Xiang et al., 2020) or to map transcription factor motions onto an underlying landscape of chromatin compaction states (Lerner et al., 2020) were recently demonstrated. Another emerging area is the application of deep-learning approaches to SMT data, with recent demonstrations of the ability to classify trajectories and quantify diffusive behaviour (Granik et al., 2019; Muñoz-Gil et al., 2020; Kowalek et al., 2019). Moving forward, there is a need for tighter integration of experimentalists and theorists to advance analytical approaches for understanding the origin and nature of sub-diffusive and compact exploration in the complex cellular milieu.
The increasingly widespread use of SMT for studying developmental gene regulation and the ability to apply it in truly endogenous contexts, such as live embryos, suggests that we will continue to witness a rapid growth in technologies for better fluorophores, labelling strategies, excitation, detection and analysis. In terms of technology development, methods such as adaptive optics will play a key role in pushing SMT deeper inside embryos and complex tissue. Brighter fluorophores and new microscopy technologies hold promise for multi-colour applications and tracking with higher temporal and spatial resolutions, as the photon budget is increased and used more efficiently. Improved photon budgets will also enable more combinations of SMT with other types of functional imaging approaches, such as fluorescent lifetime measurements, nascent imaging of transcription and 5D (x, y, z, time and colours) imaging to place trajectories in the context of nuclear organization. These advances hold the promise to provide comprehensive models of gene regulation during embryogenesis and beyond, by bridging the vast spatial and temporal scales involved in development, from molecular scale dynamics at nanometre and sub-second scales to the patterning of organisms over millimetres and days.
We thank Anders S. Hansen, Nicholas Gattone, Kaeli Mathias and Adam Jabak for comments on this Review.
M.M. acknowledges funding support from the Children's Hospital of Philadelphia Research Institute.
The authors declare no competing or financial interests.