Fluorescence microscopy images should not be treated as perfect representations of biology. Many factors within the biospecimen itself can drastically affect quantitative microscopy data. Whereas some sample-specific considerations, such as photobleaching and autofluorescence, are more commonly discussed, a holistic discussion of sample-related issues (which includes less-routine topics such as quenching, scattering and biological anisotropy) is required to appropriately guide life scientists through the subtleties inherent to bioimaging. Here, we consider how the interplay between light and a sample can cause common experimental pitfalls and unanticipated errors when drawing biological conclusions. Although some of these discrepancies can be minimized or controlled for, others require more pragmatic considerations when interpreting image data. Ultimately, the power lies in the hands of the experimenter. The goal of this Review is therefore to survey how biological samples can skew quantification and interpretation of microscopy data. Furthermore, we offer a perspective on how to manage many of these potential pitfalls.
The accelerated development of optical technologies, computational tools and imaging probes has vastly raised the prominence of quantitative analysis in microscopy. This has led to a wealth of literature that addresses the important topics of acquisition (Jonkman et al., 2020), accuracy (Jost and Waters, 2019; Waters, 2009), reproducibility (Lee and Kitaoka, 2018) and quantifiability of images (Jonkman et al., 2014). The degree of technical complexities warrants in-depth considerations of (1) quantitative experimental design (Jost and Waters, 2019; Wait et al., 2020), (2) labeling strategy (Jacoby-Morris and Patterson, 2021; Snapp, 2005; Toseland, 2013), (3) the choice of fluorophore (Albrecht and Oliver, 2018; Schneider and Hackenberger, 2017; Specht et al., 2017; Thorn, 2017) and instrument (Lemon and McDole, 2020; Schermelleh et al., 2019), (4) the effects of various optical components (Jonkman et al., 2020; Lambert and Waters, 2014), (5) the consequences of inappropriate image processing (Belthangady and Royer, 2019; Swedlow, 2013), (6) the choice of image visualization and analysis approaches (Long et al., 2012; Wolf et al., 2013), as well as (7) the accurate and sufficient reporting of crucial parameters in microscopy-related experiments (Aaron and Chew, 2021; Heddleston et al., 2021; Rigano et al., 2021). These articles, as well as many others (Combs and Shroff, 2017; Dean and Palmer, 2014; Demmerle et al., 2017; Durisic et al., 2014; Stelzer et al., 2021; Waters and Wittmann, 2014), constitute a resource for life scientists to better understand how the gamut of imaging technologies impacts experimental readout. Unfortunately, such emphasis also inadvertently puts the onerous burden of experimental fidelity squarely on the surrounding technologies; it omits an element of a microscopy experiment that can vastly affect the outcome if not properly characterized and controlled – the specimen itself.
Despite the wealth of information that can be extracted from modern optical microscopy data, most analytical metrics, such as colocalization coefficients, motion tracking, ultrastructural characterization, traction force quantification and molecular abundance, are derived from intensity and location data (Wait et al., 2020). In fact, besides a handful of specialized spectroscopic imaging techniques, such as fluorescence lifetime imaging microscopy (Datta et al., 2020) or fluorescence correlation spectroscopy (Magde et al., 1972), most fluorescence microscopy modalities only measure intensities and coordinates of light emitters. Consequently, any factor that affects the precise acquisition of these two measurements will jeopardize the accuracy of the experimental readout. Unfortunately, the biospecimen itself can contain many elements capable of skewing the light path as well as the photon count. These elements can range from the biochemical and photophysical properties of the fluorophores to the biochemical, physiological and physical properties of the microenvironment within the specimen.
While genetically encoded fluorescent proteins have revolutionized the study of biology, their widely accepted utilization is not devoid of caveats. Factors, such as molecular maturation kinetics, stabilities, photophysical properties (e.g. extinction coefficient and quantum yield), biochemical properties (e.g. pKa) and photobleaching rates, can vary widely (Heppert et al., 2016). In addition, some fluorescent proteins can form oligomers (Costantini and Snapp, 2013; Cranfill et al., 2016; Shaner et al., 2005) or cause protein mislocalization (Ghodke et al., 2016; Landgraf et al., 2012; Lee et al., 2013), confounding the very biology they are tasked to elucidate. Organic fluorescent dyes are similarly rife with idiosyncrasies, such as diverging degrees of cell permeability, specificity, fluorogenicity, bioavailability and photostability, as well as their effects (and those of necessary solvents) on the biospecimen (Grimm and Lavis, 2021). Simply introducing fluorescent probes into the biological system can trigger both (1) the ‘observer effect’, whereby the mere act of measuring biology perturbs the biology itself, and (2) the ‘uncertainty effect’, wherein attempts to improve the location accuracy of a fluorophore compromise the precision of its intensity readout, and vice versa.
Advanced microscopy techniques have elevated biological studies beyond the reductionistic context of single cells. However, the caveats inherent to imaging specimens can be compounded when the specimens present significant tissue heterogeneity, anisotropy and thickness. Nowhere is this challenge more apparent than in deep-tissue imaging where the excitation and emitted light are confounded by heterogeneous refractive indices, absorption, pH deviations, molecular crowding and autofluorescence. These challenges can be further exacerbated when these deviations change with time in live samples. Taken together, the growing desire to unravel biological processes in their near-native environment has also steered microscopists directly into the complexities of the ‘microenvironment effect’ – whereby the natural setting of a biomolecule affects the confidence of its measurement.
The importance of discerning how a biospecimen affects microscopy readout is twofold. First, without proper control and correction, such distortion can lead to erroneous measurements and data misinterpretation. Second, the detailed characterization of how the specimen alters the illumination and emission light can contain a wealth of information about the specimen. Many studies have indeed turned this otherwise problematic hindrance into an advantage in gaining novel biological insights (Campagnola and Loew, 2003; Prevedel et al., 2019).
In this Review, we will focus on how microscopy data are affected by causes not commonly discussed – namely, factors inherent to the specimen. Specifically, we survey the various factors within a biological sample that distort intensity and location information and provide cautionary examples thereof (see the supplementary information for further details). The impact of such caveats on the measurement and interpretation of bioimaging data are discussed. Additionally, we offer corresponding approaches to determine these errors, followed by possible correction and normalization strategies.
Photobleaching and phototoxicity
The illumination intensity used in fluorescence microscopy is orders of magnitude greater than most organisms have evolved to withstand (Hobson et al., 2021). As a result, fluorescence microscopy data do not solely represent the experimental condition but also the response of the specimen to light. This has two related consequences – light can cause both toxicity to the biological specimen (Icha et al., 2017; Ojha and Ojha, 2021), as well as irreversible photobleaching of fluorescent labels (Magidson and Khodjakov, 2013). A less apparent contributor to this is that fluorescent labels themselves can further amplify phototoxicity (Icha et al., 2017; Stephens and Allan, 2003).
Failure to quantitatively account for photobleaching can lead to subsequent errors in image analysis and data interpretation. Although photobleaching is commonly encountered in prolonged live-cell imaging experiments, it can occur in unexpected situations, even in fixed samples. As shown in Fig. 1, a progressive decrease in fluorescence intensity is evident in the direction in which the image volume of a brain section was acquired. Although this photobleaching effect is unrelated to the intrinsic biology, it is indistinguishable from a natural gradation in Ca2+ concentration. Therefore, a direct comparison of target abundance between the top and bottom of the volume will be misleading, regardless of which direction the volume is captured.
An even more catastrophic effect is phototoxicity. Common repercussions of phototoxicity include mitochondrial fragmentation (Kiepas et al., 2020), decreased cell proliferation (Mubaid and Brown, 2017), aberrant sample changes (Jemielita et al., 2013) and cell death. In extreme cases, illumination light can rapidly destroy the sample (Schloetel et al., 2019). From the standpoint of experimental accuracy, a rapid phototoxic effect is less insidious than a subtle but prolonged change in biological response. In the former situation, the severe phototoxicity is immediately apparent, and the experiment can be terminated. In the latter, the experimenter may unknowingly document a light-induced stress response and attribute the results to the experimental condition under investigation. This can be more confusing due to the potential opposing consequences of photoexposure, which can negatively affect both the sample (Tinevez et al., 2017) and the treatment compound (Kolega, 2004).
Fortunately, the effects of phototoxicity can be revealed by an independent assay that measures an unrelated biological readout as a function of light exposure (Laissue et al., 2017; Tinevez et al., 2012). Time-lapse experiments can be tested for photosensitization effects by comparing biological readouts between full experiments and replicates imaged only at the end of the time course or with altered acquisition parameters. If toxicity is observed, biological results need to be validated by other means to ensure phototoxicity is not altering the biology of interest.
Although some phototoxicity is inevitable, there are many strategies that can be employed to ameliorate its effect (Icha et al., 2017; Kiepas et al., 2020; Magidson and Khodjakov, 2013; Tosheva et al., 2020). Firstly, illumination intensity should be reduced as much as practical. Bright and photostable fluorophores, which require less light energy to be effectively detected, should be used. Similarly, since longer wavelength light imparts less energy on the sample, red-shifted fluorophores will generally decrease phototoxicity (Douthwright and Sluder, 2017). Careful consideration should also be given to selecting microscopes that can suitably address the biological question with the gentlest illumination (Boudreau et al., 2016; Fish, 2009; Hoebe et al., 2007; Kiepas and Brown, 2020; Mubaid and Brown, 2017). Secondly, phototoxicity and photobleaching can be reduced by minimizing photoreactive products, such as reactive oxygen species (Tosheva et al., 2020). This can be achieved by imaging the sample in an anoxic environment or with decreased oxygen concentrations (Stephens and Allan, 2003; Tosheva et al., 2020), or by excluding photosensitizing media components such as Phenol Red (Khodjakov and Rieder, 2006). Alternatively, antioxidants [e.g. rutin (Bogdanov et al., 2012) and Trolox (Douthwright and Sluder, 2017)] or oxygen-scavengers (Jung et al., 2018; Nahidiazar et al., 2016) can be added to growth media. Finally, computational methods can also be used when photobleaching is unavoidable. For example, photobleaching in time-lapse experiments can be compensated for by various normalizations (Miura, 2020). In addition, advances in computational image processing can ease the analysis of low signal-to-noise images captured to avoid photobleaching [e.g. Noise2Void (Krull et al., 2020), content-aware image restoration (Weigert et al., 2018)]. Although computational approaches can be effective, they should not be used in place of optimized sample design and imaging conditions, as deceptive artifacts can be unnecessarily introduced (Belthangady and Royer, 2019; von Chamier et al., 2019).
Absorption, scattering and refraction
Not only does light impact the sample, but the sample can reciprocally influence the light properties within it. Obstacles in a biospecimen can absorb, scatter and/or bend both incident and emitted light (Schwertner et al., 2007; White et al., 1996), resulting in image distortions and inaccurate intensity measurements. Such obstacles include melanin (Tuchin, 2015), erythrocytes (Roggan et al., 1999), chloroplasts (Vogelmann, 1993), lipid droplets (Chen et al., 2021) and hemoglobin (Weissleder, 2001). Sample-induced light distortions may not be immediately apparent and can sometimes be misconstrued as a biological phenomenon. Therefore, how the sample alters illumination and emitted light should be considered, controlled for and/or corrected to ensure proper image quantification and interpretation.
Some apparent effects of light scattering and/or absorption include a decrease in fluorescence signal with depth or an appearance of ‘stripes’ or ‘streaks’ along the illumination direction (Al-Juboori et al., 2013; Huisken and Stainier, 2007; Jacques, 2013; Salili et al., 2018; Yoon et al., 2020). Scattering causes a deviation in the light path, whereas absorption reduces the number of photons. Both phenomena alter light properties and can ultimately disrupt the accuracy of intensity measurements within a specimen. An image will be further aberrated when light passes through a sample with heterogeneous refractive indices (Schwertner et al., 2007), resulting in dimmer, distorted and poorly resolved images (Ji et al., 2010). Additionally, as refraction is wavelength dependent, colocalization readouts can be skewed, leading to difficulty in assessing biological associations (Abraham et al., 2010). Each of these artifacts are exacerbated with increased imaging depth. Therefore, accurate conclusions about the underlying biology necessitate recognizing, accounting for and/or avoiding these artifacts.
There are several sample preparation techniques that can lessen the impact of scattering, absorption and refraction. Tissue and organ samples can be sectioned to reduce imaging depth. Sectioning, however, can cause morphological changes to the sample. An alternative approach is sample clearing, which reduces absorption and scattering within the sample (Costa et al., 2019; Silvestri et al., 2016; Yu et al., 2018). Unfortunately, tissue sectioning and clearing are not compatible with live specimens; they can be used, however, to validate observations made from intact samples. In cases where in situ measurements of dynamic events within thick and/or highly heterogeneous samples are required, further steps can be taken to reduce these artifacts.
Two-photon microscopy is a common approach used to minimize optical aberrations with increasing depth (Helmchen and Denk, 2005; So et al., 2014). Traversing deep distances will nonetheless cause a decrease in both illumination and detected light intensities due to scattering and absorption. To correct for these effects, two approaches can be considered. First, many commercial confocal and two-photon microscopes can automatically adjust illumination intensity or detector gain with imaging depth. Secondly, signal deterioration caused by increasing depth can be corrected during post-acquisition image processing (Yayon et al., 2018). However, observations made using these compensations should be validated by other means due to the inexact nature of the underlying assumptions. Additionally, there is considerable improvement in camera sensitivity that can further mitigate these problems (Crosignani et al., 2012; Dvornikov et al., 2019).
A recent trend for reducing sample-induced aberrations is the use of adaptive optics (AO). By measuring the aberrations caused by the sample, AO compensates for aberrations by introducing a counter-distortion to the light wavefront to restore the final image (Booth, 2014; Ji, 2017). Unfortunately, AO requires custom, costly and complex optical configurations that are largely inaccessible to a broad range of scientists. It is therefore more pragmatic simply to be aware of how these aberrations can conflate subsequent measurements. Without AO, such caveats in the resulting image are unavoidable. Investigators are therefore strongly advised to interpret their data with circumspection.
Many endogenous biological molecules are capable of fluorescing, which leads to unwanted signals, collectively known as autofluorescence. Common sources of autofluorescence include porphyrins, flavonoids, coumarins, chlorophyll and carotenoids (Croce and Bottiroli, 2014; Donaldson, 2020). Adding to the ambiguity, autofluorescent metabolic products can fluctuate in response to experimental conditions (Maglica et al., 2015; Surre et al., 2018), causing data misinterpretation. Even common pH indicators used in tissue culture media (such as Phenol Red) can produce strong autofluorescence capable of affecting measurements.
Low-signal microscopy experiments are especially susceptible to autofluorescence artifacts. These include single-molecule-tracking (Aaron et al., 2019; Martin-Fernandez and Clarke, 2012) or Förster resonance energy transfer (FRET) assays (Pietraszewska-Bogiel and Gadella, 2011; Pleshinger et al., 2021). In extreme examples, as illustrated in Fig. 2A, the high level of autofluorescence in tissue can overwhelm exogenously added labels, hindering even initial object segmentation. One simple method to minimize autofluorescence detection is to utilize fluorophores that emit in the far-red spectral region, where autofluorescence background is typically lower (Shen et al., 2015; Wolff et al., 2006). However, this might limit the number of labels that can be introduced into a multiplexed imaging experiment.
Spectral unmixing approaches offer an attractive alternative to separate autofluorescence from the signal of interest based on differences in emission spectra (Cohen et al., 2018; Gao and Smith, 2015; McRae et al., 2019; Rossetti et al., 2020). Many commercial confocal microscopes achieve this via a prism or a diffraction grating that enables emitted light of different wavelengths to be captured by a detector array. However, these hyperspectral imaging components are sensitive to light scattering, which is common in highly autofluorescent samples. From these multi-channel images known as lambda stacks, the signal from autofluorescence can be computationally isolated (Fig. 2B). Likewise, molecules with overlapping spectra can also be separated by their signature fluorescence decay (Lakowicz, 2006). This is performed by using fluorescence lifetime imaging microscopy (FLIM) (Datta et al., 2020). Alternatively, quenching treatments, such as Sudan Black B (Sun et al., 2011) and sodium borohydride (Davis et al., 2014), can also minimize autofluorescence caused by chemical fixation. In the case of immunofluorescence, the sample can be purposely photobleached before addition of the secondary antibody. Ultimately, appropriate planning and pilot studies are essential to ensure that the biology captured in fluorescence images is not obscured by autofluorescence.
The orientation of a fluorescent molecule ultimately determines how it interacts with light. This often-overlooked attribute means that it requires more than simply the correct wavelength of light to excite a fluorophore. A fluorophore has a preferred axis along which it is most efficiently excited, termed a dipole. As such, a molecule is most effectively excited by photons that are polarized parallel to this dipole (Backlund et al., 2014).
Under most physiological conditions, fluorescent labels within biospecimens have a large degree of rotational freedom, rendering the effect of the molecular dipole negligible. Similarly, unpolarized light sources, such as light emitting diodes (LEDs), do not discriminate the orientation of molecule dipoles. However, laser light sources produce polarized light that can confound fluorescence image interpretation. For example, when imaged with laser scanning confocal microscopy, the apparent fluorescence intensity of the same actin fibers labeled with phalloidin can vary depending on their orientation relative to the direction of light polarization (Fig. 3A–C). Discrepancies caused by this effect can influence the detection of subtle features, such as filopodia (compare Fig. 3D and Fig. 3E) or impact structural quantification (Fig. 3F). This problem is similarly evident in single-molecule microscopy where the intensity of an individual label can fluctuate as it rotates through various orientations. In these cases, the effect of polarization on automated image-processing strategies should not be overlooked.
The most straightforward approach to identify whether polarization effects are influencing intensity results is to rotate the sample. Similarly, one can also compare images of the same sample captured using a non-polarized light source, such as LEDs. If intensity discrepancies are evident, the most robust method to overcome the problem is to use an unpolarized light source or convert the polarization from linear into circular with a quarter-wave plate. Unfortunately, the latter solution is often not practical with commercial instruments. Beyond this, polarization artifacts caused by anisotropic structures can also be reduced by increasing the flexibility of fluorescent protein linkers (Chen et al., 2013; Li et al., 2016) – in fact, rotational mobility is an underappreciated factor important to the accuracy of single-molecule localization microscopy experiments (Backlund et al., 2014).
The physical characteristics of a specimen are not the only factors that will affect the accuracy of a representative bioimage. The cellular milieu is biochemically heterogeneous, with wide variations in pH, ion concentration, hydrophobicity, redox state, voltage potential and numerous other factors that can affect fluorophore behavior (Costantini et al., 2015). Here, we detail three examples that illustrate how the physiochemical environment can change fluorophore behavior and outline strategies to identify and avoid data misinterpretation.
The concentration and location of many ions (e.g. Ca2+ and Zn2+) in living cells can be routinely assessed by fluorescent indicators (Bers, 2008; Brini et al., 2014; Carter et al., 2014; Jaimes et al., 2016; Solovyova and Verkhratsky, 2002). However, some Ca2+ indicators (O'Banion and Yasuda, 2020; Tian et al., 2012) have been shown to be sensitive to other divalent cations (Hyrc et al., 2000). In particular, Zn2+ ions play a critical role in many cellular processes (Williams, 1989) and have been shown to exhibit crosstalk with Ca2+ signaling (Maret, 2001). In one troubling example, the fluorescence intensity of Oregon Green, a common Ca2+ indicator, significantly decreased after Zn2+ chelation (Stork and Li, 2006). This suggests that the purported Ca2+ signaling indicated by this dye may in part be attributable to Zn2+, prompting a call from the authors to re-examine previous studies.
Membrane potential can also affect dye behavior. Indeed, the common mitochondrial dye tetra-methylrhodamine methylester (TMRM) is selectively taken up only by polarized mitochondria and is released upon depolarization (Lemasters and Ramshesh, 2007; Pendergrass et al., 2004). Other dyes show similar uptake but remain within mitochondria even after depolarization (Kholmukhamedov et al., 2013). Still others appear to bind mitochondria regardless of polarization state (Poot and Pierce, 1999). Similarly, mitochondrial membrane potential is disrupted during apoptosis, rendering mitochondria undetectable when stained with certain dyes (Elmore et al., 2004). As can be surmised, a naïvely applied mitochondrial stain can produce misleading results if the behavior of the stain is overlooked.
The intrinsic pKa of organic and biochemical molecules also makes most fluorescent probes sensitive to changes in pH (Hou et al., 2017). The cytoplasm exhibits a normal pH of 6.8–7.4, but disease or experimental perturbations can push it outside this range, potentially cloaking labeled biological targets. This can lead to the erroneous assumption that an experimental condition has perturbed the target, rather than simply disrupting the fluorescent label. Physiologically, cellular compartments can exist at acidic (e.g. lysosomes; Ohkuma and Poole, 1978) or alkaline (e.g. mitochondria; Porcelli et al., 2005) pH. These specialized compartments call for selective dyes that function within such pH ranges; however, Johnson et al. have shown that lysosomal pH is surprisingly heterogeneous (Johnson et al., 2016). Indeed, only a sub-population of lysosomes in U2OS cells are effectively labeled by organelle-targeting dyes that are pH-sensitive, such as LysoTracker, leaving the remainder invisible (Fig. 4).
It is therefore critical that researchers identify and account for potential effects due to the local physiochemical microenvironment. First, manufacturers can be a valuable source of information about the pertinent characteristics of fluorescent probes (Johnson, 2010) – although a thorough peer-reviewed literature search should also accompany such efforts whenever possible. Second, proper control experiments can identify variables that affect fluorophore intensity (Stork and Li, 2006). Third, redundant labeling strategies can help gauge the environmental sensitivity of a label, as explored by Johnson and colleagues (Johnson et al., 2016) and further illustrated in Fig. 4. These examples serve as important reminders that fluorophores do not exist in isolation within a specimen. Rather, their effectiveness as biochemical reporters can be hindered by the very environment they are designed to measure. This uncertainty can lead to deviations from the biological reality.
In addition to the intrinsic biochemical and biophysical factors within the specimen, the fluorescent labels themselves can be a source of error. If fluorescent labels are brought into close proximity to one another, a considerable reduction of fluorescence emission can occur. This process, known as quenching (Jablonski, 1955; Walter, 1888), occurs when either a neighboring molecule reabsorbs fluorescence or suppresses fluorescence emission. Although quenching is reversible and may not occur as frequently as photobleaching, it can be similarly detrimental. In such cases, structures may appear dimmer than expected or completely devoid of signal (Jacobsen et al., 2017; MacDonald, 1990). Consequently, corresponding intensity measurements can give a false account of molecular abundance, going so far as to convey contradictory results.
Quenching can occur either via transient or static mechanisms. A common occurrence of transient quenching can be found in FRET (Lakowicz, 2006), where the emission of a donor molecule is non-radiatively transferred to an acceptor fluorophore. In the case of FRET, the quenching effect is often anticipated and used as a key feature in the experimental design. There is a wealth of technical reviews that guide readers in accurately measuring FRET (Bunt and Wouters, 2017; Sekar and Periasamy, 2003; Vogel et al., 2006; Waters, 2009). However, researchers often neglect to anticipate this effect when it is not desired. Accordingly, critical evaluation of microscopy images of multi-labeled samples is required to appreciate the influence of quenching.
The more insidious quenching mechanism occurs when fluorophores physically oligomerize with light-absorbing molecules. Although this process can be reversible, the physical aggregation usually forms a more stable complex, thereby inducing static quenching. Static quenchers can be another fluorophore or a non-fluorescent molecule, with the latter known as a ‘dark quencher’ (Johansson and Cook, 2003). When static quenching occurs, the oligomer exhibits a modest or complete reduction in fluorescence emission, belying the high local concentration of the fluorescent label. In some cases, identical dyes can form self-quenching homodimers that are nonfluorescent. This is of note in hydrophobic environments where lipid-soluble dyes can accumulate in high concentrations. A number of dyes are known to exhibit this behavior, including fluorescein, rhodamine, Nile Red, Dil and DiD (Jablonski, 1955; Reisch and Klymchenko, 2016; Walter, 1888).
Naturally, quenching can skew quantitative data interpretation. For example, fluorescently labeled cell-penetrating peptides fail to label the plasma membrane or lysosomes due to self-quenching (Swiecicki et al., 2016). Similarly, although CellMask Green is useful for labeling membranes (Fig. 5A–C), high concentrations of the dye will result in a counterintuitive decrease in fluorescence signal at the plasma membrane (Fig. 5D–F). Changes in quenching can further confound time-course experiments as fluorogenic dyes accumulate over time or diffuse into fresh media following washing.
Although self-quenching can be lessened by chemically modifying dyes to prevent aggregation, such as by sulfonation or acetylation (Mujumdar et al., 1996; Swiecicki et al., 2016; Zhegalova et al., 2014), this phenomenon highlights the importance of performing pilot studies to identify unforeseen experimental shortcomings. If possible, dilution protocols should be employed to find an optimal label concentration (Swiecicki et al., 2016), or a second fluorophore can be used to validate any discrepancies caused by self-quenching. Furthermore, in multiplexed experiments, care should be taken to avoid fluorophore combinations with a high degree of spectral overlap that could result in transient quenching effects, particularly when studying multiprotein complexes (Clayton and Chattopadhyay, 2014).
Protein aggregation and maturation
As with organic dyes, fluorescent proteins, such as GFP and RFP, are particularly prone to forming higher-order oligomers, which can easily lead to misinterpretation of images (Snapp et al., 2003). This naturally occurring limitation has led to efforts to engineer monomeric fluorescent proteins (Bindels et al., 2016; Matlashov et al., 2020; Shaner et al., 2004, 2013). Importantly, oligomerization is also dependent on fluorescent protein concentration; as such, reducing the level of protein expression can minimize protein aggregation. Additionally, multiple assays exist to measure the tendency of a fluorescent protein to aggregate (Baird et al., 2000; Costantini et al., 2012; Pédelacq et al., 2006). However, final observations might need to be validated by multiple methods to ensure aggregation artifacts do not impact conclusions (Moore and Murphy, 2009).
The intrinsic biochemical environment of a biospecimen can cause fluorescent protein maturation times to deviate significantly from those listed in the literature (Chudakov et al., 2010; Lavis and Raines, 2008; Shaner et al., 2005). Such factors include temperature (Balleza et al., 2018), post-translational modifications, chaperon-folding pathways, co-translational folding (Samelson et al., 2018; Waudby et al., 2019) and oxygen availability (Chudakov et al., 2010), as well as subtle differences between sample strains (Hebisch et al., 2013). Although fluorescent protein maturation time rarely poses an issue for most steady-state biological experiments, it can negatively impact time-dependent experiments, for example, protein expression and trafficking experiments or FRET biosensing assays (Liu et al., 2018b). In these cases, selecting a fluorescent protein that has an appropriate maturation time is vitally important (Nagai et al., 2002). When consequential to the hypothesis, it is advisable to compare endogenously labeled samples to antibody-labeled controls.
The myriad of spurious specimen-related effects that can impact the outcome of an imaging experiment underscores the importance of sample preparation and characterization. Simply put, a specimen is not a passive element in a microscopy experiment. By contrast, the specimen should be considered an integral optical component, capable of filtering, shaping and aberrating the light being used to observe it. Microscopes in general cannot distinguish an artifact from a faithful representation of a biospecimen. Therefore, it is incumbent upon the researcher to identify, characterize and, ultimately, correct these various pitfalls.
Conclusions and perspectives
The observation that a biospecimen can affect microscopy readout is not new. In an attempt to use shorter wavelength illumination light to attain higher resolution, August Köhler discovered that his samples would emit light under UV illumination and noted his annoyance with what turned out to be one of the earliest descriptions of autofluorescence (Köhler, 1904). Likewise, realizing that there is an associated phase distortion when light interacts with a sample, Fritz Zernike took advantage of this phenomenon to introduce contrast in brightfield imaging techniques (Zernike, 1942a,b). This simple yet elegant principle of phase contrast microscopy led to Zernike being awarded the Nobel Prize in 1953. These historical examples serve as reminders of why it is essential to understand light–specimen interactions and how they can be leveraged to reveal information about biology.
The reliance on incident light to illuminate biological samples, as well as the interaction of emitted photons with the specimen, remains incontrovertible even with advanced optical microscopes. The inconvenient truth is that a biological sample will inevitably alter the light traveling through it. Reciprocally, illuminating the biospecimen can also alter the underlying biology. It is always important to keep in mind that ‘seeing is changing’ – the mere act of observing a biological event can change the very outcome the observer is trying to measure. As a result, the evaluation of the ultimate performance of a light microscope and the subsequent image analysis is woefully incomplete if only the microscope itself is accounted for. There is a wealth of expertly written articles on how various optical components can affect the outcome and reproducibility of microscopy results. In this Review, we aim to fill a rather glaring omission in the literature by exploring the factors extrinsic to a microscope that will ultimately affect quantitative measurement.
The challenges in confronting these issues center around two fronts: (1) the difficulty in determining the occurrence and the extent of image artifacts, and (2) the technical challenges often associated with compensating for an image distortion. These challenges are exacerbated by the increasingly popular desire to observe biology in its native physiological state. This trend is fueled by the promise of progressively better optical instruments. The ultimate opportunity to study biology in the native environment that contains all the physiological signaling cues and biomechanical properties is the holy grail of life sciences. Yet, in the race to achieve higher resolution, faster acquisition, greater combinations of colors in multiplexed experiments and even brighter fluorescent labels, the concern of how photons interact with the biospecimen is, at times, sidelined. Advances in optical engineering do not always take into consideration how the biospecimen itself can affect the microscopy readout. Even when light–specimen interactions are the focus of the technology being developed, such as in AO technologies, only weakly light-scattering samples are amenable to correction (Liu et al., 2018a). The fact that a microscope is equipped with such technology does not guarantee that it will be able to correct for aberrations in every biological sample. The boundary at which a given biospecimen is deemed optically tractable or optically challenging is dependent on many factors. As biologists push their investigations further into the poorly characterized milieu of tissue microenvironments or whole organisms, the paucity of a priori knowledge of the specimen will continue to worsen. This makes it even more difficult to characterize and to ultimately compensate for image deviations.
Further complicating these problems is the prevalence of machine-learning algorithms. One such peril is the often-misconceived belief that image imperfections, regardless of the underlying cause or severity, can be overcome by sophisticated software; the problem is dismissed as resolved if the unwanted artifacts can be eliminated, and the target biological objects can be segmented for analysis. Although it is undeniable that machine learning is increasingly capable of performing sophisticated image processing and quantitative analyses, it should not be used without necessary skepticism. Over-reliance on software disregards the fact that high-fidelity image correction through machine-learning still requires an appropriate ground-truth.
Recognizing the pitfalls caused by biospecimens in imaging data should never be treated as an afterthought. Part of the initial experimental design should include controls that would highlight sample-induced problems, and the subsequent technical means needed to mitigate them (Wait et al., 2020). Taken together, no existing technology relieves the experimenter of the responsibility to characterize the source, extent and heterogeneity of image artifacts caused by the samples. In fact, it necessitates such due diligence to accurately probe the processes of life.
We thank Tom Hennigan, Laura Grima, Chris Obara and Ben Foster for their insightful discussions. We are also grateful to Amy Hu for preparing the mouse brain slide.
The Advanced Imaging Center at Janelia Research Campus is generously supported by the Howard Hughes Medical Institute and the Gordon and Betty Moore Foundation.
The authors declare no competing or financial interests.