Traditionally, physiologists have estimated the ability of organisms to withstand lower partial pressures of oxygen by estimating the partial pressure at which oxygen consumption begins to decrease (known as the critical PO2 or Pc). For almost 30 years, the principal way in which Pc has been estimated has been via piecewise ‘broken stick’ regression (BSR). BSR was a useful approach when more sophisticated analyses were less available, but BSR makes a number of unsupported assumptions about the underlying form of the relationship between the rate of oxygen consumption and oxygen availability. The BSR approach also distils a range of values into a single point with no estimate of error. In accordance with more general calls to fit functions to continuous data, we propose the use of nonlinear regression (NLR) to fit various curvilinear functions to oxygen consumption data in order to estimate Pc. Importantly, our approach is back-compatible so that estimates using traditional methods in earlier studies can be compared with data estimates from our technique. When we compared the performance of our approach relative to the traditional BSR approach for real world and simulated data, we found that under realistic circumstances, NLR was more accurate and provided more powerful hypothesis tests. We recommend that future studies make use of NLR to estimate Pc, and also suggest that this approach might be more appropriate for a range of physiological studies that use BSR currently.
Understanding the physiological tolerances of an organism to its environment has long been a focus of ecophysiology. Oxygen is a fundamental requirement of most organisms, and its availability may be limiting across a range of habitats (Pörtner, 2010; Verberk et al., 2011; Ferguson et al., 2013). For over 30 years, measures of oxygen tolerance have accumulated in a wide variety of taxa with variation evident at a range of scales (Greenlee and Harrison, 2004a; Mueller and Seymour, 2011; Lease et al., 2012; Ferguson et al., 2013). Surprisingly, no clear consensus over how to estimate oxygen tolerance exists, and for the most part, modern statistical approaches have not been brought to bear on this problem.
The most common estimate of oxygen tolerance is the critical partial pressure of oxygen for aerobic metabolism (Pc), which represents the lowest level of oxygen at which aerobic metabolism is independent of the ambient partial pressure of oxygen (PO2) (Hochachka and Somero, 2002). At levels of PO2 below Pc, metabolism cannot be supported by aerobic processes entirely, and metabolic rate decreases, and/or anaerobic processes that are relatively inefficient and produce potentially toxic end-products become increasingly important (Hochachka and Somero, 2002). The original method for estimating Pc is the ‘broken stick’ regression (BSR) approach (Yeager and Ultsch, 1989) – an approach that remains the most common today. The broken stick approach to estimating Pc has been applied in a range of contexts, and been used to demonstrate, for example, that the Pc of a species is generally matched to the minimum oxygen level encountered in the environment in which it lives (Childress and Seibel, 1998; Nilsson, 2007; Ferguson et al., 2013), and that mobile species show behavioural avoidance of oxygen levels below their Pc (Burleson et al., 2001). Nonetheless, not all species show clear break points in the relationship between the rate of oxygen consumption () and PO2, which complicates efforts to assess the regulatory ability these species (Mueller and Seymour, 2011).
The traditional BSR approach (piecewise linear regression) makes a number of unsupported assumptions about the underlying relationship between oxygen availability and respiration rate. First, it assumes that the functional response of an organism to decreasing PO2 is biphasic, that is, it consists of two linear elements with a clear break between these two phases. Above Pc, is assumed to be characterised by a linear function that is completely flat, while below Pc, decreases linearly with PO2 with an abrupt transition between these two functions (Chiu et al., 2006). Of course, in reality, rates of respiration are likely to be a continuous function between these two phases, and furthermore, concentration-dependent reaction kinetics make a linear relationship between and PO2 highly unlikely. As such, the BSR approach does not reflect the underlying structure of the data, violating the basic assumptions of the regression approach (Quinn and Keough, 2002). Our discussion here should not be taken as a criticism of the original progenitors of these approaches: when they were developed, those analyses represented the best approach available with the statistical and computational tools of the time. Today, however, more sophisticated approaches are available that better reflect the underlying processes that generated the data.
Since the development of the BSR technique, there have been a number of other ways in which authors have attempted to estimate Pc. These approaches have largely been subjective and lack repeatability (e.g. Portner et al., 1991; Greenlee and Harrison, 2004a; Greenlee and Harrison, 2004b; Lease et al., 2012). Ideally, any technique for estimating Pc should both represent the mechanistic process by which the data were generated and be repeatable. More generally, there are compelling reasons for describing continuous traits with functional relationships, rather than taking isolated point measures (Stinchcombe and Kirkpatrick, 2012). For example, Mueller and Seymour (Mueller and Seymour, 2011) fit a nonlinear function to estimate the ability of organisms to regulate oxygen consumption beyond simple oxy-conformity across a range of oxygen values: they propose a ‘regulation index’, which estimates a relative measure of oxyregulatory ability using either linear, quadratic (y=a+bx+cx2) or one-phase association [y=y0+(ymax −y0)(1–e−kx)] fits. Here, we apply a similar logic for estimating Pc that meets the above criteria – a nonlinear regression approach coupled with simple differential calculus. We then evaluate the performance of this new method relative to the most widely used approach to date, BSR, across real-world data and explore a broader parameter space using simulations to compare the sensitivity and reliability of these two approaches. Finally, we examine the statistical power of the two techniques to distinguish differences in oxygen tolerance among groups.
MATERIALS AND METHODS
Fits to published data
After examining the form of many published relationships between and PO2, we settled on six candidate functions that approximated the general relationship well and were reasonably tractable analytically (Table 1). While other forms could also fit the data and we encourage investigators to explore alternatives (e.g. a power function with an intercept), we chose these forms as, on first inspection, they appeared to fit real-world data reasonably well and consistently. We used standard nonlinear regression (NLR) to fit these functions to published data. It is beyond the scope of this paper to describe the general theory and approach of NLR, but we recommend Quinn and Keough [see p. 150 (Quinn and Keough, 2002)] for a general introduction and Ritz and Streibig (Ritz and Streibig, 2008) for an excellent primer on how to implement this analysis in the freely available statistical software R (R Development Core Team, 2012). The NLR analysis provides estimates of between two and four parameters (or even more if a selected function has more): with these values the relationship between and PO2 can be visualised. For ease of use and comparison among studies, however, a single metric that best represents Pc is desirable. Such a metric must be repeatable and objective. We based our metric on the underlying principles of what Pc seeks to describe: the point at which is no longer strongly affected by PO2. In other words, when the slope of the function begins to flatten out and approach zero. As the absolute values of a slope will vary according to the maximum oxygen consumption rate of that particular organism, we first standardised our data by the maximum observed for any set of values (). By standardising, we can choose one slope value and compare where that value occurs among individuals, species and studies.
The slope across function is, of course, given by the first derivative of the original function and the derivative for each function is shown in Table 1. After rearranging we can solve for Pc, at any slope value (here, denoted m), and these are shown in Table 1.
We explored a range of values of m and found that a slope of m=0.065 best approximates Pc such that the solved values for Pc are shown in Table 2.
We coupled this formula to our NLR estimates of a, b, c and d as appropriate to estimate the value of Pc for a range of published data from the literature (see Table 2). NLR relationships with additive error structures were fitted using the ‘nls’ function in R v2.15.0 (R Development Core Team, 2012), and NLR relationships with multiplicative error structures were fitted using the ‘gnls’ function. The NLR and BSR fits to each data set were then compared on the basis of Akaike's information criterion (AIC) as a measure of model fit; the fit with the lowest AIC was considered the best of the candidate set of models, given the data (Burnham and Anderson, 2008). We then also compared the performance of the NLR approach with the BSR method across a range of parameter values to evaluate how each approach coped with differences in variation and sampling resolution with a simulation approach.
We explored the influence of a suite of characteristics of the relationship between and PO2 on NLR- and BSR-derived estimates of Pc using Monte Carlo simulations. We began by generating a function relating to PO2 that was explicitly biphasic, incorporating two linear portions (one that increased with PO2 until PO2=Pc, and a second that was independent of PO2 at PO2 levels greater than Pc). Pc was set at 5 kPa and above Pc was set at 1. We then repeatedly simulated the process of sampling these data, and estimating the value of Pc by fitting both the BSR and NLR. We elected to use a biphasic function because it reflects the underlying assumption of the BSR approach, and because fitting such data should represent the greatest challenge to NLR. Put simply, if NLR outperforms BSR, even when the function is biphasic, then there can be little justification for preferring BSR over NLR, and as such our approach was highly conservative. We sampled the relationship between and PO2 at a range of resolutions (0.125, 0.25, 0.5 and 1 kPa). To each , we then added a normal deviate with a mean of zero and a coefficient of variation (CV=s.d. divided by ) of 0.025, 0.05, 0.1, 0.15 or 0.20. These values for the CV of were selected to span the range of observed values of CV in real data (N=10, mean=0.09, range: 0.04–0.21; see references in Table 1). One thousand such data sets were generated for each combination of sampling resolution and CV, and Pc was estimated using both NLR and BSR techniques. NLR estimates of Pc were derived as described above by fitting a Weibull function to each data set (excluding that of Cryptobranchus as the model would not converge) using the ‘nls’ function in R v2.15.0 (R Development Core Team, 2012) to obtain a non-linear least squares fit using a Gauss–Newton algorithm. We used the Weibull function as this function best fit the most published relationships in the literature (see Results). BSR estimates of Pc were obtained by simultaneously fitting two linear regressions constrained to meet at a specified PO2, i.e. Pc. The slope of the linear regression below the specified Pc was a free parameter, whereas the slope above the specified PO2 was set at zero. Beginning at the third-lowest PO2 in the data set, a series of specified values of PO2 were trialled, with each successive PO2 being 0.01 kPa greater than the last until the third-highest PO2 in the data set was reached. The value of the PO2 break point that minimised the sum of squared deviations from the biphasic function was considered to represent Pc.
Testing for differences among groups
We compared the ability of our best-fitting NLR and BSR to identify differences among groups that differ in Pc with Monte Carlo simulations. For two groups of six simulated data sets, each with PO2 values sampled at 1 kPa resolution, we generated a relationship between and PO2 that included a break point (Pc) at either 6.5 kPa (group 1) or 8.5 kPa (group 2). As a conservative measure, these values of Pc were deliberately chosen to be different from the Pc at which NLR performs best. As above, for each data set increased linearly with PO2 to equal 1 when PO2=Pc. For values of PO2 above Pc, was independent of PO2 and equal to 1. To each value of we then added a normal deviate with a mean of zero and CV of 0.10. We then estimated Pc for each of the data sets using BSR or NLR with m=0.065, and tested for differences among groups using t-tests. We also tested for differences among groups using the best-fitting NLR regression by pooling data sets for each group, and testing for the significance of a fixed grouping factor using likelihood ratio tests. This simulation procedure was repeated 1000 times, and statistical power was estimated by calculating the proportion of tests that reported a significant difference among groups with α set at 0.05.
Fits to published data
Both techniques provide a good fit to real-world data (Fig. 1), and the predicted Pc based on NLR was strongly correlated with the Pc estimates based on BSR (R2=0.82, N=13, P<0.001), though the relationship was not precisely one to one (regression equation: Pc(NLR)=0.646 × Pc(BSR) + 1.12). Neither consistently over- or underestimated the other; rather, the relationship between the two was idiosyncratic (Fig. 2). In eight out of the 13 cases, the NLR approach provided a better fit to the data than did the BSR approach (Table 2). For the remaining five cases where the BSR approach had a lower AIC, the differences in AIC between the BSR and NLR approaches were equivocal (ΔAIC around 2) in two cases. For nine out of 13 cases, the Weibull function (either with or without an intercept) provided a better fit to the data than any other nonlinear function. For the four remaining cases, each of the other four nonlinear functions provided the best fit for one case, suggesting that all should be considered candidate functions in future studies.
The best-fitting NLR was far more robust to variation in the data than was the BSR (Fig. 3). While both performed well when variation was very low and sampling resolution very high, the broken stick technique was far more likely to dramatically misestimate Pc when variance was high. The NLR also coped with lower sampling resolution relative to the BSR: at sampling resolutions greater than 0.5 kPa, the error rate of the BSR increased dramatically, with this approach being far more likely to misestimate Pc. For the BSR, the effects of sampling resolution and error interacted such that the most error-prone estimates came from simulated data with high levels of variation and low sampling resolutions (Fig. 3). In contrast, neither resolution nor variation had strong effects on the error rate of the NLR technique.
Testing for differences among groups
Power to detect differences among groups was lowest for t-tests comparing BSR-derived estimates of Pc (0.63), and was higher for t-tests comparing best NLR-derived estimates of Pc (0.74); power was also relatively higher for comparisons of pooled data made using likelihood ratio tests (0.83).
For both real-world and simulated data, an NLR approach to estimating tolerance to decreasing availability to oxygen was superior to a BSR approach in almost every regard. The NLR approach was more robust to variation in the data and reductions in sampling resolution, as well as offering a more powerful means of detecting differences between groups. Our simulations suggest that BSR has a tendency to overestimate Pc when data are variable, sampling resolutions are low and the underlying ‘true’ Pc is low. It is possible therefore that some species may be more tolerant to low oxygen conditions than is currently appreciated. NLR performed less well in simulations when the underlying Pc was close to the highest Pc measured in the study, and in this instance, BSR was a more accurate indicator of the Pc. The poor performance of NLR in this parameter space was due to the relationship between and PO2 being linear in our simulations (recall that we used an artificially biphasic function) and as such, it was inevitable that the nonlinear function fit the data poorly. We therefore favour the use of NLR over BSR despite this shortcoming because in nature, very high values of Pc are very rare, and genuinely linear relationships between and Po2 are similarly rare. Nevertheless, we suggest that visual inspections of the data be conducted before Pc is estimated, and on the rare occasions when the relationship between and PO2 appears to be linear, linear regression be used instead of NLR. In other words, as for any statistical model-fitting exercise, the underlying form of the relationship should dictate the form of the model that is fit (Quinn and Keough, 2002).
The use of NLR to estimate tolerance to low oxygen carries the additional advantage of allowing formal testing for differences among groups within the same statistical framework. Traditionally, comparisons of Pc among species, individuals or groups have involved estimating Pc within groups using the BSR approach, using these estimates of Pc as individual data points and then using a separate statistical test to detect differences (e.g. Ferguson et al., 2013). There are a number of disadvantages to this two-stage approach. First, estimating the Pc as a summary statistic for the entire function results in significant loss of information – essentially, a large amount of data is collected to estimate a single point, and then the bulk of the data are ignored in the hypothesis test (Stinchcombe and Kirkpatrick, 2012). Second, using the Pc summary statistic ignores error and variance associated with this estimate, and the subsuming of error will result in hypothesis tests that are more susceptible to Type I errors (Hadfield et al., 2010). In contrast, formal hypothesis tests of differences across entire functions among groups allow the use of the full complement of data that were collected as well as incorporating the appropriate error into the hypothesis test of interest (Stinchcombe and Kirkpatrick, 2012). Furthermore, the NLR method could be extended to include random effects, whereby error associated with different experimental equipment or different measurement days could be partitioned, increasing the sensitivity of this analytical approach and providing much greater scope to incorporate additional factors.
As well as the statistical advantages described above, the use of NLR affords practical advantages for comparisons among studies and meta-analyses. For example, studies from different research groups may not sample at identical values of PO2 and may sample with different resolutions. Our simulations show that differences in sampling resolution will result in systematic differences in the estimate of Pc, and furthermore, the estimate of Pc will be a product of the PO2 at which was measured. In contrast, an NLR approach is robust to these differences in sampling regime, allowing more comparable estimates of Pc among studies, as well as estimates of across a greater parameter space. Furthermore, no single function fit the data best such that comparisons among studies could be problematic. The retention of an estimate of Pc in the context of our approach provides a means of comparing equivalent slopes among studies that have fit very different functions.
We chose a point (where dV/dP=m=0.065) on our function that best approximated the Pc estimated by the BSR approach. We chose this point so that future studies can provide estimates of oxygen tolerance that are comparable with traditional measures of Pc. In practice, we recommend that future studies that utilise the NLR approach present estimates of not only Pc, but also estimates of function parameters (i.e. estimates of a, b, c and d, as appropriate), as well as their associated error, as these values will allow comparisons across the entire range of oxygen partial pressures. Standardised estimates of parameters in selection in evolutionary biology have facilitated important insights from meta-analyses (Kingsolver et al., 2001), and we hope similarly valuable insights might become available via the widespread use of the standardised approach we recommend here.
The use of approaches such as BSR to estimate biological ‘thresholds’ in nonlinear responses more generally is increasingly being scrutinized. The estimation of ecological thresholds using BSR techniques has been called into question (Toms and Lesperance, 2003), as has the distillation of continuous functions into single points in evolutionary studies (Stinchcombe and Kirkpatrick, 2012). Generally, the failure to use nonlinear functions to describe nonlinear traits results in less powerful, less accurate estimates (Griswold et al., 2008). BSR techniques are also used to estimate changes in developmental rates across temperatures (Stanwell-Smith and Peck, 1998), as well as cardinal temperatures (estimates of maximum, minimum and optimal temperature), and we suspect that NLR techniques may be more appropriate in these settings as well. While BSR techniques were once a pragmatic recognition of limited computing power and statistical programming, today we have unprecedented access to powerful statistical estimation techniques with freely available statistical software. We join others (Stinchcombe and Kirkpatrick, 2012) in calling for the increased uptake of ‘function-valued’ approaches to estimating continuous traits such as the relationship between and PO2.
We thank Hayley Cameron, Amy Hooper, Marcelo Lagos Orostica and Matthew Thompson for help in preparing the manuscript. Two anonymous reviewers provided extremely thoughtful and helpful comments that improved the manuscript.
D.J.M., M.B. and C.R.W. are all supported by grants from the Australian Research Council.
No competing interests declared.