## SUMMARY

Maximal performance is an essential metric for understanding many aspects of an organism's biology, but it can be difficult to determine because a measured maximum may reflect only a peak level of effort, not a physiological limit. We used a unique opportunity provided by a frog jumping contest to evaluate the validity of existing laboratory estimates of maximum jumping performance in bullfrogs (*Rana catesbeiana*). We recorded video of 3124 bullfrog jumps over the course of the 4-day contest at the Calaveras County Jumping Frog Jubilee, and determined jump distance from these images and a calibration of the jump arena. Frogs were divided into two groups: ‘rental’ frogs collected by fair organizers and jumped by the general public, and frogs collected and jumped by experienced, ‘professional’ teams. A total of 58% of recorded jumps surpassed the maximum jump distance in the literature (1.295 m), and the longest jump was 2.2 m. Compared with rental frogs, professionally jumped frogs jumped farther, and the distribution of jump distances for this group was skewed towards long jumps. Calculated muscular work, historical records and the skewed distribution of jump distances all suggest that the longest jumps represent the true performance limit for this species. Using resampling, we estimated the probability of observing a given jump distance for various sample sizes, showing that large sample sizes are required to detect rare maximal jumps. These results show the importance of sample size, animal motivation and physiological conditions for accurate maximal performance estimates.

## INTRODUCTION

Maximal performance is a key measurement in linking the ecology, fitness, biomechanics and morphology of animals (Arnold, 1983; Bennett and Huey, 1990). Several aspects of maximal locomotor performance are readily testable in the laboratory, show strong repeatability within individuals, and often correlate with key physiological variables (Adolph and Pickering, 2008; Bennett, 1980; Bennett and Huey, 1990; Huey and Dunham, 1987; Irschick and Garland Jr., 2001; Jayne and Bennett, 1990; Losos et al., 2002). However, maximal performance studies can be confounded by persistently sub-maximal behavior of individuals, particularly when sample size is limited (Bennett and Huey, 1990; Losos et al., 2002).

The maximal jumping ability of anurans is a mechanically simple escape behavior that has been used to study the links between performance and morphology (Zug, 1972), enzyme activity (Putnam and Bennett, 1983), muscle physiology (Lutz and Rome, 1994; Marsh and John-Alder, 1994; Peplowski and Marsh, 1997) and ecology (Phillips et al., 2006). Studies of mechanical power output during maximal jumps have revealed that many frog species consistently generate mechanical work outputs that are close to the theoretical limits for vertebrate skeletal muscle (Peplowski and Marsh, 1997). These high work outputs are facilitated by an elastic mechanism that allows muscle work to be stored slowly, followed by an explosive release of this energy to produce very high power outputs during a jump (Astley and Roberts, 2012; Peplowski and Marsh, 1997). While several frogs studied to date show evidence of an elastic power amplifier and high muscle work outputs during jumping, there is variation among species. Specifically, bullfrogs (*Rana catesbeiana*) and their smaller congenerics typically produce much lower mechanical work output during jumping when compared with hylid tree frogs. These differences are apparent in a simple comparison of maximum jump distance for Cuban tree frogs (*Osteopilus septentrionalis*) and bullfrogs. The single best jump recorded in the laboratory for a Cuban tree frog is 1.7 m, while the best bullfrog single jump distance is 1.3 m, and most measurements consistently report a maximum of 1 m or less for bullfrogs and congenerics (Lutz and Rome, 1994; Olson and Marsh, 1998; Roberts and Marsh, 2003; Zug, 1978). The consistently lower maximal jumping performance of ranid frogs has been attributed to a tradeoff between jumping and swimming performance in these semi-aquatic animals (Olson and Marsh, 1998).

Conclusions about interspecific variation in performance rely on the assumption that performance measured in the laboratory represents a true maximum. In few cases can objective criteria be applied to assess the level of effort during a maximal performance trial. Experimenters address this challenge methodologically by performing a large number of trials on as large a sample of animals as possible, and often by excluding poor performers based on subjective observations (Losos et al., 2002). Our own confidence in the effectiveness of this approach in studies of bullfrog jumping was shaken by observations recorded not in the scientific literature, but in another source for reports of animal performance extremes, *The Guinness Book of World Records* (Guinness World Records, 1997). *The Guinness Book of World Records* reports a record from a frog jumping contest, the Calaveras County Frog Jumping Jubilee, in which contestants attempt to maximize the straight-line distance covered by a bullfrog in a series of three jumps. In 1986, ‘Rosie the Ribeter’, a bullfrog jumped by contestant Lee Guidici, covered a total distance of 21 feet 5.75 inches (6.55 m) in a three-jump series. This reported value would correspond to a single jump distance of 2.18 m if the jumps were equal length, or longer if they were not. This is strikingly different from the typical values of 1 m observed in scientific studies, and well beyond the single longest jump distance of 1.295 m recorded in the scientific literature (Zug, 1978).

The Calaveras County Frog Jumping Jubilee, a contest inspired by the Mark Twain short story, has been held annually in Angels Camp, CA, for over 80 years. The contest consists of 3 days of qualifying rounds, followed by a day of finals to determine the winner. Contestants fall into two categories. ‘Professional’ frog jockeys bring their own locally-caught frogs and are serious competitors, often working in family groups that have passed down frog jumping secrets through generations of competition. ‘Amateurs’ compete with frogs rented from the fair organizers. We saw the contest as an opportunity to test the hypotheses that current laboratory-based measurements of frog jumping underestimate true maximal performance, and that large sample sizes are necessary to provide reliable estimates of maximal performance in bullfrogs. We used high-definition video recordings of the 84th annual contest to determine jump distance, and used this unusually large biomechanical data set to attempt to determine the sample sizes needed to observe maximum performance.

## MATERIALS AND METHODS

### Data collection

Frog jumps of the 84th annual Calaveras County Jumping Frog Jubilee were recorded with a Sony HDR-FX1 camcorder at 30 frames s^{−1} (60 fields s^{−1}) and 1440×1080 pixel resolution from a fixed position in the seating stands. Video files were de-interlaced prior to digitizing. All frogs were bullfrogs [*R. catesbeiana* (Shaw 1802)]. Contestants placed their frog on a standard starting location and induced them to jump three times in succession in order to achieve the maximum straight-line distance from the starting point. Contestants motivated the frogs by yelling, touching the frog, blowing on it, lunging towards it, or combinations thereof, although contact with the frog is forbidden after the first jump. Although only three jumps were required, some frogs jumped additional times, and these jumps were also included in our analysis. At the beginning and end of each day filming, a brief clip was recorded of a calibration grid placed on the stage consisting of six 148 cm squares, which were then digitized using a MATLAB digitizing script (Hedrick, 2008). These data were used to create a perspective transformation that was applied to digitized coordinates from jump videos. The locations of the frog's body at the first perceptible jump movement and first body–ground contact were digitized for each jump in the sequence and perspective transformed in MATLAB, and distances and jump durations were computed from transformed data. In several instances, the frog performed a rapid series of short, shallow jumps in which forward velocity was maintained during ground contact, similar to the ‘skittering’ behavior some species use to move across the surface of the water (Gans, 1976; Herrmann, 2006). Because these ‘skitters’ violated key assumptions underlying performance limits in frog jumping, such as that all energy from each jump is generated *de novo*, they were excluded from the data set. To assess accuracy, at the end of each day we filmed a tape measure locked at 213 cm as it was placed in seven locations around the stage at various angles to the camera. Subsequent digitizing and transformation of tape measures showed no consistent bias in distance with a 95% confidence interval of 1.6 cm.

### Rental frogs *versus* professionally jumped frogs

Frogs were categorized into two discrete groups. The first group consisted of ‘rental frogs’ provided by the fair and jumped by a diverse selection of fairgoers. The second group consisted of ‘professionally jumped frogs’, fielded by highly organized teams who had competed for many years or decades. These teams collected frogs from specific sites and pre-screened them for jump ability, then maintained, prepared and stimulated the frogs to jump using methods gleaned from trial-and-error experience. Identification of ‘rentals’ and ‘pros’ was made on the basis of announcements made by fair organizers prior to each jumping trial. Although we were not allowed to take measurements of frogs in the pros group, there were no visually discernible differences in size or overall morphology. A small number of frogs were brought by independent individuals not associated with teams, or were not identified as either rentals or pros; these categories were not included as part of the rentals or pros data sets, because of uncertain background and low sample sizes, but were included in overall results.

### Derived performance variables

Video measurements allowed the direct determination of total jump distance (*D*_{jump}) and total jump duration (*T*_{j}), and from these variables we calculated the angle (θ) and takeoff velocity (*V*_{t}) of each jump using ballistic formulae. While a given jump distance can be achieved *via* many combinations of takeoff velocity and angle, each of these combinations will result in a different jump duration, only one of which will match our observed jump duration.

*L*

_{cm}is the distance from the distal toe tip to the center of mass with legs fully extended and

*V*

_{t}is takeoff velocity (Marsh, 1994). Aerial duration was: where

**is the acceleration due to gravity and θ is the takeoff angle. Descending duration was approximated as: and the total jump duration is the sum of all three (Eqns 1, 2 and 3):**

*g**D*

_{jump}is total jump distance, which can be re-arranged to: and substituted into Eqn 4 to get: allowing calculation of an estimated

*T*

_{j}for a given angle based on actual

*T*

_{j}, jump distance and

*L*

_{cm}. These estimated

*T*

_{j}values for a variety of angles are then compared with measured

*T*

_{j}. Once the angle is known,

*V*

_{t}can be calculated

*via*Eqn 6. Jumps with a distance of less than three times the length from the toe to the center of mass (

*L*

_{cm}) or with jump durations of less than 0.4 s were excluded from this analysis because of large error relative to small values.

*M*

_{m}is the proportion of muscle mass to body mass, assumed to be 24% of the total frog body mass based on prior measurements (Marsh, 1994). Average power per unit muscle mass was Eqn 12 divided by Eqn 1:

Peak power was calculated as twice the average power (Marsh, 1994).

### Statistics and resampling

A series of *t*-tests (JMP 7.0, SAS Institute, Cary, NC, USA) was used to determine differences between pros and rentals for jump distance, angle, takeoff velocity, work and power. We performed orthogonal regressions between the first jump distance and the jump distances of the next four jumps in a single series in order to assess fatigue between successive jumps and consistency of individual performance. Fatigue would be evident by a regression slope significantly lower than 1, while individual consistency would be reflected in the *r*^{2} value of each regression.

In order to assess the sample sizes needed to detect maximal performance, we shuffled all jumps of either pros or rentals, as well as a subset (‘personal best’) representing the best jump in a single series (with a minimum of three recorded jumps), and determined the maximum jump distance observed at a given sample size. From a data set of 100,000 repetitions, we determined the percent chance of observing a given jump distance for a subset of sample sizes (*N*=10, 50 and 300) for both rentals and pros.

## RESULTS

Many of the jump distances of bullfrogs at the Calaveras Country Frog Jumping Jubilee exceeded all previous records of performance in the scientific literature (Olson and Marsh, 1998; Roberts and Marsh, 2003; Zug, 1978). Of the 3124 jumps we quantified, 1804 (58%) of these jumps exceeded the maximum jump distance reported in the literature [1.295 m (Zug, 1978)] (Fig. 1A). The longest jump in our sample was 2.2 m, 70% longer than the prior maximum from the literature (Fig. 1A, Table 1) (Zug, 1978).

All jump performance variables, including jump distance and takeoff velocity, were significantly greater in professionally jumped frogs compared with rental frogs (*P*<0.0001 for all), though there was always substantial overlap (Table 1). Additionally, jump distance showed strong differences in distribution. Rental frog jump distances had a nearly normal distribution with minimal skew (Shapiro–Wilk *W*=0.97, skewness=0.23), while professionally jumped frogs had a strongly skewed distribution (Kolmogorov–Smirnov–Lilliefors *D*=0.12, skewness=−1.1; Fig. 1B).

Jump angle values calculated from jump distance and flight time showed that frogs used a wide range of jump angles. The best jumps occurred over a narrow range of jump angles, with a plateau from 38 to 44 deg (Fig. 2B). This observation agrees well with predictions from a consideration of ballistics and force production, which predict optimal take-off angles of 39 to 42 deg (Marsh, 1994). While jump distance is determined by both takeoff velocity and angle, the relatively limited variation in jump distance achieved for a given velocity (Fig. 2A) compared with the large variation in jump distance for a given angle (Fig. 2B) suggests that takeoff velocity is the primary determinant of jump distance.

Orthogonal regressions between the first jump and subsequent jumps of the same individual show a high to moderate level of consistency in an individual's performance (Fig. 3). All regression lines had slopes that were not significantly different from 1. Individual jump performance was consistent, with *r*^{2} values ranging from 0.57 to 0.20. The correlation between the first jump distance and subsequent jump distances declined with each jump (Fig. 3). Declines in jumping performance for jumps 4 and 5 may be due to a decrease in motivation, as jockeys often did not pursue frogs after the three jumps required for the contest.

Random resampling of our data set allows us to examine the relationship between sample size and the likelihood of observing maximal performance. Resampling results show a highly non-linear relationship between sample size and jump distance observed (Fig. 4). The chance of observing jump distances close to the mean of our complete data set was close to 100% even at low sample sizes, but the chance of observing longer jumps declined rapidly (Fig. 4). For example, in a sample of 50 rental frog jumps, the chance of observing a 1.6 m jump (slightly longer than the average for pros; Table 1) is only 56%, but it is almost 100% for a sample of 50 professional frog jumps (Fig. 4). Reducing the sample size to 10 rental frog jumps reduces the chance of observing a 1.6 m jump to a mere 14%. Restricting observations to ‘personal best’ jumps increases the probability that a given jump distance will be observed in a given sample size, with an 88% chance of observing a 1.6 m jump in a sample of 50 rental personal bests, and a 68% chance of observing a 2.15 m (near-maximal) jump in a sample of 100 ‘personal best’ professional frog jumps.

## DISCUSSION

Our data confirm our hypothesis that prior estimates of bullfrog maximal jump distance significantly underestimate the capability of the species and that large sample sizes are necessary to determine maximal performance. Several potential reasons for the prior underestimate are highlighted by comparisons between professionally jumped frogs and rental frogs.

### Understanding the superior performance of professionally jumped frogs

Why do frogs jumped by ‘professional’ frog jockeys consistently jump so far? Rental frogs and professionals' frogs are caught just prior to the contest and in most cases come from the same river drainage, the Sacramento–San Joaquin River Delta. We did not take any physiological or morphological measurements on the frogs, but they appeared to be of similar size and condition. Professional frog jockeys do have favorite, and secret, locations for catching frogs, so we cannot rule out the possibility that populations of frogs with exceptional anatomy or physiology exist. However, we believe the most likely explanation for the superior jumping performance of these frogs rests in the techniques employed by the jockeys.

Professionally jumped frogs were pre-screened by teams prior to competition. Many teams catch hundreds of frogs, and through a brief jumping trial screen to eliminate individuals with low performance. This pre-screening could account for both the increased mean performance and the highly skewed distribution of jump distances (Fig. 1B). This would suggest that jumping behavior and performance is relatively consistent for an individual frog. There is no evidence that any of the frog jockeys train frogs; instead, professionals prioritize keeping the animals in captivity for as short a time as possible.

Over years of competition, professional frog jockeys have learned to exploit the thermal physiology of the frogs, and have likely settled on a thermal optimum for jumping performance. Professionals' frogs were maintained in warm environments prior to jumping, with a carefully regulated target temperature of ~29°C. Because rental frogs were stored in a shaded area, their mean body temperature during jumps was lower, ranging from 20 to 23°C. The thermal optimum for jumping performance varies among species (John-Alder et al., 1988; Knowles and Weigl, 1990; Marsh, 1994), and can vary within species depending upon acclimation temperature. Thermal performance curves measured for jump distance in congenerics are relatively shallow over the 20–30°C range. Among existing measurements the largest increase in performance from 20 to 30°C is ~25% (John-Alder et al., 1988). Thus while the influence of temperature on muscle performance (Bennett, 1984; Marsh, 1994) may explain some of the extraordinary performance of the professionally jumped frogs, it does not appear to be sufficient to explain all of the difference in jump distance between groups, and it cannot explain the skewed distribution of jump distances in this group.

The techniques used by jockeys to motivate frogs may also play a crucial role in the difference in jump performance. Rental frogs were jumped by diverse fairgoers using a wide range of stimuli. In contrast, professional frog jockeys employed a very stereotyped sequence of actions (supplementary material Movie 1), including rubbing the frog's legs, dropping the frog onto the jump pad from a short height, and lunging after the escaping frog head-first. The convergence of all teams on similar motivational behavior after decades of trial and error suggests that such methods result in improved jump performance.

### Evidence for a physiological limit

Three observations suggest that the maximum jump distance in this study may be a reliable estimate of the physiological limit for the species. First, the skewed distribution of jump distance for professionals' frogs, with a sharp drop-off in number of jumps longer than 1.9 m, would seem consistent with reaching an absolute performance limit (Fig. 1). The normal distribution of frogs jumped by amateurs, by contrast, is consistent with the idea that there is a large component of behavioral variation in this group.

The second observation that supports the idea that the best jumps in the contest represent a physiological limit comes from historical records. Since the first contest in 1930, the winning jump distance increased continuously for 50 years, finally plateauing in the early 1980s (Fig. 5). This pattern is strikingly similar to the curves of maximum recorded running speed over time for humans, greyhounds and thoroughbred horses (Desgorces et al., 2012). Just as the increase in running performance is explained in part by improvements in training techniques, historical improvements in frog jumping distance may be explained by technique, not of the frog but of the jockey. Learning to warm the frogs, maintain captivity for a short time, and executing particular techniques for motivating frogs all likely contributed to the trend of improved performance from year to year. Genetics may also contribute, as the technique of selecting the best jumpers from a large pool of frogs may have been learned over the course of the contest, and this increased sampling of the genetic pool would increase the chances of finding performance standouts. We learned nothing to suggest that frogs are bred, but it is possible that contestants have discovered subpopulations of frogs with unusual jumping performance. Having observed the importance of jockeying technique, we speculate that improvements in the skill of the jockeys is a key determinant of the historical pattern observed in Fig. 5. If this is the case, the example provides a sobering caution for investigators designing experiments to elicit maximum performance. Most investigators study a few dozen animals at most, and use a very limited trial and error method to sort out methods to motivate performance. In the Calaveras frog jumping contest, it took hundreds to thousands of motivated contestants 50 years to find the conditions and techniques that would produce maximal performance in jumping bullfrogs.

Calculations of the muscular work output in jumping also provide support to the idea that the longest jumps in the contest represent the physiological limit for bullfrogs. Jump distance is ultimately determined by the kinetic and potential energy of the frog at takeoff, and is therefore limited by the mechanical work performed by the muscles during launch (Lutz and Rome, 1994; Marsh, 1994; Peplowski and Marsh, 1997). If we make the conservative estimate that all of a frog's hindlimb muscles are involved in jumping, and that bullfrog leg muscles comprise approximately 24% of total body mass (Marsh, 1994), the work per unit muscle mass is 44.6 J kg^{−1} for the longest observed jump (Table 1). This value is significantly higher than prior estimates for bullfrogs (Olson and Marsh, 1998), and near the theoretical upper limit for muscle work in a rapidly contracting muscle (Peplowski and Marsh, 1997).

### Large sample size is needed for accurate determination of maximal performance

While this study clearly shows the importance of sample size in determining maximal performance, similar samples may not be feasible for other studies, particularly in laboratory settings. Most studies do not have access to such large sample sizes, nor large numbers of people perfecting stimuli and conditions over multiple decades. How much sampling effort is actually necessary to achieve a reasonable estimate of maximal performance, and what steps can be taken to achieve better predictions of maximal performance from a given sample size of data?

One common strategy used to determine maximum performance is to select the ‘personal best’ of all trials for a given individual (Adolph and Pickering, 2008; Garland and Losos, 1994; Losos et al., 2002). By selecting the best jump of three or more from our data set, the sample size necessary to observe a given jump distance declined sharply (Fig. 4). While this may not reduce effort for simple performance assessments, it may allow for a more focused approach to any additional data processing in more complex studies.

An additional possibility is to track the maximum during sampling. In spite of differences in absolute values, the shapes of the curves of mean maximum jump distance observed *versus* sample size are quite similar, showing a steep slope that declines as the observed maximum asymptotically approaches the actual maximum (Fig. 4). If, during the course of sampling, the observed maximum's increase slows and remains static over many samples, this suggests a reasonable portion of the overall variation in the population has been captured and that only extensive additional sampling will further raise the observed maximum (Fig. 4).

The exact number of animals needed, as well as the importance of ‘fine-tuning’ the environment and motivation for performance, is likely to vary with species and behavior. Even within frog jumping, our experience suggests that some species perform more consistently at high levels of performance. The observation that existing laboratory measurements of jump performance in Cuban tree frogs involve muscle work outputs near the theoretical maximum (Peplowski and Marsh, 1997) reinforces our more subjective observations that these animals perform consistently in the laboratory. The difference between the efforts necessary to obtain maximal performance in Cuban tree frogs *versus* bullfrogs may ultimately be based in behavioral traits related to their habitats. While Cuban tree frogs must jump far enough to reach another tree branch or to completely evade a predator, bullfrogs must only jump the short distance from their typical shoreline position into the water, where they can conceal themselves. Even closely related frogs with different habitat preferences show differences in escape behavior (Licht, 1986), and frogs with similar habitat preferences to bullfrogs will tolerate closer predator approaches the closer they are to the safety of the water (Martín et al., 2005).

### Conclusions

Many of the jumps of the Calaveras County Frog Jumping Jubilee exceed estimates of maximal performance for bullfrogs from the scientific literature. The skewed distribution of jump distances for frogs jumped by professional frog jockeys and high calculated values of mass-specific mechanical work suggest that the best jumps observed define a physiological limit. The historical increase in jump distances in this contest may reflect the gradual improvement of techniques for motivating maximal performance, an observation that should serve as a caution for laboratory estimates of maximal performance. Future work on maximal performance should examine the effects of optimal and sub-optimal conditions and experiment extensively with methods of motivating the animals, as well as ensuring adequate sample size.

## Acknowledgements

The authors thank Sandie Lema and the Jump Committee of the Calaveras County Fair for their enthusiastic cooperation and assistance, and Mary Bush, Gavin Crynes and Jordan Apfeld for their tireless digitizing efforts. We thank Sierra Glasheen for assistance with imaging.

## FOOTNOTES

**FUNDING**

This work was supported by the National Science Foundation (grant 642428 to T.J.R.).

## REFERENCES

**COMPETING INTERESTS**

No competing interests declared.