Toothed whales have evolved to live in extremely different habitats and yet they all rely strongly on echolocation for finding and catching prey. Such biosonar-based foraging involves distinct phases of searching for, approaching and capturing prey, where echolocating animals gradually adjust sonar output to actively shape the flow of sensory information. Measuring those outputs in absolute levels requires hydrophone arrays centred on the biosonar beam axis, but this has never been done for wild toothed whales approaching and capturing prey. Rather, field studies make the assumption that toothed whales will adjust their biosonar in the same manner to arrays as they will when approaching prey. To test this assumption, we recorded wild botos (Inia geoffrensis) as they approached and captured dead fish tethered to a hydrophone in front of a star-shaped seven-hydrophone array. We demonstrate that botos gradually decrease interclick intervals and output levels during prey approaches, using stronger adjustment magnitudes than predicted from previous boto array data. Prey interceptions are characterised by high click rates, but although botos buzz during prey capture, they do so at lower click rates than marine toothed whales, resulting in a much more gradual transition from approach phase to buzzing. We also demonstrate for the first time that wild toothed whales broaden biosonar beamwidth when closing in on prey, as is also seen in captive toothed whales and bats, thus resulting in a larger ensonified volume around the prey, probably aiding prey tracking by decreasing the risk of prey evading ensonification.
Echolocation enables toothed whales to actively probe their surroundings during navigation and prey localisation through the production of high-amplitude, high-frequency clicks, and subsequent auditory processing of much weaker returning echoes milliseconds later (Au, 1993; Surlykke et al., 2014). In structurally complex environments, many echoes from objects and environmental features will be generated and may impede detection and processing of the few relevant echoes from potential prey (Urick, 1983; Au, 1993). Both bats and toothed whales have evolved the ability to cope with such complex, self-generated auditory scenes, allowing them to navigate and hunt efficiently with echolocation as their primary sense (Moss and Surlykke, 2001; Madsen and Surlykke, 2013). Adaptations to echolocating in complex environments include fine-scale adjustments of the biosonar system to modify the information generated from acoustic sensing (Moss and Surlykke, 2010; Madsen et al., 2013). We know from studies on trained animals that these changes take place both on the receiving side (Supin et al., 2010; Linnenschmidt et al., 2012b; Supin and Nachtigall, 2013) and on the production side, where control is exerted over click rate (Morozov et al., 1972), frequency content and output level (Moore and Pawloski, 1990), beam direction (Moore et al., 2008) and beamwidth (Jensen et al., 2015; Wisniewska et al., 2015). Depending on the beamwidth and source level (SL) of the emitted clicks, and the rate at which they are produced, echolocating toothed whales can focus their acoustic gaze on particular objects in the water column to ease the interpretation of their actively generated auditory scene in a range of habitats with different noise and clutter conditions (Wisniewska et al., 2015). We can therefore potentially learn a lot about the function, operation and evolution of toothed whale biosonar systems by quantifying their source parameters and sampling rates for different species and habitats in the wild (Madsen and Surlykke, 2013).
However, the biosonar dynamics that allow toothed whales to perform auditory stream segregation in complex environments also inherently present a problem for researchers who wish to study such dynamics if the context of the measured biosonar parameters is poorly known. For example, an increasing number of studies have reported that many smaller toothed whales employ a form of gain control in the sense that they reduce their output levels during the approach phase to the target in a manner generally described as a 20log(R) relationship for decreasing target range, R (Au and Benoit-Bird, 2003; Jensen et al., 2009). However, some studies report a substantial deviation from that general trend (de Freitas et al., 2015) while other data show no relationship between SL and range (Jensen et al., 2013), thus apparently suggesting no gain control at all. Such variation in gaze changes across species and studies as toothed whales approach a potential target begs the fundamental question of whether the studied animals did not adjust their biosonars during the approach phase or whether the researchers failed to identify the target that the animals were interested in and to which they therefore adjusted their acoustic gaze.
The quantification of biosonar parameters of wild toothed whales is normally done using two different approaches that each hold their merits and limitations: animal-borne tags and hydrophone arrays. Tags allow researchers to obtain information on individual echolocation clicks over many hours under circumstances where array recordings are often unattainable (Madsen et al., 2002; Johnson et al., 2004) and, depending on tag placement and the species tagged, returning echoes from actively pursued prey may be recorded (Johnson et al., 2004; Arranz et al., 2011; Madsen et al., 2013; Wisniewska et al., 2016). However, as echolocation clicks are highly directional (Au, 1993; Koblitz et al., 2012), tag recordings provide a highly distorted perspective on biosonar clicks from the tagged animal (Au et al., 2012) and therefore only allow for relative adjustments of source parameters (Madsen et al., 2005). Array recordings, in contrast, are very well suited to absolute quantification of source parameters as long as echolocation clicks are recorded on-axis (Au et al., 1986, 1987), but are logistically difficult to use for recording actual prey-capture situations given their bulky nature.
apparent source level
bootstrap confidence interval
energy flux density
equivalent piston radius
peak to peak
root mean square
on-axis source level
two-way travel time
The ability to confidently identify on-axis clicks and estimate source parameters depends on array conformations as well as on array dimensions (Madsen and Wahlberg, 2007). The maximum range at which arrays are able to localise sound sources with an accuracy suitable for basic sound parameter estimation, such as SL and frequency content, is generally estimated to be around 10 times the maximum array dimensions under ideal conditions (Jensen et al., 2009; Kyhn et al., 2009). Linear arrays are faster to construct and deploy in the field compared with 2D or 3D arrays, but planar arrays facilitate detection of on-axis clicks (Schotten et al., 2004). Tags and arrays are therefore not suitable for answering the same questions as large discrepancies exist between the properties derived from off-axis tag recordings and those from on-axis array recordings (Wisniewska et al., 2012). Quantification of absolute source parameters requires hydrophone array recordings in front of the animals, and hence biosonar operation is described in snapshots often with very little knowledge about the behavioural context and potential targets of interest to the echolocating animals. It follows that array recordings made in different but unknown behavioural contexts of the same species in the wild may lead to very different conclusions on the performance, use and evolution of biosonar systems.
It may therefore be relevant to ask whether the various source parameters derived when using hydrophone arrays in the wild are at all representative of what toothed whales use when they approach and intercept prey. If wild toothed whales are indeed echolocating with their acoustic gaze fixed on recording arrays, as required for testing changes in output parameters as a function of range to a known target, would they then use their biosonars in the same way if they were approaching their much smaller prey targets? Alternatively, if, when recorded, they are engaged in pursuit of fish that are not co-located with the recording array, range-dependent biosonar adjustments may be derived that are not representative of toothed whales echolocating on prey, perhaps leading to the erroneous conclusion that they do not employ acoustic gaze changes when approaching and intercepting prey.
It has recently been demonstrated with array recordings that Amazon river dolphins [boto, Inia geoffrensis (Blainville 1817)] employ a low-power, high sampling rate biosonar as a likely adaptation to the often shallow and cluttered waters in which they hunt (Ladegaard et al., 2015). Here, we tested whether these wild toothed whales use different biosonar parameters and gaze changes when echolocating on actual prey rather than hydrophone arrays. To test this, an experiment was designed where wild botos approached and intercepted prey immediately in front of a star-shaped hydrophone array so that biosonar target range and acoustic localisation range would be similar. We show that botos dynamically adjust their biosonar beamwidth, click rate and output level as they approach and intercept the prey. Furthermore, we show that although botos, like other toothed whales, buzz during prey capture, they do so at much slower rates than seen for similarly sized marine species.
MATERIALS AND METHODS
The recording site was a wooden platform (6×4 m) located near São Tomé, Amazon, Brazil (3°5′S, 60°28′W), from where local guides fed wild botos with dead fish during tourist visits. During recording sessions, one to fewer than 10 botos could be observed from the platform with usually 1–4 animals being within a few tens of metres of the recording array at the same time. Sound speed at the site was estimated to 1516 m s−1 using the Medwin equation (Medwin, 1975), based on a measured water temperature of 33°C, an assumed animal depth of 2 m and a salinity of 62 ppm (Gibbs, 1972). Fieldwork was carried out with permission from Ministério do Meio Ambiente, Brazil (SISBio-13462-5).
Recording array and trial protocol
The six-armed star array (Fig. S1) was constructed in solid PVC with each arm (2 cm cylinder diameter) inserted with 60 deg spacing into a centre disc (25 cm diameter). Seven TC4013 hydrophones (Teledyne RESON A/S, Slangerup, Denmark) were fixed at the end of seven PVC cylinders (1.5×20 cm diameter) that were attached to each arm and to the centre disc. The resulting planar seven-hydrophone array (Fig. 1) had three hydrophones situated 37.5 cm from the central hydrophone and three others at 77.5 cm. The TC4013 hydrophones had a calibrated receiving sensitivity of −211 dB re. 1 V µPa−1. All TC4013 hydrophones were connected through a custom-built 40 dB amplifier and filter box (1 kHz high pass and 200 kHz low pass, 2 poles) to an eight-channel analog-to-digital converter (USB-6356, National Instruments, Houston, TX, USA) sampling at 500 kHz at 16-bit resolution distributed over a ±5 V range set by a custom-written (LabView, National Instruments) recording program. The entire recording chain had a flat (±2 dB) frequency response from 1 to 150 kHz and a clipping level of 185 dB re. 1 µPa.
During recordings, the star array was held via a wooden stick attached to a hole in the centre disc and submerged to a depth of approximately 1.4 m relative to the centre hydrophone. A TC4034 hydrophone (−218 dB re. 1 V µPa−1 receiving sensitivity, Teledyne RESON A/S), to which a 10–15 cm-long fish was attached via an organic string, was then lowered into the water to the same depth and approximately 1 m in front of the centre hydrophone. The TC4034 hydrophone was connected to a 40 dB amplifier and filter box (1 kHz high pass and 250 kHz low pass, 1 and 3 poles), which connected to the same analog-to-digital converter as the star array hydrophones. When wild botos approached the fish, the received levels (RL) on the prey could then be recorded by the TC4034 while the star array behind allowed for acoustic localisation and hence derivation of source parameters.
Recording sessions were filmed above and below water by two HD HERO2 cameras (GoPro Inc., San Mateo, CA, USA) with the underwater camera mounted on top of the centre disc of the array to verify that pulling of the prey hydrophone cable was in fact correlated with botos grabbing the prey (Movie 1). This was otherwise not apparent because of the murky water.
On-axis click criteria
The click detector threshold was set to 60 dB below the recording chain clipping level, i.e. 125 dB re. 1 µPa (peak). To enable on-axis source parameter estimation from the star array hydrophone recordings, a set of strict on-axis selection criteria, modified from Kyhn et al. (2010), had to be fulfilled: (i) clicks were considered on-axis only if the highest envelope peak across all star array hydrophones was recorded on the central hydrophone to ensure that the acoustic beam axis had been directed within the array boundaries; (ii) in each scan (minimum 5 clicks), only the click with the highest RL on the central hydrophone was selected to reduce pseudo-replication problems, as source parameters of consecutive clicks produced by the same animal are most likely not independent; and (iii) as determined by calibration measurements (Fig. S2), sound sources had to be localised to within 10 m and at incoming angles less than 30 deg relative to the centre hydrophone.
For each click, the time of arrival on all seven array hydrophones was measured from the amplitude envelope as the time the −6 dB amplitude relative to peak amplitude was first exceeded. This formed the basis for the calculation of the six independent time-of-arrival differences. Sound source location was then estimated from those time-of-arrival differences by applying a least-squares method (Wahlberg et al., 2001; Madsen and Wahlberg, 2007).
Biosonar parameters of on-axis clicks
Recordings were digitally high-pass filtered using a 10 kHz Butterworth filter (4 poles) before on-axis signals were extracted using a 64-point Hann window centred on the peak of their amplitude envelopes and subjected to a factor 8 interpolation. Click duration, amplitude parameters, spectral parameters (using 256-point fast Fourier transform) and interclick interval (ICI) were quantified as previously described (Ladegaard et al., 2015) using the methods of Madsen and Wahlberg (2007) and Au (1993).
Beam pattern estimation
In order to estimate off-axis angles to individual hydrophones, the acoustic beam axis first had to be estimated. This was done using a method applicable for both 2D and 3D array conformations. First, all receiver coordinates were projected onto a plane (using the centre hydrophone as pivot point) perpendicular to the axis through the localised sound source and the centre hydrophone (the receiver measuring the highest RL). This resulted in a 2D rendition of the perceived array conformation from the animal's point of view. The SL was then estimated at each projected receiver location. The error of these estimated SLs depends on true off-axis angle to the centre hydrophone, but this error (estimated to <1 dB) was ignored as the on-axis criteria only allowed selection of clicks recorded less than 30 deg off-axis (i.e. sound source alignment with centre hydrophone and fish was relatively close). Next, the 3D planar coordinates were rotated into a set of 2D coordinates through principal component analysis using the MATLAB princomp function. The estimated SLs were then smoothed on a surface overlaying the 2D coordinates using the MATLAB gridfit function on a grid spacing of 5 mm. The acoustic beam centre was then determined to be at coordinates (x0, y0), representing the amplitude peak of the fitted surface (Wisniewska et al., 2015). The acoustic beam axis was then estimated as the line intersecting the points (x0, y0, 0) and (0, 0, z0), with z0 being localised sound source range relative to the centre hydrophone. Off-axis angles were then estimated from the intersections between the estimated acoustic axis and the axes between the sound source and each 2D receiver coordinate. Finally, a composite beam pattern was calculated from estimated off-axis angles and normalised apparent source levels (ASL) from each receiver by applying a circular piston fit model routine previously described in detail (Jensen et al., 2015). The piston diameters tested ranged from 1 to 30 cm with 0.1 mm increments. Performance of directivity estimation was calibrated in the harbour at the Fjord&Bælt, Kerteminde, Denmark (Fig. S3).
Boto biosonar adjustments during target approach
The botos approached and intercepted a fish in front of the recording setup a total of 156 times, during which 90 clicks fulfilling the on-axis criteria were identified. Botos would in most trials start approaching the prey attached to the TC4034 hydrophone (prey hydrophone) within a few seconds of it being lowered into the water. In the example approach (Fig. 1), the levels presented were back-calculated from the prey hydrophone and thus represent the ASL of clicks primarily recorded off-axis. In the first 4 s of this approach, the peak-to peak ASL (ASLpp) fluctuated around 180 dB re. 1 µPa with occasional gradual amplitude decreases of 10–15 dB, with these clicks of seemingly lower amplitude containing less high-frequency energy. During the course of the approach, the ICI decreased from about 30 ms to less than 15 ms (Fig. 1B). In the very last part of this approach, the sound source could unfortunately no longer be acoustically localised, probably as a result of the animal changing its orientation away from the star array. However, the clicks were still recorded on the prey hydrophone where the ICI showed a further decrease to 7 ms at the time of prey interception.
ICI, lag time and buzzing
In order to investigate the notion that botos, like bottlenose dolphins (Au, 1993), make use of a constant lag time [i.e. constant offset between two-way travel time (TWTT) and ICI], the ICIs of all 90 on-axis clicks and the 261 approach example clicks were plotted together with TWTT as a function of target range (Fig. 2). The ICI of on-axis clicks showed a mean of 16.5±5.9 ms (Table 1), whereas the 261 approach example clicks had a mean ICI of 22.0±6.6 ms. The linear regression relationships calculated separately for the on-axis clicks and approach example clicks as a function of target range were ICI=1.85R+9.43, r2=0.52, and ICI=2.08R+11.3, r2=0.90, respectively. A single outlier (8.2 m, 207 ms) was excluded in the analysis of the approach example. Range-dependent lag time was estimated by subtracting TWTT from the ICI data, thus yielding a lag time of 0.528R+9.43 ms, r2=0.08, and 0.770R+11.3 ms, r2=0.54, for on-axis clicks and approach example clicks, respectively. All regression lines demonstrated slopes significantly different from zero (P<0.01, t-test). The suggestion that lag time was independent of target range therefore had to be rejected. The shortest ICIs during all approaches were measured at the time of prey capture (Fig. 3). ICIs were generally reduced only gradually throughout approaches and during prey interception. The median minimum buzz ICI (calculated within the last 0.5 s before prey interception) was found to be 7.7 ms (N=128). Only 9% of buzzes contained ICIs shorter than 5 ms and all buzz ICIs were longer than 3.6 ms.
On-axis source parameters
Source parameters were extracted for all 90 clicks fulfilling the on-axis criteria (Table 1). These clicks had a mean duration of 19.0±4.4 µs and their peak and centroid frequencies were distributed around means of 96.7±11.6 kHz and 90.0±6.2 kHz, respectively. With a mean root-mean-square bandwidth (BWRMS) of 21.8±2.0 kHz, the resulting mean quality factor [QRMS, centroid frequency (Fc):BWRMS ratio] was 4.2±0.5. The source intensities were measured as peak-to-peak SL (SLpp), root-mean-square SL (SLRMS) and energy flux density SL (SLEFD), with means of 174.7±7.5 dB re. 1 µPa, 164.9±7.5 dB re. 1 µPa and 117.5±7.4 dB re. 1 µPa2 s, respectively. SLpp was found to decrease with a decreasing target range (Fig. 4), following the linear regression line: SLpp=16.6log(R)+166.3 dB re. 1 µPa, r2=0.36 (P<0.001, t-test). Fc increased with increasing SLpp, with the linear regression line: Fc=0.56SLpp−6.7 kHz, r2=0.44 (P<0.001, t-test). In a series of two-sample t-tests, the means of all source parameters reported here were found to differ significantly (Table 1, P<0.05) from our previous array measurements of wild botos (Ladegaard et al., 2015). Effect size was quantified by calculating Cohen's d for all source parameter pairs. All Cohen's d values were higher than 0.80, indicating large effect sizes, except for the peak frequency (Fp) difference (d=0.28), where effect size was small (Cohen, 1988). Of the observed differences, we wish to highlight that all mean SL measures were more than 10 dB higher in our previous study (Ladegaard et al., 2015). We also point out that caution is necessary when comparing data directly as a significant difference was found for mean localisation range (Table 1).
The composite echolocation beam directivity of all on-axis clicks was best described by the piston fit model using an equivalent piston radius (EPR) of 3.8 cm having a 95% bootstrap confidence interval (BCI) of 3.6–4.0 cm. This corresponded with a directivity index (DI) of 23.1 dB (BCI: 22.7–23.5 dB), half-power beamwidth of 12.9 deg (BCI: 12.4–13.6 deg) and −10 dB beamwidth of 23.5 deg (BCI: 22.5–24.7 deg) (Table 2) using the conversion formulas described by Zimmer et al. (2005). Range-dependent analysis of on-axis clicks divided into 1 m bins (Table 2, Fig. 5B) revealed that mean DI changes significantly as a function of localisation range through the relationship DI=3.45log(R)+21.3 dB, r2=0.99 (P<0.001, t-test).
Only about half of the known toothed whale species have ever been recorded, but all of these have been shown to produce clicks suited for echolocation (Surlykke et al., 2014). Yet, only for a small number of these species has echolocation been unequivocally demonstrated (Norris et al., 1961; Penner and Murchison, 1970; Evans and Awbrey, 1988) and this form of active sensing has been extensively studied in even fewer (Au, 1993). Even for the best-studied species in captivity, such as the bottlenose dolphin (Tursiops truncatus), there is very little knowledge on how they use echolocation in the wild to perform some of the most critical and basic behaviours for which this sense evolved; namely locating, choosing, tracking and capturing prey. Invariably, researchers therefore face the trade-off between potential loss of ecological validity in controlled, captive settings versus a lack of control and little power to see in studies of wild animals (Au, 1993; Madsen and Surlykke, 2013). Acoustic tags on animals in the wild have helped bridge that gap over the last decade by providing detailed information of relative output changes in toothed whale biosonars during search, approach and capture of prey (Madsen et al., 2002; Johnson et al., 2004; Wisniewska et al., 2016). Yet, such tags do not provide information about the source parameters of the emitted clicks that in part define the biosonar system performance. To get at source parameters, hydrophone arrays in front of an echolocating animal are frequently used to identify and quantify on-axis biosonar clicks (Møhl et al., 1990, 2000; Au and Herzing, 2003). Many of these studies have helped us understand how animals modify their biosonar amplitude (Au, 2004) and directivity (Jensen et al., 2015) in the wild, and used these findings to predict the changes taking place during prey pursuit. However, a fundamental assumption behind these predictions is that the recorded animals focus their attention on the array and adjust their biosonar in the same way as for a prey (Au and Benoit-Bird, 2003; Madsen et al., 2004; Jensen et al., 2009). Such an assumption may not always be supported; on the contrary, it may be argued that a stationary or slowly drifting array in the water column constitutes an uninteresting object that might even interfere with detection of prey. This may be particularly likely in already cluttered environments where animals encounter a variety of objects in pursuit of prey, but less so for an array deployed in the open ocean where outgoing echolocation clicks generate few echoes in return. Also, an echolocator emitting tens to hundreds of signals while approaching a target of interest may not strictly focus its attention solely on this primary target, but is likely to inspect other objects in the vicinity concurrently with approaching prey (Surlykke et al., 2009; Moss and Surlykke, 2010). If animals do focus their attention and hence biosonar gaze on an array, then the question is would they produce echolocation signals having the same source parameters as when adjusting to prey for capture?
In this study, we strived to ensure animal attention and hence biosonar focus by creating a recording situation where a prey target, equipped with a hydrophone, would be in line with a star-shaped array. This allowed for the first quantification of source parameters of wild toothed whale echolocation clicks engaged in approach towards and interception of prey using botos as model organisms. Specifically, we sought to test the hypothesis that toothed whales echolocating for prey in the wild will employ different biosonar parameters from those derived from typical array recordings.
Biosonar behaviour of botos during prey interception
The general boto biosonar behaviour consisted of a significant decrease in SL as target range decreased, along with steadily decreasing ICIs (Fig. 1). The initial target approach therefore largely resembles the biosonar adjustments also reported for other wild toothed whales potentially adjusting to arrays (Rasmussen et al., 2002; Au and Herzing, 2003; Jensen et al., 2009) and for unrestrained animals adjusting to prey or other targets in captivity (DeRuiter et al., 2009; Wisniewska et al., 2012). During time intervals surrounding prey interceptions, the boto ICIs were at their lowest (Fig. 3), which compares to the buzz phase that characterises prey interception in other toothed whales (DeRuiter et al., 2009; Madsen et al., 2013; Wisniewska et al., 2014; Fais et al., 2016). The observation that botos also buzz during prey capture may underscore a fundamental trait about echolocation in toothed whales; even though botos have spent >10 million years adapting to life in a remarkable habitat of rivers and flooded jungles while evolving in parallel with marine toothed whales (Hamilton et al., 2001; Martin and da Silva, 2004), these different species still seem to share a basic biosonar framework that calls for comparable biosonar adjustments and high sampling rates during the critical phase of prey capture.
Even though echolocation behaviour on a broader scale is comparable across species, the habitat and prey niche may be defining factors for the ICI step change that toothed whales use during the phases of target approach and buzzing and in the transition between the two (Madsen et al., 2005; Johnson et al., 2008; Madsen and Surlykke, 2013). Interestingly, we observed that botos, which in general click much faster than similarly sized marine toothed whales (Ladegaard et al., 2015), at rates comparable to other river-dwelling species (Jensen et al., 2013), decreased their ICIs from approach to buzzing only gradually from roughly 30 to 10 ms (Figs 2 and 3). In contrast, similarly sized marine species may downregulate ICI by more than an order of magnitude, ending with buzz ICIs as short as ∼2 ms when catching prey (DeRuiter et al., 2009; Wisniewska et al., 2014). This bears a striking resemblance to the observation that harbour porpoises exposed to a clutter situation use click rates higher than normal during the approach phase, but decrease click rate during buzzing (Miller, 2010). It may therefore be speculated that the boto biosonar sampling scheme during prey capture reflects an adaptation to echolocating in cluttered surroundings. Another potential indication of this may come from deep-diving beaked whales, although these animals, in sharp contrast to botos, use slow almost constant ICIs during the approach phase before suddenly switching to buzzing (Johnson et al., 2004; Madsen et al., 2005). However, when beaked whales target single prey, they do so using buzz rates with median minimum ICIs of 4.3 ms, whereas when targeting prey schools, this value increases to 7.1 ms (Johnson et al., 2008), which roughly compares to the 7.7 ms found for botos. The slow buzzing in beaked whales approaching prey schools may indicate an increased processing time of complex returning echo streams or serve to maintain a larger auditory scene when manoeuvring around complex targets (Johnson et al., 2008), which might be similarly important to botos seeking out prey in a cluttered and reverberant environment.
Biosonar update rate and adjustments to prey range
Most studies of toothed whales suggest that they keep their ICIs at a longer duration than the TWTT to their target of interest (Morozov et al., 1972; Au, 1993). A first hypothesis to test is therefore whether botos adjust their click rate to prey range in a manner that will not confuse range estimation. Animals adjusting to static targets are predicted to have maximum control over biosonar adjustments as a function of target range (Wilson and Moss, 2004), and hence the current study should be well suited for investigating whether the animals attempted to adjust ICI to produce a constant lag time. The echolocating botos are likely to focus on the prey directly in front of the recording array, and this is mirrored by the ICIs being well explained as a function of target range (Fig. 2A). We show (Fig. 2) that all ICIs measured both for on-axis clicks and example approach clicks uphold the general pattern of ICIs always being longer than the TWTT to the target of assumed interest (Au, 1993). Following the arrival of target echoes, botos make use of lag times around 10 ms at the shortest target ranges while employing increasingly longer lag times for longer target ranges. This finding differs from some previous studies using stationed and actively swimming captive bottlenose dolphins where lag times at comparable target ranges have been reported as being fairly constant at around 20 ms (Morozov et al., 1972; Au, 1980), but agrees with the other observations indicating no support for constant lag time usage in freely swimming animals approaching prey (Wisniewska et al., 2014).
Previous field studies of smaller toothed whales often report that some ICIs vary between several times the TWTT to approximately equal to or below the TWTT, the latter primarily at longer ranges (Jensen et al., 2009; de Freitas et al., 2015; Ladegaard et al., 2015). If the assumption that animals adjust their biosonar relative to the recording arrays holds, then such varying ICIs would suggest that small toothed whales in the wild do not attempt to match click rate to TWTT to the same extent as captive animals do (Penner, 1988; Au, 1993), and that they are less strict about avoiding range ambiguity as the negative lag times could suggest. A perhaps more likely explanation for such ICI observations could be that not all clicks recorded in array studies are adjusted relative to the arrays, even though clicks have been recorded on-axis; very long ICIs might correspond to animals focusing on objects at ranges further than the array while short ICIs with negative lag times could result from animals adjusting to objects that are closer. This might especially be true when recording animals that are engaged in activities such as hunting (Au and Benoit-Bird, 2003; Au and Herzing, 2003), where prey seems the more likely target than an array nearby, and the proportion of on-axis clicks not focused on the array could be substantial. For species where matching of click rate to target range has been shown (Penner, 1988; Au, 1993), it could be argued as acceptable to exclude clicks having either very long ICIs compared with TWTT or negative lag times to reduce the risk of including clicks emitted when the animals did not adjust biosonar gaze to an array. However, such criteria are certainly not applicable to all species (Madsen et al., 2005; Johnson et al., 2008) and this also introduces the pitfall that clicks from atypical biosonar patterns (Turl and Penner, 1989; Ivanov, 2004) may be ignored.
The current experiment allowed us to compare the click rates used by wild botos when approaching prey (Table 1) with the click rates recorded when an array was either actively or coincidentally centred in the animal's biosonar beam (Ladegaard et al., 2015). We show that half the ICI variation is explained by range (Fig. 2A), whereas our earlier array study arrived at an r2-value of 0.17 for all on-axis clicks and just 0.04 when limiting the analysis to clicks localised within 10 m (Ladegaard et al., 2015). Furthermore, the apparent adjustment of click rate to range was roughly three times lower in Ladegaard et al. (2015). This comparison either indicates that arrays are not treated similarly to prey or alternatively that biosonar gaze is not necessarily adjusted to an array even though this is centred in the biosonar beam.
Biosonar output is adjusted during prey approach
Toothed whales possess a high degree of control over their biosonar system both on the receiving side (Supin et al., 2010; Linnenschmidt et al., 2012b; Supin and Nachtigall, 2013) and on the biosonar output levels (Moore and Pawloski, 1990; Kloepper et al., 2014). In captivity, it has previously been observed that stationed toothed whales may adjust biosonar output level as a function of range (so-called gain control) to physical targets in a manner following 11–13log(R) in bottlenose dolphins (estimated from range and SLpp data published by Au, 1980) or 10–20log(R) in harbour porpoises (Beedholm and Miller, 2007; Linnenschmidt et al., 2012a,b). Similar-sized effects of 11–19log(R) have also been found for two stationed bottlenose dolphins in a phantom echo experiment, although no gain control was used by a third animal to solve the same task (Finneran, 2013). Although such output adjustments are sometimes labelled automatic gain control (Au and Benoit-Bird, 2003), other studies suggest that output adjustments are non-automatic (Jensen et al., 2009) and are under the animal's cognitive control (Linnenschmidt et al., 2012b; Kloepper et al., 2014). Therefore, searching for general biosonar gain control rates is challenging as individual adjustment strategies can be different (DeRuiter et al., 2009; Finneran, 2013) and because biosonar behaviours are task dependent (Kloepper et al., 2014; Wisniewska et al., 2015).
Here, we show that wild botos use range-dependent biosonar output adjustments as they approach prey that is subsequently captured. This finding is different from that for foraging beaked whales, which approach prey without any apparent gain control before they switch to buzzing (Madsen et al., 2005). We found an average gain control magnitude of 16.6log(R) (Fig. 4A) which, as a result of the broad dynamic recording range, is a value unlikely to be biased by 20log(R) filtering (see Appendix 1 for a more detailed discussion of this issue). That result shows a slightly higher degree of adjustment compared with the 12.4log(R) (Ladegaard et al., 2015) or 14.7log(R) gain control (Fig. 4) found for botos where an array was the assumed biosonar target. For the relationship between SLpp and range, we further show that the regression line intercept is approximately 10 dB lower when animals approach prey than predicted from a more standard array study (Ladegaard et al., 2015) when limiting analysis to the same span of localisation ranges (Fig. 4). The observed differences suggest that biosonar output regulation does not result from automatic or stereotyped adjustments to any given object ahead of an animal, but rather that echolocation context and task will affect the biosonar parameters measured.
Biosonar beamwidth broadens during prey capture
Beam directivity has recently been shown in captivity to increase with target range in both delphinids (Finneran et al., 2014) and phocoenids (Wisniewska et al., 2015). In the wild, marine delphinids seem to follow the same overall pattern (Jensen et al., 2015). Here, we show (Table 2, Fig. 5) that river-dwelling botos at close range make use of a mean DI of 23 dB, which is among the lowest reported DIs for any toothed whale so far, even though botos use beam directivities comparable to those of other similarly sized toothed whales when target ranges are longer (Ladegaard et al., 2015). However, the boto's relatively low DI used during short-range echolocation in the final seconds before prey capture is comparable to the DI of 22 dB that Ganges river dolphins (Platanista gangetica) use when recorded at longer ranges (Jensen et al., 2013). Although toothed whales in general seem to converge on a DI between 25 and 29 when recorded at longer ranges (Koblitz et al., 2012), it may be that such narrow beamwidths are more common for clicks emitted by animals in the search or early approach phase whereas this study focused on toothed whales measured in the last few seconds before prey interception. As DI depends on transmitter aperture size relative to wavelength, the beamwidth adjustments may partly be explained by range-dependent changes in Fc. However, by using the Fc mean±s.d. of 90.0±6.2 kHz (Table 1) and the relationship between Fc and range, the Fc changes would give rise to a DI change of only 1.2 dB [estimated as 20log(96.2/83.8)], which does not come close to explaining the observed overall DI adjustment of approximately 5 dB (Fig. 5B). We therefore speculate that beamwidth is primarily adjusted through conformation changes of the melon via contractions of the surrounding muscles. Beam broadening prior to prey capture is hypothesised to be advantageous in order to reduce the risk of prey escaping ensonification just before the critical phase of interception (Jensen et al., 2015; Wisniewska et al., 2015). The botos' ability to efficiently regulate beamwidth over a short range (Table 2, Fig. 5) may likewise aid these animals as they navigate and track prey in their riverine and flooded forest habitats.
Here, we show how botos dynamically adjust their biosonar parameters as they close in on and capture prey. Like marine toothed whales, botos produce buzzes during prey capture, but with a less clear transition in click rate from approach phase to buzzing. We suggest that a cluttered and reverberant shallow water habitat produces a biosonar context where fast clicking is advantageous in search and approach phases, whereas relatively slow clicking during buzzing could serve to reduce clutter and reverberation problems related to short-term masking, complex acoustic scenes, and range ambiguities to interpret. We further show that botos adjust their click rate and biosonar output level as they approach prey, but do not attempt to keep a constant RL on their target. Also, the magnitudes of SL and ICI adjustments are higher during prey approach than when botos echolocate towards a drifting array. Finally, the beam changes shown in this study are the first demonstration that wild toothed whales broaden their biosonar beamwidth as they approach and intercept prey.
Here, we wish to discuss and demonstrate a few critical problems when using hydrophone arrays to record toothed whale clicks from various ranges and subsequently back-calculate SL in order to make inferences of whether animals use range-dependent output adjustments, i.e. biosonar gain control. A first and critical necessity is to select on-axis clicks from the total pool of recorded clicks; however, even when that criterion is fulfilled, there are several pitfalls that may lead to erroneous conclusions (Beedholm and Miller, 2007; Madsen and Wahlberg, 2007; Villadsgaard et al., 2007; Jensen et al., 2009). In order to conclude that an animal or a group of animals is using gain control, it must be a prerequisite that the null hypothesis of no use of gain control can be tested and rejected. This requires a broad dynamic range of the recording system relative to the range of SLs used by the animals. However, system or ambient noise may be so high that it is not possible to use a click detection threshold low enough to avoid range-dependent filtering of low-amplitude clicks that will then be ignored in a 20log(R) manner if assuming spherical spreading loss (Jensen et al., 2009). Likewise, if the recording system is too sensitive to handle input from high-amplitude clicks, then clipping will occur, which leads to arbitrarily measured amplitudes following a 20log(R) pattern (Beedholm and Miller, 2007; Madsen and Wahlberg, 2007). The combination of these two effects has been simulated in Fig. 6, which shows that an observed gain control effect approximating 20log(R) might result arbitrarily because the dynamic range of the recording system is unable to handle the input data range. Fig. 7 further simulates how digital filtering may partially conceal serious clipping problems. To deal with this, we here suggest two criteria that may help identify data sets suitable for studying gain control by allowing the null hypothesis of no gain control to be tested: (i) raw data must not contain clicks that suffer from clipping and (ii) all click amplitudes must be higher than the click detector threshold plus 20log(Rmax), where Rmax is the furthest localisation range considered. These criteria were defined post hoc and it is therefore only by coincidence that they do not conflict with the data from this study (Fig. 4A).
It is difficult to go back in the existing literature and identify whether previous gain control studies fulfil these two criteria if the click detector threshold and clipping level were not reported. This is, for example, the case in the studies by Rasmussen et al. (2002), Au and Herzing (2003), and Au et al. (2004), who first proposed 20log(R) gain control mechanisms in wild toothed whales (Au and Benoit-Bird, 2003), but other studies suggesting gain control also do not, or only partially, supplement the data with this information (Li et al., 2006; Atem et al., 2009; Fang et al., 2015). In other studies where both the click detector threshold and clipping level are reported, the two suggested criteria are not fulfilled and hence 20log(R) filtering is a concern regarding the validity of observed gain control magnitudes (Jensen et al., 2009; Ladegaard et al., 2015). To our knowledge, only one previous field study fulfils the two suggested criteria (de Freitas et al., 2015). However, an additional caveat is that when using on-axis criteria involving selection of the highest amplitude click within a sequence exceeding a certain number of clicks, the on-axis click inclusion threshold is raised above the click detector threshold by a factor depending on the amplitude variation in each click sequence. This increase in inclusion threshold might be of the order of 5–15 dB judging from the ASL changes during scanning behaviour shown in Fig. 1A,C, but will depend on the minimum number of clicks accepted in each sequence, with fewer clicks reducing this problem. Thus, in some situations, implementation of an even more conservative criterion than the suggested criterion (ii) might be appropriate in order to reliably test the null hypothesis of no gain control. However, we hope that the two suggested criteria may serve as a stepping stone for better criteria in future gain control studies using hydrophone arrays.
In the study by Villadsgaard et al. (2007), an apparent gain control effect was observed, but as the authors identified 20log(R) filtering as a potential cause, these authors refrained from making conclusions about biosonar gain control in the studied animals. We suggest that future studies follow that example in cases where the dynamic recording range does not allow for testing the null hypothesis of no gain control. We further suggest that future studies always report the recording system clipping level and the click detector threshold used and also plot these in figures showing click amplitude as a function of range with a clear indication of whether the plotted levels refer to peak, peak-to-peak or other measures. This information will serve as a helpful platform from where to convince readers that potential gain control effects are real.
Calibration of acoustic localisation accuracy
In order to derive the range threshold for including clicks in analysis, the localisation accuracy of the star array (Fig. S1) was calibrated in Aarhus Harbour, Denmark, within sound source ranges from 2 to 30 m and incoming angles of 0, 30, 60 and 90 deg. The calibration sounds were broadband 2-cycle pulses with 90 kHz peak frequency produced by a waveform generator (model 33220A, Agilent Technologies, Santa Clara, CA, USA) and projected through an omnidirectional HS70 hydrophone (Sonar Research and Development Ltd, Beverly, East Yorkshire, UK). These calibrations were performed in shallow water of approximately 2 m depth, simulating a worst-case recording situation. With calibration pulses arriving from an angle perpendicular to the plane of the array (incoming angle of 0 deg), the source localisation proved robust out to a range of 10 m where the mean±s.d. of the estimate was 9.9±0.23 m (N=39), corresponding to a mean error of the estimated transmission loss (TL) of −0.1 dB (Fig. S2A). As the incoming angle was shifted to 30 deg, the source localisation was still robust out to 10 m (9.4±1.3 m, mean TL error of −0.5 dB, N=39), although a single localisation outlier at 1.6 m (Fig. S2B) did result. For incoming angles of 60 and 90 deg, the localisation performance gradually broke down (Fig. S2C,D) with mean estimates at 10 m of 7.3±1.3 m (mean TL error of −2.7 dB, N=39) and 1.0±0.66 m (mean TL error of −20 dB, N=39), respectively.
As localisation ranges primarily resulted in underestimations of the true range, a maximum range criterion could not be used alone to reliably exclude poor localisation estimates. We therefore tested whether a second criterion using angle estimation could resolve this problem. A series of two-sample t-tests showed that the group of estimated angles at 0 deg incoming angle was significantly different and significantly lower than the estimated angles in all other groups (P<0.05). The group of estimated angles at 30 deg incoming angle was also significantly different and significantly lower than estimated angles at 60 and 90 deg incoming angle (P<0.05). However, the tests failed to reject that angle estimations at 60 deg incoming angle were both different and lower than the estimates at 90 deg incoming angle. The final result was that angle estimates at incoming angles of 0 and 30 deg could reliably be discriminated and distinguished from angle estimates at 60 and 90 deg incoming angle out to 10 m range. Accordingly, the criteria for maximum incoming angle and range were set to 30 deg and 10 m, which together provided a robust method for excluding poor localisation estimates.
Calibration of directivity estimation performance
Calibration signals were 2 cycles 100 kHz peak frequency pulses produced by a RESON TC2130 directional transducer at ranges of 2, 5 and 10 m from the centre hydrophone. At each calibration range, the TC2130 was turned back and forth around the vertical axis to simulate a toothed whale scanning its echolocation beam across the hydrophone array in the horizontal plane, while the array was held at incoming angles of 0, 30 or 60 deg relative to the sound source. One minute of data was analysed for each recording situation. Only clicks fulfilling the on-axis criteria i and ii were included in the analysis. Table S1 shows EPR estimates along with DI and BW–3dB (half-power beamwidth) converted from EPR using the conversion formulas described by Zimmer et al. (2005). The estimated EPR tend to be higher than found by Jensen et al. 2015, who reported EPR measures of 2.60±0.09 cm (95% BCI: 2.50–2.79) for the TC2130 transducer when using a logarithmic error model as in this study.
Potential interference patterns observed during the directivity calibration
Unexpectedly large signal amplitude differences were occasionally found during visual inspection of the directivity calibration data on hydrophones equidistantly spaced from the centre hydrophone. Fig. S3 shows an example of an expected signal amplitude measurement (Fig. S3A–C) along with two examples of unexpected amplitude variation (Fig. S3E,H). The occasionally observed amplitude differences of sometimes >10 dB may be the result of destructive or constructive interference patterns produced by the directional TC2130 hydrophone and might also explain part of the observed variation in EPR estimates at various ranges (Table S1).
We sincerely acknowledge the invaluable fieldwork assistance from Projeto Boto members, locals in São Tomé, Mafalda de Freitas, and Alexandre Douglas Paro. Also, thanks to Renata S. Sousa-Lima for facilitating fieldwork.
Conceptualization: M.L., P.T.M.; Methodology: M.L., F.H.J., K.B., P.T.M.; Software: M.L., F.H.J., K.B., P.T.M.; Validation: M.L., F.H.J., K.B., P.T.M.; Formal analysis: M.L.; Investigation: M.L., V.M.F.d.S., P.T.M.; Resources: M.L., F.H.J., K.B., V.M.F.d.S., P.T.M.; Writing - original draft: M.L., F.H.J., P.T.M.; Writing - review & editing: M.L., F.H.J., K.B., V.M.F.d.S., P.T.M.; Visualization: M.L., F.H.J., K.B., P.T.M.; Supervision: P.T.M.; Project administration: M.L., V.M.F.d.S., P.T.M.; Funding acquisition: M.L., F.H.J., V.M.F.d.S., P.T.M.
Fieldwork was funded by Danish National Research Foundation (Danmarks Grundforskningsfond) grants to P.T.M., Associação Amigos do Peixe-boi da Amazônia (AMPA) and Petrobras Ambiental grants to V.M.F.d.S., and Augustinus Fonden grants to M.L. M.L. was funded by a PhD stipend from the Faculty of Science and Technology, Aarhus University. M.L., K.B. and P.T.M. were funded by Danish National Research Foundation grants to P.T.M. F.H.J. was supported by a Carlsberg Foundation fellowship and an Aarhus Institute of Advanced Studies - Marie Curie COFUND fellowship.
The authors declare no competing or financial interests.