Echolocating toothed whales generally adjust click intensity and rate according to target range to ensure that echoes from targets of interest arrive before a subsequent click is produced, presumably facilitating range estimation from the delay between clicks and returning echoes. However, this click–echo–click paradigm for the dolphin biosonar is mostly based on experiments with stationary animals echolocating fixed targets at ranges below ∼120 m. Therefore, we trained two bottlenose dolphins instrumented with a sound recording tag to approach a target from ranges up to 400 m and either touch the target (subject TRO) or detect a target orientation change (subject SAY). We show that free-swimming dolphins dynamically increase interclick interval (ICI) out to target ranges of ∼100 m. TRO consistently kept ICIs above the two-way travel time (TWTT) for target ranges shorter than ∼100 m, whereas SAY switched between clicking at ICIs above and below the TWTT for target ranges down to ∼25 m. Source levels changed on average by 17log10(target range), but with considerable variation for individual slopes (4.1 standard deviations for by-trial random effects), demonstrating that dolphins do not adopt a fixed automatic gain control matched to target range. At target ranges exceeding ∼100 m, both dolphins frequently switched to click packet production in which interpacket intervals exceeded the TWTT, but ICIs were shorter than the TWTT. We conclude that the click–echo–click paradigm is not a fixed echolocation strategy in dolphins, and we demonstrate the first use of click packets for free-swimming dolphins when solving an echolocation task.
Toothed whales and bats are unique in being the only animals in which echolocation has evolved into the primary means by which they forage and navigate. Although the media in which toothed whales and bats live present different challenges, there are several parallels in the way these taxa use echolocation (Madsen and Surlykke, 2013; Surlykke et al., 2014). Generally, foraging echolocators emit signals at the slowest rates during the search phase, with the rate employed being highly species specific (Jensen et al., 2018). When a target is detected and subsequently approached, the intercall or interclick interval (ICI) typically decreases such that the ICI is kept longer than the acoustic two-way travel time (TWTT) between the animal and the target. The approach ends with a prey capture attempt, the terminal buzz, during which the echolocating predator continuously produces echolocation signals at very short ICIs, but still longer than the TWTT (Surlykke et al., 2014). This echolocation behaviour adjusted to the differing spatial dimensions of prey search, approach and capture was first outlined by Griffin (1958) and Griffin et al. (1960). The Griffin model provides the framework for how echolocation is generally understood in both bats and toothed whales; namely, that the timing of signal production is normally adjusted to target range so that following each signal emission the target echo is received before a new signal is emitted, presumably to avoid range ambiguity (Surlykke et al., 2014). The ICI therefore consists of the TWTT plus some lag time before the next signal emission. For the much-studied bottlenose dolphin (Tursiops spp.), the lag time is generally between 20 and 50 ms (Morozov et al., 1972; Au et al., 1974; Penner, 1988), which has been proposed to represent the time required for single-echo processing within the central nervous system (Au, 1980, 1993). If the ICI is not kept above the TWTT, then range ambiguity may occur as the delay between the last outgoing signal and the returning target echoes no longer correlates with target range. Range ambiguity may be problematic for fast manoeuvring animals, and both bats and toothed whales seem to typically avoid ICIs shorter than the TWTT when actively tracking and capturing prey (Wilson and Moss, 2004; Wisniewska et al., 2016).
Bats and toothed whales also adjust biosonar source level (SL) with target range, generally decreasing call or click intensity with decreasing target range (Rasmussen et al., 2002; Au and Benoit-Bird, 2003; Surlykke et al., 2014). For toothed whales, patterns in click SL have been suggested to result from pneumatic limitations of the sound production system in which increased clicking rates at shorter target ranges and a relatively constant nasal pressurisation effectively create a transmitter-based automatic gain control (AGC) mechanism (Au and Benoit-Bird, 2003). At longer target ranges, where clicking rates generally are lower, the highest recorded SLs for bottlenose dolphins are 228–230 dB re. 1 µPa peak-to-peak (pp) (Au, 1980, 1993). These levels probably reflect the upper limit of output level from their sound production system, but it is unknown what the time constant is for adequate pressurisation within the nasal passage to reach such high levels, or even whether pressurisation time is an active constraint on output levels. For the proposed transmission-based AGC model in dolphins, SL is suggested to follow an approximately 20log10(R) relationship, where R is target range (Au and Benoit-Bird, 2003). Approximately 6 dB changes in SL have been observed for a twofold change in target range for a number of smaller toothed whales both in captivity (Wisniewska et al., 2012) and in the wild over ranges of up to a few tens of metres (Rasmussen et al., 2002; Au and Herzing, 2003; Jensen et al., 2009). For longer ranges, the sonar equation predicts that animals should be able to detect larger targets if ambient noise and clutter levels are sufficiently low, but little is known about how SL and ICI adjustments are then implemented.
Although data are sparse, it has been shown that some smaller toothed whales switch to a different strategy for long-range echolocation that does not follow the click–echo–click paradigm proposed for bottlenose dolphin echolocation (Au, 1993). In this long-range mode, short groups or ‘packets’ of clicks are emitted in which the ICIs within packets are much shorter than the TWTT to the target, but interpacket intervals (IPIs) are closer to or longer than the TWTT (Turl and Penner, 1989; Ivanov, 2004; Finneran, 2013). Packet-emitting animals receive multi-echo streams following each packet emission rather than receiving a single target echo following each click. Hence, returning echo delays measured relative to the most recent outgoing click will not correspond with target range, except for echoes resulting from the last click in each packet if received prior to the production of another packet. This peculiar biosonar behaviour seemingly challenges the hypothesis that echolocating animals consistently operate to avoid range ambiguity. Packet click production also has implications for the hypothesis of pneumatic output limitation (Au and Benoit-Bird, 2003), as the short ICIs for clicks within click packets should result in low SLs which are poorly suited for long-range echolocation (Finneran, 2013).
The use of click packets has mainly been reported under laboratory conditions for stationary animals comprising a beluga whale (Delphinapterus leucas) (Turl and Penner, 1989) and a few bottlenose dolphins (Tursiops truncatus). These animals were engaged in ‘go/no go’ long-range target detection and discrimination experiments using either real (Ivanov and Popov, 1978; Turl and Penner, 1989; Ivanov, 2004) or phantom targets (Finneran, 2013; Finneran et al., 2014). Click packet emission occurred when echolocating towards static targets at ranges exceeding 75–120 m, with packet use being common beyond 200 m (Turl and Penner, 1989; Ivanov, 2004; Finneran, 2013; Finneran et al., 2014). The number of clicks per packet increases with target range for some animals (Ivanov, 2004; Finneran, 2013), and for one dolphin the target detection threshold improved 3 dB for every doubling of the number of returning echoes following packet emission (Finneran et al., 2014). Although click packets can improve detection performance, the use of packets does not appear to be triggered by a decrease in received echo amplitude with increasing target range, but instead by an increase in echo delay (Finneran, 2013; Finneran et al., 2014). In a recent study on mine hunting capabilities, Ridgway et al. (2018) observed that two bottlenose dolphins occasionally produced click packets when returning to the operator's boat having completed their tasks, whereas packets did not appear to be related to echolocation streams emitted during the search for mines.
It therefore remains an unanswered question whether the reported emission of click packets by free-ranging toothed whales is a genuine, but hitherto overlooked, mode of echolocation for long-range targets that is at odds with the Griffin model. To address this question, we designed a controlled study in which two free-swimming bottlenose dolphins actively approached a stationary target from distances of up to ∼400 m. Echolocation clicks were recorded using both an animal-attached sound recording tag and a synchronized hydrophone on the target, which allowed estimation of instantaneous target range for every detected click.
MATERIALS AND METHODS
Two trained bottlenose dolphins, T. truncatus (Montagu 1821) – SAY (36 year old female, ∼220 kg) and TRO (23 year old male, ∼180 kg) – with previous long-range echolocation training experience (Finneran, 2013; Finneran et al., 2014) and normal high-frequency hearing (Finneran et al., 2016b) participated in the study. The dolphins belonged to the US Navy Marine Mammal Program (MMP) population and were regular participants in Navy MMP psychophysical research. The study followed a protocol approved by the Institutional Animal Care and Use Committee at the Biosciences Division, Space and Naval Warfare Systems Center (SSC) Pacific and the Navy Bureau of Medicine and Surgery, and followed all applicable US Department of Defense guidelines.
Recording setup and trial protocol
The dolphins were trained to echolocate on a physical target consisting of two water-filled plates (30×25 cm) constructed from wooden frames with sheet metal faces (0.5 mm thickness). The plates were positioned vertically, attached together at a perpendicular angle relative to each other, and mounted at the end of a 2.4 m cylindrical metal pole of 3.8 cm diameter and 0.15 cm wall thickness. To enhance the target strength, the plates were wrapped in bubble wrap to a thickness of approximately 8–10 cm. The target had a measured target strength (TSE) of −22 to −23 dB measured with a broadband click that resulted from a 5 µs DC pulse amplified and delivered to a 5446 transducer (International Transducer Corporation, Santa Barbara, CA, USA). The acoustic signal resembled an exponentially damped sinusoid with duration of ∼100 µs and peak frequency near 160 kHz. A self-contained underwater recording system (SoundTrap 202HF, 576 kHz sampling rate and 186 dB re. 1 µPa clipping level, Ocean Instruments, Auckland, New Zealand) was mounted at the end of the target pole below the target plates. The SoundTrap hydrophone element was situated 22 cm below the lower edge of the target and 7 cm below the end of the target pole and was free to record 360 deg horizontally without being shadowed. When submerged during trials, the target depth was approximately 190 cm measured at the target plate centre and the SoundTrap hydrophone element depth was approximately 230 cm.
Experimental sessions took place in San Diego Bay (32°43′39″N, 117°12′35″W) between 7 and 21 December 2016. Both dolphins were trained to leave their ocean enclosure and follow a boat (the send boat) into the bay (5–6 m water depth), and then wait calmly near the send boat before each trial. The dolphins wore a stereo Dtag3 archival recorder (500 kHz sampling rate per channel and 190 dB re. 1 µPa clipping level, www.soundtags.org, Sea Mammal Research Unit, University of St Andrews, Scotland, UK) which was attached via suction cups dorsally with the two hydrophones located approximately 5 cm behind the blowhole. The target was lowered into the water a few seconds before the beginning of each trial from a second boat (the target boat, 22 ft Boston whaler), which had the engines turned off and drifted with the target during trials. The target boat, target pole and target plates possibly constituted a combined target for the dolphins to detect using echolocation, which might have been important especially at long target ranges. The initial animal-to-target range was read off a laser range finder immediately before a dolphin was sent towards the target. The handheld target was always presented with the perpendicular target plates oriented in an open book fashion relative to the dolphin's approach angle. SAY was trained to attend to this target orientation. During a target approach, the target orientation would suddenly be changed by ∼90 deg. Upon observing the target orientation change, SAY's task was to immediately return to the send boat. The orientation change time was restricted to periods where SAY was visible from either the send boat or target boat (i.e. within tens of metres) in order to confirm that SAY returned immediately following the target orientation change. This training regime was implemented in an effort to ensure that the biosonar beam axis was directed toward the target as much as possible during the approaches. Attempts were made to train TRO according to the same regime, but due to poor performance on reporting target rotation, he was trained to instead approach the target and touch it with his rostrum without attending to target orientation. Because the support pole had a theoretical TSE of about +7 dB relative to the target plates, the target plate echoes were probably not perceived as the most important target echoes to home in on, except for short ranges. A sound cue (i.e. a bridge) was played from a nearby hydrophone when TRO made contact with the target, which was observable from the target boat. This also signified the end of the trial. After each trial, the dolphins were given a fish reward upon completion of their tasks; no rewards were given for incorrect behaviours. No trials were run while other boats passed directly between the send and target boats. The dolphins were marked with zinc oxide on the dorsal surface to aid visual observations by the researchers while the dolphins swam in the bay.
Click detection, synchronisation and range estimation
All data analysis was carried out using custom scripts (MATLAB 2017a, MathWorks, Natick, MA, USA) and scripts from the tag toolbox (www.soundtags.org). Click events were detected in the Dtag recordings after applying a 50–200 kHz band pass filter (12 poles) that served to reduce the number of false detections due to snapping shrimp and other transients. An automated click detector was then applied using an adaptive threshold (i.e. relatively weak transients exceeding the detector's minimum threshold of −66 dB re. clipping level were ignored at times where high-amplitude clicks were detected) and a blanking time of 1.5 ms following each detection. Click detections from all trials were then manually inspected in 10 s windows using plots of received sound pressure, power spectrum and angle of arrival between the two hydrophones in order to add missed clicks and remove false detections.
Before estimating animal-to-target range, the Dtag and SoundTrap clock offsets were calibrated on a per-trial basis by first creating an echogram (Johnson et al., 2004), which is a visual representation of incoming echoes as a function of time, to identify a returning target echo in the Dtag recordings during the last seconds of a trial and measuring the TWTT. Target echoes were often visible out to 25–30 m. By identifying the corresponding outgoing click in the SoundTrap recordings (by comparing ICI sequences), the time of arrival difference (TOAD) between the Dtag and SoundTrap was estimated. The difference between the TOAD and half the measured TWTT then served as the clock-offset estimate, which was assumed to be a fixed value within each trial (given that trials lasted only up to 2.5 min). Clicks from the approaching dolphin were then identified in the SoundTrap recordings by ensuring that only clicks that correlated with Dtag click detections were selected. To this end, echogram-like plots were created by aligning 0.5 s SoundTrap sequences synchronised and extracted following each Dtag click detection to create a visual representation of incoming clicks as a function of time. Clicks from the approaching dolphin appeared in the plots as distinct lines with tag-to-SoundTrap delay changing gradually as the animal approached the recorder and were manually selected and saved with information of the TOAD between the two recording devices. The instantaneous distance between dolphin and target was estimated from the TOADs after correcting for the trial-specific clock-offset estimate by multiplying by a sound speed of 1507 m s−1 [calculated using the Medwin equation (Medwin, 1975) based on a water temperature in San Diego Bay of 15°C, 2 m depth and 35 ppm salinity]. To account for occasional outlier distance estimates that were physically unrealistic, the estimates were filtered using a two-state (speed and range) Kalman filter followed by a Rauch smoother (Bar-Shalom et al., 2004).
Biosonar parameter estimation
ICIs were measured as the interval between each click and the preceding click detection in the Dtag recordings. In the SoundTrap recordings, the raw signals of the detected clicks were first extracted using a 1 s window centred on each detection time and then filtered with a 10 kHz Butterworth high-pass filter (4 poles). The filtered signal was then extracted in a 200 µs window centred on the detection time and the 1 ms window preceding the signal window was used later for signal-to-noise ratio (SNR) estimation. For estimating the spectrum of outgoing clicks, a 200 µs Hann window was applied to the 200 µs signal window to reduce the amplitude of reflections trailing the direct signals. For each signal, the power spectrum was then calculated (FFT size: 1024) to estimate the peak frequency (Fp), centroid frequency (Fc) and root-mean-squared bandwidth (BWRMS) following Au (1993). Signal duration was estimated from the amplitude envelope as the time between the nearest −10 dB points (relative to the peak amplitude) on either side of the amplitude peak. Received levels (RLs) were measured peak-to-peak (pp) within the signal duration. The SNR was estimated for each click by first computing the RMS level of the filtered signal amplitudes within the measured signal duration. These were divided by the noise RMS level of the 1 ms window preceding each signal and converted to decibels. The apparent pp SL (from here on referred to as SL) was estimated from the RL by adding a transmission loss (TL) given by spherical spreading [20log10(R) dB, where R is range in metres] plus an absorption loss of αR with α computed from the Fc for each click in the far field taking into account 15°C water temperature, 0 km depth, 35 ppm salinity and a pH of 8 following Ainslie and McColm (1998).
Click packets have previously been defined by Finneran (2013) as ‘a temporally distinct collection of clicks spaced in time so that the last click in the packet was emitted before the first echo from that group was generated’, which also encapsulates the packet characterisations presented by earlier studies (Turl and Penner, 1989; Ivanov, 2004). In this study, the aim was to use an automated process to identify click packets without using a criterion relying on the target range and TWTT, firstly because click packets may not always be produced at IPIs exceeding the TWTT (Finneran, 2013) and secondly to assist future studies in finding possible packets when TWTT information is not available. First click packets containing two or more clicks were selected manually using plots of waveforms and ICIs to produce a data set of 4055 packets (2723 packets for SAY and 1332 for TRO), allowing subsequent comparison with detections made using an automated routine. Manual detection relied on recognition of the distinctive grouping of two or more clicks with low ICI preceded and followed by much longer ICIs; please see Movies 1 and 2 for click packet examples. Of the manually selected click packets, 95% were produced at delays exceeding 168 ms following a preceding click, 95% of within-packet ICIs were shorter than 47 ms, and 95% of packets were followed by delays exceeding 167 ms before the next click emission. Manually selected packets contained between 2 and 9 clicks. Based on these findings, the automatic click packet detection was chosen to follow the criteria that (i) a packet begins with a click produced >150 ms after a preceding click and is followed by a second click at an ICI <50 ms; (ii) all subsequent clicks with ICIs <50 ms are part of the packet; (iii) no other clicks must be emitted 150 ms after the last click in a packet; and (iv) a packet contains at least two but fewer than 10 clicks. The last criterion was implemented to decrease the likelihood of categorising regular click sequences as packets, although we note that Ivanov (2004) and Finneran et al. (2014) have observed packets with up to 30 and 24 clicks, respectively.
On-axis click estimation
Because toothed whales have a directional biosonar beam, the back-calculated source parameters for each echolocation click will depend on the recording angle relative to the acoustic beam axis. It is therefore more appropriate to compare the SL only within the subset of clicks that are recorded on-axis. Only a single hydrophone was used for recording at the target and hence array-based methods for identifying on-axis clicks could not be used. With that in mind, clicks were classified as presumed on-axis clicks following the assumptions that the dolphin would be scanning its biosonar across the target at least once every 5 s with a roughly constant SL during each scan, and that the click having the highest recorded amplitude within a click sequence would therefore have been recorded closest to the acoustic axis (Møhl et al., 2003; Johnson et al., 2006). Within each trial, clicks were automatically selected as presumed on-axis from the SoundTrap recordings using a 5 s sliding window (75% overlap) and the following criteria: (i) the click with the highest SL within each 5 s window is the most likely on-axis candidate; (ii) the candidate click is only selected if it is not one of the first two or last two clicks in the time window to ensure that the SL is increasing and decreasing within the time window, as expected during scanning; and (iii) if the same click has the highest SL in successive overlapping 5 s windows, it is only selected once. We wish to highlight that successful implementation of these on-axis criteria relies on a high degree of biosonar focus being directed towards the target and would not recommend using those criteria for analysing single-hydrophone recordings of wild or untrained animals.
To investigate whether the two dolphins emitted more clicks per packet at longer ranges potentially to compensate for increased TL and improve detection performance, the relationship between the number of clicks per packet (for packets containing a presumed on-axis click) and target range was analysed using a generalised linear mixed-effects model (GLME, MATLAB function: ‘fitglme’). The fixed effect was target range and as random effects, the model had intercepts for dolphin (SAY or TRO) and by-dolphin random slopes for target range. The GLME was fitted using a Poisson distribution for the response variable, a log link function and Laplace fit method. Using the MATLAB function ‘compare’, a P-value was computed by likelihood ratio tests of the model including target range against a reduced model without target range.
The relationship between SL and target range was analysed using a linear mixed-effects model (LME, MATLAB function: ‘fitlme’, with fit method set to maximum likelihood estimation in order to use the MATLAB function ‘compare’ for model comparison). Fixed effects were log10(R) (where R is target range), dolphin, click type (non-packet or packet click) and an interaction term between log10(R) and dolphin. As random effects, the model had intercepts for trial and by-trial random slopes for the effect of log10(R). P-values for model comparisons were computed by likelihood ratio tests of the model including the parameter in question against a reduced model with the selected parameter removed. Because range-dependent SL adjustments might be inadequately explained with a model covering the entire target range interval from 1 to about 400 m, we also ran the SL model with target range limited to 1–25 m, 1–100 m and >100 m. These intervals were chosen to compare the log10(R) slopes estimated for short target ranges, for target ranges where click packets are uncommon and for target ranges were packets are expected to be frequently used.
The relationship between lag time (ICI minus TWTT) and target range was analysed using an LME for the presumed on-axis regular clicks produced at target ranges less than 100 m. The analysis was also restricted to ICIs exceeding the TWTT (because negative lag times do not make sense in relation to the hypothesis that lag time represents some neural echo processing time) with an upper ICI threshold of 500 ms (to remove the influence of occasional ICIs of up to several seconds). Fixed effects were target range, dolphin and an interaction term between range and dolphin. As random effects, the model had intercepts for trial and by-trial random slopes for the effect of range. P-values were computed by likelihood ratio tests.
Output level adjustment of dolphin echolocation clicks has been suggested to not be a cognitive process but to be regulated passively through ICI adjustments; at shorter target ranges, the acoustic TWTT is shorter and shorter ICIs are therefore generally used, which may cause a decrease in SL as a result of less time for pressurisation in the nasal passage in between clicks (Au and Benoit-Bird, 2003). If dolphin click SL mainly depends on the time interval since the last outgoing click, then it might be hypothesised that short-ICI packet clicks (ICI <50 ms) and regular clicks produced at a similar ICI are all produced at a similar SL. Such a result would be highly interesting given that packets are used for long-range echolocation whereas regular clicks with ICIs less than 50 ms are mainly expected during short-range echolocation (50 ms corresponds to the TWTT for a target range of 38 m). To address this aspect of dolphin sound production, an LME analysis was performed to test whether presumed on-axis packet clicks (excluding the first click in each packet) had a different SL compared with presumed on-axis regular clicks limited to clicks with ICI <50 ms (as for the packet clicks compared against). Fixed effects were ICI, dolphin, click type and an interaction term between ICI and dolphin. As random effects, the model had intercepts for trial and by-trial random slopes for the effect of ICI. P-values were computed by likelihood ratio tests.
For all models, residual plots were visually inspected and did not reveal any obvious deviations from homoscedasticity or normality.
SAY began the target approaches from starting ranges between 35 and 444 m, and approached the target at a mean (±s.d.) rate of 3.0±0.4 m s−1 (range: 1.8–3.7 m s−1). A total of 37,724 echolocation clicks were detected in the Dtag recordings during the target approaches for SAY. This number reduced to 31,027 clicks when applying a criterion of a SNR >6 dB and when excluding buzz clicks below an ICI cut-off of 10 ms. On the target-mounted SoundTrap, a total of 16,284 click detections were made (Fig. S1A) of which 14,383 were detected at a SNR >6 dB, an ICI >10 ms and at a target range >1 m.
TRO began target approaches from ranges between 34 and 403 m, and approached at a mean rate of 3.2±0.3 m s−1 (range: 2.4–3.7 m s−1). A total of 16,284 echolocation clicks were detected in the Dtag recordings, of which 10,663 clicks were detected at a SNR >6 dB and ICI >10 ms. In the SoundTrap recordings, 7681 clicks were detected at a SNR >6 dB, an ICI >10 ms and at a target range >1 m (from a total of 9739 detections; Fig. S1B).
Click packets, presumed on-axis clicks and ICIs
From the subset of clicks identified on both the Dtag and the SoundTrap (SNR >6 dB, ICI >10 ms, target range >1 m), the automatic click packet detector identified 663 click packets for SAY (2769 total clicks) and 321 click packets for TRO (1145 total clicks). For SAY and TRO, 88% and 96% of these automatically detected packet clicks, respectively, were also identified as packet clicks during the manual selection process. Overall, the ICI within click packets tended to decrease gradually from click to click until the last click pair, in which the ICI often increased. For SAY, 433 clicks met the criteria for being presumed on-axis at the receiver, of which 292 were classified as regular (i.e. non-packet) echolocation clicks and 141 as packet clicks. For TRO, 355 clicks were presumed to be on-axis, of which 288 were regular clicks and 67 were packet clicks. Table S1 lists the source parameters estimated for all accepted clicks (i.e. recorded at unknown aspect relative to the acoustic axis) detected in the SoundTrap recordings and for the subset of presumed on-axis clicks for each dolphin.
For short target ranges below ∼25 m, SAY and TRO predominantly clicked at ICIs above the TWTT (Figs 1 and 2). For target ranges less than 100 m, SAY and TRO produced 30 and 5 packets out of the 663 and 321 total packets, respectively, and hence both dolphins predominantly echolocated using regular clicks at short (<25 m) and medium (∼25–100 m) target ranges. At medium ranges, TRO mainly clicked at ICIs above the TWTT as for short target ranges (Movie 2; Fig. 1D–F and 2B). In contrast, SAY seemingly used a bimodal range-dependent ICI adjustment at medium ranges involving ICIs exceeding the TWTT or ICIs below the TWTT (Movie 1; Figs 1A and 2A). In 16 trials, SAY produced more than 100 clicks per trial (2595 total clicks in all 16 trials) with ICIs less than the TWTT at target ranges less than 100 m. To investigate the range-dependent ICI adjustments of the ICIs shorter than the TWTT in those 16 trials, an LME analysis was performed (Table S2). Fixed effects were target range and random effects were intercepts for trial and by-trial random slopes for the effect of ICI. The results showed an intercept of 10.7 ms (±1.96 ms s.e.; P=5.0×10−8) and a slope of 0.59 ms m−1 (±0.035 ms m−1 s.e., P=1.4×10−60). Fig. 3 and Fig. S2 illustrate the potential range ambiguity associated with clicking at ICIs shorter than the TWTT to the target under the assumption that echo delays are estimated relative to the most recent outgoing click, although dolphins may process echo delays differently.
The LME analysis for the relationship between lag time and target range did not find a significant effect of target range at the 5% significance level either for the full model or for reduced versions of the model (results not shown). For SAY, the mean±s.d. lag time was 41.7±65.1 ms (range: 0.06–413 ms, N=155) and for TRO the mean lag time was 49.4±48.8 ms (range: 4.2–421 ms, N=169) for the presumed on-axis regular clicks produced at ranges less than 100 m and at ICIs between the TWTT and 500 ms.
At long target ranges (>100 m), both dolphins frequently produced click packets as well as non-packet (regular) clicks with ICIs that were often shorter than the TWTT, but mostly longer than ∼150 ms (Fig. 2). The click packets were produced at IPIs exceeding the TWTT in 87% and 79% of cases for SAY (N=663) and TRO (N=321), respectively, when measuring IPI as the interval between the first click in a packet and the previous click. If measuring IPI as the interval from the last packet click to the subsequent click, 90% and 91% of IPIs exceeded the TWTT for SAY and TRO, respectively. Whether IPI is measured relative to a click produced before or after the packet determines whether the IPI measure is better suited for investigating whether SL increases as a consequence of increasing ICI or whether all target echoes will return before the next click. The mean ICIs within packets were 19–28 ms (Table S1). Packets were produced either in sequences or as single packets in between bouts of regular click production. GLME analysis of the number of clicks per packet (for packets containing a presumed on-axis click) in relation to target range showed a minor increase by a mean of 0.0017 (±0.0006 s.e.) clicks per packet per metre (P=0.01; Table S3), i.e. an average increase of 0.6 clicks per packet from 100 m to the longest range tested. The mean packet duration (SAY: 54.8±25.4 ms, range: 30–235 ms, N=141; TRO: 77.1±34.6 ms, range: 28–254 ms, N=67) was, in a linear regression analysis, not found to change significantly (at the 5% significance level) with target range and hence the delay between the last click in a packet and the first packet echo return from the target (estimated as TWTT minus packet duration) increased as a function of target range.
Source levels of presumed on-axis clicks
The relationship between presumed on-axis click SL (Table S1) and log10(R), where R is target range, is illustrated in Fig. 4. The LME analysis showed a non-significant effect for the dolphin×log10(R) interaction term and therefore a reduced model without the interaction was selected (Table S4). SL (N=788) increased with target range following 17log10(R) (±0.8 s.e., P=5.3×10−78) with the random effects showing a standard deviation of 4.1 for the by-trial slopes (Table S4). A small mean SL difference of 2 dB (±0.7 dB s.e., P=0.0032) was found between the two dolphins. The mean intercept was 173 dB (±0.7 dB s.e.) re. 1 µPa (pp) with a standard deviation of 7.1 for the by-trial random effect. The model showed that packet clicks had a 6 dB (±0.6 dB s.e.) higher intercept than regular clicks (P=1.9×10−21; Table S4). For the SL models with target range limited to 1–25 m (N=129), 1–100 m (N=406) and >100 m (N=382), the mean log10(R) slopes were found to be 16log10(R) (±2.3 s.e., P=6.7×10−11), 16log10(R) (±1.1 s.e., P=7.1×10−42) and 16log10(R) (±2.9 s.e., P=5.0×10−8), respectively. These results suggest that the model covering the entire target range interval provided a good approximation of the range-dependent SL adjustments.
Fig. 5 shows the relationship between presumed on-axis click SL and ICI. In the LME analysis of the relationship between SL and ICI for presumed on-axis clicks with ICI less than 50 ms, a reduced model without the dolphin×ICI interaction term and without by-trial random slopes was selected (Table S5). The packet clicks were found to be produced at 23 dB (±1.1 dB s.e.) higher SL than the regular clicks (P=9.0×10−66; Table S5). This demonstrates that packet clicks are not limited in terms of SL despite being produced at short ICIs. The mean intercept was 189 dB (±1.8 dB s.e.) re. 1 µPa (pp) with a standard deviation of 1.8 dB for the by-trial random effect. The SL was found to increase by 0.13 dB (±0.05 dB s.e.) per 1 ms increase in ICI (P=0.0079; Table S5). There was a significant difference of 5 dB (±1.1 dB s.e., P=1.8×10−5) between the mean SL used by the two dolphins for clicks having ICIs shorter than 50 ms (Table S5).
Target detection and discrimination studies have shown that dolphins can solve echolocation tasks involving targets at ranges well beyond 100 m (Au and Snyder, 1980; Ivanov, 2004; Finneran, 2013), but that such long ranges may trigger a switch to echolocation in click packet mode (Ivanov, 2004; Finneran, 2013). Similar click patterns have been reported for false killer whales (Pseudorca crassidens), Risso's dolphins (Grampus griseus) (Madsen et al., 2004) and rough-toothed dolphins (Steno bredanensis) (Rankin et al., 2015) in the wild, but all in a context where it was impossible to know whether the animals were communicating with conspecifics or echolocating, and if the latter, at what target range. In captivity, dolphin long-range echolocation capabilities have been mainly tested in experiments with stationary dolphins that might opt for different echolocation behaviours to those of moving animals. The two free-swimming bottlenose dolphins in this study frequently used click packets during target approaches when target ranges exceeded ∼100 m, demonstrating that echolocation in click packet mode is not an artefact of studying stationary animals. At short target ranges (<25 m), both dolphins decreased ICI with decreasing range. For both long (>100 m) and medium target ranges (∼25–100 m), the data show range-dependent adjustments of ICI that deviate from the traditional click–echo–click paradigm, where ICIs exceed the TWTT to the target. Overall, the dolphins adjusted SL by approximately 17log10(R) during target approaches, but with substantial variation between trials. Below, we discuss range-dependent ICI adjustments in the context of short, medium and long target ranges before moving on to discuss range-dependent SL adjustments.
Short-range (<25 m) echolocation: ICIs exceed the TWTT
Seminal studies on bottlenose dolphins in the laboratory have shown that they echolocate with click rates such that the target echo returns before another click is produced (Evans and Powell, 1967; Morozov et al., 1972; Au et al., 1974). It has been suggested that ICIs are 20–45 ms longer than the TWTT to accommodate auditory and higher order processing of echo information (Morozov et al., 1972; Au, 1993).
Here, we show that at target ranges less than ∼25 m, two free-swimming dolphins almost exclusively click at ICIs exceeding the TWTT (Figs 1 and 2), conforming to the expectations of a click–echo–click paradigm for echolocation (Au, 1993; Surlykke et al., 2014). Lag times did not change significantly with target range for the two dolphins at target ranges less than 100 m. However, the large variation observed around the mean lag times highlights that the emission of a subsequent click is unlikely to depend on the dolphins first having to perform some lengthy stereotyped neural processing of the returning target echo resulting from the previous click. This is supported by physiological recordings of auditory potentials in dolphins (i.e. representing cortical auditory processing) that appear at latencies exceeding 50 ms following sound reception (Woods et al., 1986; Hernandez et al., 2007).
Medium-range echolocation (∼25–100 m): range-dependent ICI adjustments above and below the TWTT
TRO's echolocation behaviour was similar for short and medium target ranges; namely, clicking at ICIs exceeding the TWTT to the target (Figs 1D–F and 2B,D). In contrast to TRO, SAY employed an echolocation behaviour that resulted in a bimodal ICI distribution around the TWTT, i.e. exhibiting both positive and negative lag times for medium target ranges (Figs 1A–C and 2A). SAY mostly used ICIs that exceeded the TWTT, but in about half the trials, ICIs shorter than the TWTT were frequent for target ranges from ∼25 to 100 m (Movie 1; Figs 1A–C and 2A). This observation might be interpreted as SAY occasionally diverting her biosonar attention to unknown targets at a range between herself and the experimental target. If so, then such potential targets would most likely be located at various ranges from the experimental target between trials because the target boat was not anchored at a fixed location in the bay, but often changed location and also drifted between trials. The prediction for situations where biosonar focus is on non-intended targets located closer to the dolphin is that the intercept, when modelled relative to range to the experimental target, will be highly variable between trials compared with situations where the focus is on the experimental target, whereas the range-dependent ICI adjustment slope might be similar between trials (assuming stereotyped ICI adjustments). However, the lower and upper intercept estimates from the LME analysis were relatively close together (6.9–14.6 ms, Table S2). The range-dependent ICI adjustment slope was also found to lie within a narrow range with lower and upper values of 0.52–0.66 (Table S2). These results show that both intercept and slope were similar between the subset of trials for SAY, which is also evident from Fig. 1A, which shows a dense and narrowly distributed ICI cluster below the TWTT for medium target ranges. We therefore posit that the stereotyped range adjustment used by SAY when clicking at ICIs less than the TWTT at intermediate target ranges of ∼25–100 m is a result of adjustments to the experimental target and that SAY therefore alternated between two different modes of range-dependent ICI adjustment representing a standard rate (ICI>TWTT) and high rate (ICI<TWTT) target inspection.
Long-range (>100 m) echolocation involves click packets
This study demonstrates that when target range exceeds ∼100 m, both dolphins routinely produced click packets with very short ICIs as they closed in on a known target. This strategy was interspersed with emissions of regular clicks at ICIs often exceeding 150 ms, but not necessarily exceeding the TWTT (Figs 1 and 2). The apparent range threshold for packet production was similar to the thresholds previously reported for stationed animals during long-range echolocation (Turl and Penner, 1989; Ivanov, 2004; Finneran, 2013), supporting the interpretation that click packet utilisation in stationed animals is not an artefact of echolocating into a relatively static echoic background. Relatively few regular clicks had ICIs in the interval from 50 to 150 ms (Figs 1 and 2A,B). Finneran (2013) previously found a similar gap in the ICI distribution at 100–200 ms for clicks produced by three dolphins engaged in long-range echolocation.
Deviating from the click–echo–click paradigm: discrimination, bearing estimation or ranging?
For increasing ranges, there was an increased tendency for the dolphins to use ICIs shorter than the TWTT to the target (Figs 1 and 2). Although this contrasts with general biosonar theory in which target echoes are assumed to be received before emission of the next click (Au, 1993), there is evidence that some toothed whales will solve echolocation tasks with ICIs shorter than the TWTT. Such observations have been made for a beluga whale (Au et al., 1987; Turl and Penner, 1989) and in a phantom-echo study showing that three bottlenose dolphins, including SAY and TRO, increasingly used ICIs below the TWTT when target ranges increased from ∼25 to 100 m (Finneran, 2013). In these studies, the tasks were either to report target presence or absence or the detection of a change in the echo, which is similar to the task that SAY performed in the present study. Although clicking at ICIs exceeding the TWTT is by far the more commonly reported observation for toothed whales (Au, 1993), our results suggest that future studies should not uncritically classify clicks as focused elsewhere than on the intended target just because ICIs are shorter than the TWTT to the target (Akamatsu et al., 2005).
In this study, the dolphins were tasked with detecting and approaching the target either to intercept it (TRO) or to detect a target orientation change (SAY). Thus, target ranging was not part of the trained echolocation tasks. It seems that detection, bearing estimation and discrimination are not inherently sensitive to the range ambiguity problems purported to stem from using ICIs shorter than the TWTT (Surlykke et al., 2014). For tasks in which target range is not needed for success (i.e. when the task can be solved despite range ambiguity problems), there might be a benefit in sidestepping the general rule that ICI exceed the TWTT to target: increased echo return rates from using ICIs shorter than the TWTT could potentially improve target detection or classification performance.
Is range estimation possible when ICI<TWTT?
While it may be possible that the two dolphins were not interested in range and therefore employed ICIs shorter than TWTT at medium and longer ranges, it is nevertheless the case that many studies with stationed dolphins where ranging is not part of the trained problem solving have shown a strong correlation between range and ICI (Au et al., 1982; Penner, 1988), giving rise to the ICI equals TWTT plus lag time paradigm (Au, 1993). It may therefore be that echolocating animals inherently seek to establish range no matter whether it is part of the trained objective or not. If so, this raises the question of how they may establish range when producing ICIs shorter than the TTWT. Dolphin echolocation clicks are fairly stereotyped and both SL and spectral characteristics often change gradually within a click train (Au, 1993). A returning echo is therefore unlikely to contain much if any information about which specific click the echo delay should be measured from in order to correctly estimate target range, potentially making resolution of ambiguity a complex processing task. If echo delays are measured relative to the most recent click, it might be possible to cope with the potential problem of range ambiguity resulting from clicking at ICIs shorter than the TWTT provided that some ICIs occasionally do exceed the TWTT for the target of interest. The approach example in Movie 1 is replotted in Fig. 3 to show that though many of the echo delays measured from the most recent click will result in underestimated range for target ranges between ∼25 to 100 m, the ICIs contain enough variation that only short time gaps occur between correct range estimates. When a target is still tens of metres away, it might not be essential that every returning echo results in a correct range estimate, in contrast to the last few metres and last few seconds before target interception. This intermittent ranging model assumes that dolphins are able to separate correct range estimates (i.e. from clicks with ICIs exceeding the TWTT) from underestimates of target range. Fig. 3 and Fig. S2 show that such underestimations appear at ranges that are shorter by ∼10 m or more relative to the correct target range. The ∼10 m gap where no range estimates are present occurs because the last ICIs in click trains and click packets are rarely shorter than ∼13 ms (TWTT for 10 m range is ∼13 ms; Fig. S2). Assuming dolphins perform ranging as speculated here, the dolphin biosonar system merely requires a range resolution that is better than ∼10 m in order for correct range estimates to appear in range resolution bins that are separate from the underestimates.
The function of click packets
Irrespective of how dolphins perform target ranging, the use of click packets suggests that dolphins are able to use the combined echo information from the emitted packets for long-range detection and discrimination (Finneran, 2013; Finneran et al., 2014). Multi-echo processing is probably an inherent part of echolocation; detection thresholds have been found to improve by 20 dB for a dolphin (Altes et al., 2003) and 25 dB in bats (Surlykke, 2004) in phantom target experiments when the total number of echoes available for target detection was increased from one to eight, which suggests that echolocating animals exploit information from multiple echoes for detection and classification. In a click packet study that limited the maximum number of phantom target echo returns following click packet emission, the detection threshold decreased 3 dB for every doubling of available echoes, thus resembling an ideal energy detector (Finneran et al., 2014). Although we, in this study, saw a statistically significant increase in the number of clicks per packet with increasing target range, the average increase was merely 0.5 clicks from 100 to 400 m, which was not sufficiently high to offset the increased TL over the same range, and further seems too minor an effect to be considered biologically relevant. However, Finneran (2013) and Finneran et al. (2014) showed in phantom-echo experiments (allowing manipulation of both echo delay and echo level) that packet production is influenced more by echo delay, i.e. simulated target range, than by echo-to-noise ratio. That result may partly explain why all studies of packet production, including this one, have found a similar range threshold of 75–120 m above which click packets are used regardless of varying target strengths and potentially varying noise levels between studies. However, the underlying mechanism that establishes a threshold for packet production remains unknown.
A possible explanation is that correct identification of the echoes returning from a dolphin's own click production increases in difficulty with increasing echo delay, i.e. listening time following click production. The use of click packets may serve to encode a recognisable echo pattern that dolphins can listen for during investigation of distant objects when very weak echoes must be detected from an acoustic background that also includes conspecific echolocation signals, ambient noise, as well as clutter and reverberation from the emitted clicks. Ridgway et al. (2018) recently observed that trained dolphins rarely used packets when searching for mines at unknown locations, but often did so when returning to the operator's boat. This potentially implies that dolphins use packets for long-range inspection of relatively large targets at already known or predictable bearings as was probably the case in this and other studies (Turl and Penner, 1989; Ivanov, 2004; Finneran, 2013).
It might also be that multi-echo processing in dolphins is restricted by a fixed integration time so that multi-echo processing deteriorates at target ranges exceeding ∼100 m when echolocating in the click–echo–click paradigm, i.e. at ICIs above the TWTT. Switching to echolocation in click packet mode, with a packet length shorter than the presumed integration time, would restore multi-echo processing and so enhance echo detection. Alternatively, packet production is mainly used at target ranges long enough to ensure that echo packets return following a close to full release from forward masking from the outgoing clicks, which may last ∼100 ms for dolphins (Finneran et al., 2013), so that perceived echo levels do not vary considerably over the time course of a returning echo packet. In this study, packet durations averaged 54 and 76 ms for the two dolphins and did not increase as a function of target range, so the prediction is that echoes from returning echo packets on average are perceived at similar levels by the dolphins when target range exceeds ∼121–137 m (provided that no head scanning occurs). This interval is relatively close to the range threshold for packet production, which could simply be a coincidence or might indicate that the multi-echo processing associated with packet production performs best at ranges long enough to ensure a close to full release from forward masking from the most recent outgoing click.
Output levels are not limited by ICI: no fixed transmitter-side AGC
Minimising forward masking of returning target echoes has been suggested as the explanation for why dolphins decrease SL with decreasing range (Supin et al., 2007; Finneran, 2013; Supin and Popov, 2015). In a paper by Rasmussen et al. (2002) and subsequently in a paper by Au and Benoit-Bird (2003), it was reported that dolphins employ a 20log10(R) gain control on the transmission side so that the SL is doubled for every doubling in target range, presumably through range-dependent adjustment of ICI. However, that assertion has never been tested on free-swimming dolphins in a long-range echolocation setting where the target of interest is known.
Packet clicks in our study were all produced at high SL despite the short within-packet ICIs (Fig. 5) and the first click in a packet was produced after at least 150 ms of silence, but did not have higher SL than subsequent clicks in the packet (Table S1). These findings are inconsistent with the notion that SL might increase passively as a consequence of increasing ICI (Au and Benoit-Bird, 2003), at least when ICIs are longer than 10–15 ms (also discussed by Finneran, 2013).
The 20log10(R) gain control model is attractive to use as it ties neatly into the sonar equation (Urick, 1983; Møhl, 1988) by offsetting one-way TL for a single target. In concert with a proposed 20log10(R) gain control on the auditory side of the sonar system, it has been speculated that the combination of gain control on both receiving and transmitting sides causes animals to perceive a constant echo level irrespective of range (Supin and Nachtigall, 2013). However, one concern with finding a 20log10(R) gain control is that it is the same result that may arise from signals recorded on systems with limited recording dynamics irrespective of whether the animals actually employ AGC (Beedholm and Miller, 2007; Ladegaard et al., 2017). A second concern is that it is unlikely that increases in both gain control systems would continue out to the long ranges considered here; AGC-induced changes in SL and hearing sensitivity must have a maximum range at which perceived echo level is stable. In fact, the AGC does not seem relevant in the context of long-range echolocation as improvements in hearing thresholds that roughly compensate for a one-way TL only apply for targets closer than 10–20 m (Supin and Nachtigall, 2013), although full recovery of hearing sensitivity following click production may require echo delays of up to ∼100 ms, equivalent to a target range of ∼80 m (Finneran et al., 2013, 2016a). Here, we show an average SL adjustment slope of 17log10(R) over the target ranges studied, which nevertheless is relatively close to 20log10(R). Importantly, however, the logarithmic slopes for individual target approaches vary substantially (Fig. 4) and we found a random effect with a standard deviation of 4.1 for the by-trial slopes (Table S4). Thus, this study demonstrates that the average ∼20log10(R) adjustment does not seem to be a fixed SL adjustment in every approach. It follows that free-swimming dolphins employ a large dynamic range of SLs for the same target and range, resulting in fluctuations in received echo powers of several orders of magnitude for the same delay windows; echolocating dolphins, therefore, do not seek to stabilise perceived echo levels but perhaps to bring echo dynamics into a range that can be handled by the auditory system.
Further, at long target ranges where receiving-side gain control stemming from forward masking release from the last outgoing click becomes insignificant, a 40log10(R) adjustment of SL is necessary to offset the two-way TL if echo levels are to remain constant. This is much higher than the highest degree of adjustment observed in this study (Table S4) and therefore a dolphin does not receive or perceive constant echo levels as previously suggested by Supin and Nachtigall (2013); both received and perceived echo-to-noise ratios will therefore rapidly deteriorate for long target ranges.
This study demonstrates that free-swimming dolphins use click packets when engaged in long-range echolocation tasks involving a known single target. The dolphins did not exclusively use packets at long target ranges, but switched between emission of packets and regular clicks with long ICIs. Despite short within-packet ICIs, subsequent packet clicks were not produced at lower SLs than the first click in each packet or the regular clicks produced at long ICIs, which seemingly contradicts the notion that range-dependent adjustment of SL is driven by ICI. We further show that click packets were produced beyond an apparent target range threshold of ∼100 m, coinciding with the upper range for which the two dolphins used range-dependent ICI adjustments. The overall adjustment of SL was approximately 17log10(R), but relatively large variations in the slope between trials (standard deviation of 4.1) highlight that the dolphins did not employ a fixed transmission-side gain control. At target ranges less than ∼25 m, both dolphins echolocated with ICIs exceeding the TWTT, thus complying with the typical click–echo–click paradigm for dolphin echolocation. The dolphins were trained to solve different tasks, which may be the reason for the differences in echolocation strategies employed for medium target ranges between ∼25 and 100 m. The dolphin TRO, tasked with approaching and touching the target, consistently used ICIs exceeding the TWTT to the target. The dolphin SAY, tasked with detecting a target orientation change, apparently operated in two different echolocation modes with ICIs either above or below the TWTT to the target. We speculate that this bimodal echolocation strategy was used to improve the target orientation change detection performance at the potential cost of poor range estimation during periods when ICIs were shorter than the TWTT.
We wish to thank the animal trainers and staff at the US Navy Marine Mammal Program for their hard work and support of animal training and research sessions. Also, thanks to Kristian Beedholm for many helpful discussions and Roger Wuerfel for help constructing the biosonar target. We thank Walter Zimmer, Sander von Benda-Beckmann and three anonymous reviewers for comments that improved the manuscript.
Conceptualization: M.L., J.M., D.S.H., F.H.J., P.T.M., J.J.F.; Methodology: M.L., J.M., D.S.H., F.H.J., M.J., P.T.M., J.J.F.; Software: M.L., F.H.J., M.J.; Validation: M.L., F.H.J.; Formal analysis: M.L., F.H.J.; Investigation: M.L., J.M., D.S.H., J.J.F.; Resources: M.L., J.M., D.S.H., F.H.J., M.J., P.T.M., J.J.F.; Data curation: M.L.; Writing - original draft: M.L., P.T.M.; Writing - review & editing: M.L., J.M., D.S.H., F.H.J., M.J., P.T.M., J.J.F.; Visualization: M.L., F.H.J., P.T.M.; Supervision: P.T.M., J.J.F.; Project administration: P.T.M., J.J.F.; Funding acquisition: M.L., J.M., D.S.H., P.T.M., J.J.F.
Financial support was provided by the US Office of Naval Research Code 32 (Mine Countermeasures, Acoustics Phenomenology & Modeling Group). M.L. and P.T.M. were funded by frame grants from the National Danish Research Council (Det Frie Forskningsråd) and by a Semper Ardens grant from the Carlsberg Foundation. M.L.’s travel expenses were covered by grants from Augustinus Fonden and DAS-Fonden (Danish Acoustical Society, Dansk Akustisk Selskab). F.H.J. was funded by an AIAS-COFUND fellowship from Aarhus Institute of Advanced Studies under the EU's Seventh Framework Programme (Agreement No. 609033).
The authors declare no competing or financial interests.