Toothed whales produce echolocation clicks with source parameters related to body size; however, it may be equally important to consider the influence of habitat, as suggested by studies on echolocating bats. A few toothed whale species have fully adapted to river systems, where sonar operation is likely to result in higher clutter and reverberation levels than those experienced by most toothed whales at sea because of the shallow water and dense vegetation. To test the hypothesis that habitat shapes the evolution of toothed whale biosonar parameters by promoting simpler auditory scenes to interpret in acoustically complex habitats, echolocation clicks of wild Amazon river dolphins were recorded using a vertical seven-hydrophone array. We identified 404 on-axis biosonar clicks having a mean SLpp of 190.3±6.1 dB re. 1 µPa, mean SLEFD of 132.1±6.0 dB re. 1 µPa2s, mean Fc of 101.2±10.5 kHz, mean BWRMS of 29.3±4.3 kHz and mean ICI of 35.1±17.9 ms. Piston fit modelling resulted in an estimated half-power beamwidth of 10.2 deg (95% CI: 9.6–10.5 deg) and directivity index of 25.2 dB (95% CI: 24.9–25.7 dB). These results support the hypothesis that river-dwelling toothed whales operate their biosonars at lower amplitude and higher sampling rates than similar-sized marine species without sacrificing high directivity, in order to provide high update rates in acoustically complex habitats and simplify auditory scenes through reduced clutter and reverberation levels. We conclude that habitat, along with body size, is an important evolutionary driver of source parameters in toothed whale biosonars.
Echolocation is an active sense that involves generation and transmission of high-intensity sound pulses into the environment, and subsequent auditory detection and processing of returning echoes to inform changes in motor patterns for navigation and foraging (Griffin, 1958). The ability to detect echoes is ultimately limited by the hearing threshold, but for most healthy animals, the threshold for echo detection is set by either the ambient background noise or the level of reverberation or clutter (Au and Turl, 1983; Turl et al., 1991). In a noise-limited scenario, the detection range can be increased by increasing the biosonar source level (SL). However, this is not true for a reverberation- or clutter-limited scenario. Reverberation and clutter consist of unwanted echoes reflected off objects in the medium and from the boundaries, thus the level of reverberation and clutter will be proportional to the outgoing SL (Au, 1992). To increase detection range, an animal might then increase the transmitting directivity to reduce the number of ensonified objects in the same delay window as the target of interest (Moss and Surlykke, 2001; Aytekin et al., 2010). Thus, the optimal SL and directivity of animal biosonars are likely to be influenced by the habitat and the behavioural context in which the biosonars are operated. This has been demonstrated for bats (Neuweiler, 1989; Schnitzler and Kalko, 2001; Surlykke et al., 2009), yet little is known about how different habitats might drive evolution and operation of biosonars in toothed whales.
Previous studies have shown that large oceanic toothed whales emit low-frequency clicks at high SLs and with long interclick intervals (ICIs), allowing for long-range target detection (Møhl et al., 2003; Zimmer et al., 2005). Smaller coastal species, by contrast, emit high-frequency clicks at lower SL and short ICIs, resulting in short-range biosonars that allow high update rates on the acoustic environment (Madsen and Surlykke, 2013). The low-frequency echolocation clicks of large toothed whales in comparison to those produced by smaller species also suggest an inverse scaling of frequency with body size. As biosonar directivity is determined by signal frequency relative to size of the emitting aperture, such inverse scaling has led to fairly similar biosonar directivity amongst toothed whales (Koblitz et al., 2012), pointing to directivity as a potential evolutionary pressure determining biosonar frequency. Thus, to separate the effects of scaling from potential effects of habitat on click source parameters it is necessary to constrain analysis to species within a restricted size range.
To this effect, the paraphyletic group of extant river dolphins represent an ideal group of study subjects. The intriguing convergent evolution, where distantly related toothed whales have independently adapted to life in riverine environments (Hamilton et al., 2001) and acquired the same overall morphology, which is different to that of similar-sized marine toothed whales, raises the question of whether the various marine to freshwater transitions have also led to biosonar systems better suited for dealing with clutter. Theoretically, the best way a river dolphin can deal with clutter is by a downregulation of SL in conjunction with an ability to increase directivity, thus reducing detection range and decreasing beamwidth. However, an intricate relationship exists among SL, directivity and frequency (Moore and Pawloski, 1990; Au et al., 1995; Madsen et al., 2013a), so adjustments of one parameter are likely to affect the others. Depending on which parameters, or combinations thereof, are selected for over time, more than one evolutionary outcome may be possible for a clutter-adapted biosonar.
energy flux density
equivalent piston radius
fast Fourier transform
peak to peak
Fc to BWRMS ratio
root mean squared
third octave level
two-way travel time
The Amazon river dolphins, commonly known as botos (Inia sp.), are regularly found in shallow river channels, and seasonally, even in flooded forests (Best and da Silva, 1989; Martin and da Silva, 2004), suggesting that they, at least at times, operate their biosonars in highly clutter-limited conditions. Botos are therefore intriguing animals to study through quantification of click source parameters, since they may help shed light on the evolutionary driving forces behind biosonar parameters in different toothed whale species. To date, botos have been subject to multiple studies, including a few in the wild, but discrepancies exist in reported signal frequencies of boto echolocation clicks (Norris et al., 1972; Nakasai and Takemura, 1975; Kamminga et al., 1993) and source parameters such as SL and directivity index (DI) are lacking from the published literature on free-ranging animals.
Given their habitat, we hypothesise that botos use a short-range biosonar with high directivity. Specifically, we further hypothesise that botos will produce clicks at lower SL and operate their biosonars at shorter ICIs compared with similar-sized marine species. This would simplify the auditory scene through a decreased sensory volume, and provide high update rates, which we hypothesise are important when navigating a complex habitat. However, clicking at low SL has been shown to decrease the peak (Fp) and centroid frequency (Fc) of emitted clicks, thus lowering the DI (Houser et al., 2005; Madsen et al., 2013a). A low DI will provide more clutter, thus complicating echo processing. We therefore additionally hypothesise that botos will produce clicks with high Fc relative to SL in order to keep their biosonar beams directional.
Here, we test these hypotheses by using a vertical seven-hydrophone array to quantify source parameters of echolocation clicks of wild botos (Inia geoffrensis Blainville 1817) in three different areas in the Amazon, Brazil. We show that botos operate a short-range biosonar system with source parameters that cannot be predicted simply from size-related scaling. Specifically, we show that in comparison with similar-sized marine toothed whales, the boto clicks at high rates, producing low SL clicks with high-frequency content, whereas beam directivity compares with that of other toothed whales, regardless of size.
Botos were recorded in three main areas containing black water, white water or a mixture. Botos were found as single animals, mother calf pairs and in small groups of usually less than five animals. In general, animals were in a state of feeding/milling behaviour, where they moved around slowly in the same area. Recordings from all three areas totalled 213 min of usable recordings containing 34,827 echolocation clicks with received levels above a threshold set for the initial screening process. Within the confident localisation range of 40 m, a total of 404 echolocation clicks fulfilled the on-axis criteria. Of those clicks, 268 were within the 21 m requirement for inclusion in piston fit modelling.
Boto clicks were broadband transients having a mean duration of 14.1±3.1 µs with a mean ICI of 35.1±17.9 (Table 1). An example click is given in Fig. 1A,B with its waveform and power spectrum shown. Power spectra for all 404 on-axis clicks are shown in Fig. 1C with mean energy distribution overlaid. The clicks had a mean Fp of 95.7±12.4 kHz. Energy was centred on a mean Fc of 101.2±10.5 kHz with root-mean-squared bandwidth (BWRMS) of 29.3±4.3 kHz resulting in a QRMS ratio of 3.5±0.5 (Table 1). Linear regression analysis revealed a significant positive relationship for the slope of Fp as a function of Fc (Fig. 2A; R2=0.48, t-test, P<0.001) with Fp=0.82Fc+12.9 kHz and also between Fc and BWRMS (Fig. 2B; R2=0.25, t-test, P<0.001) with Fc=1.2BWRMS+65.0 kHz. A significant positive relationship was also found between Fc and SLpp (Fig. 2B; R2=0.11, t-test, P<0.001) with Fc=0.56SLpp−4.5 kHz. The lowest SLpp values were measured at distances less than 5 m from the recording array (Fig. 3A). A significant positive relationship was found between SLpp and log(range) (Fig. 2B; R2=0.47, t-test, P<0.001) where SLpp=12.4 log(range)+176.7 dB re. 1 µPa (Fig. 3A). A few on-axis clicks came close to the clip level of the recording array represented by the upper dashed line in Fig. 3A, but no clicks were found to have been clipped. On-axis clicks were mostly well above the selected threshold of the initial screening process of 154 dB re. 1 µPa (peak) represented by the lower dashed line in Fig. 3A. The relationship between ICI and range was also positive (Fig. 3B; R2=0.15, t-test, P<0.001) with the corresponding linear regression line ICI=0.65range+24.5 ms (Fig. 3B). To estimate effects of pseudo-replication caused by multiple measurements on the same animals, we adjusted the degrees of freedom and found that all relationships were significant (P<0.05) if at least four animals had been measured (under the assumption of an identical number of clicks performed by each animal).
When comparing on-axis clicks between recording areas, several of the parameters listed in Table 1 turned out to differ significantly (two-sample t-test, P<0.05). The mean SLpp, SLRMS, SLEFD, Fc, Fp and QRMS differed between São Tomé (103 clicks) and Mamirauá Sustainable Development Reserve (193 clicks) and when comparing São Tomé with the confluence of Rio Negro and Rio Solimões (108 clicks) where significant differences were also found for mean ICI. Comparing clicks from the confluence and Mamirauá Sustainable Development Reserve the differences in mean ICI and localisation range were significant. The mean value differences were <3 dB for SLpp, SLRMS and SLEFD, <5 kHz for Fc and Fp, <0.15 for QRMS, <6 ms for ICI and <3 m for localisation range. Albeit significant, potentially due to large sample sizes, the differences are so small that hypotheses about local habitat adaptations in biosonar parameters seem unsupported with the available data sets.
The vertical composite beam pattern yielding the best piston fit for all on-axis clicks acoustically localised to less than 21 m resulted in a symmetric half-power (−3 dB) beamwidth of 10.2 deg with 95% bootstrap confidence interval (BCI) from 9.6–10.5 deg and corresponding DI of 25.2 dB with 95% BCI from 24.9–25.7 dB (Fig. 4, Table 2). The best piston fit was found for all clicks when fitting with an equivalent piston radius (EPR) of 3.6 cm (Table 2). Fig. 5B presents a click example where the signals recorded by each of the seven hydrophones have been back-calculated to a reference distance of 1 m. From this example, it is seen that amplitudes decrease with increasing off-axis angles in concert with distortions of the waveforms relative to the signal recorded closest to the acoustic axis. Surprisingly, the signal recorded on the second lowest hydrophone is smaller in amplitude and energy relative to the signal recorded on the lowest hydrophone. This phenomenon was seen for several clicks both above and below the 21 m localisation range criterion for inclusion in the piston fit model. Since this was an unexpected finding and violates one of the assumptions in the piston fit procedure, the clicks localised to less than 21 m were divided into two groups for which the piston model was run separately (Table 2). One group contained only clicks with single-lobed beam patterns and a second group contained only clicks with double-lobed beam patterns. To allow for some error of measurements, clicks were characterised as having a double-lobed beam pattern if the recorded amplitude increased from one hydrophone to the next by >3 dB when moving away from the estimated acoustic axis. Separate analysis (not shown) of source parameters for on-axis clicks excluding those having double-lobed beam patterns revealed almost no difference from the source parameters presented in Table 1. Clicks with double-lobed beam patterns had broader mean half-power beamwidth and lower DI than single-lobed clicks (Table 2). For the group with the expected single-lobed beam patterns, the piston fit modelling was also done after these data had been divided into seven bins based on localisation range. A tendency for range dependence was evident with clicks localised at very short ranges (0–3 m), with significantly broader composite beamwidth (Fig. 5A) and smaller DI compared with clicks recorded at longer range (Table 2).
Ambient noise levels
The spectrogram of the noise recording from the river channel in Mamirauá Sustainable Development Reserve (Fig. 6A) illustrates the noise levels that botos may be exposed to on a daily basis in that general area. Small motorised boats were occasionally observed in the river channel, but the main noise contribution is likely to be biological in origin, such as the noticeable noise band around 8 kHz. The 8 kHz noise band is most pronounced during the 6 h following sunset, whereas general noise levels seem clearly elevated in the 2 h following sunset (Fig. 6A). Third octave levels (TOL) differed between the two noise recording sites (Fig. 6B) with the São Tomé recording showing almost constant TOLs at about 90 dB re. 1 µPa regardless of frequency, whereas the Mamirauá recording showed an overall trend of decreasing TOLs towards higher frequencies. Above a few hundred Hz, both noise recordings from this study had lower TOLs compared with a coastal noise recording made in tropical waters with snapping shrimp present (Fig. 6B).
Parameters of toothed whale echolocation clicks have been shown to scale with animal size (Au, 1993; Madsen and Surlykke, 2013), except for DI which seems relatively constant across toothed whale species (Koblitz et al., 2012; Madsen and Surlykke, 2014). Scaling should therefore be considered when making interspecies comparisons of source parameters. Botos are comparable in size to marine species such as pantropical spotted dolphins (Stenella attenuata), spinner dolphins (S. longirostris), Indo-Pacific bottlenose dolphins (Tursiops aduncus) and some ecotypes of common bottlenose dolphins (T. truncatus) (Jefferson et al., 2008), which all make broadband echolocation clicks like botos and for which source parameters have been reported in the wild (Schotten et al., 2004; Wahlberg et al., 2011). By hypothesising that scaling is the major driver of source parameters, then botos are expected to emit clicks similar to those of the four marine species. Alternatively, it opens up a second hypothesis where habitat is an important co-driver acting on biosonar source parameters (Jensen et al., 2013), thus making it more likely that converging source parameters are found for similar-sized species that live in acoustically similar habitats, such as rivers. In this study, we present data in favour of the second hypothesis and discuss how clutter and reverberation in rivers are likely to be major factors responsible for the lower SL and faster clicking rate at high Fc and DI of river dolphins compared with marine species.
Fast biosonar sampling rates
Toothed whales generally do not produce a new click until relevant echoes from the previous click are received to avoid range ambiguity problems, so the two-way travel time (TWTT) corresponding to the ICI is expected to represent an upper estimate of range to targets of interest (Au et al., 1974; Au, 1993; Akamatsu et al., 1998). The ICI can be divided into TWTT to furthest target of interest plus a lag time, which may be species dependent (Madsen et al., 2013b) and also task dependent (Au, 1993; Wisniewska et al., 2012). For animals recorded at close range, a recording array may itself act as a relevant target, which can be inferred by animals keeping ICIs longer than TWTT to the array (Au and Herzing, 2003; Jensen et al., 2009). In this study, all on-axis clicks but one have ICIs longer than the TWTT at localisation ranges less than 17 m; below 10 m localisation range there also seems to be a reduction of ICIs longer than 50 ms compared with the remaining ICI distribution (Fig. 3B). This might indicate that botos close to the recording array have in fact detected the array and reduced their acoustic gaze to focus their attention on the array, or at least on other objects nearby. At ranges longer than 17 m, there is no evidence of range locking in the ICI, which can be explained either by a lack of attention to the array or by the animals failing to detect it at these longer ranges.
We hypothesised that botos click at high rates, based on the assumptions that search ranges will be short in shallow water environments, where a potential added effect of clutter and reverberation might create an acoustically complex environment where high update rates will benefit prey tracking and navigation. We find the mean ICI of 35 ms for the boto to be identical to the 35 ms found for the Ganges river dolphin (Jensen et al., 2013). In comparison, the typical mean ICIs of bottlenose dolphins in the wild may be two to four times longer (Wahlberg et al., 2011). Assuming a lag time of 20 ms (Morozov et al., 1972; Au, 1993), the mean ICIs found for botos correspond to an upper search distance averaging just 11 m (sound speed: 1500 m s−1), whereas the mean ICI of 63–120 ms measured for two species of bottlenose dolphin (Wahlberg et al., 2011) corresponds to upper search distances from 32–75 m, conforming with other studies investigating active biosonar ranges (Au et al., 2007; Simard et al., 2010). We therefore conclude that botos operate a short-range biosonar system, which may be reflected in their behaviour by their relatively slow swim speeds of usually less than 1 m s−1 (Best and da Silva, 1989). Botos are likely to encounter more objects per distance covered than toothed whales at sea, so a slow swim speed together with a high update rate probably reduces the risk of colliding with obstacles such as tree trunks and vegetation.
Low source levels in quiet, shallow waters
The boto's mean SLEFD of 132 dB re. 1 µPa2s and mean SLpp of 190 dB re. 1 µPa are slightly higher than values reported for the Ganges river dolphin, which has a mean SLEFD of 127 dB re. 1 µPa2s and mean SLpp of 183 dB re. 1 µPa (Jensen et al., 2013). The SL values for both river dolphin species are, however, low in comparison to the mean SLEFD of 132–150 dB re. 1 µPa2s and mean SLpp of 199–212 dB re. 1 µPa reported for similar-sized marine toothed whales (Schotten et al., 2004; Wahlberg et al., 2011), thus supporting the hypothesis that the boto produces clicks at relatively low SL. Interestingly, the Irrawaddy dolphin (Orcaella brevirostris), which is also found in freshwater, echolocates with SL and ICI values very similar to the boto (Jensen et al., 2013), suggesting a general evolutionary selection for short-range biosonars in riverine toothed whales.
For toothed whales found in open waters, an increased SL will result in an increased detection distance as this increases the ratio of echo levels over ambient noise (Au and Turl, 1983; Au, 1993). However, in shallow waters, an increase in SL, although increasing target echo levels, also increases clutter and reverberation levels (Au and Turl, 1983), making operation of long-range sonar in shallow and complex habitats impractical (Jensen et al., 2013). The preliminary noise recordings made in this study indicate that ambient noise levels are low in comparison to what can be encountered at sea in shallow tropical marine waters (Fig. 6). This supports the notion that boto echolocation is indeed clutter limited, where high SLs do not facilitate detectability. Rather, it may be speculated that selection for low SLs has reduced the problem of range ambiguity in a complex auditory scene of many targets in shallow water.
High frequencies despite low output levels
Changing the SL is not without consequences and may lead to alterations of the frequency content of emitted clicks as decreases in SL correlates with decreases in Fc (Moore and Pawloski, 1990; Au et al., 1995; Madsen et al., 2013a). The frequency content is important because directivity of a sonar beam follows the relationship between wavelength and effective aperture of the sound emitter, so for a constant melon size the DI will decrease with decreasing frequency (Au, 1993; Zimmer et al., 2005; Madsen and Wahlberg, 2007). For example, if a marine toothed whale switches from production of high SL clicks to clicks with SLs closer to that of a boto, the result may be a decrease in Fc by about an octave and hence a drop in DI by roughly 6 dB (Au et al., 1995). As initially hypothesised, the boto may therefore have an Fc higher than their SLs would predict in order to operate their biosonars at low SL without sacrificing directivity. If so, botos will maintain a narrow biosonar beam that will work to reduce effects of clutter and reverberation. A low SL and high Fc in combination will therefore provide botos with simpler auditory scenes to process and interpret.
In this study, we report a mean Fc and mean Fp of 101 and 95 kHz, respectively, which are well above most previously reported frequencies for boto echolocation clicks (Diercks et al., 1971; Nakasai and Takemura, 1975; Kamminga, 1979; Pilleri et al., 1979; Kamminga et al., 1993), but conform with the study by Penner and Murchison on a single boto in captivity (Penner and Murchison, 1970). Many of the previous studies may have suffered from equipment limitations, but even with adequate recording bandwidth, a lack of strict on-axis criteria may explain a large part of the remaining variation given how high frequencies are radiated in narrower beams than lower ones for the same aperture size (Au, 1993).
When comparing the mean Fc around 100 kHz with three published boto audiograms, it is surprising to find a best hearing sensitivity between 70 and 90 kHz, with a steep sensitivity cut-off above 100 kHz (Jacobs and Hall, 1972; Popov and Supin, 1990; Supin and Popov, 1993). This implies that botos only hear half of the sound energy in the returning echoes, which is at odds with the general match between best hearing frequency and the Fc in clicks of echolocating toothed whales (Au, 1993). This discrepancy may be real for botos, but in our opinion it is more likely that the low high-frequency cut-offs in the measured audiograms are related to the age of the measured animals or methodology. Data for Risso's dolphins show a poor overlap between click spectra and audiograms for old animals and a very good overlap for younger animals (Madsen et al., 2004; Nachtigall et al., 2005).
The frequency distribution reported here for the boto is slightly higher than for the similar-sized marine toothed whales chosen for comparison in this study (Schotten et al., 2004; Wahlberg et al., 2011). This corroborates our hypothesis that the boto operates its biosonar at relatively low SL, but at high Fc to maintain directivity. If habitat truly influences source parameters, then the prediction would be to find energy at similarly high frequencies for other freshwater-dwelling toothed whales as well. This seems to be the case for the Irrawaddy dolphin recorded in freshwater, where Fc and Fp have been found to be 95 and 101 kHz, respectively (Jensen et al., 2013). For baiji clicks, only Fp measures are published (Akamatsu et al., 1998), but Fp is by itself a far less-robust measure of the energy distribution in a click, compared with Fc, given how power spectra often show a bimodal energy distribution at least for some toothed whale species (Au et al., 1995; Au, 2004). We can therefore only speculate on whether or not the baiji produced clicks with similar frequency content as the boto. For the Ganges river dolphin (Fig. 7), however, the available data seem to contradict the hypothesis of high frequencies being advantageous for biosonars operated in riverine habitats, as the Ganges river dolphin has a mean Fc of just 61 kHz (Jensen et al., 2013). Such frequency content is very low compared with the boto and similar-sized marine toothed whales, which have reported mean Fc values between 75 and 91 kHz (Schotten et al., 2004; Wahlberg et al., 2011). Interestingly, some evidence suggests that the unique maxillary bony crests of the Ganges river dolphin act to focus outgoing biosonar signals so that it may emit clicks with a DI of 22 dB, despite its small size and relatively low-frequency clicks (Jensen et al., 2013). While this is higher than what their biosonar frequency would suggest (Jensen et al., 2013), it is still lower than the DIs measured for most marine toothed whales (Koblitz et al., 2012).
Several evolutionary solutions to a shallow-water-adapted biosonar may exist; this is corroborated by results shown in Fig. 7, which displays an overview of echolocation signals for the extant river dolphins that have independently adapted to life in shallow waters during different ages of the Miocene epoch (Hamilton et al., 2001). The Ganges river dolphin, the baiji and the boto all produce very short broadband clicks, but the Ganges river dolphin clearly do so at lower frequencies than the others. Nevertheless, a recent study confirms that the Ganges river dolphin employs a short-range biosonar (Jensen et al., 2013), as we expect for all riverine toothed whales. In terms of habitat, the franciscana may be seen as an outlier amongst river dolphins as it inhabits estuarine and coastal waters. This might explain why it has evolved a narrow-band high-frequency signal, which is thought to be an adaptation against killer whale predation (Melcón et al., 2012; Kyhn et al., 2013).
Dynamic beam patterns
The composite vertical beam patterns presented in this study have half-power beamwidths of approximately 10 deg and DIs in the order of 25–27 dB (Fig. 4 and Fig. 5A, Table 2), which falls right in the middle of previous measures for toothed whales (Koblitz et al., 2012), suggesting that high Fc at low SL serves to maintain high directivity despite low biosonar output levels in botos. In a study by Pilleri and co-workers, a single hydrophone recorded echolocation clicks from various angles to produce a composite beam pattern with an estimated half-power beamwidth of 29 deg at their reported peak frequency of 80 kHz and 28 deg at 100 kHz (Pilleri et al., 1979). Such half-power beamwidths are two to three times broader than found here (Table 2) and might be explained by a lack of directional control due to recording with a single hydrophone, in addition to the unnatural setting where the animal echolocates in a small concrete pool in which it may use even lower SL and hence Fc.
When DIs and equivalent piston radii are estimated from array data, a flat piston model is often fitted to how click levels taper off with increasing off-axis angle, but for some of the data recorded here, that criterion is not supported because of multiple amplitude peaks (Fig. 5B). All planar transducers have side lobes that may be in the order of −20 dB relative to the on-axis signal for toothed whale clicks (Au, 1993); however, for some clicks in this study, the additional lobe amplitudes were within −1 dB relative to the main lobe. Further studies using more sophisticated arrays are needed to confirm whether the unexpected finding of double-lobed beam patterns is truly biological in origin or is merely an artefact of the physical environment. A beam pattern having more than one main lobe has previously been reported for the Ganges river dolphin (Pilleri, 1979), but was not found in a more recent study on this species (Jensen et al., 2013), leaving the existence of double-lobed beam patterns controversial.
For the clicks with single-lobed beam patterns, it seems that greater localisation range corresponds with slightly narrower beamwidth, higher DI and larger EPR (Table 2). This fits well with the notion that higher SLs generally are measured at longer ranges (Fig. 3A) (Jensen et al., 2009) since higher SLs are to be predicted when the sound beam is more focused. Nevertheless, the tight relations between DI, SL and frequency makes it challenging to pinpoint the primary mode by which toothed whales make beam pattern adjustments. As shown in Fig. 4, botos are able to emit highly varying beam patterns, which is likely to be under acute control by these animals. Since the correlation between SLpp and Fc was rather low in this study, with an R2 of just 0.11, then potential beam focusing might be achieved primarily through a fourth factor, namely conformation changes of the melon (Wisniewska et al., 2015). Melon dynamics might also relate to the sudden widening of the half-power beamwidth by about 50% when localisation range becomes less than 3 m. Such beam changes are comparable to the adjustments seen for porpoises initiating the buzz phase (Wisniewska et al., 2015) or Atlantic spotted dolphins focusing on a recording array (Jensen et al., 2015). It may be that botos dynamically change the beamwidths during close-up inspection of the array; however, at such short ranges, there is a high risk that animals are being recorded off-axis, which results in underestimated DIs, thus calling for caution when interpreting beam patterns. Assuming that the results are indeed a result of a variable beam, then it may be advantageous for botos to employ beam adjustments when navigating densely vegetated habitats, when moving between dense and open areas, and when tracking prey in a complex auditory scene. Studying the active control and dynamics of the conspicuous melon that botos possess in parallel with the source parameters of their echolocation clicks might be very fruitful for the understanding of toothed whale echolocation.
Here, we have shown that Amazon river dolphins in the wild use a short-range biosonar in shallow water environments characterised by high levels of clutter and reverberation. Their biosonar system is characterised by a high frequency relative to the source level, resulting in a sonar beam of comparable directivity to those achieved by marine delphinids despite operating at high repetition rates and low output levels. We argue that low-amplitude, highly directional biosonar systems are advantageous for toothed whales in riverine habitats since these parameters serve to simplify the auditory scene and facilitate target detection and discrimination in complex, cluttered environments. These findings suggest that habitat, in addition to size, may play an important role in the evolution of toothed whale echolocation.
MATERIALS AND METHODS
Study area and animals
Recordings were carried out in the vicinity of São Tomé, Amazon, Brazil (3°6′0″S, 60°29′40″W) on 15–18 October 2013, at the confluence between Rio Negro and Rio Solimões (3°8′0″S, 59°54′0″W) on the 20th of October 2013 and in the Mamirauá Sustainable Development Reserve, Amazon, Brazil (3°7′45″S, 64°47′20″W) on 22–27 October 2013. São Tomé is located on the Rio Negro, which carries black water rich in humic acids, whereas Rio Solimões, which also drains the Mamirauá Sustainable Development Reserve carries white water rich in sediment. Botos (Inia geoffrensis Blainville 1817) were recorded from small aluminium-hulled boats using a linear recording array deployed vertically after the boat had been driven slowly (1–2 knots) in the vicinity of animals some 10–100 m ahead of them. Sound speed was estimated to be 1512 m s−1 using the Medwin equation (Medwin, 1975) on a mean measured water temperature of 31°C, an animal depth of 5 m and 62 ppm of salinity (Gibbs, 1972).
The linear recording array consisted of seven Neptune Sonar D/140 spherical hydrophones (Neptune Sonar Ltd., Kelk, UK) with a nominal sensitivity of −210 dB re. 1 V µPa−1. All hydrophones were attached 60 cm apart through breakouts on the same 14 m, 16 wire cable of 8 mm diameter (Cortland Cable Company, Cortland, NY). Plexiglas cylinders (90×32 mm) filled with polyurethane encased each breakout with the hydrophone elements suspended 50 mm below the cylinders parallel to the cable. The array was attached to a buoy with the first hydrophone placed at 1 m depth below the surface. A 3 kg weight was attached at the end of the array 40 cm below the lowest hydrophone to ensure the array was kept as linear as possible. The hydrophone cable was connected to a custom-built 20 dB amplifier and filter box with high- (1 kHz, 1 pole) and low-pass (200 kHz, 4 pole) filters. From there, signals were relayed to an eight-channel analogue to digital converter (USB-6356, National Instruments, TX, USA), which sampled at 500 kHz at 16-bit resolution set by a custom-written recording program (LabView, Metrotech, Denmark). Recordings were saved onto the hard drive of a laptop in WAVE file format with continuous recordings being divided into files of 30 s duration. Hydrophones were calibrated against a TC-4034 hydrophone (Teledyne RESON A/S, Slangerup, Denmark) and recordings were corrected for the hydrophone resonance frequency at 160 kHz to provide a flat frequency response (±2 dB) in the range of 2–180 kHz. The entire recording chain had a clipping level of 184 dB re. 1 µPa.
On-axis click criteria
An initial screening was carried out in Adobe Audition 3 (Adobe Systems, CA, USA) to identify files containing echolocation clicks. Files were selected for further analysis if they contained clicks with received levels of >154 dB re. 1 µPa (peak), i.e. 30 dB below clip level. This threshold was selected to standardise the screening process after initial exploratory analysis of random echolocation clicks. Recordings were analysed using custom-written scripts in Matlab 7.5 (MathWorks, Natick, MA, USA). To be accepted as an on-axis click, a click had to fulfil a set of criteria following Kyhn et al. (2010): (1) it had to be part of a click series of at least five consecutive clicks with received levels exceeding the 154 dB re. 1 µPa (peak) threshold and where received levels increased and then decreased within the click series. An unknown number of weaker on-axis clicks are therefore likely to have been ignored providing a lower bound on SLpp in an x dB re. 1 µPa+20log(range) manner. If two or more click series overlapped in time then no on-axis clicks were selected. Click series overlap was readily identified in a given time window by inspecting ICI and received level differences between detected clicks. (2) Within a click series, the click with the highest received level was chosen, since this click was assumed most likely to have been on-axis within the horizontal plane. (3) The highest received level had to have been recorded on one of the five middle hydrophones, so that angle of incidence in the vertical plane could be estimated. (4) Click localisation must be robust i.e. with intersecting hyperbolas (see next section) and within confident localisation range determined by calibration measurements.
The time-of-arrival differences (TOADs) from when a click was received on each of the seven hydrophones were estimated via cross-correlation of the seven signals recorded for each click. For each hydrophone pair it was then possible to calculate a hyperbola that described the possible location of an animal given the TOAD. For a seven-hydrophone array, a total of six independent hyperbolas could be calculated, and from their crossing points the animal's location was estimated within two dimensions by applying a least-squares method following Wahlberg et al. (2001) and Madsen and Wahlberg (2007). Ranging calibration of the array was done in Aarhus Harbour, Denmark, from distances of 10 to 60 m with an HS70 hydrophone (Sonar Research and Development Ltd, Beverly, UK) acting as a transducer playing out two cycle pulses at 80 kHz as specified by a connected waveform generator (model 33220A, Agilent Technologies, CA, USA). The localisation calibration of the hydrophone array yielded a resulting error of less than 2 dB for the transmission loss estimate out to a range of 40 m, which is in line with accepted localisation errors in previous studies (Kyhn et al., 2009; Jensen et al., 2013).
Source parameter estimation
Since recordings were done in shallow water, a 32-point Hann window centred on the peak of the signal envelope was applied to all signals to reduce the risk of reflections contributing substantially during parameter estimations. Signals were interpolated (Matlab interp function) by a factor of 10 in order to better estimate signal window length calculated as D duration defined by the −10 dB end points relative to the peak of the amplitude envelopes (Madsen, 2005; Madsen and Wahlberg, 2007). Received levels were calculated as peak-to-peak (pp) sound pressures, RMS pressures within the D duration and as energy flux density (EFD) calculated for each click as the sum of the squared sound pressure values within the D duration (Madsen, 2005; Madsen and Wahlberg, 2007). Corresponding SLs (on-axis levels at 1 m reference distance) were then calculated by adding estimated transmission loss to received level values. Transmission loss (dB re. 1 m) was estimated as the sum of spherical spreading (Urick, 1983) and frequency-dependent absorption loss where the absorption estimate of 0.0228 dB m−1 was based on an assumed Fc of 90 kHz and a water temperature of 31°C.
Spectral parameters were estimated by first applying a 32-point Hann window centred on the peak of the signal envelope to the raw signal. The power spectrum was then estimated as the squared magnitude of a 320-point fast Fourier transform (FFT) applied to the signal, resulting in a linearly interpolated spectral resolution of 1.56 kHz. Fp was calculated as the frequency of highest value in the power spectrum whereas Fc was calculated as the frequency that divides a spectrum into two halves of equal energy on a linear scale. Bandwidth was parameterised in three different ways. BW−3dB and BW−10dB were given by the two points around Fp in the power spectrum where the signal had dropped −3 or −10 dB, respectively. BWRMS was given by the standard deviation of a linear spectrum around Fc. The QRMS was calculated by dividing Fc by BWRMS, providing a measure of how resonant a click was. ICI values were calculated as the time from on-axis click to the previous click in a click series.
Beam pattern estimation
A composite beam pattern was estimated based on a model of a circular piston mounted in an infinite baffle which has previously been applied to describe radiation patterns of toothed whale echolocation clicks (Au, 1993; Beedholm and Møhl, 2006). First, the acoustic axis of each click was estimated based on the acoustic animal localisation and interpolation of received levels across all seven hydrophones. From the animal location, an off-axis angle could then be estimated to each individual hydrophone relative to the acoustic axis. Since the difference between off-axis angles at neighbouring hydrophones decreases with increasing distance, only on-axis clicks acoustically localised to less than 21 m were included in this part of the analysis. This stricter criterion was chosen because estimation of the angle of incidence to individual hydrophones is highly sensitive to localisation errors. For all on-axis clicks localised to less than 21 m, the received level at each of the seven hydrophones was back-calculated to estimate SL at 1 m and then normalised relative to the signal with highest back-calculated amplitude. Off-axis angles, together with normalised apparent SLs, were then used to estimate a composite vertical beam pattern through a single parametric fit in which piston diameters from 1 to 20 cm were tested in 0.01 cm increments (Kyhn et al., 2010). For each piston diameter tested, a goodness of fit was calculated as the sum of squared error between observed and predicted SLs. A best composite beam pattern was selected on the basis of the piston size that minimised the sum of squared error. Afterwards, a bootstrapping procedure was carried out to estimate confidence intervals of the composite beam pattern (Jensen et al., 2015): for each bootstrap, N clicks were drawn with replacement from the original pool of N on-axis clicks. The beam pattern was fitted as described above, resulting in a bootstrap estimate of the piston radius. A total of 2000 bootstrap estimates was created and the 95% bootstrap confidence intervals calculated as the 2.5th and 97.5th percentile of the resulting bootstrap distribution of estimated piston radius. The symmetric half-power beamwidth was then calculated for the estimated composite beam pattern and transmission DI was approximated as 20log(ka), where k is the wavenumber defined as 2π/λ and a is the piston radius (Urick, 1983; Zimmer et al., 2005).
Ambient noise recording
Ambient noise levels were recorded 4 km north-east of São Tomé (3°5′0″S, 60°28′0″W) on 16 October and in a minor tributary in Mamirauá Sustainable Development Reserve (3°6′0″S, 64°47′55″W) on 26 October. Botos were observed daily at both locations. A single SUDAR (Ocean Instruments, New Zealand) was deployed from a buoy at São Tomé and a few metres from a river bank at Mamirauá where the SUDAR recorded at a depth of 2 m for 32 h and 43 min and 21 h and 27 min, respectively. The SUDAR recorded with a sampling rate of 128 kHz and a clipping level of 169 dB re. 1 µPa. Third octave levels (TOLs) were calculated for the entire duration of both recordings.
This study was part of Projeto Boto, a cooperative agreement between the National Amazon Research Institute (INPA/MCTI) and the Mamirauá Sustainable Development Institute (MSDI-OS/MCTI). We wish to express our sincere gratitude to field assistants and locals in the Amazon for their dedicated help and support in making the logistics and field work a seamless operation. Thanks to Renata S. Sousa-Lima for facilitating field work and Kristian Beedholm for assistance with Matlab scripts and figures. Thanks to Mariana L. Melcón and Tom Akamatsu for most generously making their own data available for the final figure. Field work was carried out under permission SISBio-13462-5.
M.L., M.d.F. and P.T.M. designed experimental procedures. M.d.F. and P.T.M. built and calibrated the recording array. M.L., M.d.F., V.M.F.d.S. and P.T.M. carried out field work. M.L., F.H.J., M.d.F. and P.T.M. developed analytical methods. M.L., F.H.J., M.d.F., V.M.F.d.S. and P.T.M. drafted the manuscript.
Field work was funded by Danish National Research Council grants to P.T.M., Associação Amigos do Peixe Boi da Amazônia (AMPA) and Petrobras Ambiental grants to V.M.F.d.S., Augustinus Fonden grants to M.L. and a travelling fellowship awarded to M.d.F. by Journal of Experimental Biology. M.L. was funded by a PhD stipend from the Faculty of Science and Technology, Aarhus University, and National Research Council grants to P.T.M. F.H.J. was funded by a Carlsberg Foundation travel grant.
The authors declare no competing or financial interests.