SUMMARY
An evoked-potential audiogram was measured for an Indo-Pacific humpback dolphin (Sousa chinensis) living in the dolphinarium of Nanning Zoo, China. Rhythmic 20 ms pip trains composed of cosine-enveloped 0.25 ms tone pips at a pip rate of 1 kHz were presented as sound stimuli. The dolphin was trained to remain still at the water surface and to wear soft latex suction-cup EEG electrodes used to measure the animal's envelope-following evoked potentials to the sound stimuli. Responses to 1000 rhythmic 20 ms pip trains for each amplitude/frequency combination were averaged and analysed using a fast Fourier transform to obtain an evoked auditory response. The hearing threshold was defined as the zero crossing point of the response input–output function using linear regression. Fourteen frequencies ranging from 5.6 to 152 kHz were studied. The results showed that most of the thresholds were lower than 90 dB re. 1 μPa (r.m.s.), covering a frequency range from 11.2 to 128 kHz, and the lowest threshold of 47 dB was measured at 45 kHz. The audiogram, which is a function of hearing threshold versus stimulus carrier frequency, presented a U-shape with a region of high hearing sensitivity (within 20 dB of the lowest threshold) between approximately 20 and 120 kHz. At frequencies lower than this high-sensitivity region, thresholds increased at a rate of approximately 11 dB octave–1 up to 93 dB at 5.6 kHz. The thresholds at high frequencies above 108 kHz increased steeply at a rate of 130 dB octave–1 up to 127 dB at 152 kHz.
INTRODUCTION
Odontocete cetaceans (toothed whales, including dolphins and porpoises) evolved highly developed sound production systems and hearing capabilities (Au, 1993; Au et al., 2000; Nachtigall and Moore, 1988), which enable them to effectively navigate, sense and communicate within their three-dimensional and often vision-limited underwater environment. Hearing is considered to be a primary sensory modality in odontocete cetaceans to aid in navigation, orientation, foraging and communication (Au, 1993; Nachtigall and Moore, 1988; Richardson et al., 1995a; Richardson et al., 1995b). Since the first odontocete hearing was measured as a function of hearing threshold versus frequency of sound stimulus (i.e. an audiogram) in an Atlantic bottlenose dolphin, Tursiops truncatus (Johnson, 1967), audiograms of odontocete cetaceans have been measured using either psychophysical or evoked-potential methods in 16 species to date, including the harbour porpoise, Phocoena phocoena (Andersen, 1970; Kastelein et al., 2002), the killer whale, Orcinus orca (Hall and Johnson, 1972), the Amazon River dolphin, Inia geoffrensis (Jacobs and Hall, 1972), the beluga or white dolphin, Delphinapterus leucas (White et al., 1978), the Pacific bottlenose dolphin, Tursiops truncatus gilli (Ljungblad et al., 1982), the false killer whale, Pseudorca crassidens (Thomas et al., 1988), the Yangtze River dolphin, Lipotes vexillifer (Wang et al., 1992), Risso's dolphin, Grampus griseus (Nachtigall et al., 2005), the common dolphin, Delphinus delphis (Popov and Klishin, 1998), the tucuxi, Sotalia fluviatilis guianensis (Sauerland and Dehnhardt, 1998), the Yangtze finless porpoise, Neophocaena phocaenoides asiaeorientalis (Popov et al., 2005), Gervais' beaked whale, Mesoplodon europeaus (Cook et al., 2006; Finneran et al., 2009), the white-beaked dolphin, Lagenorhynchus albirostris (Nachtigall et al., 2008), the long-finned pilot whale, Globicephala melas (Pacini et al., 2010), and Blainville's beaked whale, Mesoplodon densirostris (Pacini et al., 2011). However, as there are more than 70 species of odontocete cetaceans, those species for which nothing is known about their hearing sensitivity are still an overwhelming majority. Some of these species, particularly those living in coastal or riverine ecosystems, are threatened by a wide variety of environmental factors, including climate change and anthropogenic activities. An example is the Indo-Pacific humpback dolphin [Sousa chinensis (Osbeck 1765), also called the Chinese white dolphin]. It is essential to obtain even basic information about the hearing of these animals to better understand their biology and ecology, and guide effective conservation strategies such as mitigation of the potential effects of underwater noise, some of which fall under the term ‘noise pollution’.
The Indo-Pacific humpback dolphin is referred to as an inshore or nearshore species, and is discontinuously distributed throughout coastal waters of the Indo-Pacific oceans, from eastern Africa through the Arabian Sea, the Bay of Bengal, southern China, the Gulf of Thailand, Indonesia, to northern Australia (Corkeron et al., 1997; Jefferson and Leatherwood, 1997; Jefferson and Karczmarski, 2001). Because they inhabit shallow nearshore waters, humpback dolphins are particularly susceptible to human activities. The very significant recent increase in coastal development, which is related to economic growth in China and Southeast Asia, has resulted in the influence of human activities permeating underwater. In consequence, marine mammals are being confronted with habitat degradation and destruction, and by factors including noise pollution, harassment and overfishing of prey species (Jefferson and Hung, 2004). Recently, public knowledge and hence concern about the possible effects of anthropomorphic environmental noise, together with attempts to mitigate adverse effects on the humpback dolphin, have steadily grown within scientific and conservation communities (Würsig et al., 2000; Jefferson and Hung, 2004; Jefferson et al., 2009). However, in order to propose effective and scientifically based measures for noise mitigation and animal conservation it is necessary to study their hearing and the possible effects of environmental noise on their hearing. Unfortunately, to date, nothing is known about the hearing sensitivity of the humpback dolphin.
To address this, in the present study, we measured the audiogram of a captive Indo-Pacific humpback dolphin by using an auditory evoked potential (AEP) method. This enabled measurement of key audiometric variables within a short time (typically in a few days during approximately 100 min of recordings) and without the lengthy training of the animals that is required in traditional behavioural techniques using psychophysical procedures (Supin et al., 2001; Nachtigall et al., 2000; Nachtigall et al., 2007). Previous studies of odontocete cetaceans have suggested that accuracy and precision of the audiograms were comparable when they were obtained via audiometric measurements using the AEP method or traditional behavioural techniques (Yuen et al., 2005; Houser and Finneran, 2006). The AEP method has been widely used for audiometry in odontocete cetaceans. It has been successfully used for audiogram investigation of odontocetes in captive conditions (Popov et al., 2005; Pacini et al., 2010), catch-and-release scenarios (Nachtigall et al., 2008), and even in the wild for stranded animals (Mann et al., 2010).
Previous audiometric investigations of odontocetes have usually used sinusoidally amplitude-modulated (SAM) signals as sound stimuli to provoke an AEP response (Supin et al., 2001; Nachtigall et al., 2007). The SAM stimuli evoked a rhythmic sequence of auditory brainstem responses (ABRs), i.e. envelope-following responses (EFRs), following the modulation rate of the SAM stimuli, which was chosen to be approximately 600 to 1000 Hz in odontocetes (Supin et al., 2001; Nachtigall et al., 2007; Mann et al., 2010). Although the SAM stimuli have many advantages and contributed efficient and fairly confident information about audiometry in odontocetes, there is a noteworthy disadvantage, as demonstrated by Supin and Popov (Supin and Popov, 2007): the EFR evoked by the SAM stimuli with sound pressure levels (SPLs) within 20 dB of the hearing threshold was usually small and hardly visible. If measurements were made in an environment with high electrical background noise levels, the estimated hearing threshold could be false with an error of >30 dB (Supin and Popov, 2007). The low response amplitude at the near-threshold SPLs was attributed to the narrow frequency bandwidth of the SAM stimuli, which were ±600 to ±1000 Hz (at the half-level), corresponding to a modulation rate of 600 to 1000 Hz (Popov and Supin, 2001; Supin and Popov, 2007). An effective solution to the problem is to enlarge the frequency bandwidth of the stimuli, which could be achieved using rhythmic pip trains as the sound stimuli, with each pip appropriately shorter than the modulation rate (Supin and Popov, 2007). In the present study, rhythmic pip trains, composed of 0.25 ms pips, were used as the sound stimuli to provoke the AEP response of the subject.
MATERIALS AND METHODS
Ethical statement
This research was conducted under China's Wildlife Protection Act, 1989, Implementation By-law on Aquatic Wildlife Conservation.
Subject
The experimental subject was a male Indo-Pacific humpback dolphin that was rescued from stranding on the coast of Beihai Bay, China (Fig. 1) in August 2007. The animal was transported to the dolphinarium of Nanning Zoo, Nanning, China (Fig. 1A) on 25 August 2007, approximately 1 week after the stranding, for further treatment and rehabilitation. Thanks to the great efforts of the veterinarians and other staff in the zoo, the animal's health became normal and stabilized within a few months. Subsequently, the dolphin was trained to perform in shows for the public. The hearing experiment and data collection were conducted between 18 and 22 December 2011. Prior to the experiment, the animal was trained (over a few days) to remain still at the water surface and wear soft latex suction cups in order to examine its hearing using the AEP method (Fig. 1B). During the time of the experiment the dolphin was fed four times per day with thawed small fish and also participated in two shows per day: 11:20-11:40 h and 15:00–15:20 h. Experimental sessions were conducted during the first feeding and last feeding, 08:10–08:30 h and 17:10-17:30 pm, respectively, well before or after the daily shows. The dolphin was approximately 2.25 m in length and 130 kg in mass, and was estimated to be 13 years old at the time of the study.
Experimental facility and background noise measurements
The hearing experiment was conducted in the main pool (Fig. 1B,C) of the dolphinarium, which was mainly used for dolphin training and shows. The pool was a kidney-shaped concrete structure 14 m in width, 30 m in length and 5 m in depth, and filled with seawater transported from nearby coastal waters. Background noise in the pool was measured during the experiment using a Reson TC-4013 hydrophone (–212 dB re. 1 V μ Pa–1; Reson, Slangerup, Denmark) with 50 dB gain within a frequency range of 0.1 to 200 kHz by an EC6081 pre-amplifier (VP2000; Reson). The amplified noise was input to a 16 bit analog-to-digital converter of a data acquisition card (NI USB-6251 BNC, National Instruments, Austin, TX, USA) and recorded by a standard laptop computer (PC) with a custom-made program designed using LabVIEW software (National Instruments) with a sampling rate of 512 kHz. The recorded noise was analysed and averaged using a customised MATLAB algorithm (MathWorks, Natick, MA, USA).
Experimental setup and sound stimuli presentation
The experimental setup is shown in Fig. 1C and the data flow chart is presented in Fig. 2. Each experimental session began with the primary trainer positioning the dolphin at the water surface parallel to the side of the pool and approximately 80 cm away from the pool wall. The dolphin was positioned in such a way that the dorsal fin and the dorsal surface of the head with the blowhole remained above the water surface (Fig. 1C), while the‘acoustic windows’ located at the lower jaw and/or external auditory meatus of the subject, where the sounds were assumed to travel to the inner ear (Norris, 1968; Popov et al., 2008), were maintained underwater throughout the session. Three suction-cup electrodes were then attached to the back of the dolphin for AEP recording. Sound stimuli were presented using a Reson TC-4040 hydrophone as a projector, which was positioned at a distance of approximately 2 m and a depth of 50 cm in front of the subject's ‘acoustic windows’.
The sound stimuli were rhythmic pip trains composed of cosine-enveloped 0.25 ms tone pips with a 1 kHz pip rate and variable carrier frequency. Each pip train was 20 ms in duration followed by a silence of 30 ms so that the sound stimuli were presented at a rate of 20 s–1. The 1 kHz pip rate was chosen based on previous publications for other odontocetes (Supin et al., 2001; Popov et al., 2005; Nachtigall et al., 2007; Supin and Popov, 2007; Pacini et al., 2010) and a pre-established modulation rate transfer function of the experimental subject. The stimuli were digitally synthesised using a customised LabVIEW programme at an update rate of 512 kHz, and the digital-to-analog conversion was accomplished by the NI USB-6251 BNC (National Instruments) data acquisition card connected to a laptop computer. The analogue signals were then attenuated by an HP-350D attenuator (Hewlett Packard, Palo Alto, CA, USA) and amplified by a HP-465A power amplifier (Hewlett-Packard) to vary the signal amplitude. The signals were monitored by an oscilloscope (Tektronix TDS1002C, Beaverton, OR, USA) before being projected by the Reson TC-4040 hydrophone. SPLs (dB re. 1μPa) of the projecting sound stimuli were measured and calibrated in root mean square (r.m.s.) of the whole pip train, including both the pips and inter-pip pauses (Supin and Popov, 2007), by positioning a calibrated receiving hydrophone at the same location as the animal's ‘acoustic windows’. Carrier frequencies varied from 5.6 to 152 kHz, separated by one-octave steps within a range of 5.6 to 22.5 kHz, half-octave steps within a range of 22.5 to 32 kHz, quarter-octave steps within a range of 32 to 128 kHz, and eighth-octave steps within a range of 128 to 152 kHz, which are (rounded to 0.1 kHz): 5.6, 11.2, 22.5, 32, 38, 45, 54, 64, 76, 90, 108, 128, 139 and 152 kHz. The waveforms (left) and corresponding spectra (right) of the pip train segments with carrier frequencies of 5.6, 11.2, 45, 108 and 152 kHz are presented in Fig. 3 as examples of the received stimuli at the animal's ‘acoustic windows’. The frequencies of the received stimuli were fairly centered at the expected carrier frequencies even for the low-frequency stimuli, where the projector's transmitting sensitivity is relatively low (Fig. 3).
AEP recording
The animal's AEP responses to the sound stimuli were picked up by three electroencephalography (EEG) electrodes (Grass Technologies, West Warwick, RI, USA): gold-plated disks 10 mm in diameter mounted within latex suction cups 60 mm in diameter. The recording electrode was attached with conductive gel to the dorsal head surface, located on the midline, approximately 5–7 cm behind the blowhole. The reference electrode was also attached to the animal's dorsal fin using conductive gel. The third EEG electrode acted as a grounding device and was positioned on the back of the animal between the recording and reference electrodes (Fig. 1C, Fig. 2). The AEP responses were conducted by shielded cables to an EEG amplifier (Grass CP511 AC Amplifier, Grass Technologies) and amplified 20,000 times within a frequency band of 300 to 3000 Hz. The amplified signal was monitored by the Tektronix TDS1002C oscilloscope and input to a 16 bit analog-to-digital converter of the same NI USB-6251 BNC data acquisition card that generated the synthesised sound stimuli (Fig. 2). The AEP response triggered by the sound stimulus onset was then digitised at a sampling rate of 25 kHz and transmitted to the laptop computer. To extract the AEP response from noise, AEPs were collected by averaging 1000 individual AEP records, which were 30 ms in duration, using the same customised LabVIEW program that synthesized the sound stimuli.
Hearing threshold determination
To estimate a hearing threshold for each carrier frequency, typically six to nine AEP records with a series of stimulus SPLs were recorded and measured. The initial stimulus SPL for each frequency was chosen based on previously published audiograms of other odontocetes (Supin et al., 2001; Popov et al., 2005; Nachtigall et al., 2007) and was usually 20–40 dB higher than the estimated threshold. The stimulus presentation level was then attenuated in 5–10 dB steps until no evoked potential was observed. For each frequency and stimulus SPL, a 15 ms (375 point) window of the EFR to the rhythmic sound stimulus, from 5 to 20 ms in the AEP record, was fast Fourier transformed (FFT) to obtain a frequency spectrum. The magnitude at 1 kHz in the spectrum was used to estimate the response of the subject to the sound stimulus. For each frequency, the magnitudes at 1 kHz were measured and plotted as a function of stimulus SPLs, and the near-threshold portion of the plot was approximated by a linear regression line (Supin et al., 2001; Nachtigall et al., 2007). The intersection of the regression line with the zero crossing point of the response input–output function was adopted as a threshold estimate.
Vocalisation recording of the subject
For comparison of the frequency range between hearing and biologically produced sounds, vocalisations by the experimental subject freely swimming (alone) in the main pool were recorded before or after the hearing study sessions. The animal's vocalisations were recorded in the same way as noise recording described above. In outline, the sound was picked up by the Reson TC-4013 hydrophone with 50 dB gain within a frequency range of 0.1 to 200 kHz by the EC6081 preamplifier. The amplified sounds were input to a 16 bit analog-to-digital converter of the NI USB-6251 BNC data acquisition card and recorded by a standard PC with a custom-made LabVIEW program at a sampling rate of 512 kHz. The recorded sounds were analysed using a customised MATLAB algorithm.
RESULTS
AEP response and hearing audiogram
Each hearing experimental session lasted approximately 20 min, with 10–15 AEP records being collected. Examples of the recorded AEP responses to the rhythmic sound stimulus (0.25 ms tone pips with carrier frequency of 108 kHz) are presented in Fig. 4A. The stimulus SPL (dB re. 1μPa) calibrated near the animal's ‘acoustic windows’ is indicated with the corresponding AEP response. The zero point of the time scale in Fig. 4A corresponds to the time point when the sound stimulus was projected and the AEP recording was triggered. The tone pips evoked a sequence of evoked potentials following the 1 kHz pip rate, which was the EFR. The EFR showed a temporal lag of approximately 3–4 ms compared with both the onset and offset of the sound stimulus. Both sound transmission time from the projector to the animal's ‘acoustic windows’, which was approximately 1.3 ms for a 2 m distance (Fig. 2), and the 2–3 ms latency of the evoked potential following presentation of the stimulus contributed to this lag. The latter served as a predictable electrophysiological feature, confirming that the AEP recording occurred in direct response to the sound stimulus and was not an artifact. The EFRs were discernible well above the electrical noise level even during the near-threshold portion of the SPL (Fig. 4A). As the stimulus SPL decreased, the EFR magnitude synchronously decreased until the response disappeared in noise (Fig. 4A).
The frequency spectrum of the corresponding AEP response between 5 and 20 ms, which contained a major part of the EFR record but did not contain the latency and the initial transient part of the response, was calculated by FFT and is presented in Fig. 4B. The consistent peak at 1 kHz reflected the animal's EFR, and thus the neurophysiological ‘following’ of the carrier tone pips at 1 kHz pip rate. The amplitude of the AEP response was reflected in the magnitude of the peak at 1 kHz in the spectrum. As the stimulus level was attenuated, the peak magnitude of the response decreased correspondingly. Fig. 4B shows that at the stimulus SPL of 66 dB, the peak of the response spectrum was comparable to the electrical noise level, which was typically lower than 0.04μV r.m.s. at 1 kHz in the spectrum in the present experimental condition. The peak magnitude of each spectrum at 1 kHz was measured as an estimate of the EFR amplitude and plotted as a function of stimulus SPL. Examples for the stimuli with carrier frequencies of 45 and 108 kHz are presented in Fig. 5. The functions of EFR amplitude versus stimulus SPL in Fig. 5 showed that at near-threshold, with stimulus SPL from 49 to 59 dB and 66 to 101 dB for the 45 and 108 kHz stimuli, respectively, the EFR amplitude increased fairly steeply; an inflection point appeared at a stimulus SPL of 59 dB and 101 dB for the 45 and 108 kHz stimuli, respectively, after which the EFR amplitude increased at a reduced rate. In determining hearing threshold, the near-threshold portion of the plot, up to the inflection point, was approximated by a linear regression line (Fig. 5). In most cases, the linear regression was satisfactory within a near-threshold up to a range of 20–45 dB (35 dB for the 108 kHz stimulus shown in Fig. 5) with a high r2-value, typically from 0.96 to 1. The slope of the linear regression line was typically between 0.01 and 0.02 μV dB–1. Theoretical zero-response SPL of the regression line was adopted as the hearing threshold for the corresponding carrier frequency, which was 47 and 62 dB in the examples shown in Fig. 5 for carrier frequencies of 45 and 108 kHz, respectively. Hearing thresholds determined for each of the 14 examined carrier frequencies ranging from 5.6 to 152 kHz are presented in Table 1.
The resulting audiogram, which is a function of hearing threshold versus stimulus carrier frequency, is shown in Fig. 6. The spectrum density of the pool background noise (means ± s.d.; dB re. 1 μPa2 Hz–1), which was calculated by FFT of 10 ms noise windows for each sample and averaged by 1000 samples, is also shown. The spectrum density indicated that the experimental pool had a quiet noise environment with a background noise level of less than 50 dB for all the examined frequencies and even lower than 40 dB for frequencies higher than 45 kHz. The quiet noise environment provided an excellent opportunity for hearing threshold measurement. The audiogram demonstrated that most of the thresholds were lower than 90 dB, covering the frequency range from 11.2 to 128 kHz, and the lowest threshold of 47 dB was measured at 45 kHz. The audiogram presented a U-shape with a region of high hearing sensitivity with thresholds below approximately 70 dB (within 20 dB of the lowest threshold) between approximately 20 and 120 kHz. For frequencies lower than this high-sensitivity region, thresholds increased at a rate of approximately 11 dB octave–1 up to 93 dB at 5.6 kHz. The thresholds at high frequencies above 108 kHz increased steeply with a rate of 130 dB octave–1 up to 127 dB at 152 kHz. Within the high-sensitivity region, there was a plateau at 64–76 kHz between the two regions of highest sensitivity at 32–54 and 90–108 kHz.
Vocalisation of the subject
Three sessions of sound recordings were conducted just before or after the hearing experimental session, when the dolphin was swimming freely and alone in the pool. Each session lasted approximately 10 min. Most of the time, the dolphin was acoustically silent. Occasionally, the animal produced a short click train consisting of single clicks, probably exploring the experimental equipment deployed underwater. No ‘whistles’ were detected. Examples of two click trains are presented in Fig. 7. The waveform and power spectrum of one of the clicks from the click train shown in Fig. 7A demonstrates that the clicks possess high-frequency (peak frequency >100 kHz) and short-duration (<50μs) characteristics typical of odontocete cetacean echolocation clicks. However, the clicks from the click train shown in Fig. 7B had peak frequencies lower than 15 kHz and were of relatively long duration. Three examples of click waveform and corresponding spectra from the click train shown in Fig. 7B indicate that the click waveform and spectrum changed from click to click.
DISCUSSION
Instead of using SAM signals, the present study used rhythmic pip trains with 0.25 ms pips, shorter than the 1 kHz modulation rate, as the sound stimuli. Previous work (Supin and Popov, 2007) has demonstrated that rhythmic pip trains composed of appropriately short tone pips as sound stimuli were capable of achieving a more reliable and confident estimation of the hearing threshold and thus audiogram measurement relative to the SAM sound stimuli. Our measurements showed that the present sound stimuli provoked robust AEP responses even in the near-threshold range (Figs 4, 5), which is ideal for reliable estimation of the hearing threshold. The robust AEP responses were assumed to be attributed to a wide bandwidth in the rhythmic pip trains composed of 0.25 ms pips with a 1 kHz modulation rate. In theory, the half-level (i.e. 6 dB) frequency bandwidth of the present stimuli in the spectra is ±4 kHz (1/0.25), wider than that of the SAM stimuli with a 1 kHz modulation rate, which is ±1 kHz. The 6 dB frequency bandwidth of approximately ±4 kHz was confirmed by monitoring and measuring the sound stimuli produced by the projector in the experimental pool (Fig. 3). Although shorter pips with a wider stimulus spectrum provoke higher AEP response amplitudes and result in steeper amplitude dependence on stimulus SPL in the near-threshold range, and thus more precise hearing threshold estimation (Supin et al., 2001; Supin and Popov, 2007), the wider the spectrum, the more ambiguous the estimated threshold attributed to a certain carrier frequency. In the present study, for the stimuli with carrier frequencies higher than 45 kHz, the approximately ±4 kHz frequency bandwidth may be considered narrow enough to distinguish one stimulus from another. At the low frequency range, although the carrier frequencies were selected in frequency steps of one octave (from 5.6 to 22.5 kHz) or a half-octave (from 22.5 to 32 kHz), the spectra of the stimuli still slightly overlapped each other (see Fig. 3F,G). However, previous direct measurements indicated that the hearing thresholds of dolphins, estimated based on stimulus SPL in long-term r.m.s. (computed throughout the stimulus duration, including both the pips and inter-pip pauses), were almost independent of pip duration and thus the stimulus spectrum bandwidth (Supin and Popov, 2007). Supin and Popov (Supin and Popov, 2007) also demonstrated that the audiogram measured using a rhythmic pip train with 0.25 ms pips, shorter than the modulation rate, as the stimulus was comparable to but less scattered than that measured using SAM stimuli in an ideal background noise environment. Assuming that the present subject had a auditory mechanism comparable to that of the dolphin measured in Supin and Popov's study, we would not expect that the present sound stimuli had introduced obvious ambiguousness into the threshold estimation of a certain carrier frequency, even at the low frequency range.
For most of the tested carrier frequencies, the EFR amplitude increased steeply with the stimulus SPL within the near-threshold range up to a range of 20–45 dB. The functions of EFR amplitude versus stimulus SPL within the near-threshold range were approximated by linear regression lines with high r2-values and fairly steep slopes, typically between 0.01 and 0.02 μV dB–1. The examples in Fig. 5 show that for the stimulus with a carrier frequency of 108 kHz, the EFR amplitude increased with the stimulus SPL at a rate of 0.0131 μV dB–1 within 35 dB of the near-threshold range up to the stimulus SPL of approximately 100 dB, after which the rate reduced. However, for the stimulus with a carrier frequency of 45 kHz, the EFR amplitude increased steeply with the stimulus SPL only within a relative short range of the stimulus SPL, which was approximately 10 dB in the near-threshold range. The reason is unclear. Nevertheless, the function of EFR amplitude versus stimulus SPL featured two to three branches composed of a steeply rising near-threshold branch, a quasi-horizontal branch and/or an oblique branch, a typical form of EFR amplitude dependence on stimulus SPL (Supin et al., 2001; Supin and Popov, 2007).
The U-shaped audiogram shown in Fig. 6 indicates that the studied dolphin had hearing abilities similar to those of many other odontocete species (Au et al., 2000; Supin et al., 2001). The main characteristics were that the animal had low hearing thresholds below approximately 70 dB for the sound stimuli with carrier frequencies ranging in a wide band from approximately 20 to 120 kHz. The lowest threshold of 47 dB at 45 kHz represented a fairly low threshold measured by the evoked-potential method (Supin et al., 2001; Popov et al., 2005). However, it is comparable to the lowest thresholds in the hearing investigations of other odontocetes acquired by the same evoked-potential method, which were commonly close to or lower than 50 dB (Popov et al., 2005; Nachtigall et al., 2008; Pacini et al., 2010; Pacini et al., 2011). Such low thresholds indicated that the background noise environment in the present study was suitable for measurement of hearing and that masking effects of the animal's hearing were negligible. The steep increase in hearing thresholds at high frequencies above 108 kHz suggested that the hearing cut-off frequency of the investigated animal was between approximately 110 and 130 kHz, similar to most investigated odontocetes (Au et al., 2000; Supin et al., 2001). Although in all investigated odontocetes the hearing cut-off frequency is higher than or, in a few cases (Pacini et al., 2010; Pacini et al., 2011), close to 100 kHz, in many species it does not exceed 120–130 kHz (Au et al., 2000; Supin et al., 2001). However, in some species, such as the harbour porpoise (Andersen, 1970; Kastelein et al., 2002), the Yangtze finless porpoise (Popov et al., 2005) and the white-beaked dolphin (Nachtigall et al., 2008), the hearing cut-off frequencies were greater than 150 kHz.
The extra high cut-off frequency in the hearing of the porpoise family and the white-beaked dolphin might be related to the high-frequency clicks they transmit, assuming that the animals could hear the sounds they produce. Both the harbour porpoise (Villadsgaard et al., 2007) and the Yangtze finless porpoise (Li et al., 2005) produce echolocation clicks with peak frequencies typically greater than or close to 130 kHz. White-beaked dolphin clicks were reported to have a secondary energy peak at 250 kHz (Rasmussen and Miller, 2002) and contain energy up to 305 kHz (Mitson, 1990). Indo-Pacific humpback dolphin clicks recorded opportunistically from a group of animals in Hong Kong waters were shown to have spectral energy extending up to at least 200 kHz (Goold and Jefferson, 2004). However, sound recordings from the experimental subject in the present study indicate that typical high-frequency clicks of this humpback dolphin had a peak frequency around 120 kHz with no obvious spectral energy above 150 kHz (Fig. 7A). The animal also produced click trains consisting of clicks with relatively longer duration and peak frequencies lower than 15 kHz (Fig. 7B). This suggests that the experimental subject produced a variety of clicks with different frequency components. Alternatively, the low frequency clicks in Fig. 7B could be artifacts originating from ‘off-axis’ patterns of the ‘on-axis’ high-frequency clicks (Au, 1993). In either case, the measured audiogram of the experimental animal with a high-frequency hearing cut-off between 110 and 130 kHz approximately matched the frequency range of the animal's high-frequency clicks that, as measured, had a peak frequency of approximately 120 kHz. At frequencies outside the high sensitivity region of 20–120 kHz, the animal would still be able to hear the sound stimuli, but with relatively higher hearing thresholds (Fig. 6). Given that most mammalian audiograms are U-shaped, the hearing thresholds of stimuli with frequencies higher than 152 kHz were not measured. The plateau at 64–76 kHz between the two most sensitive regions of 32–54 and 90–108 kHz in the measured audiogram was similar to a phenomenon observed in the audiograms of the harbour porpoise (Popov et al., 1986), the Amazon River dolphin (Popov and Supin, 1990) and the Yangtze finless porpoise (Popov et al., 2005), where a plateau also appeared between two high-sensitivity regions. The reason or biological significance of this phenomenon remains unexplained.
The present data represent the first hearing measurements for the Indo-Pacific humpback dolphin. The investigated animal, with an estimated age of 13 years, should be considered adult (Jefferson and Karczmarski, 2001). Medical records of the dolphin indicate that the animal had not received ototoxic medicines (as antibiotic medication), which might have adversely affected its hearing. The high-frequency click production with a peak click frequency of approximately 120 kHz and high-frequency hearing with a cut-off frequency between approximately 110 and 130 kHz suggest a matching and healthy sound production and hearing capability. This audiogram of a healthy adult could form the baseline of hearing information for the Indo-Pacific humpback dolphin. However, although the measured audiogram had a U-shape and is similar to many odontocete audiograms, for which the audiograms of each species were usually collected from only one or two individuals (Nachtigall et al., 2000), we should be cautious when interpreting and extending the present hearing data from one individual to the species as a whole. Many factors, including the age of the subjects, physical situation, medical administration and background noise environment, could influence hearing measurements. Hearing measurements with a group of Atlantic bottlenose dolphins, Tursiops truncatus (Popov et al., 2007), and Pacific bottlenose dolphins, Tursiops truncatus gilli (Houser et al., 2008), showed that intraspecific variations in hearing capability of the odontocetes does exist. Although the present study provides basic hearing information for the Indo-Pacific humpback dolphin, it is essential, whenever possible, to measure hearing in more than one individual with different ages and under various environmental conditions to learn more about individual variation and better assess potential environmental effects on the hearing and behaviour of the species.
LIST OF ABBREVIATIONS
FOOTNOTES
FUNDING
The study was supported by the National Natural Science Foundation of China (31070347), the Ministry of Science and Technology of China (2011BAG07B05-3), the State Oceanic Administration of China (201105011), and the Marine Mammal Research Laboratory, Tropical Marine Science Institute, National University of Singapore.
Acknowledgements
We are grateful to the staff and students at the Baiji Aquarium, Institute of Hydrobiology of the Chinese Academy of Sciences, the trainers and staff of the dolphinarium of Nanning Zoo, Nanning, China, for their assistance during data collection and travelling. The LabVIEW program used for stimuli synthesis and evoked potential recording was originally developed by Dr Alexander Ya. Supin. The first author appreciates Drs Alexander Ya. Supin and Paul E. Nachtigall for their continuous help and guidance on the hearing research of marine mammals. We thank Drs Paul James Seekings and Brahim Hamadicharef and other staff at the Marine Mammal Research Laboratory of the Tropical Marine Science Institute for their encouragement and help during the study.