Aversiveness of sounds and its underlying physiological mechanisms in mammals are poorly understood. In this study we tested the influence of psychophysical parameters, motivation and learning processes on the aversiveness of anthropogenic underwater noise in phocid seals (Halichoerus grypus and Phoca vitulina). We compared behavioural responses of seals to playbacks of sounds based on a model of sensory unpleasantness for humans, sounds from acoustic deterrent devices and sounds with assumed neutral properties in different contexts of food motivation. In a captive experiment with food presentation, seals habituated quickly to all sound types presented at normalised received levels of 146 dB re. 1 μPa (r.m.s., root mean square). However, the fast habituation of avoidance behaviour was also accompanied by a weak sensitisation process affecting dive times and place preference in the pool. Experiments in the wild testing animals without food presentation revealed differential responses of seals to different sound types. We observed avoidance behaviour at received levels of 135–144 dB re. 1 μPa (sensation levels of 59–79 dB). In this experiment, sounds maximised for ‘roughness’ perceived as unpleasant by humans also caused the strongest avoidance responses in seals, suggesting that sensory pleasantness may be the result of auditory processing that is not restricted to humans. Our results highlight the importance of considering the effects of acoustic parameters other than the received level as well as animal motivation and previous experience when assessing the impacts of anthropogenic noise on animals.
Aversiveness of biological sounds has been studied in detail in the context of predator avoidance (Deecke et al., 2002; Tuttle and Ryan, 1981). By contrast, factors influencing aversiveness of other sounds are poorly understood and have only been investigated with respect to stimulus amplitude (Campbell, 1957; Kastelein et al., 2006a), practical applications (Blackshaw et al., 1990; Kastelein et al., 2001; Talling et al., 1998) or the use of sound as a reinforcing stimulus (Campbell and Bloom, 1965). An aversive stimulus is an unpleasant or noxious stimulus, which induces an avoidance response in an animal. Such behavioural responses to sounds are influenced by a variety of psychophysical factors relating to sound perception, the motivational state of an animal and basic learning processes (e.g. habituation, conditioning). Elucidating the role of these factors is not only relevant for animal welfare and conservation (Nowacek et al., 2007) but can also provide answers to fundamental questions of sound and music perception in mammals (Hauser and McDermott, 2003).
Anthropogenic noise has been found to elicit avoidance responses in marine mammals (Johnston, 2002; Kastelein et al., 2006a; Kastelein et al., 2006b; Morton and Symonds, 2002; Nowacek et al., 2004), terrestrial mammals (Talling et al., 1998) and birds (Mackenzie et al., 1993). There are few studies that have tested the effects of sound characteristics on aversiveness (Kastelein et al., 2001; Talling et al., 1998) so that detailed information on what causes aversiveness in animals is not available. However, models developed for humans could provide a first indication which psycho-physiological parameters influence the degree of pleasantness or aversiveness of sound in mammals. Zwicker and Fastl developed a model that can be used to predict unpleasantness of sounds in humans (Zwicker and Fastl, 1990). In this model, a decrease in tonality and an increase of sharpness, loudness and roughness will contribute to ‘unpleasantness’.
Tonality depends on the waveform of the sound and is highest for pure tones whereas the sensation of sharpness is caused by signals with centre frequencies close to the upper edge of the hearing range (Zwicker and Fastl, 1990). Loudness is a complex psychophysiological parameter that depends among other factors on the hearing threshold of the test subjects. Experiments on humans showed that the contours of perceived equal loudness are roughly parallel to the hearing threshold within the most sensitive hearing range but are compressed at the high and low frequency edge of the hearing range (Fletcher and Munson, 1933; Robinson and Dadson, 1956). In other words, sounds of different frequency but similar sensation levels (i.e. pressure levels in dB by which a sounds exceeds the hearing threshold) cause similar perceived loudness within the most sensitive hearing range (Yost, 2000). However, perceived loudness also depends to some extent on bandwidth (Zwicker et al., 1957) and stimulus duration (Zwislocki, 1969). In humans, changes in electro-physiologically measurable parameters that are indicative of stress or discomfort were correlated with sensation levels of about 70–80 dB (Spreng, 1975).
A sensation of roughness is caused by fast frequency or amplitude modulations. In humans, modulation frequencies of around 70 Hz cause the strongest effect (Zwicker and Fastl, 1990). Roughness perception has received considerable attention in humans since it has also been suggested as the physiological basis for musical consonance preferences (Plomp and Levelt, 1965). In humans, sounds that consist of partial tones, which are related by complex frequency ratios, are perceived as unpleasant or dissonant whereas sounds that consist of partials related by simple ratios are perceived as pleasant or ‘consonant’ (Helmholtz, 1853). Modern classical composers (e.g. Arnold Schönberg) and musical psychologists (Stumpf, 1883) tended to argue that consonance perception is a result of culture but physiologists expected more general properties of the auditory system to be responsible (Helmholtz, 1853). Plomp and Levelt developed the so-called critical band theory of consonance perception based on their findings that in musically untrained subjects dissonance is maximised if two partial tones fall within 25% of the critical bandwidth (= cochlea filter bandwidth) (Plomp and Levelt, 1965). Some evidence for a genetic rather than a cultural basis of consonance preference also comes from experiments on human babies (Zentner and Kagan, 1996). However, results from animal experiments remain controversial; while a two-alternative forced choice experiment revealed clear preferences for consonant musical intervals in rats (Borchgrevink, 1975), consonance preference could not be demonstrated in place preference experiments with tamarin monkeys (McDermott and Hauser, 2004).
Studies assessing impacts of noise on animals usually use behavioural avoidance responses as a measure of aversiveness or severity of disturbance (Nowacek et al., 2007). This is problematic because motivation and learning can minimise such responses while detrimental effects remain unchanged. For example, while seals in British Columbia showed diminishing (Mate and Harvey, 1987) or a lack of aversive responses to acoustic predator deterrent devices used to protect fish farms (Jacobs and Terhune, 2002), cetaceans were deterred by these devices for several consecutive years (Morton and Symonds, 2002). As the cetaceans did not feed on fish in farms, their motivation to stay in the area may have been lower than that of the seals. However, the signals could still have had an effect on the hearing abilities of the seals. Thus, it is important to elucidate the role of motivation and learning in the control of avoidance responses.
Our study aimed to test how stimulus properties, motivation and learning influence aversiveness of sound in phocid seals. We chose seals as test subjects for several reasons. Seals have sensitive underwater hearing over a very large frequency range covering almost eight octaves (Kastak and Schusterman, 1998; Møhl, 1968; Terhune, 1988; Terhune and Ronald, 1975). Visual energy is much less efficient to convey information underwater, and some echolocating toothed whales rely largely on sound for prey detection (Gannon et al., 2005) and communication (Janik, 2000). While seals might use hydrodynamic cues to detect prey over short ranges (Dehnhardt et al., 2001), sound plays an important role in their underwater communication system (Hanggi and Schusterman, 1994), and passive listening has been suggested to aid foraging (Schusterman et al., 2000). In addition seals are not closely related to humans within the mammalian line but have evolved adaptations to aquatic hearing (Schusterman et al., 2000). Therefore, if we find similar sound characteristics to cause aversiveness in seals and in humans it suggests that the responsible mechanism is an evolutionarily ancient characteristic within the mammalian line. Finally, there is increasing concern about the impact of anthropogenic noise on the behaviour of marine mammals (Nowacek et al., 2007), and it has been suggested that in some species mass strandings might be a secondary result of an overt behavioural response to aversive sound (Jepson et al., 2003).
In order to investigate the physiological basis of aversiveness of sounds in seals, we tested three different classes of stimuli which were presented to seals underwater at received sensation levels that were below the expected pain and acoustic startle thresholds: sounds designed to be unpleasant based on a psychophysical model of sound perception in humans (Zwicker and Fastl, 1990), control sounds with assumed neutral properties regarding perceived pleasantness, and sounds recorded from commercially available acoustic deterrent devices (ADDs) for seals. To test how motivation modifies behavioural responses, the animals were tested under three different conditions: (1) with a known accessible feeding apparatus present, (2) with a known feeding apparatus present that does not provide food, and (3) without a food source. Tests were conducted with captive and wild animals.
MATERIALS AND METHODS
Subjects and their environment
For the captive tests, we used six grey seals (Halichoerus grypus, Fabricius 1791) and two harbour seals (Phoca vitulina, Linnaeus 1758). All grey seals and one harbour seal were wild captured at a haul-out site at Abertay Sands in the UK (56°25.59′N, 2°45.59′W). The other harbour seal was caught in an estuary close by (~56°21.7′N, 2°51′W). All seals were housed in outdoor pools filled with seawater. Four out of the six grey seals were sexually mature adult females and two were juveniles (one male, one female). The juveniles were approximately 6–11 months old at the time of the experiments. The two harbour seals were adult males. The harbour seals had been in captivity for two weeks and one month, respectively, before being used in the experiment while the tested grey seals had been in captivity for a time ranging from 3 to 8 months prior to the experiments. Experiments were carried out in a 3 m-diameter, 1.5 m-deep circular, seawater-filled test pool.
Tests on wild seals were conducted at Abertay Sands. All tests on wild animals were conducted in the vicinity of one of four sites where grey seals hauled-out. Haul-out sizes during the experiments ranged from approximately 20 to 200 animals.
We tested three different types of sounds. The first class (PPM=psychophysical model sounds) were sounds predicted to be aversive based on the psychophysiological model developed by Zwicker and Fastl (Zwicker and Fastl, 1990). PPM sounds were designed to maximise roughness through selected frequency modulation patterns and had a relatively high loudness due to their broad bandwidth. The second class were control sounds with assumed neutral properties, and the third class were ADD sounds. All sounds are shown in Fig. 1. For playbacks each stimulus was presented in a continuous sound burst of 6 s duration.
PPM and control sounds were synthesised using Cool Edit pro software (Syntrillum Software Corporation, Phoenix, AZ, USA) with rise and fall times of 50 ms. ADD sounds had been recorded from active acoustic deterrent devices at sea except for the Lofitech sound which was synthesised based on a field recording. ADD sounds were played at a lower level than what the original devices produce (see below).
PPM (psychophysical model) sounds
(1) Square 500/530 FM. This stimulus consisted of two concurrent 70 Hz frequency modulated (FM) square-wave tones with a carrier frequency of 500 Hz and 527 Hz. Modulation depth was 50% of the carrier frequency. A 70 Hz frequency modulation pattern was found to cause maximum roughness in humans (Zwicker and Fastl, 1990). We tried to enhance aversiveness by selecting two partials that lay in the same critical band for auditory analysis. Critical bandwidths for harbour seals range from 20% to 40% of the test frequency (Southall et al., 2003).
(2) Square 500/507 FM. This stimulus was identical to ‘Square 500/530 FM’ except that the carrier frequencies of the two partials were 500 Hz and 507 Hz, respectively. The frequency ratio of the partials for this stimulus was chosen to reflect 25% of the critical bandwidth calculated from underwater critical ratios in pinnipeds (Southall et al., 2000), which are 3% and 9% of the test frequencies.
(3) Square 500 FM. 70 Hz FM square-wave tones with a carrier frequency of 500 Hz. Modulation depth was 50% of the carrier frequency. This stimulus is the base pattern of stimuli 1 and 2.
(4) Square variable. 100–300 ms-long, constant frequency, square-wave pulses (some of which were FM) with the carrier frequency of each individual pulse ranging from 500 Hz to 1.5 kHz. Similar to the previous sound, spectral variability was used to make the sound less predictable.
(5) Sweeps FM. A complex sound consisting of FM square-wave up- and down-sweeps. The frequency modulation applied to the square waves ranged from 0 Hz (no modulation) to 100 Hz with modulation depth between 30% and 60%. Sweeps (1–4 s duration) covered a frequency range from 400 Hz up to 3.5 kHz. This temporal and spectral variability was implemented to make the sound less predictable and to prevent habituation.
(6) White noise (400–20 kHz), which was slightly modified during playback due to the frequency response of the speaker.
(7) Sine-wave pure tone (500 Hz).
(8) Pulse train consisting of 2–5 ms long pure tones (10 kHz) recorded from an Airmar dB Plus (Milford, NH, USA).
(9) Complex, broadband sounds with a peak frequency between 7 kHz and 9 kHz produced by a Terecos ADD (Glasgow, UK).
(10) Short tone pulses at varying frequencies with peak frequencies of either around 15.4 kHz or 9.6 kHz recorded from an Ace-Aquatec ADD (Dingwall, UK).
(11) Pulse train consisting of 495–500 ms-long sine-wave pulses as used in a Lofitech ADD (Leknes, Norway).
Transducer, sound field and source level
Sounds were presented underwater through a Lubell 9162 loudspeaker (Lubell Labs Inc., Columbus, OH, USA). The loudspeaker was powered by a Phonic MAR 2 power amplifier (Taipei, Taiwan) and playback sounds were played from a Panasonic SL-S120 CD player (Osaka, Japan). The loudspeaker was calibrated using all playback stimuli and a variety of test signals at broadband source levels ranging from 120 dB re. 1 μPa to 160 dB re. 1 μPa. The amplitude of some playback sounds was then readjusted in the digital domain using the calibration data in order to ensure normalised root mean square (r.m.s.) source levels.
Transducer calibration and sound field measurements were conducted using a calibrated Bruehl & Kjaer 8103 hydrophone connected to a Bruehl & Kjaer charge amplifier 2635 (Naerum, Denmark). The output from the charge amplifier was recorded on a Toshiba Satellite Pro laptop (Tokyo, Japan) using its sound card, which showed a flat response (±1.5 dB) from 70 Hz to 15 kHz. The sound card was calibrated using a Thurlby Thandar TG 230 signal generator (Huntington, UK). The output of the signal generator was confirmed with a Tektronix TDS 3022 digital oscilloscope (Beaverton, OR, USA). Recordings were made using Cool Edit Pro 1.2 software (Syntrillum Software Corporation). r.m.s. and peak-to-peak (p–p) voltages of the recorded sound and calibration signals were measured in Avisoft SAS Lab Pro v 4.32 (Avisoft Bioacoustics, Raimund Specht, Berlin, Germany). Sound pressure levels (SPL) were calculated as SPL=20log(sound pressure/1 μPa).
Sound types that contained significant energy below 600–700 Hz were equalised using the calibration data to compensate for the low-frequency response decline (<700 Hz) of the transducer using fast Fourier transform (FFT) filters in Cool Edit Pro (Syntrillum Software Corporation). The actual peak frequencies of the five ‘PPM sounds’ broadcasted through the loudspeaker were between 750 Hz and 800 Hz. The −20 dB power points were between 600 Hz and 2.5–3.5 kHz, respectively.
For sound field measurements in captivity, the loudspeaker was placed at the test position in the test pool (Fig. 2) with no seal in the pool. Received levels of all playback stimuli were set to values equal or just below 146 dB re. 1 μPa and were measured four times at 11 different positions of the pool. Mean received levels (r.m.s.) in the pool ranged from 142 dB re. 1 μPa to 147 dB re. 1 μPa. Assuming the hearing threshold of harbour seals to be 72 dB re. 1 μPa at 1 kHz (see composite underwater audiogram in Fig. 3) these sounds would have a maximum sensation level of 74 dB, The sensation level is the relative SPL (in dB) by which a sound exceeds the hearing threshold of a species. Our chosen sensation level of 74 dB exceeds the discomfort but not the pain threshold in humans (see Spreng, 1975). It was also below the startle threshold in terrestrial mammals [rats (Pilz et al., 1987); humans (Berg, 1973)].
In the wild, signals were played at a broadband source level of 172 dB re. 1 μPa (r.m.s.). Sound field measurements in the wild were conducted at the haul-out site on the outer sandbars in the mouth of the river Tay (Tayport, UK) where 75% of the playbacks were carried out. All playback sounds were played consecutively, and measured received levels were averaged over all eight sounds. Received levels were measured along two depth profiles: the first parallel to the shore and a second one from the boat to the shore. Water depth along the profiles ranged from 3.5 m to 5 m for the first profile and from 4.5 m to 1 m for the second profile. The measured received levels along both profiles were also used to determine avoidance thresholds (received level at the edge of the deterrence range).
As we wanted to test the aversiveness of sounds based on parameters other than received level, we chose sound exposure levels (SELs) that were below the threshold where a temporary threshold shift (TTS) could be expected to occur in harbour seals. Kastak et al. showed that the onset of TTS occurs at SELs of 183 dB re. 1 μPa2 s (Kastak et al., 2005). Our exposures did not reach this level. A single emission of our sounds in the field experiment (10 s burst) would result in a SEL of 182 dB re. 1 μPa2 s (source level), and a single emission (6 s) in the captive trials would amount to a SEL of 156 dB re. 1μPa2 s (received level at the position of the animal's head when playback starts).
Ambient noise measurements were carried out using a low-noise Reson TC 4032 hydrophone connected to a Reson VP 2000 (EC6081) voltage amplifier (RESON A/S, Slangerup, Denmark). The output from the preamplifier was recorded on an Edirol UA-25 sound card (Roland Corp., Hamamatsu, Japan) connected to a Toshiba laptop (Tokyo, Japan). In the pool and in the wild ten 5 min sections separated by 10 min intervals were recorded. Ambient noise measurements in the pool were carried out on two days, the first one with Beaufort (BF) wind of 1–2, the second with strength 3–4. In the wild ambient noise was recorded at two haul-out sites used for playbacks on a day with sea state (SS) 1–2. The recording day and time were chosen to reflect the typical playback conditions: low wind, SS 1–2, ±2 h around low water, no rain and no boat traffic within 1 km of the playback site. Power spectral density was calculated in Avisoft SAS Lab Pro v 4.32 (Avisoft Bioacoustics, Raimund Specht, Berlin, Germany) using an 8192 step FFT. The calibrated values were calculated by taking the sensitivity of the hydrophone and the gain from the preamplifier into account.
In captivity, seals were tested individually with only one seal in the test pool at a time. Experiments were started by providing a fish through an underwater feeder, which was 1 m below the surface. Animals had previous experience with the feeder and would approach it when the edge of a metal cup (that contained a fish) became visible. The cup was then lowered 2 s after the playback started making the fish accessible to the seal. Playbacks started when the tip of the animal's nose was within 40 cm of the feeder. Every playback lasted one minute with sound being presented as four sound bursts of 6 s duration each. This resulted in an effective duty cycle of 40% over the 1 min period with a 12 s interval between the presentations. A playback session consisted of 1 min playbacks of each of the 11 described sound stimuli, each separated by a quiet 5 min interval from the next one. In addition, a 1 min observation period with no sound presentation (a no sound control) was carried out. Different versions of the recorded sounds were used in different playback sessions to prevent pseudo-replication and the sound presentation sequence differed for each playback session and individual. We carried out three playback sessions with food presentation as described above, followed by one session with no food, in which playbacks still started when the animal was within 40 cm of the feeder. In the fifth playback session we provided food again while the last one was another no food session. This allowed us to investigate how motivational state affected the behaviour.
Playbacks were monitored using an HTI-96-MIN hydrophone (High Tech Inc., Gulfport, MI, USA), an analogue VN37CPH colour underwater camera (RF Concepts, Dundonald, UK) focused at the feeding station and a camera of the same model mounted 4 m above the pool that was used to view the whole pool area (Fig. 2). Video tracks from both cameras were linked to a multiplexer (CK-70C-4, Camtek-CCTV, Taipei County, Taiwan) and together with the audio track from the hydrophone recorded on a Sony (Tokyo, Japan) DV video recorder (GVD 1000E or MVX 350i). The experimenter and all equipment were hidden from the animal in a hut next to the pool. Behavioural responses were measured from the video recordings.
Eight of the 11 stimuli from the captive experiment were also tested in the wild. We used all control and ADD sounds but only the two most efficient PPM sounds from the captive sessions (Sweeps FM, Square 500/530 FM). We approached the haul-out site from sea with a 6.5 m boat with two outboard engines at idle speed. The boat was anchored between 80 m and 250 m from the haul-out. The playback source was deployed at a depth of 1.5 m at the stern of the boat. We observed all animals in the water within a 100 m radius of the boat. A playback trial consisted of observations 5 min prior to playback (pre), 5 min during playback (sound) and 5 min following playback (post). A 15 min recovery period separated each trial. We used only one sound type in each playback trial. Not more than five playback trials were carried out per day. Sounds were played for 10 s followed by 10 s of silence during the 5 min playback period resulting in a duty cycle of 50%. We increased the duty cycle and trial length for the experiments in the wild to ensure that animals, which were spread out over a large area and were often very close to the surface, would be exposed for a sufficient amount of time to exhibit an avoidance response. As the main goal of the study was to investigate the effects of specifically chosen control and PPM sounds these were all tested 10 times on separate days within a period of several months. ADD sounds, which can contain a variety of complex features, were only tested six times each. Playbacks were only carried out if at least one animal was seen within 50 m of the boat during the 5 min pre-playback period. We also conducted 14 control observations with no sound playbacks in which equipment was deployed but no sound was played during the 5 min between the pre- and post-observation period. The order in which sound types were played on a given day was pseudo-randomised. No sound type was tested in more than one playback on each day. As eight different stimuli were tested not all stimuli were tested each day but sound stimuli were distributed evenly between playback days and haul-out sites.
In captivity, an index of aversiveness was used to describe the animals' responses. The scale ranged from 0 (not aversive) to 4 (highly aversive) and was of an ordinal nature. Aversive response at a certain level always included all aversive responses at a lower level (e.g. level 3 means that the animal also exhibited a level 1 and 2 response). After reviewing the tapes, each 1 min playback was allocated one value. The levels were: (1) seal turns away from underwater loudspeaker – a change in the orientation of the line between shoulder blades and the tip of the nose by at least 100 deg. from the original position (nose pointing towards feeding station). (2) Escape/flight response: seal increases distance to underwater speaker at speeds of more than 3 m s−1. Value 2 was allocated if the animal crossed the pool diagonally swimming away from the feeding station in less than 1 s. (3) Foraging behaviour (fish take) prevented – seal does not re-approach the feeding station after flight response and fish remains in feeder for the whole minute (4) Haul-out behaviour for at least 30 s after an initial flight response.
Additionally, the following continuous response variables were measured: (a) time the animal's head was underwater and within 1.5 m of the feeding tube, and (b) dive time during playback defined as head being completely submerged. All response variables were measured from the videotapes during the 1 min sound exposure.
Because phocid seal species have similar underwater audiograms (cf. Terhune, 1988; Terhune and Ronald, 1972; Terhune and Ronald, 1975) we pooled data for all seals in the captive experiments to allow statistical testing. However, we give information on species differences in the text. Calculations of sensation levels were based on a composite behavioural audiogram using data for harbour seals from Møhl (Møhl, 1968), Kastak and Schusterman (Kastak and Schusterman, 1998) and Terhune (Terhune, 1988) (see Fig. 3 and Table 1). A general linear model (GLM) and two multifactorial analyses of variance (ANOVAs) were calculated to determine which covariables influenced the behaviour of seals in the pool. We used a modified Bonferoni method (Cross and Cjaffin, 1982) to adjust the overall P-values for the models and the P-values of those covariables/covariable combinations that were used in more than one model (treatment, individual, food/no food). Thus, all P-values in the text and figures are already adjusted if this was required. Statistical tests were calculated in Systat 11 (Systat Software Inc., Chicago, IL, USA) with the exception of the GLMs, which were calculated in JMP 4 (SAS, Cary, NC, USA). We pooled data within each sound category for some of the analyses to allow statistical testing. The term ‘treatment’ is used to refer to exposure to either: (1) PPM sounds, (2) control sounds, (3) ADD sounds, or (4) no sound.
In the wild, surface positions of seals were measured continuously relative to the playback boat using a laser range finder (Bushnell Yardage Pro 1000, Overland Park, KS, USA) and a handheld compass. The observer would continuously and slowly rotate around his axis resulting in a scan sampling of the area. The response measure was the number of surfacings observed, except in cases were recognisable animals exhibited a quick series of surfacings in which case only the closest approach was used. Playbacks were conducted on 18 separate days in 2006 and 2007. The data were analysed using repeated-measures ANOVAs to compare the number of seals between pre-, sound- and post-observation periods in distance classes comprising 20 m each. This was found to be a suitable method to detect seal movement around the playback boat in a pilot trial where we observed behaviour of well-marked individuals (recognisable by the pelage pattern on their head). Deterrence ranges were calculated by analysing the data in 20 m distance classes up to a distance of 100 m. The deterrence range was defined as the outer edge of the distance class within which there was a consistent, statistically significant reduction of animals during sound exposure (repeated-measures ANOVA, P<0.05).
Captive seals showed median aversive responses up to level three (turn away, flight and prevention of fish catch) in response to the first sound exposure events in the pool even though food was presented to them at the same time (Fig. 4A). All sounds had a similar aversiveness in the first trial causing the animals to move away from the loudspeaker. None of the sounds elicited a startle reflex as would have been visible by a rapid neck or body flinch at the onset of sound exposure. There was a significant difference in the index of aversiveness among the four treatments (no sound, control sounds, PPM sounds, ADD sounds) in the first playback session (Kruskal–Wallis H=9.383, P=0.025, d.f.=3). Median aversive responses were zero for all sounds in the second playback session (Fig. 4B) and in all subsequent sessions. There were no apparent species differences as the median response score calculated over all responses in the first playback session was 1 even if species were analysed separately. The median response score in the first playback session was 1 for adults and 1.5 for two juveniles.
The position of each specific sound within the first playback sequence had a larger effect on the index of aversiveness than the sound type. Fig. 5 shows the median responses ordered by playback position within the first playback session independent of sound type. There was a strong decline of the responses over the first 3–4 playbacks with median responses reaching zero in all trials following the seventh playback, no matter what sound type was played in that position (Kruskal–Wallis, H=25.126 P=0.005, d.f.=10). Furthermore, a Spearman rank correlation test revealed that there was a highly significant negative correlation between the median response score and playback position within the first playback session (t=−6.36, P=0.00013, R2=0.82; Fig. 5), indicating fast habituation to hearing a playback sound independent of what the sound was. Playback position did in fact explain 82% of the variation in the index of aversiveness (R2=0.82). Therefore, response magnitude to a certain sound primarily depended on when it was played to a seal within the first playback session with a sound having the highest likelihood to elicit an aversive response if it was among the first 2–5 sounds a seal had heard in the test pool (Fig. 5).
In contrast to the findings for the index of aversiveness sound exposure maintained some effect on dive times and the time spent close to the feeder. Exposure to any of the three sound treatments reduced the time an animal spent close to the feeding station and caused a reduction of dive time over the course of several playback sessions (Fig. 6). To elucidate potential factors that might influence swimming and diving behaviour in the pool we calculated GLMs for these response variables over all of the sessions that involved food presentation (Table 2). The model included playback session number, individual identity, treatment and all three interaction terms as variables. The model for the time spent close to the feeder was highly significant (F64,124=8.14, P<0.0003) explaining 71% of the variance in the data. Individual variation in behaviour was the most important explanatory variable, followed by treatment (effect of sound exposure) and to a lesser degree playback session number. The effect of the individual was not caused by species differences. The interaction term for playback session number and individual identity was also significant. Generally, seals reduced the time spent close to the feeder slightly in later playback sessions in all four treatments. However, the interaction term of treatment and playback session was not significant showing that the effect of sound exposure on behaviour did not change over time, i.e. there was no clear habituation for the time spent close to the feeder. The parameter estimates from the model revealed that the effect of treatment was due to the difference between the no sound control and sound exposure while there was no significant difference between the sound types. The model for dive times explained 85% of the variance and was highly significant (GLM, F64,124=12.22, P<0.0003). Similar to the previous model, the most important explanatory variable was individual identity (irrespective of species). However, in contrast to the previous model the second most important factor was playback session number followed by treatment. This shows that the seals decreased dive time in later playback sessions in all four treatments.
To test for differences in behaviour between consecutive playback sessions with and without food presentation, we used multifactorial ANOVAs including individual ID, treatment and food presentation schedule (food vs no food) as covariates (Table 3). The comparison model for playback sessions 3 (food) and 4 (no food) was significant for both, the time spent close to the feeder (F11,63=19.748, P<0.0003, R2=0.77) and dive time (F11,63= 19.175, P<0.0003, R2=0.76). The model showed that there was strong inter-individual variability (irrespective of species) in these variables as well as an effect of treatment on time spent close but no effect of the food presentation regime was found (Table 3). The comparison models for playback sessions 5 and 6 were also significant for both response variables (dive time: F11,63=10.42, P<0.0003, R2=0.62; time close F11,63=18.00, P<0.0003, R2=0.75). In contrast to the previous models, food presentation regime (food vs no food) had an influence on both variables (i.e. dive time, time spent close). This means that seals dived longer and spent more time close to the feeder when no food was presented (Fig. 6). However, again individuals showed strong differences in their general diving and swimming behaviour.
In the wild, we found a significant decrease in the number of animals in at least one of the distance classes for almost all tested sound types. From observations of well-marked animals we found that this was an indicator of animals having moved away from the sound source during sound exposure (Fig. 7, repeated-measures ANOVAs all P<0.05). Deterrence ranges for the two PPM sounds were 60 m (Sweeps FM) and 80 m (Square 500/530 FM) while ranges for the control sounds were 40 m (sine 500 Hz) and 60 m (white noise), respectively. The sounds of the Ace-Aquatec and Lofitech ADDs yielded a deterrence range of 60 m while the deterrence range for the Airmar sounds was 40 m. No significant deterrence range was found for the sound of the Terecos ADD. However, ADD sounds were only played six times resulting in a lower statistical power of these tests than for the tests of other sound types, which were played 10 times. The distribution of animals in the five distance classes did not differ significantly between the three 5 min observation periods for the no sound control (Fig. 8). This shows that the experimental setup and the behaviour of the observer did not result in changes of seal distribution. Fig. 8 also shows that while the detection rates of seals were similar at distances between 40 m and 80 m, the likelihood of sighting seals at distances of 80–100 m was lower.
To test whether animals left the overall observation area after playbacks, the number of surfacing animals in all distance classes (closer than 100 m) was compared between observation periods within each trial. A significant drop in seal numbers closer than 100 m in the playback phase compared with the pre-playback phase was found only for the Square 500/530 sound (Friedman test, P<0.004). PPM sounds were also the only sounds capable of reducing seal numbers in the post-playback phase compared with the pre-playback phase (Friedmann tests with Bonferroni adjustments, Square 500/530 FM: P=0.04, Sweeps FM: P=0.04). All other sounds did not have a significant effect on seal distribution after sound exposure had ceased.
Given that two sound types caused a deterrence effect that extended to at least 5 min post-sound exposure over the whole observation area, it is in theory possible that not all animals returned during the 15 min recovery periods. This could have potentially biased the following playback. However, a comparison of all 5 min pre-sound exposure observation periods for each playback day reveals that the mean number of animals within the observation area did not differ between consecutive playbacks, meaning that no drop of seal number occurred over the course of a playback day (ANOVA F4,63=1.44, P=0.23). This showed that while not all animals returned during the 5 min after sound exposure ceased (post periods) the 15 min recovery time was sufficient for the animals to return to the observation area. Alternatively, it is possible that the area filled up with new arrivals during the post-playback phase. To test habituation effects to sound exposure of any kind within one playback day, the number of animals closer than 60 m from the playback source was counted for all playback sessions. No significant difference in the number of animals between playback sessions on a given day was found (Kruskal–Wallis H4,17=8.820, P=0.116).
Data from sound field measurements are presented in Fig. 9. In the profile measured from the sound source towards the haul-out site on shore, received levels (in dB re. 1 μPa) at different depths did not vary much. Transmission loss was higher than would be expected by either cylindrical or spherical spreading in the first 20 m but then tailed off as predicted from spherical spreading. In the profile measured parallel to the shore, transmission loss was closer to cylindrical than spherical spreading. Received levels in the top layer (0.2 m depth) tended to be lower compared with measurements at greater depth. Underwater ambient noise levels in the pool and in the wild did not differ by more than 10 dB at any of the frequencies (Fig. 10). Mean noise levels dropped off from values of 55 dB re. 1 μPa2 Hz−1 at 0.5 kHz to around 35 dB re. 1 μPa2 Hz−1 at 5 kHz when wind speed and SS were low (Fig. 10). The noise level in the test pool showed some spikes at frequencies between 800 Hz and 2 kHz, particularly when wind speed was high. At frequencies above 10 kHz ambient noise was below 35 dB re. 1 μPa2 Hz−1 in the wild and in the test pool.
Habituation and food motivation
In the captive experiments that involved food presentation, seals did not respond differentially to the sound types while wild animals exhibited differential responses. Ambient noise levels were on average between 10 dB and 20 dB below the hearing threshold and did not differ by more than 10 dB in the field and in the pool at frequencies between 200 Hz and 10 kHz (Figs 3 and 10, Table 1). The difference in behaviour is therefore more likely to be caused by the animal being motivated to approach the feeder and food acting as a reinforcing stimulus overriding any possible aversiveness of sounds in the captive experiments. Food presentation is also the most likely explanation for the fast habituation process observed in the captive experiment. A study on captive sea lions that provided a foraging opportunity also found that animals habituated quickly to artificial sounds at SPLs of 165 dB re. 1 μPa (Akamatsu et al., 1996). Groves and Thompson developed a ‘dual-process’ theory of habituation suggesting that ‘…the strength of the behavioural response elicited by a repeated stimulus is the net outcome of the two independent processes of habituation and sensitisation’ [p. 442 in Groves and Thompson (Groves and Thompson, 1970)]. This is consistent with our data. In captivity, the most aversive responses like flight and prevention of food retrieval habituated within the first playback session. However, the impact of sound exposure remained significant in more subtle response variables and may even indicate a weak sensitising component. Playback session number was a significant factor in the model for the food presentation trials and seals decreased their dive time and the time spent close to the feeding station in later playback sessions. An alternative explanation could be that seals learnt to retrieve fish faster with food acting as a reinforcing stimulus. This is, however, less likely because all individuals increased the time spent close to the feeder in consecutive training sessions (without sound exposure) prior to the start of the experiment.
Our data also showed that variable stimulus design was not successful in delaying habituation of flight behaviour when food was provided as habituation occurred within the first playback session (consisting of 11 different stimuli). According to Groves and Thomson's habituation theory such stimulus generalisation will depend on whether common features in the stimulus–response pathway are shared between stimuli (Groves and Thompson, 1970). Our results are in line with their predictions because all stimuli used in the present study were perceived through the auditory pathway and had similar sensation levels, which differed by not more than 15 dB.
Another possible explanation for the apparent habituation of food avoidance can be found in learning theory. Food presentation can be interpreted as an unconditioned stimulus while the playback of the sound right before foraging or the lowering of the fish cup can be interpreted as a conditioning stimulus. Thus, the animal could have been conditioned in the Pavlovian sense (Pavlov, 1927). In addition an operant component was present in the experimental setup as the animals learnt to position itself in front of the feeder and manipulate the cup in order to retrieve a fish. The food rewards would have therefore acted as a reinforcement of approach and retrieval behaviour. Such apparent conditioning has been observed in the wild where seals can be attracted to an ADD in the so-called diner bell effect (Jefferson and Curry, 1996). It was also striking that seals spent more time close to the feeder in the last playback session when no food was provided compared with the preceding food session. Thus, non-foraging seals may prevail in areas ignoring sound exposure if they had found food in the area before.
Previous studies (Kastelein et al., 2006a; Kastelein et al., 2006b) on captive harbour seals yielded no evidence for habituation over several consecutive playback sessions even though received levels were fundamentally lower than the ones in our experiment. However, these studies did not provide food when presenting sounds. In our experiment with wild animals where food motivation was likely to have been low, there was also no evidence for habituation. A simple explanation for the lack of habituation in the wild could be that animals were displaced by our sound exposure and replaced by new arrivals. However, since we also observed some well-marked individuals in several playbacks, this would not explain the behaviour of all animals. Our data therefore show that food motivation or reinforcement has an accelerating effect on habituation to aversive stimuli.
Aversiveness and unpleasantness of sounds
The aversiveness of each individual sound stimulus is best evaluated from our experiments with wild animals where no food presentation was involved and ambient noise levels were generally 10–20 dB below the known hearing threshold (Figs 3 and 10, Table 1). The following discussion is based on the assumption that avoidance behaviour in the field was not caused by longer dive times but by animals moving away. We think this is justified because we commonly observed well-marked individuals surfacing at greater distances than before when the sound was playing. In two further cases, a seal was seen underwater close to the boat moving away quickly when the sound was switched on. To evaluate the aversiveness of sound features other than received level, we have to control for the frequency-dependent hearing sensitivity of seals. To achieve this, we need to consider that our test stimuli had different frequency spectra. We therefore use sensation levels, which is the level in dB by which a sound exceeds the composite hearing threshold (Figs 3 and 10, Table 1) at a given frequency, to compare the effects of different sound stimuli on the animals.
The maximum sensation level caused by each sound in an animal at 1 m distance was calculated by measuring the maximum difference between a composite hearing threshold (see Fig. 3 and Table 1) and the referenced power spectrum of the sound type in 1/3 octave bins (from 100 Hz up to 24 kHz). Deterrence ranges were defined as the upper edge of the distance class furthest away from the loudspeaker within which the number of animals was significantly reduced during sound exposure. The avoidance threshold in units of sensation levels therefore gives the sound pressure level in dB above the hearing threshold at which a sound causes a deterrence effect. Avoidance thresholds expressed in sensation levels were calculated by subtracting the measured transmission loss (Fig. 9) from the maximum sensation level. Table 4 summarises these features for all of the tested sounds. For the ADD sounds, it is important to note that deterrence ranges given here are based on the features of the sound played at a much lower source level than in an actual ADD. Thus, our results do not describe the effectiveness of the actual ADDs in the field.
The maximum sensation level of our stimuli at 1 m distance (~110 dB) was below the sensation level threshold for a temporary auditory threshold shift in harbour seals [132.5 dB SEL-sensation level (Kastak et al., 2005)]. We found little avoidance beyond the first trial when seals were motivated to forage in our captive experiments. However, in the wild, we found that seals repeatedly avoided sounds when sensation levels ranged from 59 dB to 79 dB (depending on sound type) with a mean value of 70 dB. Interestingly, this mean value of 70 dB above the hearing threshold matches the discomfort thresholds obtained from electro-physiological measurements in humans (Spreng, 1975). The initial avoidance responses in captivity and the sustained avoidance behaviour in the wild could therefore be caused by a physiological mechanism marking the onset of discomfort and stress. It is important to note that the initial responses in captivity and most responses in the wild were unlikely to have been the result of a startle reflex because the mean avoidance threshold (sensation level of 70 dB) and the maximum avoidance threshold (79 dB sensation level) were below the startle threshold measured in rats (sensation level: 87 dB) (Pilz et al., 1987) and humans (sensation level: 92 dB) (Berg, 1973). In addition, the rise times of 50 ms used in the control and PPM sounds would have been too long to elicit a startle reflex (Fleshler, 1965). It is also important to note that avoidance thresholds in captive harbour seals and harbour porpoises when no food was presented were found at sensation levels below 50 dB (Kastelein et al., 2005; Kastelein et al., 2006a). This is similar to what has been found in rats where sensation levels of only 50 dB caused signs of aversive responses (Campbell, 1957). Further experiments are needed to explain the differences in avoidance thresholds between these studies.
Avoidance thresholds ranged from sensation levels of 59–79 dB (re. hearing threshold) depending on sound type. Some of the differences in deterrence ranges can be attributed to differences in the hearing thresholds at the different frequencies of the test sounds (Table 4). For instance, the sine 500 Hz sound had a lower deterrence range than white noise but the sensation level at which it caused deterrence was in fact lower than for white noise. Nevertheless, the data also demonstrate the influence of features deemed unpleasant in humans following the model by Zwicker and Fastl (Zwicker and Fastl, 1990). In the field trials the number of seals within the overall observation area (<100 m) was lower during the 5 min post-playback observation period compared with the pre-sound exposure period for PPM sounds but not for any of the other sounds. This shows longer lasting deterrence effects caused by the PPM sounds. Also, the most aversive sound type was the Square 500/530 stimulus causing the largest deterrence ranges (up to 80 m). By contrast, the control sound sine 500 Hz caused deterrence effects up to 40 m and white noise did so up to 60 m. Square 500/530 was able to deter seals at a sensation level of 59 dB while control sounds needed to have sensation levels of 64–74 dB to cause a similar effect. Thus, roughness appears to be an aversive feature of sounds in seals similar to what was found by Zwicker and Fastl in humans (Zwicker and Fastl, 1990). Roughness sensation can be caused by frequency or amplitude modulation of a signal at modulation frequencies between 20 Hz and 300 Hz (Terhard, 1976). Amplitude modulation patterns originating from mixing of two partial tones whose frequency difference is less than a critical band give also rise to roughness and are likely to be the cause of music being perceived as dissonant in humans (Helmholtz, 1853; Plomp and Levelt, 1965). Dissonance perception appears in fact to be maximised if two partial tones fall within 25% of the cochlea filter bandwidth (Plomp and Levelt, 1965). Roughness therefore originates when the amplitude or frequency fluctuation rate of a signal falls well within the critical band at a certain carrier frequency. If we find behavioural evidence for such perceptional similarities between pinnipeds and humans, these sensations may also be common in other mammals. It is therefore possible that some aspects of human art are not purely a result of culture but have been primed by how our sensory systems evolved in order to process information. This is also supported by recent findings from humans who perceive such roughness as unpleasant independent of their culture (Fritz et al., 2009). Some evidence for the aversiveness of roughness in other mammals may come from right whales who exhibited strong aversive responses to FM stimuli (some of which are capable of causing roughness) but no response to playbacks of ship noise (Nowacek et al., 2004). However, the animals might have been habituated to boat noise. Habituation could also be a factor explaining the mixed results for ADD sounds in our study. In ADD sounds, the degree of unpleasant features as predicted by the Zwicker and Fastl (Zwicker and Fastl, 1990) model did not correlate with their deterrence effects (Table 4). We think that the most likely explanation for this is a varying degree of previous experience with these sounds in the wild leading to habituation to some ADD sounds but not to others.
Behavioural responses observed in this study were surprisingly consistent with predictions obtained from human psychophysiological studies. This indicates that some aspects of sound perception such as roughness may result primarily from physiological properties of the cochlea that evolved early in the mammalian line and have been conserved in spite of specific adaptations to the aquatic habitat. Place preference experiments or two-alternative forced-choice experiments with captive animals would help to further investigate the evolution of sound perception in mammals.
This study was funded by the Scottish Government. V.M.J. was supported by a Royal Society University Research Fellowship and by a fellowship of the Wissenschaftskolleg zu Berlin. Writing of the manuscript was supported by a Tim Waters scholarship to T.G. Seals were held under HO license number 60/3303. Field trials were conducted under SNH research permits number Dec05/01 and Feb06/01. We would like to thank Simon Moss for technical support and equipment design for the pool experiments and Steven Laing, Kate Grellier and Sabrina Brando for their support. We would also like to thank Thorben Selm, Ana Catarina Alves, Kathryn Ball, Valentina Islas, Gordon Hastie, Lars Boehme, Nicola Quick, Rene Swift, Ricardo Antunes, Tess Gridley, Ewan Edwards, Julia Engstrom and Arliss Winship for help during the field trials.