Multimodal signals (acoustic+visual) are known to be used by many anuran amphibians during courtship displays. The relative degree to which each signal component influences female mate choice, however, remains poorly understood. In this study we used a robotic frog with an inflating vocal sac and acoustic playbacks to document responses of female túngara frogs to unimodal signal components (acoustic and visual). We then tested female responses to a synchronous multimodal signal. Finally, we tested the influence of spatial and temporal variation between signal components for female attraction. Females failed to approach the isolated visual cue of the robotic frog and they showed a significant preference for the call over the spatially separate robotic frog. When presented with a call that was temporally synchronous with the vocal sac inflation of the robotic frog, females did not show a significant preference for this over the call alone; when presented with a call that was temporally asynchronous with vocal sac inflation of the robotic frog, females discriminated strongly against the asynchronous multimodal signal in favor of the call alone. Our data suggest that although the visual cue is neither necessary nor sufficient for attraction, it can strongly modulate mate choice if females perceive a temporal disjunction relative to the primary acoustic signal.
Multimodal communication has received much attention in recent years (Candolin, 2003; Hebets and Papaj, 2005; Partan and Marler, 2005), but the role of individual signal components in composite displays remains poorly understood. Further, few data are available to show how perceived temporal and spatial variation in multimodal signal components affects receiver responses. This paucity of knowledge stems largely from the inherent complexity involved in multimodal communication and the difficulty of experimentally disentangling individual signal components. As a result, many studies lack information on receiver responses to unimodal components in comparison to the composite signal. To understand multimodal signal function and evolution, it is important to test receiver responses to unimodal components as well as responses to the composite signal (Leger, 1993; Partan and Marler, 2005).
In this study we tested how individual components of a multimodal courtship signal influence female responses in the túngara frog, Physalaemus pustulosus Cope 1864. In addition, we examined female responses to variation in temporal and spatial synchrony of the composite signal. Vocalizations are a critical component of mate attraction in nearly all anuran amphibians (Ryan, 2001; Gerhardt and Huber, 2002). In the túngara frog, males produce vocalizations that consist of a simple call (whine only) or a complex call (whine plus one or more chucks appended to the end of the call). Females express a strong preference for the complex call (Ryan, 1985; Rand et al., 1992; Ryan and Rand, 2003a). Several studies have shown that visual cues are also used in the courtship behaviors of anurans (Summers et al., 1999; Hödl and Amézquita, 2001; Amézquita and Hödl, 2004; Taylor et al., 2007; Vasquez and Pfennig, 2007; Gomez et al., 2009). As with most anurans, the male túngara frog vocalization is accompanied by a conspicuous, simultaneous inflation of the vocal sac (Pauly et al., 2006). When accompanied by relatively low-amplitude calls, female túngara frogs express a preference for the multimodal signal (visual cue of an inflating vocal sac plus vocalization) over the vocalization alone (Rosenthal et al., 2004; Taylor et al., 2008).
Many frogs, including túngara frogs, breed in dense choruses that result in a cacophony of mating calls and a complex auditory scene from which females must acoustically detect, localize and assess appropriate mates. When females use vocal sac inflation as a visual cue, the complexity of the auditory scene is compounded by the complexity of the visual scene. This ‘communication scene’ presented to the receiver is rife with auditory interference among adjacent calling males and visual obstructions in the environment, both of which could disrupt the perception of spatial and temporal synchrony of the acoustic and visual courtship components.
In communication systems, the perceptual requirements of a receiver for signal recognition might not be obvious. For example, female túngara frogs do not require spatial synchrony of different call components for attraction to the chuck of a complex call (Farris et al., 2002). There is little known, however, in any communication system about the degree to which receivers must integrate signal components across different sensory modalities during communication (but see Narins et al., 2005). Although female túngara frogs preferentially respond to the multimodal courtship signal (Taylor et al., 2008), it is unknown how they respond to the unimodal visual component of the vocal sac, the multimodal signal under higher playback amplitudes, or asynchrony of the multimodal signal. In this study we: (1) examined female responses to two unimodal signal components, (2) tested female responses to a synchronous multimodal signal played back at a higher amplitude than in our previous study (Taylor et al., 2008), and (3) tested the hypothesis that temporal synchrony between signal components is required for attraction to the multimodal signal.
MATERIALS AND METHODS
In some systems, disentangling the influence of space and time on signal perception is not tractable (e.g. chemical plus auditory signals). In systems where signal components are communicated in visual and auditory modalities, however, several techniques are available for experimental signal presentation (Rosenthal, 1999; Knight, 2005; Patricelli et al., 2006; Taylor et al., 2007; Taylor et al., 2008). In this study we used a robotic frog (hereafter referred to as a robofrog) for the visual cue in conjunction with acoustic playbacks to elucidate the nature of their interaction in influencing female responses.
We conducted all experiments at the Smithsonian Tropical Research Institute in Gamboa, Panama (9°7′0″N, 79°42′0″W). We collected amplectant pairs of túngara frogs from small, temporal pools around Gamboa. Individual pairs were placed into plastic bags that were deposited into a cooler for transport back to the laboratory. Pairs remained there in total darkness for a minimum of 1 h prior to testing to ensure that the frogs' eyes were dark adapted. In these experiments we tested only females. On occasion, females oviposited prior to experimentation; we did not test these females because they exhibit marked decreases in responsiveness to courtship signals.
For each trial, we separated a female from her mate and placed her under a funnel in a testing arena. We removed sections of the plastic funnel and covered the remaining ribs with clear, polyethylene food wrap, ensuring that the female could receive both visual and acoustic stimuli. We positioned the funnel 80 cm from one or two speakers (depending on the experiment) and we placed a highly realistic robofrog with an inflatable vocal sac in front of one of the speakers (Fig. 1). We inflated the vocal sac remotely by a pneumatic pump that was actuated by the computer producing the acoustic stimulus. A delay switch on the pump apparatus allowed us to vary the timing of the inflation cycle relative to the acoustic stimulus.
The vocal sac shape and coloration, and the timing of inflation of the robofrog's vocal sac provided a realistic, but not perfect, representation of a calling male. For example, the robofrog vocal sac did not perfectly replicate the degree of lateral bulge and shape produced by living males (Fig. 1). The strong responsiveness to motion and high visual sensitivity of nocturnally active frogs likely produces low spatial resolution (Lettvin et al., 1968; Land and Nilsson, 2002). Thus, the robofrog, or even the vocal sac movement alone, provides a representation that is realistic enough to evoke responses in female frogs (Taylor et al., 2008). The robofrog also provides a 3D stimulus and can be lit from above. Compared with video playback, this controls for variation in light output by computer monitors and provides a more realistic visual stimulus than a 2D computer representation.
The arena was lit from above by a single GE night light (model no. 55507; Fairfield, CT, USA). Most of the light's surface was covered with duct tape to reduce light output, yielding an irradiance in the test arena of ca. 5.9×10–10 W cm–2. This is similar to the downwelling irradiance at a typical nocturnal breeding site (Cummings et al., 2008). We observed frogs using an infrared viewer. Once the female was under the funnel, we initiated broadcasts of a digitally synthesized male vocalization and began inflating the robofrog's vocal sac. For all experiments we used the same synthetic, complex call (whine+chuck) broadcast at 82 dB sound pressure level (SPL, re. 20 μPa) measured at the position of the female's release point. This call is based on the average of acoustic parameters for frogs in this population and is no more or less attractive than natural calls (Rand et al., 1992). Broadcasting the call at 82 dB SPL (the standard for typical túngara frog phonotaxis studies) provided a comparison with our previous multimodal study in which the call amplitude was 76 dB SPL (Taylor et al., 2008). In that study, playback amplitudes were lowered to increase the probability that females would attend to the visual cue.
We exposed the female to vocalizations and the inflating robofrog for a 2 min habituation period under the funnel. After this period, we raised the funnel and allowed the female to move. A choice was recorded when she approached to within 5 cm of a speaker or speaker/robofrog combination and remained there for 5 s. We allowed the calls and robofrog vocal sac inflation to proceed during the trial until the female made a choice. In two-speaker experiments, we alternated the side on which the robotic frog was presented. For all experiments, we only scored responsive females. If a female failed to move for 2 min after the funnel was raised or failed to make a choice after 10 min, we interpreted this as a lack of motivation and discarded the trial from the data set. At the end of the night, we released the frogs at the sites where they were collected. The test arena in this study was identical to that used previously (Taylor et al., 2008). Detailed methods, particularly regarding arena lighting and robotic frog assembly, are described elsewhere (Taylor et al., 2008).
Female responses to unimodal components
Vocalizations are sufficient for mate attraction in many anurans (Ryan, 2001; Gerhardt and Huber, 2002) and female phonotaxis preferences are well documented in túngara frogs (Ryan, 1985; Ryan and Rand, 2003b). In this experiment, we tested the hypothesis that the visual cue of an inflating vocal sac (without a vocalization) is sufficient for mate attraction. First, we presented females with a complex call to determine sexual receptivity. An individual female was placed under the funnel 80 cm from the speaker and allowed to listen to the playback for 2 min. We then raised the funnel and scored a female as responsive when she approached the speaker broadcasting the vocalization. If a female exhibited phonotaxis to the vocalization, she was then retested with an inflating robotic frog in front of a silent speaker (Fig. 2A). If a female responded to the call but not the robofrog, we assumed her lack of response in the latter case was due to a lack of signal saliency and not a lack of motivation.
Female responses to spatial separation of signal components
Farris and colleagues demonstrated that female túngara frogs exhibit auditory grouping and respond as if spatially separated call components (a whine and a chuck) are produced at the same location (Farris et al., 2002). We conducted this experiment to test signal dominance when the two signal components (acoustic and visual) are spatially separated. We presented females with a single speaker broadcasting a digitally synthesized call (82 dB SPL) and placed the robofrog 15 cm to one side of the speaker, alternating sides between trials. The robofrog's vocal sac was inflated synchronously with the vocalization at the speaker. We placed individual females under the funnel, raised it, and scored a choice when they approached either the speaker or the robofrog (Fig. 2A).
Female responses to temporal separation of signal components
In this experiment, we conducted four sets of trials to test the hypothesis that the vocal sac inflation (visual cue) must be synchronized with the call (acoustic signal) to enhance the attractiveness of the call. In the first treatment, we allowed females to choose between two speakers broadcasting the same synthetic call antiphonally at a rate of one every 2 s. We placed the robofrog 1 cm in front of one speaker and the vocal sac was inflated synchronously with the call broadcast from that speaker; the other speaker lacked a robofrog (Fig. 2B). The inflation/deflation sequence of the robofrog was approximately 450 ms, resulting in the terminus of the deflation occurring about 50 ms after the end of the 400 ms call. We refer to this treatment as 100% overlap (100% OL) as the vocalization was temporally synchronous with the vocal sac inflation/deflation sequence; this mimicked a live calling male (Fig. 3).
We conducted the second treatment in the same manner as the first. In this case, however, the robofrog's vocal sac inflation was initiated 100 ms after the start of the 400 ms call. This resulted in approximately 75% temporal overlap (75% OL) between the call and inflating/deflating vocal sac (Fig. 3).
In the third treatment, the vocal sac inflation was initiated 200 ms after the start of the 400 ms call. This produced approximately 50% overlap (50% OL) between the call and vocal sac (Fig. 3).
In the final treatment, the vocal sac inflation began approximately 100 ms after the end of the call such that there was no overlap between the call and the inflating vocal sac (0% OL; Fig. 3). In this treatment, the inflating vocal sac also did not overlap with the call at the other speaker.
We did not retest females, thus each female was a unique datum. We predicted a priori that if temporal overlap of the visual and auditory components is necessary for enhancing the attractiveness of the call, females would exhibit a significant preference for the temporally synchronous multimodal stimulus. Additionally, we predicted that females would fail to exhibit a significant preference for the multimodal stimulus when the visual and auditory components were temporally decoupled, and that females would choose at random.
For the experiment testing female responses to unimodal components, we conducted a 2×2 contingency table analysis for dependent proportions (McNemar's test). This analysis compares the binomial response (approach vs non-approach) of females to the call and the robofrog and accounts for non-independence due to retesting females. In all other experiments, females were presented with a two-choice test and no female was tested more than once in any experiment. In these experiments, we compared the binomial distribution of female responses for each stimulus pair against an equiprobable distribution (0.5 vs 0.5).
This research complied with all requirements of the animal care and use protocols of the University of Texas IACUC no. 4031701. All necessary permission and permits were obtained from the Smithsonian Tropical Research Institute and the government of Panama.
Female responses to unimodal components
In the first experiment we tested the hypothesis that the visual cue alone is sufficient for mate attraction. None of the 20 females that initially responded to a speaker broadcasting a call responded to the non-calling (inflating without the vocalization) robofrog (McNemar's test for dependent proportions, χ2=18.05, P<0.0001).
Female responses to spatial separation of signal components
In the second experiment we tested signal dominance when the two components were spatially separated by 15 cm. Nineteen females responded to the speaker and one female responded to the inflating robofrog (2-tailed binomial test, P<0.0001; Fig. 4).
Female responses to temporal separation of signal components
We next tested the hypothesis that females require temporal synchrony of the visual and acoustic components in order for the multimodal signal to be more attractive than the call alone. In the 100% OL treatment, females did not express a significant preference for the multimodal signal (12 multimodal:8 call only; 2-tailed binomial test, P=0.3833). In the 75% OL treatment females also did not express a significant preference (8 multimodal:12 call only; 2-tailed binomial test, P=0.2631). In the third treatment, when females were presented with a robofrog inflating at 50% OL with the call, females showed a significant discrimination against the multimodal stimulus (4 multimodal:16 call only; 2-tailed binomial test, P=0.0026). In the final treatment, where females were presented with a 0% OL multimodal stimulus, five females chose the multimodal stimulus and 15 chose the call only (2-tailed binomial test, P=0.0118; Fig. 5).
In sum, the vocal sac alone is not sufficient for mate attraction. The acoustic signal dominates the visual cue when there is 15 cm of spatial displacement. At playback levels of 82 dB SPL, females do not express a preference for the synchronous (100% OL) multimodal signal over the call only, but they discriminate strongly against the multimodal signal when the visual and acoustic components are temporally asynchronous by 50% or more.
Previous studies have demonstrated that female túngara frogs attend to visual cues in conjunction with vocalizations during mate assessment (Rosenthal et al., 2004; Taylor et al., 2008). In this study we documented the role of each of these unimodal signal components in the multimodal courtship display. The complete lack of response to the non-calling robofrog demonstrates that the vocalization is the dominant signal component and is both necessary and sufficient for mate attraction.
In the present study, female preference for the synchronous (100% OL) multimodal stimulus was diminished compared with previous experiments conducted at lower sound pressure levels. At 76 dB SPL (Taylor et al., 2008), females expressed a significant preference for the multimodal stimulus. At 82 dB SPL (this study), females failed to show a preference for it. This suggests that the contribution of the vocal sac to mate attraction is dependent on call amplitude and the distance of the female to the male as amplitude attenuates in a distance-dependent manner. The chorus environment of túngara frogs is highly variable, ranging from one to hundreds of males calling in a given area (Ryan, 1985). The sound pressure level experienced by a female in a chorus is also variable depending on the distance the female is from a male, the number of males in a chorus, and possible constructive/destructive interference of overlapping calls. The data from this study and our previous study (Taylor et al., 2008) suggest that the vocal sac enhances call attractiveness only under relatively low sound pressure levels and thus farther distances. This implies that the visual component of the signal might be more important in detection and spatial localization of the signal rather than detailed mate assessment.
It is possible that the lack of preference for the synchronous multimodal signal found in this study is an artifact of variation in experimental design or variance in female preference across study years, but this is unlikely. Between our previous (Taylor et al., 2008) and our current study, we used identical equipment, followed the same protocols and adjusted the light environment to be similar. Female preferences for some aspects of male vocalizations have also been shown to remain stable over a period of more than 20 years (Gridi-Papp et al., 2006). Further, our results are commensurate with other studies showing increased attendance to a visual cue when sound pressure levels of an acoustic signal are reduced or are near threshold (Rowe, 1999; McDonald et al., 2000).
When males of most anurans vocalize, the vocal sac is by necessity inflated synchronously with the call, producing a ‘fixed’ signal (Smith, 1977). In at least two anuran species, males inflate the vocal sac without calling (Hirshman and Hödl, 2006; Grafe and Wanger, 2007), but this appears to be an unusual behavior among anurans. Even though túngara frog females are quite able to perceive the vocal sac in the low light environment of the túngara frog's nocturnal choruses (Cummings et al., 2008), in large choruses they are probably restricted to viewing only a subset of calling males because of habitat heterogeneity or male position (facing away from a female receiver). Complex chorus environments present a discrimination challenge for female frogs (Gerhardt and Klump, 1988; Schwartz, 1993; Wollerman, 1999; Schwartz et al., 2001; Bee and Micheyl, 2008); interference, masking and differential visibility of closely spaced males limit the ability of a female to assign the movement of every vocal sac to the male emitting the call. Our data suggest that in these acoustically complex situations, females are likely to discriminate against a male where there is a perceived asynchrony between signal modalities, rather than merely finding the asynchronous multimodal signal no more attractive than the vocalization alone.
Two possible explanations could account for this reversal of preference. First, lack of temporal synchrony might alter a female's perception of the acoustic signal, changing its relative attractiveness. For example, the perception of phonemes in humans is influenced by the pattern of lip movements that co-occur with the vocalization, known as the McGurk effect (McGurk and MacDonald, 1976). As a result, a perceived asynchrony may reduce the perceptual attractiveness of the call, causing females to discriminate against that male. A second possibility is that the meaning of the vocal sac itself may depend on temporal synchrony. In the absence of such synchrony, the inflation/deflation cycle of the vocal sac could be perceived as movement related to one of the many túngara frog predators such as snakes, crabs, turtles, frogs and bats (Ryan et al., 1981; Ryan, 1985) known to hunt at breeding sites. Regardless of the mechanism, when the vocal sac is visually available, temporal synchrony is required for attraction.
Narins and colleagues found that males of the diurnally active frog Epipedobates femoralis exhibited an agonistic response to the visual cue of a robofrog intruder when the acoustic signal was spatially displaced (Narins et al., 2005). They also found that agonistic responses to the robofrog persisted when there was a temporal asynchrony between the acoustic and visual cues. In this study, however, spatial separation resulted in little response to the robofrog and partial temporal asynchrony resulted in strong discrimination against the multimodal signal. These differences suggest that the relationship of the acoustic and visual components has evolved in different contexts of communication among anuran species. The dominance of the visual signal in E. femoralis, a diurnally active frog in which visual signals appear to be at a premium, in contrast to the dominance of acoustic signals in the nocturnally breeding túngara frog might be expected. Contrasts between other diurnal and nocturnal species would be required, however, to conclude that diel activity patterns lead to convergent evolution of higher-level cognitive processes such as cross-modal integration. Similar variance in signal dominance and multimodal signal function is seen among closely related spiders (Hebets and Uetz, 2000; Hebets, 2005; Hebets, 2008), suggesting that selection can alter signal function even within taxonomic groups that share a similar ecology or life-history strategy.
In túngara frogs, the vocal sac might have been incorporated into the auditory courtship signal through efficacy-based selection (Hebets and Papaj, 2005). It is difficult for túngara frog females to assign different acoustic signal components (whine and chuck) to the correct individual (Farris et al., 2002; Farris et al., 2005). Thus, our results suggest that the vocal sac improves the ability of females to discriminate among individual males within the noisy chorus environment. In addition, the two communication modes, acoustic and visual, also show intersignal interaction, where the production of one signal alters the perception or response to the second signal (Hebets and Papaj, 2005). In this system, the vocalization is dominant and is both necessary and sufficient for mate attraction. The secondary signal component of the vocal sac, while neither necessary nor sufficient, can strongly modulate female responses if there is a perceptual disjunction of signal components.
The process by which multiple signal components may interact to influence receiver responses, inter-signal interaction (Hebets and Papaj, 2005), has received relatively little attention, especially with respect to temporal order effects. What literature is available shows considerable variation in receiver responses to temporal ordering of signal components both within and across modalities (Wilcynski et al., 1999; Martins et al., 2005; Narins et al., 2005; Gerhardt et al., 2007). An approach to understanding the evolution of complex signaling championed by Partan and Marler is to test receiver responses to all unimodal signal components as well as the full composite signal (Partan and Marler, 1999; Partan and Marler, 2005). We agree that this approach is crucial. The data presented here suggest that the additional step of examining responses to temporal variation, even in a ‘fixed signal’, can provide important information that may illuminate evolutionary processes acting on complex signals.
We thank Keri Athanas, Jessica Morris, and Karine Posbic for help in collecting data. Eileen Hebets and two anonymous reviewers provided valuable comments on an earlier version of the manuscript. The Smithsonian Tropical Research Institute provided logistical support. This work was supported by the National Science Foundation (IBN 0517328 to M.J.R. and R.C.T.).