ABSTRACT
Although it has been proposed that birds acquire visual depth cues through dynamic head movements, behavioral evidence on how birds use motion parallax depth cues caused by self-motion is lacking. This study investigated whether self-generated motion parallax modulates pecking motor control and visual size perception in pigeons (Columba livia). We trained pigeons to peck a target on a touch monitor and to classify it as small or large. To manipulate motion parallax of the target, we changed the target position on the monitor according to the bird's head position in real time using a custom-built head tracker with two cameras. Pecking motor control was affected by the manipulation of motion parallax: when the motion parallax signified the target position farther than the monitor surface, the head position just before pecking to target was near the monitor surface, and vice versa. By contrast, motion parallax did not affect how the pigeons classified target sizes, implying that motion parallax might not contribute to size constancy in pigeons. These results indicate that motion parallax via head movements modulates pecking motor control in pigeons, suggesting that head movements of pigeons have the visual function of accessing motion parallax depth cues.
INTRODUCTION
Pigeons and other avian species show flexible head movements compared with humans and other primates. They swing their head back and forth while walking, in a movement known as head bobbing. Numerous studies have revealed that head bobbing serves to stabilize visual images (Dunlap and Mowrer, 1930; Friedman, 1975; Frost, 1978). While the head moves backward relative to the body, it is actually locked in space as the body moves forward (hold phase), and while the head moves forward relative to the body it moves forward faster than the body (thrust phase).
Whereas head movements in the hold phase are for visual stabilization, it is proposed that head movements in the thrust phase have another visual function (Frost, 1978; Davies and Green, 1988; Troje and Frost, 2000). Davies and Green (1988) showed that, when running, pigeons swing their head even though this no longer stabilizes retinal images. The authors suggested that the forward-thrust movement amplifies relative motion in retinal images, serving as a motion parallax depth cue. Pigeons also move their heads when flying in the presence of obstacles (Ros et al., 2017) and when landing on a perch (Davies and Green, 1988), suggesting that they utilize motion parallax for flight motor control. However, there is little behavioral evidence that birds use motion parallax obtained via head movements for visual depth information. In one study on owls (Van der Willigen et al., 2002), birds were first trained to discriminate concave and convex figures of random dot stereograms on a monitor, with only binocular depth cues. When faced with novel displays in which the depth structure was defined only by self-generated motion parallax, the owls successfully extracted depth information by spontaneously moving their head, and they transferred learning from the binocular to the motion parallax cue. Fux and Eilam (2009) observed that, in a more naturalistic situation, owls move their head before attacking prey; they suggested that owls get depth information via self-generated motion parallax. Given the evidence that frontal-eyed owls use motion parallax for depth information, it is likely that lateral-eyed pigeons also use motion parallax because they are considered to depend less on binocular depth cues with their narrow binocular visual field (Martin and Young, 1983; Martinoya et al., 1981), and behavioral and physiological evidence of stereopsis is scarce in pigeons compared with owls and raptors (McFadden and Wild, 1986; Martinoya et al., 1988). The aim of the current study was to investigate whether and how pigeons use motion parallax depth cues obtained via head movements in two cognitive contexts: visuo-motor control and visual size perception.
To experimentally manipulate self-generated motion parallax, it is necessary to track the subject's head position and move a visual target with respect to the changing viewpoint of the subject (Poteser and Kral, 1995; Goodale et al., 1990; for review, see Kral, 2003). In pioneering work by Wallace (1959), a platform was manually moved with or against the peering movement of a locust before jumping. The locusts were induced to undershoot toward the target when it moved against their self-motion, which signifies motion parallax of nearer position. In recent years, computer-based trackers have enabled the tracking of fast and complex movements and displaying of the motion parallax cue based on these movements (Sobel, 1990; Van der Willigen et al., 2002; Stewart et al., 2015). Stewart et al. (2015) manipulated motion parallax of a visual target on a monitor by automatically tracking the flying position of a butterfly and moving the target according to its current position. The butterfly's position was estimated using the triangulation method; two cameras continuously tracked the butterfly and its 3D position was calculated from the combination of 2D positions on the camera images. In the current study, we used the same tracking method as Stewart et al. (2015) to manipulate motion parallax of a visual target on a monitor. By changing the target position according to the head position of the experimental pigeons, we created a virtual target with its depth farther or nearer than the monitor surface.
Given that most previous studies on the use of motion parallax have focused on visuo-motor control (e.g. jumping behavior), here, we analyzed motor control of pecking in pigeons. The pecking movement is composed of thrust and stop phases like head bobbing during walking, and the hold phase before pecking occurs at constant distances from the pecking target (Goodale, 1983). Theunissen et al. (2017) found that visual features of the target modulate the position and duration of the hold phase, and suggested that pigeons look longer with nearer positions when a target is difficult to peck. Contrary to studies examining pecking behavior toward a physical object (Goodale, 1983; Theunissen et al., 2017), the current study used a visual stimulus with virtual depth to investigate the pure effect of motion parallax on pecking control. If pigeons control their head position in regard to its distance from a target, they should hold their head nearer to the monitor surface just before pecking when the virtual depth of the target is far, and vice versa (Fig. 1C).
In addition to pecking motor control, we investigated the effect of self-generated motion parallax on visual size perception. In humans and other primates, visual size perception is modulated by depth cues so that size is perceived to be constant, irrespective of changes in retinal angular sizes related to different viewing distances (Fineman, 1981; Barbet and Fagot, 2002, 2007; Imura et al., 2008; Imura and Tomonaga, 2009). When two objects of identical visual angular size were presented on a picture with pictorial depth cues, primates perceived the object at a farther position as larger than the object at a nearer position. We previously demonstrated that pigeons also have the size constancy function (Hataji et al., 2020). Pigeons were trained to classify the size of visual targets as ‘large’ or ‘small’ against a background corridor drawing. Pictorial depth cues consistently changed the pigeon's response: they overestimated the target size when pictorial cues signified a farther position, and vice versa. However, motion parallax did not affect their responses. One possible reason is that motion parallax in the study was not synchronized with the pigeons’ head movements but emerged simply from the difference in motion speed between the target and background. Such object-produced motion parallax is less reliable than self-generated parallax in humans (Wexler et al., 2001) and owls (Van der Willigen et al., 2002). Fixational eye movements in relation to self-movements are crucial for motion parallax processing in primates (Nadler et al., 2008). Therefore, in the current study, we re-examined the effect of motion parallax on visual size perception, using self-generated parallax stimuli. We first trained pigeons to peck an object on a monitor and classify its size, using a differential reinforcement method. After the training phase, we manipulated the virtual depth of the target using the head-tracking system. If size constancy occurs with self-generated motion parallax, pigeons should overestimate size when the virtual depth is farther than the monitor surface (Fig. 1C).
MATERIALS AND METHODS
Subjects
Three male pigeons (Columbalivia Gmelin 1789) were evaluated in the study (aged 4, 7 and 7 years). The number of subjects was determined based on the maximum number that could be trained in daily sessions consecutively, using one operant chamber. They were individually housed in a 12 h:12 h light:dark cycle with light onset at 08:00 h. Each bird was maintained at 85–95% of its free-feeding weight. Water and grit were freely available in the home cage. All experimental protocols conformed to the animal welfare guidelines of the Japanese government (notice no. 71 of Ministry of Education, Culture, Sports, Science and Technology, no. 88 of Ministry of the Environment, no. 40 of the Prime Minister's Office). The experiments were conducted with the approval of the animal experiment committee of the Graduate School of Letters, Kyoto University (No. 18-33).
Apparatus
The experiments were conducted in an operant chamber (41×31×40 cm) installed with a 15 in LCD monitor (EIZO, FlexScan L357) and touch-sensitive frame (Minato Holdings, ARTS-015N-02B) (Fig. 1A). Two gigabit Ethernet cameras (Allied Vision Technologies, Prosilica GE680) were placed behind the chamber for head tracking. The monitor and cameras worked at 60 Hz. A grain hopper with a hopper light delivered food reward through an opening on the left-side wall. The hopper light was used as a secondary reinforcer (for details, see the ‘Procedure’ section). The experiments were controlled by a personal computer (ThirdWave Corporation, Diginnos Series) running MATLAB with the Psychtoolbox extensions (Brainard, 1997).
Tracking
Before daily experimental sessions, the optical distortions of each camera and the position and angle of one camera relative to the other were calibrated using the ‘stereoCameraCalibrator’ function embedded in MATLAB. A red circular sticker (7 mm diameter) was attached to the back of the pigeon's head as the tracking marker. The centroids of the marker region in images from two cameras were taken using thresholding with RGB values. The center of the marker in 3D space was estimated using the triangulation method. The centroid in each camera image defined a specific direction of a straight line in 3D space originating from the camera position. The point at which the lines from two cameras come closest was the position of the marker in 3D space. The estimated head position was used to create a visual target with virtual depth in real time (see ‘Stimuli’ section), and for offline analysis of pigeons’ pecking trajectory (see ‘Analysis’ section).
Stimuli
A white Gaussian circle was depicted as the target on a black screen. The target size (2σ of Gaussian function) varied: 14.9, 17.8, 20.5, 21.7, 24.7 or 29.7 mm. A background of white grids was also depicted to emphasize the change in position of the target according to the head position. The background was composed of 25 horizontal vertical lines. The background size was 148.5×148.5 mm. Two square keys with different textures were depicted to record the size classification responses of pigeons.
This closed-loop system simulates the motion parallax of virtual position on the monitor surface if the response of this system is sufficiently rapid (in humans, less than 485 ms; Yuan et al., 2000). We quantified the latency to detect the head marker in the camera image, calculate the 3D position of the marker and change the stimulus position according to the detected marker position as in Stewart et al. (2015). A light-emitting diode (LED) was placed in the experimental chamber so that the two cameras imaged it. The program was modified to calculate the latency from the LED onset to the stimulus drawing according to the detected LED position 1000 times. The mean delay and its s.d. were 74 and 11 ms, respectively (Fig. S2).
Procedure
The pigeons were trained on a size classification task (Fig. 1B, Movie 1). A trial started with a white self-start key (14.9×14.9 mm) on the background grid appearing at the center of the display. Pecking the key immediately eliminated it. After 2 s, a target white circle appeared on the background. Five pecks to the target replaced it with two choice keys below the background grid. To induce head movements during the presentation of the target, the virtual position of the target was changed after each peck with the depth position (zt) constant within a trial. The pigeons were differentially reinforced for pecking different keys according to the target size. One key (left key for Birds 1 and 2; right key for Bird 3) was for the target of 14.9, 17.8 and 20.5 mm, and the other was for 21.7, 24.7 and 29.7 mm. The pigeons were rewarded with mixed grain for 3–6 s if they pecked the correct key. Access to the food was granted with a probability of 50%. Regardless of the delivery of the food, the light above the grain hopper came on when the subject responded correctly, serving as a secondary reinforcement. This treatment was for increasing the number of trials in a daily session. Pecking the wrong key resulted in 5–7 s timeout, and a correction trial was inserted before the next trial. In a correction trial, the same stimulus appeared and pecks to the wrong key were not counted, so that the subject eventually responded to the correct key. This treatment equalized the number of reinforcements for left and right choice keys in a session, thus eliminating response biases. The duration of timeout and access to the food varied according to the weight and motivation of each subject. The self-start key, target and choice keys were not depicted if the head marker was not detected by the tracking cameras.
In the training phase, the virtual depth of the target was always 0 mm, meaning that it was located on the monitor surface. Therefore, it was stationary irrespective of head movements. Daily training sessions consisted of 30 blocks of six trials. In each block, six size conditions were conducted randomly. Subjects advanced to the test phase if their performance in two successive sessions exceeded 80%.
In the test phase, the virtual depth of the target was manipulated. Each test session consisted of 180 training trials and 32 probe trials. In probe trials, the virtual depth of the target was ±10 mm. When the virtual position was farther than the monitor surface, the target moved in the same direction as the head movements. When the virtual position was nearer than the monitor surface, the target moved in the opposite direction to the head movements. The pigeon's response was always reinforced in probe trials, irrespective of their choices. This was done to prevent learning with probe stimuli using other cues or criteria than those in training trials. Probe trials were inserted in semi-randomized trial orders so as not to appear in two successive trials. Between test sessions, at least one training session was conducted to confirm that performance was above 75%; if performance was lower, training sessions were conducted until the criterion was again reached.
Analysis
Data from 15 test sessions (3180 trials) for each subject were used for further analysis, after excluding some sessions owing to errors in marker tracking (four and eight sessions for two subjects). Marker trajectories were recorded during the pigeons’ pecks at the target in test sessions. Marker positions were smoothed with 17 ms-wide Gaussian kernel for x, y and z coordinates. If the marker position was lost for more than 50 ms, the kernel was applied for each separate trajectory longer than five frames. As in previous studies (Goodale, 1983; Theunissen et al., 2017), we found that pigeons’ head movements consisted of hold and thrust phases (Fig. 2, Movies 1 and 2). The hold phase was defined as when the head velocity was less than 30 mm s−1 for three frames. Mean marker distances from the center of the target on the monitor were calculated for each hold phase. If the number of hold phases between successive pecks was less than three, these data were excluded from further analysis because the pigeons were not considered to adjust their head to the target precisely before pecking. The effects of virtual depth and size of the target on the head distance from the target in the hold phase were examined by linear mixed-effect model (LMM) with a random effect of subject IDs. The same effects on the duration of hold phase were also examined by generalized linear mixed-effect model (GLMM) with Poisson distributions. Because target size had a non-linear relationship with hold duration, target size was converted to a ratio relative to the classification boundary (21.0 mm) at the GLMM fitting.
To examine the relationship between hold duration and discrimination performance, LMM analysis for the mean hold duration within a trial was performed with fixed effects of target size ratio and correctness, and a random effect of subject IDs. The effects of virtual depth and the target size on size classification (key choice) responses were tested by GLMM analysis with binomial distributions with a random effect of subjects. Because target depth had a non-linear relationship to the proportion of large key choices, target depth (−10, 0, 10 mm) was treated as a categorical variable (‘near’, ‘baseline’, ‘far’).
RESULTS
Distance to the target in hold phase decreases as its virtual depth increases
Head movements of pigeons consisted of thrust and hold phases, as reported in previous studies (Goodale, 1983; Theunissen et al., 2017; Fig. 2, Movies 1 and 2). We investigated the effects of size and virtual depth of the target on the duration and distance of hold phase to the target on monitor. Head distance decreased as motion parallax of the target simulated a farther position (F1.55584=32.1, P<0.0001; Fig. 3A, Table 1). This suggests that pigeons attempted to hold their phase at a constant distance to the target before pecking. However, this result may not be due to the depth information itself but to motion characteristics of the target on the monitor: the target moved in the same direction as the head movement on the monitor when the motion parallax was far, which may have resulted in the shorter head distance. To exclude this possibility, we measured the vertical distance of the head from the monitor surface, which did not include the distance caused by 2D motion of the target on the monitor. We applied the same statistical analysis to the vertical distance to the monitor surface. The vertical distance also decreased as the virtual depth of the target increased when the target size was large, whereas the effect of virtual depth was weakened for the smaller target owing to a floor effect (F1.55584=39.78, P<0.0001; Fig. S3, Table S1), indicating that the results reflect the pigeons’ visuo-motor response using self-generated motion parallax, not the physical characteristics of motion parallax stimulation.
Distance to target in hold phase decreases as target size decreases
Head distances increased as the target size increased (F1.55584=27.0, P<0.0001; Fig. 3B, Table 1). This suggests that pigeons hold their head before pecking at a closer position as the target gets difficult to peck, or that pigeons strive to keep the visual angle subtended by the target constant. Their pecking position peaked around the center of the target (Fig. S4), excluding the possibility that they aimed at the peripheral region of the target, and this resulted in increasing head distances to the center of the target as the target size increased.
Hold-phase durations are correlated with task difficulty
There appeared to be a non-linear relationship between duration of hold phase and target size: hold duration increased as the target size was ambiguous for size classification. Therefore, we converted the target sizes into size ratios to the classification boundary (21.0 mm) by calculating the absolute difference between the log-translated value of the target size and classification boundary. The size ratios of 14.9, 17.8, 20.5, 21.7, 24.7 and 29.7 mm were 0.35, 0.16, 0.025, 0.025, 0.16 and 0.35, respectively. We used GLMM to investigate the effects of size ratio and virtual depth of the target on hold duration. Hold duration increased as the size ratio decreased (F1.55585=58.1, P<0.0001; Fig. 4A, Table 2), indicating that pigeons hold their head longer when the target is difficult to discriminate. To further investigate whether the longer hold duration reflects better discrimination performance, the effects of size ratio and correctness on mean hold durations within a trial were analyzed. Whereas size ratio had little effect on mean hold duration when the response was incorrect, mean hold durations increased as the task difficulty increased when the response was correct (F1.7734=18.6, P<0.0001; Fig. 4B, Table 3). This indicates that pigeons looked at the target longer as its size was difficult to classify, particularly when they responded correctly.
Visual size perception is not affected by virtual depth by motion parallax
Fig. 5 shows the proportions of large key choices as a function of target size. The pigeons chose the large key more often as the target size increased (F1.7732=266.9, P<0.0001; Table 4). If size constancy occurs with the motion parallax depth cue, the pigeons would overestimate target size when its virtual depth was farther, and vice versa. However, the pigeon's key choices were not affected by the virtual depth of the target (main effect, F1.7732=0.5, P=0.6; interaction with target size, F1.7732=0.7, P=0.5; Table 4). Therefore, we did not find any evidence of size constancy based on motion parallax in pigeons.
DISCUSSION
The aim of our study was to test the effects of motion parallax resulting from head movements on visuo-motor control and visual size perception in pigeons. We tracked each pigeon's head and moved a visual target on a monitor in synchrony with the real-time head movements, allowing the simulation of motion parallaxes of an object’s position at closer or farther positions than the monitor surface. The main finding is that pigeons held their head before pecking at more distant positions as the virtual position of the target got closer. Given that pigeons hold their head at a constant distance before pecking (Goodale, 1983), they attempt to maintain the viewing distance using motion parallax from head movements. Pigeons and other birds often move their head to change their gaze direction and to keep their retinal images stabilized (Dunlap and Mowrer, 1931; Friedman, 1975; Frost, 1978; Troje and Frost, 2000). Thus, it has been proposed that they access motion parallax depth cues from these head movements (Frost, 1978; Davies and Green, 1988; Troje and Frost, 2000). To our knowledge, this is the first experimental demonstration that pigeons use the motion parallax depth cue. Unlike owls, which compute depth information from binocular disparity (Pettigrew, and Konishi, 1976), pigeons are thought to rely more on motion parallax, because of their lateral-oriented eyes. Pigeons use this depth information at least for visuo-motor control, as found in other vertebrate and invertebrate species (Wallace, 1959; Goodale et al., 1990; Poteser and Kral, 1995). This is consistent with the finding that pretectal neurons in pigeons are selective for visual motion directions simulating motion parallax of different depths (Xiao and Frost, 2013). These neurons are in the accessory optic system, which processes motion of large patterns caused by self-motion.
In contrast to the results on visuo-motor responses, we found no evidence that visual size perception is affected by motion parallax depth cue in pigeons. This indicates that pigeons use motion parallax selectively for visuo-motor control and not for size constancy. We previously found that this species’ visual size perception was not affected by motion parallax caused via relative object motion (Hataji et al., 2020). The difference between our previous study and this one is that motion parallax in the stimuli was object produced or self-generated. Our data suggest that size constancy does not work with the motion parallax cue regardless of whether it is object produced or self-generated. Our earlier study (Hataji et al., 2020) demonstrated size constancy based on pictorial depth cues in pigeons. Given that this might not be due to depth information but to two-dimensional structures inherent in these cues, such as repulsion and attraction mechanisms (Fujita et al., 1991, 1993), it seems plausible that size constancy does not function with motion parallax depth cue. Previous human studies also suggest that motion parallax is less important for size constancy than other visual depth cues, such as binocular disparity and pictorial cues (Luo et al., 2007; Watt and Bradshaw, 2003). Therefore, the selective use of depth cues, motion parallax for opto-motor control and pictorial cues for size constancy, might be widespread among animals.
Analysis of trajectories of pecking movements also revealed that head distances to target in hold phases are regulated by target size. This result is consistent with the findings by Theunissen et al. (2017). Pigeons visually control head distances to precisely peck a target. They aim to keep their head at a controlled distance using motion parallax depth cues, so as to peck a target at a closer distance when it is small and difficult to peck. Given that the size of the target affected the head distance, it might be possible that the observed effect of motion parallax derived from the size information was modulated by size constancy, not by the depth information. If the effect of motion parallax derives from size constancy, the head distance should be larger when motion parallax signified a farther position, because size constancy increases the target size in the farther condition. Contrary to this prediction, we found that head distance decreased as motion parallax signified a farther position, indicating that the effect of motion parallax derived from the depth information itself.
We found that the head-holding duration before pecking was longer when the target size was difficult to discriminate, and that longer head-holding duration reflected the pigeon's performance in size classification. This suggests that pigeons’ head movements are not fixed patterns of thrust and hold phases, but viewing durations during hold phases are controlled by various cognitive demands, such as task difficulty and ambiguity of the stimuli.
Several potential confounds might have affected our data. First, our tracking system might not have properly simulated motion parallax of virtual positions. The delay of this virtual system was 74 ms, which corresponded to four to five monitor frames. We also observed noisy movements of the target on the monitor when virtual depth was not 0 mm, possibly caused by errors of marker tracking (Movie 1). These tracking delays and noise could affect the pigeons’ motor control and perceptual performance. We cannot say whether, when the virtual depth of target was not 0 mm, pigeons perceived a stationary target positioned above or below the monitor surface, or perceived a moving target positioned at the monitor surface. However, we argue that motion parallax presented here is actually effective for their visuo-motor control system, because the virtual depth of target did affect the head distance to target in hold phases.
A second possible confound is that potential changes in perceived size by motion parallax manipulation was not supra threshold for pigeons. Given that the average distances from marker to target and from marker to eyes were ∼100 and ∼25 mm, the average viewing distance was ∼75 mm. Changes in virtual depth were 10 mm. If the size constancy works perfectly, perceived sizes changed at the ratios of 65:75 and 85:75 for closer and farther conditions, respectively. These ratios correspond to the manipulation of pictorial cues in our previous study (115:135 and 155:135; Hataji et al., 2020). These ratios also correspond to the discrimination thresholds 65.7:73.9 and 73.9:84.9, which are calculated from the ratios of target size when the proportion of large key choices was 25% and 50%, and when it was 50% and 75%, respectively. However, this does not ensure that the change in virtual depth ±10 mm is sufficient to induce size constancy in pigeons, because size constancy does not perfectly correct perceived size. In our previous study, which showed size constancy with pictorial depth cues (Hataji et al., 2020), the ratio of perceptual change of target size to the size change predicted by the pictorial cue manipulation was below 1.0 (0.92 and 0.95 when the pictorial cue was far and near, respectively). If the contribution of motion parallax to the size constancy is smaller than the pictorial cue, then the motion parallax manipulation of ±10 mm is not sufficient. It might be necessary to replicate the current study with larger changes in virtual depth.
Finally, we consider the effect of repeated experiences of pecking the monitor surface. These experiences potentially give feedback that the target is indeed placed at the monitor surface in any case, weakening the effect of virtual depth across training and testing sessions. This could explain why changes in head distance in hold phases were much smaller than changes in the virtual depth of the target. However, we do not consider that the pecking experience hindered size constancy by motion parallax, because we previously demonstrated that size constancy occurs with pictorial cues using a similar touch panel training protocol (Hataji et al., 2020). We also confirmed, in the current study, that there was no effect of session number on changes in head distance or size classification according to virtual depth.
Conclusion
This study investigated the effect of self-generated motion parallax via head movements on visuo-motor control and visual size perception in pigeons. The results showed that pigeons adjusted head distance to target in hold phases according to the virtual depth simulated by motion parallax cues, whereas the visual size perception was not affected by the virtual depth. This suggests that head movements of pigeons have the function of accessing motion parallax depth cues. The results additionally showed that the head distance was adjusted according to the target size and that the head-holding duration was adjusted according to the task difficulty.
Acknowledgement
We thank James R. Anderson for editing an earlier version of the manuscript.
Footnotes
Author contributions
Conceptualization: Y.H.; Software: Y.H., K.F.; Validation: Y.H.; Formal analysis: Y.H., K.F.; Investigation: Y.H.; Writing - original draft: Y.H., H.K.; Project administration: H.K., K.F.
Funding
This study was financially supported by the Japan Society for the Promotion of Science (KAKENHI grant numbers 15J02739 to Y.H., and 16H01505 and 16H06301 to K.F.).
Data availability
Data and MATLAB codes are available in the Open Science Framework repository (https://osf.io/b6y3u/).
References
Competing interests
The authors declare no competing or financial interests.