Ants can use visual information to guide long idiosyncratic routes and accurately pinpoint locations in complex natural environments. It has often been assumed that the world knowledge of these foragers consists of multiple discrete views that are retrieved sequentially for breaking routes into sections controlling approaches to a goal. Here we challenge this idea using a model of visual navigation that does not store and use discrete views to replicate the results from paradigmatic experiments that have been taken as evidence that ants navigate using such discrete snapshots. Instead of sequentially retrieving views, the proposed architecture gathers information from all experienced views into a single memory network, and uses this network all along the route to determine the most familiar heading at a given location. This algorithm is consistent with the navigation of ants in both laboratory and natural environments, and provides a parsimonious solution to deal with visual information from multiple locations.
The visual navigation of ants is a beautiful demonstration of how small brains can produce impressive and robust behaviours in complex environments, providing a prime opportunity to investigate parsimonious mechanisms underlying visual guidance. We know from many experiments that an experienced ant forager will preferentially use visual cues for guiding long idiosyncratic foraging routes, and can also use visual cues to control the search for a specific location such as a nest entrance (for a review, see Wehner et al., 1996). Our understanding of how insects might use vision for these tasks has been shaped by the seminal paper of Cartwright and Collett (Cartwright and Collett, 1983). They showed that navigation to a discrete location can be achieved using just a single visual memory or ‘snapshot’, stored at the target location. The key insight was that animals could home by moving to reduce the retinotopic mismatch between their current view and the retrieved view from the target location. In this case, the stored visual memory acts as a point-attractor to the location where the memory was stored, drawing the animal home even from novel locations. The stored view need not be a pictorial representation of the world; it could equally consist of a set of extracted features or a simple parameterisation of the scene. Nonetheless, the defining characteristic of models of this class, commonly termed ‘snapshot models’, is the storage and retrieval of a single view memorised at a goal location for comparison with the current view.
Observations of ants suggest that the views used to guide a search for a discrete location, such as the nest entrance or a feeder, are learnt during specialised behaviours called learning walks (Nicholson et al., 1999). Learning walks are performed by naïve ants when first leaving a goal as well as experienced ants in response to a change to the familiar visual scene. Müller and Wehner (Müller and Wehner, 2010) observed that Ocymyrmex ants departing the nest display a characteristic spiral walk around the nest and at several points turn back and pause whilst facing the inconspicuous nest entrance (Fig. 1A). The occurrence of multiple points at which the ant fixates the nest entrance suggests that visual homing may not be based on a single memorised view of the goal, but on several discrete views memorised whilst facing the target. Similarly, Judd and Collett (Judd and Collett, 1998) showed that wood ants departing from a hidden feeder marked by a black cone also turned back and fixated the cone at discrete locations (Fig. 2A). During subsequent approaches to the feeder (Fig. 2C), ants showed clear switches in their heading, as if successively matching the edges of the cone with the discrete views stored during learning walks (Fig. 2E). Taken together, these results suggest that ants use multiple discrete views to guide their approach to, and search for, important locations. Similarly, ants will need to learn visual information from multiple locations to navigate long, visually guided routes through cluttered environments. Because of the success of snapshot-type models for short-range navigation, an orthodoxy has developed that navigation is achieved by matching the current world view with a sequentially retrieved series of discrete visual memories sampled along familiar routes (e.g. Collett and Collett, 2002).
Despite the simplicity of the idea, in practice it has proved difficult for snapshot-type models of navigation to be extended to the use of multiple discrete views in this way [for a discussion, see Smith et al. (Smith et al., 2007)]. One reason is that because each view acts as an attractor to a discrete location in space, they need to be used independently. Thus models must deal with the difficulty of retrieving the appropriate visual memory at the appropriate location along the route. To date this remains an open problem, with no plausible model in which discrete views are sequentially retrieved and matched able to provide guidance over long ranges through complex environments.
Baddeley et al. (Baddeley et al., 2012) recently developed an alternative model of visual navigation that avoids the need to sequentially retrieve and match discrete visual memories. The model uses a neural network trained with all views experienced during a single, path-integration mediated training route. Rather than storing each view independently, the network gathers information across all views experienced into a single memory network – referred to as a holistic memory. To subsequently recapitulate the route, the trained network assesses the familiarity of views perceived by the simulated ant as she scans the world by rotating on the spot. As ants are constrained to move in the direction they are viewing, a familiar view specifies a familiar direction to move in. After each scan, the simulated ant navigates by simply taking a step in the most familiar direction. The problem of view retrieval is avoided because the same holistic memory provides guidance all along the trip.
This familiarity-based approach has been shown to capture many of the properties of visual navigation in ants and successfully explains route following in complex environments. However, it is unclear whether it can account for the evocative observations of Müller and Wehner (Müller and Wehner, 2010) and Judd and Collett (Judd and Collett, 1998) that suggest the independent use of multiple discrete stored views. Here, we apply the familiarity-based model to (1) the classic landmark manipulation experiment of Wehner et al. (Wehner et al., 1996) with realistic learning walks according to Müller and Wehner (Müller and Wehner, 2010) and (2) the key experimental scenario of Judd and Collett (Judd and Collett, 1998) with the learning walks reported in the original paper. We show that navigation driven by a single holistic memory is capable of reproducing both of these key experimental results. This provides proof-of-concept that these observations, as well as visual navigation in ants more generally, can be explained without the need for storage and independent retrieval of discrete memorised views.
MATERIALS AND METHODS
Simulated environments, and the familiarity network
All simulations were undertaken in MATLAB (The MathWorks, Natick, MA, USA) using code adapted from Baddeley et al. (Baddeley et al., 2012) and any parameters not listed here were as in the original study. The major difference is that, in the original study, a neural network was trained with the visual input experienced along an entire route whereas here we trained the network using only the visual input experienced during the portions of learning walks where ants turn and fixate the goal (Fig. 1B, Fig. 2B). These points are clearly identified in the original papers of Judd and Collett (Judd and Collett, 1998) and Müller and Wehner (Müller and Wehner, 2010). During training, the views are presented one by one and the network gathers information by adjusting weights between the visual input and 800 hidden units so as to maximise the extraction of information across the whole set of training views. After training, the overall activity of the network provides a measure of the familiarity of the current view with respect to the trained weights. Route navigation is then achieved by an iterative procedure in which the simulated ant scans the world from left to right in 1 deg increments (−90 deg to +90 deg from the previous heading) and takes a 1 cm step in the direction that produces the most familiar view across the scan.
Simulated environments were created to replicate the experimental scenarios described in Wehner et al. (Wehner et al., 1996) (Fig. 1) and Judd and Collett (Judd and Collett, 1998) (Fig. 2) with landmarks coloured black on a white background. The visual input of virtual ants was simulated by sampling panoramic images at ant-eye resolution (4 deg of acuity) before passing through an edge detection algorithm (edge function in MATLAB with default parameters). Baddeley et al. (Baddeley et al., 2012) used a simple intensity-based retinotopic visual system. As we are here replicating biological data, we have included an edge enhancement, which is found in all animal visual systems (Sanes and Zipursky, 2010).
Replication of Wehner et al., 1996
The model was assessed in the three conditions described in Wehner et al. (Wehner et al., 1996) (Fig. 1C,D,F). The network was trained with views that match the typical frequency and orientation of fixations recorded by Müller and Wehner (Müller and Wehner, 2010). Twenty-four virtual ants were then tested in three environments: the training environment (Fig. 1C,F), landmarks at twice the distance (Fig. 1D,G) and landmarks twice the size at twice the distance (Fig. 1D,H). Trials were randomly initialised from a series of positions around the fictive nest to remove any positional bias and the path recorded for 10 m. The search data across all simulated ants were summed and are presented for each condition (Fig. 1E–G).
Replication of Judd and Collett, 1998
Judd and Collett (Judd and Collett, 1998) describe the portions of paths where ants turn back and walk towards the feeder during nestward routes (Fig. 2A, dark segments). We replicated these portions of the routes and used the views experienced along these segments to train the network (Fig. 2B). After training, a single simulated ant was released 30 cm from the feeder, corresponding to the commencement of tracking in the original study. The path was terminated when the simulated ant reached 0.25 cm from the goal. The path followed by the simulated ant and the retinal position of the cone's edges were recorded.
RESULTS AND DISCUSSION
No need for discrete snapshot memories
The aim of our modelling was to query the pervasive view that ants store and retrieve discrete views of the world independently. We first asked whether a model based on a single holistic memory could reproduce the observations of homing ants searching for their nest in triangular arrays of landmarks (Fig. 1C–E). The network that instantiates the holistic memory was trained with a set of views selected to mimic the nest-focused views generated by the learning walks of desert ants (Fig. 1B). The model accurately reproduces the search distributions across all three conditions (Fig. 1F–H). That is, simulated ants search at the fictive nest position when the visual panorama viewed from the fictive nest matches that of the training situation, and the search pattern loses its accuracy when the landmarks are moved to twice the distance without changing their size. Crucially, this is achieved without the agent storing a view from the nest position itself. That is, the stored views are not acting as point attractors to discrete locations in space.
Having shown that a single memory network can take information from multiple views experienced in training and produce goal searches that match those observed with ants, we then attempted to replicate the results that provide the strongest support for the independent retrieval and matching of multiple discrete views. Judd and Collett (Judd and Collett, 1998) observed that an ant approaching a single black cone (Fig. 2C) will hold the edges of the cone at several discrete positions on its retina, as if matching discrete retrieved views in turn (Fig. 2E). We show here that this pattern of behaviour can arise from the use of a single memory network (Fig. 2F) supplied with training paths taken directly from the original Judd and Collett paper (Fig. 2A).
Connecting input to output
The explanation of how a holistic memory can reproduce patterns of behaviour suggestive of the retrieval and use of discrete views is quite clear. The use of a single network to learn views and drive subsequent navigation makes an explicit connection between the paths taken during learning and subsequent behaviour. Thus the distinct preferred retinal positions of edges during approaches to the cone is a consequence of the discrete nature of the views used for training and not a result of the system forming, storing and retrieving discrete memories. In other words, we show that the discrete nature of the output behaviour reflects the discrete nature of the input (or training) data. The philosophy of ‘embodied cognition’ explains how an intelligent interaction between the physical agent and its environment can simplify the neural processing required. Here, some of the processing has been outsourced to the active sensing behaviour of the learning walks, allowing navigation without the cognitive machinery required to store, retrieve and use discrete views.
In summary, our data demonstrate that retrieval-type memory of the views from discrete places in the world is not a prerequisite for visual navigation to those places. Instead, a single holistic memory structure can store sufficient visual information to allow navigation from a range of locations by simply following the most familiar direction. By avoiding the problem of retrieving appropriate memories, the familiarity-based approach is a parsimonious method that enables both the pinpointing of a specific location and the following of long routes through complex environments. Furthermore, the model provides accurate replications of behavioural data reported in key research papers.
When might additional mechanisms be required?
The use of familiarity as a criterion for choosing a direction is an attractive and viable scheme for navigating ants because of the constraints of the task (i.e. moving between physical goals) and their motor systems (i.e. coupled viewing and walking direction). However in its current form, this model cannot explain all view-based behaviours in insects. For instance, hoverflies (Collett and Land, 1975) and waterstriders (Junger, 1991) use views to maintain a fixed position in a fluid space, so there may be a requirement for a view-based mechanism that acts as an attractor. Conceivably this could be implemented using an absolute familiarity threshold as a stop signal, or by the use of a snapshot in the traditional sense (Cartwright and Collett, 1983). Similarly, our current model does not capture all that we know about the sensorimotor implementation of visual navigation in ants. For instance, Lent et al. (Lent et al., 2009) show how ants can perform some form of mental image rotation to generate corrective saccades during visual orientation whereas our model relies on an exhaustive rotational search. The inclusion of a saccadic mechanism like this would improve the efficiency of the model but would not alter the pattern of results presented.
Another issue not currently addressed in our modelling concerns how ants modulate learning. During learning walks, the views used to guide a return to a nest are learnt at the beginning of the outward journey, and reciprocally, views used to pinpoint the food source are learnt at the beginning of the return journey. However, we know that ants form distinct memories for foodward and nestward navigation, and that those memories are insulated from each other (Wehner et al., 2006). It may be possible that ants learn continuously but switch between foodward and nestward motivational contexts to decide to which memory the current visual information is allocated. Switching motivational context would also lead the ant to turn and face the appropriate goal, by means of path integration.
This work has been driven by the philosophy of trying to produce parsimonious hypotheses for observed behaviours. In this spirit, we proposed a simple solution that can explain key observations of visual navigation in ants from both experimental and natural conditions. Our belief is that ant experiments and insect-inspired modelling can be used to generate valuable hypothetical mechanisms for understanding animal navigation in general (Wystrach and Graham, 2012). Given that the idiosyncratic routes that are characteristic of ant navigation are also seen in many vertebrate navigators as they move through familiar terrain, one should ask whether the ideas presented here may apply to vertebrates. Certain lines of evidence suggest that this is not an entirely fanciful notion. Route following in humans does not have to engage the map-like memory formed in the hippocampus (Hartley et al., 2003), and familiarity-type memories are also independent of the hippocampus (Fortin et al., 2004). This makes the familiarity-based solution proposed here an interesting candidate to explain route following in vertebrates. The next step is to design experiments that conclusively test whether animals use such a familiarity-based memory for navigation.
We would like to thank Bart Baddeley for providing MATLAB code, and Tom Collett and Barbara Webb for comments on an earlier version of the manuscript.
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
No competing interests declared.