ABSTRACT
As babies rapidly acquire motor skills that give them increasingly independent and wide-ranging access to the environment over the first two years of human life, they decrease their reliance on habit systems for spatial localization, switching to their emerging inertial navigation system and to allocentric frameworks. Initial place learning is evident towards the end of the period. From 3 to 10 years, children calibrate their ability to encode various sources of spatial information (inertial information, geometric cues, beacons, proximal landmarks and distal landmarks) and begin to combine cues, both within and across systems. Geometric cues are important, but do not constitute an innate and encapsulated module. In addition, from 3 to 10 years, children build the capacity to think about frames of reference different from their current one (i.e. to perform perspective taking). By around 12 years, we see adult-level performance and adult patterns of individual differences on cognitive mapping tasks requiring the integration of vista views of space into environmental space. These lines of development are continuous rather than stage-like. Spatial development builds on important beginnings in the neural systems of newborns, but changes in experience-expectant ways with motor development, action in the world and success–failure feedback. Human systems for integrating and manipulating spatial information also benefit from symbolic capacities and technological inventions.
Introduction
Every mobile species must navigate through the world in order to find food and avoid danger. Another requirement for effective adaptation is to reproduce. There are different ways to accomplish both of these vital functions. Each species' distinctive ecological niche and evolved sensory capacities circumscribe the possible solutions to the navigational problem. Evolutionary forces have also created dramatic differences across species in many aspects of reproduction, including whether (and how long) adults care for the young. Variation across species in modes of navigation and reproduction likely affect each other. For precocial species, navigation must be largely innate for the young to survive, and consequently may follow a fixed pattern. Sea turtles, for example, have to fend for themselves from the moment they hatch, and they are equipped with strong pre-programmed mechanisms that direct them to the sea, then to feeding grounds, and eventually back to their home beaches. Changes in the environment may disrupt or redirect these patterns, but basic navigational systems are unlikely to adjust much based on environmental input. By contrast, for altricial species, notably including humans with our extremely extended period of immaturity and adult caregiving, inborn capabilities are more likely to interact with post-birth experiences during a period in which adults protect and feed their young. That is, like other mammals, we are able to have adaptive systems that change in a flexible way in response to environmental demands.
Plausible as this broad-brush portrait of the development of navigation in precocial and altricial species may seem, there has been considerable debate regarding the strength of the spatial endowments present at birth in humans and the nature and importance of environmental influences. Pendulum swings between the extremes of nativism and empiricism, anchored in philosophy by the writings of Immanuel Kant and John Locke, have preoccupied the whole field of human cognitive development. The Swiss psychologist Jean Piaget proposed one kind of solution, positing that babies begin with more than the ‘blank slate’ of which Locke is accused but with less than the full-blown adult capabilities that Kant implied. Piaget also saw transitions from the infant state as due to more than either the empiricist proposal of passive receipt of environmental contingencies or the nativist idea of maturational unfolding. In his view, children's development arises from their active exploration of the world, and their internal reflection on their experiences. The developing child aims to have the world make sense. Piaget's constructivist approach provides one potential solution to the nativist–empiricist debate.
Piaget did more than outline a general constructivist approach, however. He also proposed specific characterizations of particular lines of cognitive development, including spatial development along with other kinds of conceptual development. With respect to initial endowment, Piaget credited babies with only rudimentary sensory and motor capabilities and thought that children do not reach cognitive maturity until adolescence, after a series of stage transitions. In the spatial domain, these ideas translated into a theory in which early spatial representations are topological, i.e. coding only spatial relations of touching, enclosure, overlap or disjunction, with no notion of metric distance or angle. He emphasized the idea that babies and young children code location primarily in relation to their bodies (Piaget, 1954; Piaget and Inhelder, 1956). For example, in the famous ‘three mountains’ task evaluating perspective taking, children younger than 9 or 10 years often selected their own view as the view of another observer with a different vantage point, showing egocentrism and a lack of what Piaget termed projective spatial representations. Additionally, Piaget thought that children did not begin to show evidence of Euclidean spatial relations, i.e. use of coordinate systems to organize their knowledge of the spatial world, until the age of 9 or 10 years.
In the decades since Piaget wrote, investigators in various cognitive domains have questioned his theory. They have gathered extensive evidence to show that infant cognitive endowment is much richer than he thought, and that cognitive change is less abrupt and stage-like. They have also sought to specify more precisely what factors propel developmental change. However, these revisions and challenges have not led to consensus about spatial development. Contending accounts have recently returned us to the nativist–empiricist debate. Data from paradigms studying spatial reorientation, i.e. how disoriented children (or non-human animals) recover their bearings in the world, have been seen as supporting a modern version of nativism, in which babies are simply born with two abilities relevant to navigation, a ‘geometric module’ and a capacity to use inertial navigation systems (e.g. Hermer and Spelke, 1996; Spelke et al., 2010). The additional use of featural information to reorient that appears at around the age of 5 or 6 years is attributed to the acquisition of human symbolic capacities. In an alternative neoconstructivist approach to spatial development in general (Newcombe and Huttenlocher, 2000, 2006) and to reorientation in particular (Xu et al., 2017), the use of boundary shape for spatial reorientation is not an encapsulated module, and symbolic function is only one of the many relevant forces propelling development.
This review aims to provide an overview of the historical and theoretical landscape in which researchers have studied spatial development over the past 50 years, as well as to provide a summary of the empirical findings. Over the course of the article, several themes emerge. First, the acquisition of motor capabilities plays a central role in human spatial development, just as Piaget surmised. Second, success or failure in spatial tasks affects the development of navigation through an internal (and perhaps Bayesian) process of adaptive change and combination of a variety of sources of relevant information. This conceptualization is broadly similar to Piaget's constructivism, but adds computational mechanisms and testable fine-grained hypotheses. It is also more compatible with contemporary models of spatial navigation in which various sources of information are blended rather than opposed (e.g. Barry and Burgess, 2014). Third, although human symbolic capacities are important drivers of age-related change in navigation, they are not exclusively responsible for it.
Challenging and changing Piaget's account of infancy
Research on infants' spatial representations began by addressing two questions deriving from Piaget's characterization of infant spatial representation: whether babies and toddlers show any evidence of allocentric as well as egocentric coding and of metric as well as topological coding. The answer to both questions turned out to be ‘yes, at least some rudimentary capability for both allocentric and for metric coding’, contrary to what Piaget thought. These findings led to two more lines of investigation. One set of experiments addressed whether babies combine allocentric and metric coding to locate objects, i.e. use distances from a framework of distal landmarks independent of the self. Success on this task, sometimes called place learning, is not evident in babies but appears during the second year of life and improves thereafter. Another set of experiments addressed whether babies can adaptively combine information from different sensory systems (e.g. visual and auditory information regarding spatial localization) and across frames of reference (e.g. allocentric and egocentric frameworks), or choose between them when they conflict. I will review these four questions in turn.
Egocentric versus allocentric coding
Research on infant allocentric coding has used two paradigms. One technique involves the A not B error described by Piaget in tracing the sequence of development of the object concept over the first 18 months of life. He observed that babies at 8 or 9 months are typically able to find a hidden object. However, after several experiences of retrieving the object at one location (the A location), they often fail to retrieve an object hidden at another location (the B location), despite the fact they have observed the hiding event at B. This behavior is ‘egocentric’: the location is defined in relation to a body in a fixed position. There are hundreds of papers on this error, and several contending theories of why it appears and disappears. These theories all involve the roles of memory, strength of representation, motor habit and inhibition, although to varying degrees and in different ways (Diamond, 1998; Munakata et al., 1997; Smith et al., 1999). One way to see the task is as an index of conflict between a motor habit system (or response learning, probably based on the striatum) and an immature allocentric spatial coding system (probably based on the hippocampus).
A different paradigm, derived from work with rodents, explicitly pits motor habits and response learning against allocentric learning (based on beacons, proximal or distal landmarks or based on using body movement to update position, a process called inertial navigation). Note that this usage of ‘place learning’ is different from the use of the term to refer to tasks in which distances are involved (see the next section for a discussion of these tasks). For example, babies might experience an entertaining event on one side of a room following an alerting noise and quickly begin to turn their heads following the sound in order to see the event (Acredolo, 1978). However, when experimenters rotate the infant by 180 degrees, babies often continue to turn their heads as they had initially, for example to the left, rather than compensating for the turn by reversing the direction of the head turn, and looking to the right (see Fig. 1). In a similar paradigm, babies reaching for one of two locations from one side of a table, and then moved to the other side, make an egocentric or habit-based choice by continuing to reach as before, or an allocentric choice by using landmarks and/or accounting for motion (Bremner and Bryant, 1977). Initial studies show that infants succeed when there are very salient visual landmarks (e.g. flashing stars around a window), or if tested in their own home, where they may be more emotionally secure (see review by Newcombe and Huttenlocher, 2000). Subsequent experiments reinforce and strengthen this conclusion. For example, 6-month-olds can use indirect as well as direct landmarks to locate an interesting event (Lew et al., 2004) and 4-month-olds given experience with passive motion expect that objects will occur at allocentrically defined locations following rotation (Kaufman and Needham, 2011). These paradigms may also index how babies cope with conflict between a striatal habit system and other systems, including hippocampal ones.
Setup for paradigm used by Acredolo (1978) to distinguish motor habits and response learning from allocentric learning. A baby is initially placed at position S1 and a buzzer in the center of the room announces the imminent arrival of an interesting display at window X (or Y). Then, the experimenter moves the baby to position S2 and the direction of the baby's head turn in anticipation of the display is recorded after the buzzer sounds.
Setup for paradigm used by Acredolo (1978) to distinguish motor habits and response learning from allocentric learning. A baby is initially placed at position S1 and a buzzer in the center of the room announces the imminent arrival of an interesting display at window X (or Y). Then, the experimenter moves the baby to position S2 and the direction of the baby's head turn in anticipation of the display is recorded after the buzzer sounds.
Development may continue well past infancy. Children teleported in a virtual environment, and thus not experiencing the sensory input required for inertial navigation, did not use visual cues to localize an object until the age of 4 years (Negen et al., 2018), thus indicating that success at earlier ages may require the presence of inertial cues. However, the task used in this study was more complex than the two-choice tasks used with babies, and children had to code distances from somewhat undistinctive visual landmarks. In sum, there is evidence of allocentric learning beginning in the first year of life, although it is not as consistent or powerful as it will become later, and may depend on very salient cues and/or the availability of redundant cues.
Topological versus metric coding
The egocentric–allocentric studies with infants involve the use of locations defined by wells, containers, cloths or windows, i.e. pre-defined locations. Thus, they do not allow evaluation of whether babies' spatial coding is metric. Evaluating metric coding requires the use of tasks in which there is a continuous surface under which to hide objects or upon which to place objects based on memory. Work with infants in the first year of life has used looking-time techniques in which objects disappear behind stages or are hidden under sand but later reappear in the wrong location. Babies look longer on these surprising trials as compared with trials on which the objects appear where they ought to be, providing evidence of some degree of metric coding (Newcombe et al., 1999).
However, active search is more convincing evidence than changes in looking time, and often success on search tasks appears later than success on looking-time paradigms (Keen, 2003). In search tasks, 12-month-old babies are able to look in the general areas where an object was hidden (Bushnell et al., 1995). Although that study only tested search from a fixed vantage point, by 18 months, even when shifted between observation of hiding and search, toddlers look for objects hidden in a sandbox with impressive accuracy (Huttenlocher et al., 1994).
These findings indicate early beginnings, but do not investigate developmental change. Studies of children 18 months or older show a number of lines of developmental improvement in metric coding. First, 18-month-olds can remember the location of only one object only briefly, but there is a marked transition between 18 and 24 months to greater capacity and durability (Sluzenski et al., 2004). Second, metric precision increases with age over the preschool years (Lambert et al., 2015; Simmering et al., 2008). Third, over the preschool years and into elementary school (from approximately 2 to 10 years of age), children become less reliant on the local reference frame of a sandbox or a stage to encode extent, and more able to use other cues or an internal ‘ruler’ (Huttenlocher et al., 2002; Duffy et al., 2005). Fourth, the use of categorical information to correct uncertain metric information (described by the category adjustment model; Huttenlocher et al., 1991) is apparent in toddlers, but the categories change with age to more fine gradations that permit greater accuracy and reduced bias (Huttenlocher et al., 1994). Categories may become richer as well as finer with development, especially in the natural environment, where categories have semantic content (Holden et al., 2010; Hund and Plumert, 2003). Increases in working memory capacity seem to allow for combinatorial processes involving the use of more than one category at a time (Sandberg et al., 1996).
The category adjustment model seems to apply to categories in environmental space as well as in small-scale space (Holden et al., 2013; Uttal et al., 2010) and perhaps also to knowledge of spatial relations at geographic scale (Friedman and Brown, 2000), although the latter claim is controversial (Friedman et al., 2012). However, categories at geographic scale only begin to emerge in late childhood, probably coincident with increasing use of maps and geography lessons at school (Kerkman et al., 2003).
Metric coding with respect to allocentric cues
Mature place learning involves the use of metric coding of distance with respect to allocentric landmarks or boundaries (rather than from the self) to specify location. That is, it relies on both allocentric and metric coding. We might speculate that such place learning is not present in the first year of life for several reasons. Allocentric information must be strong and routine, not only occasional and difficult to evoke; several distances must be noticed at once, not just one distance; this set of information must be maintained over time, which only seems to be possible in the second half of the second year (Sluzenski et al., 2004). In line with this speculation, place learning is first evident around 21 months (Balcomb et al., 2011; Newcombe et al., 1998). A simpler version may appear earlier when motivation is high (Clearfield, 2004). However, only older children succeed when there are more demands on the system. We see later development when children need to code several locations at once (Ribordy et al., 2013), to navigate in virtual environments (Laurance et al., 2003), to use substantial distances in large-scale space (Overman et al., 1996) or to use different strategies concurrently (Bullens et al., 2010b). Thus, development extends in a graded form from 1 to 8 years of life, but the basic elements appear in the second year.
Rodent studies also show that place learning takes time to appear. In rats, successful search in the Morris Water Maze appears at 21 days after birth, about the same time as weaning and fully independent exploration (Tan et al., 2017). Success is the culmination of a sequence of events over the first weeks of the baby rat's life, beginning around postnatal day 12 and unfolding over the next 10 days. Elegant and detailed exploration of the sequence has been able to delineate the sequence of development at the cellular level. The earliest cells to mature are the head direction cells, followed by the place cells, the boundary cells and lastly the grid cells. Fig. 2, taken from the review by Tan et al. (2017), summarizes this sequence. Tan et al. also discuss studies that are indicating how we can understand the interplay between innate endowment and environmental input in development in fine-grained detail. For example, head direction cells predate movement and even eye opening in baby rats, but these cells seem to require input from vision and motion to stabilize (Tan et al., 2015). Place cell networks show basic properties as soon as baby rats start to explore (Muessig et al., 2015) but take close to 2 months to exhibit mature stability and accuracy (Wills et al., 2010). Even in adult animals, input is required to maintain basic elements of the system. Grid cells exhibit some basic firing properties in the dark, but require visual input to exhibit hexagonal symmetry (Chen et al., 2016).
Sequence of development of spatially relevant cell systems in the infant rat. Image reproduced from Tan et al. (2017), with permission. Head direction cells develop first, followed by place, boundary and grid cells. Each system matures after its first appearance. P, post-natal day.
Sequence of development of spatially relevant cell systems in the infant rat. Image reproduced from Tan et al. (2017), with permission. Head direction cells develop first, followed by place, boundary and grid cells. Each system matures after its first appearance. P, post-natal day.
Integration across reference frames
The navigation system typically uses both inertial and allocentric information, as instantiated in most current computational and neural models (e.g. Barry and Burgess, 2014). Working within these models, developmentalists need to address how each system develops separately and how mature patterns of interaction between them emerge. Sensory and motor input is likely to influence all three lines of development. For human babies, 3 months marks the onset of vision with good acuity and accommodation, 6 months sees the beginning of stable independent sitting with rotational trunk movement, 8 months is the average age of crawling, and 12 to 14 months is the onset of walking. Fig. 3 shows the average age and range of onset of these motor capacities, although it does not cover visual development. Each of these motor milestones carries with it implications for the spatial information babies receive. For example, before babies can crawl, caregivers typically carry them, and the statistics of optic flow and motion are quite distinctive for the carried infant (Raudies et al., 2012). Thus, obtaining information from the inertial navigation system is challenging and perhaps impossible. As another example, the onset of crawling (or placement in a walker, i.e. a seat that allows an infant to sit upright and self-propel using the wheels of the walker) gives experience that leads to more focus on distant landmarks (see Fig. 4). Upright walking brings with it the ability to see distal landmarks even better as motion through the world occurs at will; the view of the world obtained by the crawling infant is limited (Adolph and Tamis-LeMonda, 2014). Seeing distal landmarks and realizing that they can aid in navigation is thus likely to blossom as these experiences accumulate.
Motor milestone chart for human infants. Chart shows average age (black bar) and range (gray) for emergence of each motor event. Data from Bayley (1969) and Frankenburg et al. (1992).
Motor milestone chart for human infants. Chart shows average age (black bar) and range (gray) for emergence of each motor event. Data from Bayley (1969) and Frankenburg et al. (1992).
Time spent looking towards far objects versus vacant space in human infants aged 8.5 months. Data are mean percentage values for infants aged 8.5 months, split into three locomotor categories: pre-crawlers, crawlers and infants with experience of a baby walker (upright locomotion). From Campos et al. (2000) with permission.
Time spent looking towards far objects versus vacant space in human infants aged 8.5 months. Data are mean percentage values for infants aged 8.5 months, split into three locomotor categories: pre-crawlers, crawlers and infants with experience of a baby walker (upright locomotion). From Campos et al. (2000) with permission.
There is empirical evidence for the influence of these motor milestones on spatial behavior (see review by Campos et al., 2000). Unfortunately, most studies have used either the A-not-B task or the 180 degree rotation task. They convincingly indicate an impact of locomotor experience on a shift towards more use of allocentric information, but do not address topological versus metric coding, the use of distances from an allocentric framework, or the combination of reference systems. Only a few studies exist on the relation of motor milestones to these kinds of spatial behavior, notably Clearfield (2004), who linked a simple form of place learning to the onset of both crawling and walking. Perhaps less relevant to navigation, motor development correlates with mental rotation of objects by infants (Frick and Möhring, 2013) with relations still evident for 5- and 6-year-old children (Jansen and Heil, 2010).
Children continue to grow in height, gain in strength, increase speed of locomotion and refine their motor capabilities until late adolescence. In tandem, they refine their ability to infer direction and distance from proprioceptive and visual cues during locomotion, i.e. to employ inertial navigation from the age of 2 years until at least past the age of 4 years (Rider and Rieser, 1988; Rieser and Rider, 1991) and likely well into middle childhood (Smith et al., 2013). Brain responses to optic flow information may be mature by middle childhood (Gilmore et al., 2016), although physical growth and continued changes in gait and speed may need recalibration until growth ceases. In addition, and surprisingly, the integration of depth cues from binocular disparity and relative motion in visual cortex (V3B) may not occur until age 10 or 11 years (Dekker et al., 2015). Given these ongoing developments, it is perhaps unsurprising that children do not combine inertial navigation and allocentric information until around 8 to 10 years of age (Nardini et al., 2008). In addition, even then, children may behave differently than adults, for example, by combining cues when adults regard them as competitive and hence choose between them (Petrini et al., 2016), or ignoring irrelevant cues (Petrini et al., 2015).
Stable solutions to weighting inertial navigation and allocentric information may require time to emerge. However, this conclusion does not entail the idea that young children are incapable of combining other kinds of spatial information. As discussed above, 18-month-old toddlers arguably combine categorical and metric information (Huttenlocher et al., 1994); four-year-olds clearly do so, as well as making sensible choices among conflicting allocentric cue systems (Waismeyer and Jacobs, 2013) and integrating visual and auditory information for spatial localization (Nardini et al., 2016). However, refinement in coordination of categorical and metric information continues over the first decade (Sandberg et al., 1996), along with increases in speed, efficiency and selectivity of auditory–visual integration (Nardini et al., 2016; Petrini et al., 2015). Some patterns may be due to the variability related to physical growth. A study of combination of vision and proprioception to perform a simple pointing task showed an intriguing pattern in which combination did not occur in 4- to 6-year-olds, emerged in 7- to 9-year-olds, and disappeared in early adolescence, perhaps due to the adolescent growth spurt (Nardini et al., 2013). When sensory and motor systems change due to maturation, their calibration and integration must adjust.
Summary
Research on spatial representations in babies, toddlers and young children has delineated detailed sequences of development from the initial appearance of basic capabilities in simple situations to more sophisticated capabilities towards the end of the first decade of life. Babies are allocentric as well as egocentric, metric as well as topological. Toddlers can use distances from allocentric landmarks in place learning paradigms, and combine categorical and metric information. However, the accuracy and durability of all these abilities continues to improve, along with the adaptive combination of various kinds of input, at least until the age of 10 years or so, and perhaps into adolescence. Each of these lines of development depends at least in part on the maturation of sensory and motor systems, combined with environmental feedback.
An important limitation for the study of the development of navigation is that almost all of these studies have used small-scale displays. Many of them are within reaching space, and most others are contained within a room, and hence in ‘vista space’ (Montello, 1993). We know relatively little about whether these generalizations apply to navigation through environmental space, which offers only partial and sequential views of spatial relations. In the next sections, we begin to turn to that challenge. A challenge for the future is to implement the kinds of research covered so far in paradigms that use large-scale space. For example, we could examine the combination of categorical and metric information in the real world, as done for adults by Holden et al. (2013).
Challenging and changing Piaget's account of childhood
Studies examining allocentric and metric coding in infancy and childhood have shown a gradual increase in skills relevant to navigation over the first decade. This work undermines Piaget's idea of a stage-like transition from topological to projective and Euclidean systems, while potentially supporting his idea that 10 years of age represents the beginning of mature spatial functioning. Studies of perspective taking and cognitive maps allow us to evaluate this picture further. We also begin to grapple more directly with navigation in spaces at environmental scale.
Spatial perspective-taking
The Three Mountains task is (still) in every textbook on developmental psychology. As shown in Fig. 5, children sit on one side of a display and must select one of four pictures as the view of an observer in a different position. Each picture shows the view from one side of the table. Until the age of 9 or 10 years, Piaget observed that children not only have difficulty selecting the correct view, but also frequently make errors by selecting what they see – the ‘egocentric’ error. Three key points have emerged from the research on this task, although many factors affect how difficult the task is, including the nature of the display, the instructions and so forth, as thoroughly researched decades ago (see review by Newcombe, 1989).
The ‘Three Mountains’ test used to assess egocentric perspective-taking. Three mountains display seen from the side (top panel) and from above (bottom panel). Children sit on one side of a display (positions A–D) and must select one of four pictures as the view of an observer in a different position. Each picture shows the view from one side of the table. Until the age of 9 or 10 years, children have difficulty selecting the correct view and frequently make errors by selecting what they see (the ‘egocentric’ error).
The ‘Three Mountains’ test used to assess egocentric perspective-taking. Three mountains display seen from the side (top panel) and from above (bottom panel). Children sit on one side of a display (positions A–D) and must select one of four pictures as the view of an observer in a different position. Each picture shows the view from one side of the table. Until the age of 9 or 10 years, children have difficulty selecting the correct view and frequently make errors by selecting what they see (the ‘egocentric’ error).
First, a very simple form of perspective taking is present by 2 or 3 years (e.g. Masangkay et al., 1974). Young children know that observers see something different than they see, for example that they see different colors if looking through tinted glasses (Liben, 1978). They also can extrapolate line-of-sight views so that, for example, they show other people pictures turning the relevant side away from themselves and towards the observer, or they can help someone find an object occluded from the other's vantage point (Moll and Tomasello, 2006). However, in this simple kind of perspective taking, there is no need to work out the spatial relations of various objects seen from different views.
Second, accomplishing perspective-taking tasks that require computation of spatial relations is especially hard with Piaget's picture choice task. When children answer questions about what object would be where, instead of selecting among various pictorial views, there is some success by 4 years of age (Newcombe and Huttenlocher, 1992) and excellent performance by the age of 8 or 9 years, much better than with picture selection (Huttenlocher and Presson, 1973, 1979). The picture choice task is challenging because it requires children to deal with a conflict between their current frame of reference (the room in which they are sitting) and what the framework would look like from the different vantage point of the observer. Children see the correct picture while sitting in a position in which it is misaligned with the allocentric framework. Interestingly, coding with respect to room cues, which causes problems with picture choice, may be especially prevalent at age 3, when perspective taking is virtually non-existent. In a different paradigm from perspective-taking tasks, in which children directly searched for hidden objects, 5-, 6- and 7-year-olds used more body-based representations (Nardini et al., 2006), which might help performance in perspective-taking by leading to less interference. There is a mixture of coding strategies in 4-year-olds (Negen and Nardini, 2015).
Third, Piaget's basic phenomenon is stunningly replicable. Tasks that do use picture choice develop on the timetable Piaget described, and may index a functionally relevant ability to deal with conflicting frames of reference. Thus, such tests may tap a spatial skill important to navigation and prove useful in studies of individual differences (e.g. Frick et al., 2014), even though a small-scale simulation is quite different from the real-world phenomenon. Observers can view a tabletop display at a glance and with simultaneous visual access to the essential elements; whereas in real-world environments, different views unfold over time and the sequential views require integration. Walking benefits such integration compared to passive movement, but the task is nevertheless challenging (Holmes et al., 2018). However, even small-scale perspective-taking tasks engage navigational areas of the brain (Lambrey et al., 2011) and correlate with success on cognitive map tests (Nazareth et al., 2018). Thus, although perspective-taking tasks use small-scale arrays, they may provide clues regarding the development of navigation.
Cognitive maps
Encoding locations using a common coordinate system lies at the heart of the idea of a cognitive map. Thus, Piaget's timing of the acquisition of a Euclidean coordinate system at 10 years would seem to demarcate the beginning of the ability to form such maps. However, Piaget's ways of assessing a Euclidean system did not involve large-scale navigation. One of his primary indicators was children's ability to reproduce invariant horizontal and vertical lines, defined with respect to gravity and instantiated in drawing tasks in which children drew the water level in a half-filled bottle or the way an electric light on a wire would hang in a caravan going uphill. These tasks have generated a great deal of research, largely because they turned out to tap interesting individual differences in adults (see review by Vasta and Liben, 1996). Another indicator devised by Piaget, although less used in subsequent research, was somewhat more relevant to navigation. Children reproduced a small layout of a terrain showing elements such as farms, toy animals and roads. Scoring examined whether they used the metric coordinates of the original to make an accurate model. This task is quite challenging and heavily symbolic. We do not know if it relates to real-world navigation, although there is some resemblance to the use of maps to navigate.
Siegel and White's (1975) proposal of a sequence from landmark learning to route learning to survey maps led to research more relevant to the development of navigation and the formation of cognitive maps. Studies in this tradition typically evaluated children's knowledge of their own environments or their learning of new large-scale environments, either natural or made for standardized research (Anooshian and Young, 1981; Cornell et al., 1994; Cousins et al., 1983; Herman and Siegel, 1978). In general, integrated and accurate representations appear by early elementary school but mature skills build over the subsequent 6 years or so, firming up at the onset of adolescence. These findings make sense in terms of the rest of the research reviewed so far. It seems to take a decade for children to be attuned to combining sensory modalities and frames of reference, to imaging other perspectives in the face of interference and (not yet mentioned) to acquire a mature level of spatial working memory (Gathercole et al., 2004).
Recent research confirms this picture. For example, 4- and 5-year-olds do not show flexible recall of spatial frameworks from multiple vantage points but 6- to 8-year-olds do (Nardini et al., 2009). High levels of correct route reversals after following a route in one direction are uncommon until 10 or 12 years of age (Lingwood et al., 2018). Strikingly, fewer than half of 10-year-olds were able to find shortcuts in a virtual environment, although this figure was increased from the third or less of children 8 years and younger who were able to do so (Broadbent et al., 2014). There were similar results in an apparently more difficult task, also sampling children between the ages of 5 and 10 years (Bullens et al., 2010a). As late as 11 years of age, children were not using distal landmarks as adults do in a maze task, preferring intramaze landmarks (Buckley et al., 2015) and even up to 13 years of age, intramaze landmarks are prioritized over boundary information (Thurm et al., 2019).
Research on the development of cognitive maps that suggests success by early adolescence sits uneasily in the context of skepticism regarding the existence of such a representation in adults, voiced periodically over the years since Tolman (1948) introduced the term. Investigators have suggested that spatial representations of large-scale environments may be simply associative, route-based or locally metric but with only directional indicators linking the local clusters (e.g. Shettleworth, 2010; Warren et al., 2017; Wang, 2016; Wang and Spelke, 2002). There are several counter-arguments (e.g. see Burgess, 2006, for a critique of the Wang and Spelke proposal). One possibility is that forming cognitive maps is effortful and shows individual differences (Weisberg and Newcombe, 2016). If so, one question is when (and why) such individual differences emerge. In line with the research on normative development, 12 years of age may be the time not only when cognitive maps become widely evident but also when the adult variation pattern stabilizes. By that age, there is a distribution of mappers, route learners and people with imprecise spatial representations similar to that seen in adults (Nazareth et al., 2018).
Summary
Research on the perspective-taking task and on cognitive maps converges on the view that the first decade and beyond is a time of extended and intricate development in which children gradually acquire the abilities to deal with conflicting frames of reference, to integrate cues at both the sensory and cognitive levels, and to use common coordinate systems. Each of these lines of development includes substantial individual variability that persists in adults. An important priority for future research is to devise paradigms that we can use effectively across a wide range of ages to chart this development in detail and to link it to underlying neural changes and to variations in environmental demands. Investigators have not explored these linkages to any substantial extent. In addition, researchers should focus on individual variation and trajectories of change as well as normative development.
The geometric module debate
The theme of this review so far has been that research originally inspired by Piaget's specific claims about spatial development has guided an unfolding research tradition. New research has led to academics abandoning his terminology and his notions of stage-like change, substantially augmenting his vision of infant starting points, vastly enriching his accounts of the mechanisms underlying change and making better contact with the developing neuroscience literature. However, there has been a widespread assumption that a prepared learning system retains plasticity as it engages with the world, i.e. constructivism or experience-expectant learning as the solution to the nativist-empiricist debate. Against this backdrop, a very different and widely cited nativist claim emerged, namely that an inborn ‘geometric module’ uniquely guides spatial reorientation, with a single transition in development created by the acquisition of symbolically coded spatial concepts (Hermer and Spelke, 1996).
Evidence regarding the geometric module claim comes from a very simple paradigm, pioneered by Ken Cheng (1986) with rats (see Fig. 6). Rats learn to search for food in one of four corners of a rectangular arena. Subsequently, spinning disorients them, making it impossible for them to use an inertial navigation system to find the correct corner. After disorientation, they concentrate their searches for food on the two corners that have the same geometric description, one correct and one identical to it geometrically, e.g. the long wall is to the left of the short wall. This discovery has stood the test of time; boundaries (and their relative length) provide powerful spatial cues, for reorientation as well as in other tasks. A second finding is more controversial. The addition of various features to the arena, such as a shaded wall or distinctive odors, did not lead to choosing the correct corner over the geometrically congruent corner, despite the fact that logically, they would have supported a correct choice and thus it would seem adaptive to use them. This second finding led to the claim that geometric information is modular, or informationally encapsulated (Gallistel, 1990). Furthermore, when disoriented children looked for hidden toys in a similar paradigm, children of 5 years and less show the same response as that observed in rats, but by the age of 6 years, children begin to use the visual feature to guide correct search (Hermer and Spelke, 1996). This transition correlates with the acquisition of spatial terms, specifically with learning the words ‘left’ and ‘right’ (Hermer-Vazquez et al., 2001).
Geometric module test pioneered by Cheng (1986). Rats learn to search for food in one of four corners of a rectangular arena. Subsequently, spinning disorients them, making use of an inertial navigation system impossible. After disorientation, they concentrate their searches for food on the two corners that have the same geometric description, one correct and one identical to it geometrically, e.g. the long wall is to the left of the short wall. (A) Basic paradigm used when features are available. (B) An enclosure without features. Adapted from Twyman et al. (2013a) with permission.
Geometric module test pioneered by Cheng (1986). Rats learn to search for food in one of four corners of a rectangular arena. Subsequently, spinning disorients them, making use of an inertial navigation system impossible. After disorientation, they concentrate their searches for food on the two corners that have the same geometric description, one correct and one identical to it geometrically, e.g. the long wall is to the left of the short wall. (A) Basic paradigm used when features are available. (B) An enclosure without features. Adapted from Twyman et al. (2013a) with permission.
There are several overview articles on the extensive and still growing literature using the reorientation paradigm. The reviews have concentrated on data summaries (Cheng and Newcombe, 2005), on challenging the claim of modularity (Cheng, 2008; Twyman and Newcombe, 2010) and on evaluating the range of extant theories (Cheng et al., 2013). One crucial finding that challenges the modular view is the enclosure size effect. Failure to use visual features only appears in small enclosures, as used in the original studies, but not in larger ones (e.g. Learmonth et al., 2002). Another crucial finding is malleability. Brief exposure affects the choices adults make between geometric and featural information in a conflict situation (Ratliff and Newcombe, 2008) and leads younger children to use features even in small enclosures (Learmonth et al., 2008; Twyman et al., 2007). In studies with mice, rearing babies in circular enclosures reduces their use of geometry and enhances their use of features, indicating substantial plasticity in the mammalian system (Twyman et al., 2013b). The enclosure size effect is especially important in understanding the pattern of data, because it points to the need to understand the factors that determine the salience and usefulness of the available cues.
The neuroscience literature provides evidence suggesting that multiple cues interact in the navigation system, and that processing of smaller enclosures will be different from navigation in larger ones. However, with respect to cue interaction, an initial study using fMRI seemed to suggest independence, not interaction (Doeller et al., 2008). The right dorsal striatum is activated when learning location with respect to small moveable interior landmarks and the right posterior hippocampus is activated when boundary information is relevant. These systems seem to interact only in a ‘top down’ fashion in the ventromedial prefrontal cortex (Doeller et al., 2008). Doeller et al. identified their boundary processing with the geometric module. However, oddly, the boundary in this study was circular, with no geometric information available, and the small interior landmarks were not the sort of feature usually regarded as important in reorientation. In another fMRI study directly implementing the reorientation paradigm in the scanner, different results emerged (Sutton et al., 2010). The bilateral hippocampus and the left parahippocampal cortex were more engaged when a feature wall was present than when there was only geometric information, suggesting processing of the wall color and its conjunction with geometry within the hippocampal system.
A model of data from rodents using cellular recordings of head direction cells, which seem potentially particularly relevant to reorientation, emphasizes weighted cue integration (Knight et al., 2014). On the other hand, Keinath et al. (2017) report that the behavior of rodents in reorientation is well aligned with a hippocampal map dependent only on the geometry of an enclosure. A limitation of that study is its use of the small enclosure that gives priority to geometric information. Boundary and border cells are obviously more active close to walls. A successful model of human location judgments close to walls or in the center of rectangular enclosures of varying sizes used information used data on firing of rodent place cells dependent on distances from boundaries (Hartley et al., 2004, 2014). In larger environments, it is likely that grid cells have more input to the firing of place cells while boundary cells have less (Wang et al., 2015). Geometric boundaries drive grid cells but so do salient remote cues (Savelli et al., 2017). It would be intriguing to see work at the cellular level on rodent navigation and reorientation in larger enclosures.
What is the alternative to modularity? One possibility is an adaptive combination model (Newcombe and Huttenlocher, 2006). However, one weakness of the proposal has been that it is not computationally specified (Cheng et al., 2013). In response to this problem, Xu et al. (2017) have recently proposed a Bayesian model of spatial reorientation that accounts for the data from the central papers in the literature on human studies. It performs better than a modular account and better than an associative model proposed by Miller and Shettleworth (2007), but it is possible that a learning model will be necessary to account for the animal literature, in which there are numerous learning trials. The model involves four independent cues: geometry, an associative feature cue, a polarizing effect of features, and linguistic information. It also includes a noise function and an assumption that noise declines with age. In addition, the availability of linguistic information changes with age, abruptly rather than continuously. In this model, language is included, but it is not the only element changing with development and it is also possible that the abrupt change could come from some other parameter of a model, such as hippocampal development. There are many next steps ahead to test this model. We need to generate and test new predictions, and extend the range of results tested against the model. Extensions could include application to other lines of human spatial development or to the literature on reorientation in non-human animals, where paradigms require long periods of learning and where linguistic information is unavailable.
Conclusions
Human spatial development occurs along the extended developmental timetable that characterizes our species, with its long period of immaturity. This immaturity is widely considered to be adaptive because it allows for considerable learning driven by curiosity and experimentation (the model of child-as-scientist) and by cultural norms (the model of child-as-apprentice). Fig. 7 shows a schematic view of spatial development. Babies can initially move very little, and their visual systems lack acuity and accommodation. Over the first 2 years of human life, babies rapidly acquire sensory and motor skills that give them independent and wide-ranging access to the environment. As each sensory and motor landmark arrives, it supports spatial development. Babies begin to use their inertial navigation system and to rely on sources of allocentric information, although both lines of development will continue for many more years. Initial place learning is evident towards the end of the period. From 3 to 10 years of age, children expand their ability to encode and combine various sources of spatial information (inertial information, geometric cues, beacons, proximal landmarks and distal landmarks). In another line of development between 3 and 10 years of age, children build the capacity to think in terms of frames of reference different from their current one (i.e. to perform perspective taking). By around 12 years of age, we see adult-level performance and adult patterns of individual differences on cognitive mapping tasks requiting the integration of vista views of space into environmental space. Human systems for integrating and manipulating spatial information also benefit from symbolic capacities and technological inventions, although they do not solely depend on them. For a more extended treatment of development in spatial symbolic systems than is possible in this review, see Newcombe et al. (2013). Thus, spatial development builds on important beginnings in the neural systems of newborns, and changes in experience-expectant ways with motor development, action in the world, feedback from successes and failures, and symbolic development.
Footnotes
Funding
Work on this paper was supported by grants from the National Science Foundation SBE1041707 and EHR1660996.
References
Competing interests
The author declares no competing or financial interests.