Introduction to Part II: The biology of language evolution: anatomy, genetics and neurology
Abstract and Keywords
This article focuses on the evolution of language along with its anatomy, genetics, and neurology. The concepts of instinct and innateness are actually quite useful for describing behaviors that routinely characterize all members of species or at least all species members of specific sex and age classes. Thus, they tend to be favored by scientists with a primary focus on the distinctive behaviors of individual species. To many developmental biologists and developmental psychologists, however, instinct and innateness are fallacious concepts because all behaviors develop through gene-environment interactions. The solution to this dilemma, in Fitch's view, is to abandon the terms “instinct” and “learning” in favor of other terms that more accurately describe the phenomena in question, such as “species-specific” or “species-typical” to describe behaviors routinely displayed by all members of a species, and “canalization” to explain the species-typical gene-environment interactions that produce behavioral regularities. From this perspective, language is a species-specific human behavior that is developmentally canalized via interactions of genes and predictable environmental impacts such as typical adult-infant interactions. In sum, evidence indicates that language evolution probably demanded changes in multiple interacting genes and involved expansions in multiple parts of the brain, as well as changes in the vocal tract and thoracic spinal cord.
Some of us have long assumed that instinct versus learning controversies met their demise back in the 1960s with publications such as ‘How an instinct is learned’ (Hailman 1969) and the insightful behavioural analyses of Robert Hinde (1966). Not so. In Fitch's view (Chapter 13), these controversies continue, both because they reflect interdisciplinary gaps and because of the tendency of scientists to black‐box issues not of their own immediate concern. Concepts of instinct and innateness are actually quite useful for describing behaviours that routinely characterize all members of species or at least all species members of specific sex and age classes. Thus, they tend to be favoured by scientists with a primary focus on the distinctive behaviours of individual species. To many developmental biologists and developmental psychologists, (p. 134) however, instinct and innateness are fallacious concepts because all behaviours develop through gene–environment interactions. The solution to this dilemma, in Fitch's view, is to abandon the terms ‘instinct’ and ‘learning’ in favour of other terms that more accurately describe the phenomena in question, such as ‘species‐specific’ or ‘species‐typical’ to describe behaviours routinely displayed by all members of a species, and ‘canalization’ (Waddington 1942) to explain the species‐typical gene–environment interactions that produce behavioural regularities. From this perspective, language is a species‐specific human behaviour that is developmentally canalized via interactions of genes and predictable environmental impacts such as typical adult–infant interactions.
Although all animals display species‐specific behaviours, most also exhibit behavioural plasticity in response to learning and/or in response to environmental conditions that may directly impact on brain development or physiological status (West‐Eberhard 2003). Some animals can even, if subject to unusual rearing conditions, develop behaviours not considered typical of their species. Great apes reared in human homes or subject to language‐training experiments, for example, develop a number of behaviours not found in wild apes. In other words, dissimilar phenotypes (i.e. observable behaviours and characteristics) can develop from similar genotypes (i.e. genetic endowment), a phenomenon termed phenotypic plasticity. As Számadó and Szathmáry (Chapter 14) note, phenotypic plasticity plays important evolutionary roles. Specifically, those phenotypes which prove adaptive and the genes that facilitate their development are subject to positive selection, hence, increase in the population (see also West‐Eberhard 2003). Ultimately, these phenotypes may become fixed in the population (Baldwin effect; Baldwin 1902). If the genes producing them also become fixed, genetic assimilation will have occurred (Waddington 1953).
Each species occupies physical environments that can change in response to numerous external events such as climate change, earthquakes, or volcanic eruptions. Species, however, also modify and create their own environments, and hence the selective pressures that impinge upon them, a process termed niche construction (Odling‐Smee et al. 2003). Although external environmental events have undoubtedly influenced human evolution, niche construction has arguably played an even greater role in shaping the selective forces that help mould the modern human mind, and perhaps the human body as well, because our lineage has repeatedly created and adapted to new technological, cultural, and linguistic environments. It is sometimes thought that genetic change is too slow for our genes and brain to have adapted to selective pressures posed by ever‐changing languages and cultures. Számadó and Szathmáry counter this argument by presenting numerous examples of rapid genetic change in humans and other species. They also argue that the pace of language change, like technological change, was probably considerably slower during Pleistocene times than it is today. The result of the combined processes of potentially rapid genetic change and an earlier, (p. 135) somewhat slower, pace of language change is that genes, languages, and the brain have co‐evolved, and to some extent may be continuing to do so. On the one hand, genes and brains enable language; on the other, language change selects for further, linguistically‐conducive, changes in genes and brains.
12.1 Developmental plasticity and genes
Számadó and Szathmáry (Chapter 14) also suggest that some biological systems, such as the immune system, are specifically adapted to enable rapid responses to environmental change. They suggest, for example, that the brain has been specifically shaped by selection to function as a rapid responder to linguistic change (and we would add cultural change as well). This postulate draws clear support, not only from our species' well‐recognized learning and problem‐solving capacities, but also from the plasticity that characterizes all developing and mature mammalian brains. First, during early developmental periods, all mammalian brains routinely overproduce neurons; those neurons that fail to achieve full functionality are subsequently pruned (Edelman 1987). In humans, neuronal production primarily occurs prenatally, as does much neuronal pruning. Similarly, all mammalian brains overproduce synapses during certain periods of development. Again, those that fail to achieve full functionality are later pruned. Our species typically overproduces synapses in the first several postnatal years and again just prior to puberty. One unexpected result is that the typical human adolescent has more synapses than most adults, at least in the frontal lobes (Blakemore and Choudhury 2006). Although the production and pruning of neurons and synapses is primarily a maturational phenomenon, these processes never truly cease. New cortical synapses continue to be produced and pruned throughout life, and a region of the brain concerned with declarative and episodic memory (Zito and Svoboda 2002), the hippocampus, continues to produce new neurons throughout life (Eriksson et al. 1998).
These processes have demonstrable functional effects. For example, in rats, final adult brain size as well as performance on laboratory learning exercises varies depending on experience during the maturational process (Bennett et al. 1964). Similarly, humans who practise particular skills such as piano‐playing or taxi‐driving develop enlarged neural structures pertinent to those tasks (Amunts et al. 1997; Maguire et al. 2000). Language‐related functional reorganizations are also known to occur in humans in relationship to environmental inputs. For example, in congenitally deaf subjects who master sign language at a young age, regions of the temporal lobe that normally mediate auditory functions become more attuned (p. 136) to visual input, including visual gestures (Neville 1991). Similarly, the visual neocortex of congenitally blind subjects assumes tactile functions, if such individuals master Braille at a young age (Sadato et al. 1998). Even literacy changes brain functions, and may, in fact, sharpen the neural perception of phonemes (Dehaene et al. 2010). Recognition of the environmentally‐induced developmental plasticity of mammalian brains helps explain why chimpanzees, bonobos, and other apes, reared from infancy in human homes, can, within limits, develop protolanguage‐like behaviours, whereas wild apes and/or apes captured in adulthood usually cannot.
Brain plasticity, of course, has its limits. All brains of a given species strongly resemble each other in overall structure and function. This must reflect considerable genetic programming. As Számadó and Szathmáry note, numerous genes impact on brain development, and these genes appear to evolve at a rapid pace, thereby potentially impacting rapid evolutionary changes in behaviour. Diller and Cann (Chapter 15) focus on specific genes thought to influence the evolution of language and the brain. FOXP2, a regulatory gene, helps determine when and where other genes are expressed. In humans, certain FOXP2 mutations produce orofacial dyspraxia (possibly by disrupting motor sequencing behaviours), some language deficits, and mal‐development of several neural structures (Lai et al. 2003). In other animals, depending on the species, FOXP2 may exhibit increased or decreased activity during periods of vocal learning. Hence, although no evidence indicates that FOXP2 directly controls for vocal behaviour, the gene does, apparently, impact on the development and functions of neural structures that do. Specific human mutations in the FOXP2 gene were once thought to have occurred in the last 120,000 years. Re‐evaluations of the genetic data now suggest a much earlier date of about 1.8 to 1.9 million years ago (Diller and Cann, Chapter 15).
Diller and Cann also review variants of two additional genes that, when mutated in modern humans, result in microcephaly (microcephalin and ASPM). Dysfunctional mutations in these genes result in abnormally small brains. Hence, it has been suggested that both played key roles in the evolutionary enlargement of the brain. Brain development, however, is a complex process involving hundreds, possibly thousands, of genes. Functional disruptions in any of these can cause developmental neural pathologies. This does not mean that earlier, different mutations in the same genes caused increased brain size, only that normal, fully‐functional genes are needed for brain development. Other evidence cited by Diller and Cann, however, indicates that certain variants of ASPM and microcephalin have increased in frequency in the last 37,000 and 5800 years respectively. Some have interpreted this to mean that these genes are currently experiencing positive selection for their roles in brain function or development, but after reanalysing the data, Diller and Cann conclude that the increased gene frequencies could equally well represent genetic drift. In their view, in‐depth analysis also fails to support reports of correlations between the distribution of these genes and tonal (p. 137) languages. Ultimately, Diller and Cann conclude that language evolution is likely to have resulted from interactions of a multiplicity of genes, rather than from a single mutation in a ‘magic’ language gene. In sum, despite increasing research in this area, our understandings of the genetic basis of language and of human‐specific neural developmental pathways remain vague.
12.2 Adult brains
Even though human children speak in full sentences by the time they are about 2½ years old, most research on the neurological basis of language focuses on the anatomy of the adult brain and, then, mostly on brain size or on the anatomy of neocortical structures, some of which reach full functionality only in adolescence or later. Brain size, both absolute and relative to body size, did increase steadily from about 2,000,000 to 300,000 years ago (Mann, Chapter 26). It is likely that these size increases were functionally adaptive; otherwise, they would have been selected against. Large brains, after all, are metabolically expensive (Aiello and Wheeler 1995). Specific language‐related neural structures have also increased in size in human evolution, as delineated in a number of the chapters in this section. Consequently, increased brain size almost certainly contributed to the evolution of language. However, no one‐to‐one correlation exists between language and overall brain size, and no specific brain size Rubicon separates the linguistically capable from the linguistically inept. Indeed, given that microcephalics do not entirely lack linguistic abilities (Diller and Cann, Chapter 15) it is clear that overall brain size is not the sole determinant of language capacity.
Most investigators have worked on the assumption that language evolution primarily involved the neocortex, either the differential expansion of neocortical areas and connections already present in non‐human primates and/or the addition of new neocortical structures. Gibson (Chapter 16) takes a somewhat different stance. Following on from Gibson and Jessee (1999) and P. Lieberman (1991, 2000, 2002), she reminds us that lesions in structures such as the cerebellum and basal ganglia often produce speech and language deficits. These areas have greatly expanded in human evolution and they mature earlier than many areas of the neocortex (Gibson 1991). In addition, a greater percentage of descending cortical fibres terminate directly on brainstem and spinal cord motor neurons in humans than in monkeys and apes, providing for finer control of lip, tongue, and finger movements (Kuypers 1958). Consequently, neural areas and connections other than those confined to the neocortex deserve far greater scrutiny from the language origins community.
(p. 138) Donald (Chapter 17) also emphasizes the role of neural circuitry involving the basal ganglia, cerebellum, and neocortical areas (especially the premotor and dorsolateral prefrontal cortex). These circuits enable procedural learning, that is, the acquisition of motor skills that require much practice, including those needed for mimesis and tool‐making, both of which, in his view, preceded language evolutionarily (see also Arbib, Chapter 20). Donald further notes that although mimesis is, in large part, a sensorimotor function, the social contexts in which it is used are amodal. Hence, once mimesis became an integral part of human behaviour, through, for example, mime, it would have selected for enhanced amodal cognitive capacities, such as those needed for language and mediated by the inferior parietal and frontal lobes (see also Wilkins, Chapter 19).
Most vertebrate brains exhibit functional lateralization (Rogers and Andrew 2002). In humans, the left hemisphere controls the right arm and hand and, as Hopkins and Vauclair (Chapter 18) note, it is also dominant for language and speech in 96% of right‐handed and 70% of left‐handed individuals. Although it has long been known that individual primates prefer specific hands, until recently it was assumed that population‐wide preferences for the right hand were a uniquely human trait. Indeed, the coincidence of two left hemisphere‐controlled, largely species‐specific human behaviours (right‐handedness and language) has led to hypotheses that cerebral lateralization, language, and right‐handedness evolved together in a causally interconnected manner (Corballis 1993; Crow 2004). Such views long received support from studies indicating that monkeys and apes fail to display population‐level handedness in simple manual reaching tasks. More recently, however, Hopkins' group has found population‐wide right‐handedness in captive chimpanzees, when they were tested on complex manipulative tasks requiring that an object be held in one hand and manipulated in the other (Hopkins 1995).
In Chapter 18, Hopkins and Vauclair also report that captive chimpanzees, bonobos, gorillas, and baboons all exhibit population‐wide biases for the use of the right hand for communicative gestures. In contrast, judging by asymmetrical facial expressions, the majority of vocalizations and facial expressions in non‐human primates are controlled by the right hemisphere. The few exceptions, controlled by the left hemisphere, include marmoset twitters and the novel raspberry sounds and extended food grunts made by some captive chimpanzees. Hence, lateralization in non‐human primates may be greater for communicative gestures than for manipulative behaviours, and for voluntary, as opposed to emotional, vocalizations. These findings suggest that left‐hemisphere dominance for speech and language may have been preceded evolutionarily by left‐hemisphere dominance for voluntary gestures and vocalizations in other primates.
In humans, language and handedness are usually thought to be accompanied by differential expansion of some left‐hemisphere areas, in comparison to similar areas on the right. A literature review by Hopkins and Vauclair finds that Broca's (p. 139) area is somewhat inconsistently expanded in the left hemisphere in both apes and humans. In contrast, the left temporal plane is usually expanded not only in humans, but in apes as well. The left Sylvian fissure is also somewhat longer than the right in both apes and humans. This fissure, which separates the temporal lobe from the parietal and frontal lobes, is surrounded by neocortical areas known to have language functions. In sum, anatomical and behavioural data indicate that neural asymmetry is neither unique to humans nor a specific language specialization. That great apes and monkeys exhibit greater lateralization with respect to gestural usage and voluntary vocalizations may, however, provide clues to possible behavioural precursors to speech and language.
Wilkins (Chapter 19) addresses the anatomy and functions of Broca's area, the POT (parieto‐occipito‐temporal junction), the inferior parietal lobe, and tracts that interconnect these areas. Since one of her aims is the delineation of ape/human neural differences potentially visible in the fossil record, her primary focus is on those species differences that can be seen on the external surface of the brain. This is a critical point, because historically three different parameters have been used to identify neural regions: external anatomy, internal cellular architecture (cytoarchitecture), and function. The three do not always provide identical results. For example, a Broca's area homologue was identified in monkey and ape brains via cytoarchitecture as early as the 1940s (von Bonin and Bailey 1947; Krieg 1954), but most investigators continued to insist, based on external morphology, that Broca's area was unique to humans until the discovery, in the 1990s, of mirror neurons, in what many now accept as the monkey homologue of Broca's area (see Arbib, Chapter 20). Similarly, rhesus monkey and chimpanzee brains contain cytoarchitectonic areas that these earlier neuroanatomists considered homologous to the human POT. Externally, however, the anatomy of the POT region is quite different in apes and humans. In apes, the lunate sulcus separates the occipital lobe from the parietal and temporal lobes, while all three lobes merge in the human brain. Wilkins accepts that much of the parietal cortex and Broca's area have homologues in monkey and ape brains, but she considers the POT to be uniquely human. Nonetheless, she concludes from fossil evidence that the POT evolved early in our lineage, prior to speech. These findings present an evolutionary quandary. The so‐called language areas of the human brain apparently evolved long prior to language. From this, she concludes that language evolution involved exaptation, that is, the re‐appropriation of pre‐existing functions to new uses.
For the most part, Wilkins focuses on the spatial functions of the parietal lobes and their interactions with Broca's and other motor areas in the frontal lobe. Specifically, she notes that that the human POT plays an active role in the formation of modality‐free conceptual structures (see also Coolidge and Wynn, Chapter 21; Donald, Chapter 17). In her view, these functions represent a natural expansion of primate posterior parietal lobe functions, which include the construction of modality‐neutral spatial concepts and the spatial orientation of arm and hand (p. 140) movements. In non‐human primates, these posterior parietal functions are coordinated with motor functions of the frontal lobes to produce object‐related actions. She hypothesizes that the expansion (or emergence) of the POT permitted the enhanced spatial analyses required for the coordination of arm, hand, and thumb movements with respect to tool use and throwing (see also Calvin 1985). Since many linguistic structures are spatially and thematically organized, POT expansion also provided the necessary conceptual structure for critical components of the language function; hence, in her view, spatial skills that developed initially in tool‐using situations were later co‐opted for language.
Arbib (Chapter 20) also pursues issues of primate/human neural homologues and neural exaptations. He accepts that the human Broca's area is homologous with similar areas in monkeys and apes, but notes that it has no obvious vocal functions in other species. In contrast, neural areas that do mediate primate vocalizations have no known linguistic role in the human brain. Rather, he posits that mirror neurons found in Broca's area of non‐human primates served as the foundation stones upon which imitation, gesture, and language were built. These neurons fire when a monkey executes a particular manual action and when it observes another individual performing the same action. Mirror neurons thus provide an essential language function—parity; that is, assuring that communicator and recipient have similar perceptions. In Arbib's view, mirror neurons serve as essential components of language and imitation, but are not, by themselves, sufficient to mediate behaviours which require the hierarchical integration of multiple actions and concepts. Since the earliest mirror neurons to be identified were related to manual actions, Arbib adopts a gestural model of language origins and delineates how such a system may have evolved. More recent research indicates that mirror neurons are also found in the inferior parietal lobe; may represent oral movements as well as manual; and may also be of an audiovisual nature. Hence, the mirror neuron story continues to unfold.
Coolidge and Wynn (Chapter 21) focus on the neurological and cognitive correlates of indirect speech, that is, intentionally ambiguous utterances that must be interpreted with regard to social context and that are used primarily in situations that require diplomacy. In their view, indirect speech requires working memory, executive control structures, and theory of mind. Working memory, in turn, is composed of phonological storage capacity, a visual spatial sketchpad and an episodic buffer which allows the contents of phonological storage and the visuospatial sketchpad to be simultaneously held in conscious thought, manipulated, and combined and recombined with respect to each other. Hence, it facilitates the construction of complex plans and mental models. Executive functions monitor these activities via selective inhibition and attention. Since indirect speech is a product of multiple interacting cognitive components, it must also be a product of multiple interacting regions of the brain. In particular, Coolidge and (p. 141) Wynn note the involvement of the inferior parietal and superior temporal lobes and the dorsolateral frontal cortex.
Taken as a whole, the chapters in this section indicate that nearly all higher neural processing centres play some role in the mediation of language or speech. The complexity of the neural interactions required for indirect speech, in particular, suggests that whatever neural changes may have been needed to initiate and sustain protolanguage and/or the language of human infants, fully developed ‘diplomatic’ language capacities reflect the interactions of much of the neocortex. In her paper on ape language (Chapter 3), Gibson notes that great apes fall short of humans in their ability to construct linguistic, technical, and other hierarchies, and it is widely accepted that many aspects of language, most strikingly syntax, are hierarchically structured. Consequently, it would seem of prime interest to determine which neural areas mediate hierarchical abilities. Greenfield (1991) assigned that role to Broca's area. Arbib (Chapter 20) follows her lead. Wilkins (Chapter 19) remarks that the POT is structured to automatically create mental hierarchies. Coolidge and Wynn (Chapter 21) do not use the term ‘hierarchical’, but they do suggest that the ability to hold multiple images in mind in order to combine and recombine them is mediated by the dorsolateral frontal lobe. That the various authors in this section assigned hierarchical processing or components thereof to different parts of the neocortex would appear to validate earlier suggestions by Gibson that the creation of linguistic and other hierarchies, like indirect speech, requires the interactions of multiple cortical processing areas (Gibson 1996a; Gibson and Jessee 1999).
12.3 The vocal tract
Although changes in brain function almost certainly played central roles in language evolution, as MacLarnon notes (Chapter 22), other critical anatomical changes occurred as well. For example, the larynx is lower in humans than in apes and the oral cavity is differently structured. Together, these changes allow humans to produce sounds not readily produced by apes. This vocal tract reorganization may have been facilitated by bipedalism, diet, or a combination of the two. In addition, humans have far greater neural control of their breathing than do apes, and, unlike apes, they have no laryngeal air sacs. MacLarnon suggests that increased control of the respiratory apparatus involved expansion of the numbers of neurons in the thoracic spinal cord. While it is impossible to determine laryngeal position from fossils, the very few hyoid bones that have been found suggest that modern human laryngeal structure, including absent air sacs, may have been (p. 142) present in the common ancestor of Neanderthals and anatomically modern humans, but not in australopithecines. Similarly, fossil vertebrae indicate that both Neanderthals and anatomically modern humans had achieved the modern size of the thoracic spinal cord, but Homo erectus had not (see Wood and Bauernfeind, Chapter 25, for a contrary view on the thoracic cord).
In sum, evidence indicates that language evolution probably demanded changes in multiple interacting genes and involved expansions in multiple parts of the brain, as well as changes in the vocal tract and thoracic spinal cord. Given our current understandings of exaptation, niche construction, the Baldwin effect, and neural plasticity, neural changes probably built upon precursor neurobehavioural functions in non‐human primates and occurred over a lengthy period of time. In some cases both neural and vocal changes may have occurred in response to the selective pressures exerted by language and culture, as opposed to strictly external environmental circumstances; see also Bickerton (2009a). Nothing that we know about genetic or neural functions would suggest that language arose in response to a sudden mutation or the sudden appearance of a new neural module.