Introduction: The evolution of language
Abstract and Keywords
As the title suggests, this work chronicles the evolution of language with special reference to animals. Most organisms communicate with conspecifics, whether intentionally or not, and such communication encompasses all conceivable mechanisms. Vocal and other sound-based signals, such as clicking wings or legs, are common in animals. Visual signals are also widespread, including those associated with humans and other primates: manual and facial signals, and bodily postures. The volume is divided into five parts. Part I, about insights from comparative animal behavior, examines animal communication systems and cognitive capacities of potential relevance to the evolution of language and speech. Part II, which details the biology of language evolution including anatomy, genetics, and neurology, offers various views of the physical components of a language faculty. Part III is about the prehistory of language, and in particular askes: When and why did language evolve? The text presents current interpretations of the selective events that may have led to the evolution of language. Part IV, is on launching language and looks especially at the development of a linguistic species, and it presents articles dealing with central properties to be accounted for in language evolution, and issues surrounding the forces that shaped the language faculty. Finally, the articles in Part V look at language change, creation, and transmission in modern humans, and this part of the book examines a number of putative “windows” on language evolution; for instance, modern events involving language emergence or change, for which there exists a reasonably concrete evidence, might shed light on the evolution of language itself.
1.1 Animal Communication Systems and the Uniqueness of Language
Most organisms communicate with conspecifics, whether intentionally or not, and such communication encompasses all conceivable mechanisms. Vocal and other sound‐based signals, such as clicking wings or legs, are common. Visual signals are also widespread, including those most associated with humans and other primates: manual and facial signals, and bodily postures. Visual signalling with feelers or other body parts occurs in many species. Colouration is a common type of visual signalling: for instance, some species have distinct juvenile versus adult colouration, or special breeding plumages. Often, changes in colour occur according to context, for instance in some fish, in octopuses, and in chameleons; social cephalopods, such as squid and cuttlefish, deploy changes in skin colour and pattern to signal messages such as readiness to mate. Tactile signals are widely employed, such as touching with legs, trunks, or feelers. Communication via chemical signals is widespread; for example, moths use pheromones to attract conspecifics in the dark. (p. 2) Electric organ discharge (for instance, in electric fish such as mormyrids) also occurs, but is much rarer. Numerous social insect species have highly sophisticated communication systems, such as the ‘dance’ used by honey bees, where complex movements produced by a returning forager indicate the distance and direction of pollen and nectar resources. Animal communication systems are thus immensely varied in form.
Some animals communicate not only with conspecifics, but also understand at least some of the signals produced by other species; for example, Diana monkeys learn the message conveyed by the alarm calls of neighbouring Campbell's monkeys (Zuberbühler, Chapter 5). Moreover, symbiotic relationships sometimes produce communication across unrelated species. The honey guide bird (Indicator indicator) leads honey badgers to bees' nests by making a sound that attracts the badger, which then breaks into the nest, allowing both animals to reap the rewards.
Mammals employ extensive vocal communication with conspecifics, often in addition to using visual display, chemical messages, and tactile communication. Humans are no exception here, and our ancient primate communication system encompasses many non‐verbal signals, including laughter, crying, smiles, frowns, cries of pain, and postures of aggression and appeasement (Burling, Chapter 44). With a very small amount of cultural diversity, these signals are human universals. But all animal communication systems, including our own non‐verbal signalling, share a (negatively‐defined) property: these systems cannot combine signals to produce new meanings. In fact, they generally do not combine signals at all. For language, though, this property is fundamental. The combinatorial principle, exploited at different levels of organization, is a crucial, distinctive attribute of language. The overarching property, shared by no other animal system, is the open‐ended productivity of language: humans combine signals to produce an infinite set of distinct meanings, and can convey to conspecifics any topic that can be thought of, including absent, hypothetical, and fictitious events and entities.
In every meaningful sense, language is an autapomorphy, i.e. a derived trait found only in our lineage, and not shared with other branches of our monophyletic group (say, the group of primates, or the group of great apes). We also have no definitive evidence that any species other than Homo sapiens ever had language. However, it must be noted straightaway that ‘language’ is not a monolithic entity, but rather a complex bundle of traits that must have evolved over a significant time frame, some features doubtless appearing in species that preceded our own. Moreover, language crucially draws on aspects of cognition that are long established in the primate lineage, such as memory: the language faculty as a whole comprises more than just the uniquely linguistic features. We do not know, though, if and how language has itself shaped properties such as memory (or vice versa), so the role of extra‐linguistic factors is hard to evaluate.
We anticipate that both animal communication and animal cognition will shed light on the evolution of language, but in exactly what ways is hotly debated. (p. 3) Biologists do not expect evolution to throw up a radically new complex system with no evolutionary precursors, so it is good scientific practice to look for relevant primitive features in closely‐related species, i.e. phylogenetically ancient features that may have preceded language or been prerequisites for language; see Hopkins and Vauclair, Chapter 18; Wilkins, Chapter 19. Another established methodology is to search for examples of convergent evolution: functionally or formally similar features appearing in species that do not share a recent common ancestry. Unfortunately, for language such features are not easily detected. For specific traits, there are indeed both analogues (unrelated but superficially similar features) and homologues (features with a shared common ancestry) in other animal systems. These include such common features as vocalization and cultural transmission. But it seems clear that language as a system has no ‘simpler’ analogues or homologues in other animals. In fact, language is exceptional in almost all aspects.
Language obtains its unique expressive power by exploiting a few distinct formal principles that operate over numerous subsystems and at different levels of organization. These tools have little or no parallel in the animal kingdom. First, and perhaps most critically, language combines elements at all levels. Starting with sound systems (MacNeilage, Chapter 46), each language combines elements from an individual set of digitized sounds known as phonemes: discrete, contrastive sound segments. In turn, phonemes are often considered to be combinations of a discrete set of phonological features, such as [+/− voice] and [+/− sonorant] (though see MacNeilage, Chapter 46, for discussion). Phonemes combine in language‐specific permutations to form syllables, and syllables combine to form morphemes and words; sign languages have equivalent manual systems (Goldin‐Meadow, Chapter 57). Words are combined, both in morphology, to form compounds (greenhouse, dog‐house, icehouse, outhouse), and in syntax, to form headed phrases.
Second, elements are ordered in predictable ways. Each language has its own set of phonotactic constraints, governing which sound sequences are permissible and which are not; English, for instance, allows word‐initial sequences of /pl/ (play) and /kl/ (clay) but not */tl/. Morphemes are ordered in fixed ways (un+afraid, but fear+less; hand+ful+s, not *handsful); see Carstairs‐McCarthy, Chapter 47, on morphology. Words are also sequenced predictably, for instance by having a usual order of heads and complements across phrasal categories, though languages are not always dogmatic about this.
Third, language exploits hierarchical structure at several levels. Syllables are hierarchically structured, with a nucleus and coda combining to form a rhyme (say, vocalic nucleus /i/ plus coda /⋰k/ forming ink), and then an onset, say dr or sl, combining optionally with the rhyme, forming drink or slink. Hierarchical structure also operates at the level of morpheme combinations: given the morphemes un+friend+ly we get the structure [un[friendly]] rather than *[[unfriend]ly], whereas with untidily (un+tidy+ly) the structure is [[untidy]ly]. Semantics (p. 4) exploits hierarchical structure, for instance by partitioning meanings into different levels of specificity: spaniel, setter, retriever are types of dog, dogs are a type of mammal, mammals a type of vertebrate and so on. Syntax exploits hierarchical structure by combining words into phrases, and phrases into larger phrases and clauses (Tallerman, Chapter 48). It is also widely argued that texts in discourse (e.g. conversations or stories) are hierarchically structured. In addition to these organizational principles, mappings occur between all linguistic levels, including most broadly between sound and meaning. This, then, is the formal basis of language.
Even at the lowest levels of organization, there are strikingly few parallels in animal communication systems. In human phonological systems, a relatively small, closed set of meaningless elements (sound segments and their visual equivalents in sign languages) combine to produce meaningful elements. Though bird song is sometimes described as having ‘phonological syntax’ (Marler 1977), the term is rather misleading. In bird song, discrete acoustic units also exist, and are combined and sequenced in rule‐governed ways, but there is no compositionality; whatever the sequences of notes or motifs, the message never changes and no new information is produced (Hilliard and White 2009; Slater, Chapter 8). There is no productivity in the combinations. Bee dances display a limited compositionality (Kirby, Chapter 61) but again, no productivity. As we move up the levels of linguistic organization, we find fewer parallels still in animal systems. Hierarchical structure exists in some bird song and some whale song (Janik, Chapter 9), but it is always limited (Hurford 2011). There is no recursion (self‐embedding) and no semantic compositionality.
A priori, we might expect that the natural communication systems of our closest living relatives, the great apes, would be nearest to language—perhaps rather like language, but with a smaller vocabulary and a simpler grammar. But this is absolutely not the case. Even at the most fundamental level, that of sound production, we have a different morphology of the supralaryngeal vocal tract from that of chimpanzees, with humans showing clear specializations for speech production (MacLarnon, Chapter 22). Moreover, humans have evolved far greater neurological control over their vocalizations than other primates. Although much is still unknown about the subtleties of communication systems in other primates, it is clear that there is really nothing analogous to human sound systems, lexicon, semantics, or grammar. Some call combinations do seem to occur in wild chimpanzees (Slocombe, Chapter 7) but as yet there is no evidence that these acquire a compositional meaning; see Tallerman, Chapters 48 and 51.
Certain animal systems have something that at first glance seems to resemble a primitive vocabulary. For instance, among other calls, some monkey species have a small set of distinct alarm calls, each produced in response to a different predator (such as the well‐known vervet monkey calls, ‘eagle’, ‘leopard’, and ‘snake’; Seyfarth and Cheney, Chapter 4). These have attracted much interest in the language evolution literature, doubtless because of the relatively close relationship between (p. 5) humans and monkeys. It should be noted, though, that domestic chickens also have distinct alarm calls for different predators, a system as sophisticated as those of monkeys, and, additionally, have referential food calls; moreover, prairie dogs (which are rodents) employ perhaps the most highly sophisticated systems of animal alarm calls (Gibson, Chapter 11). Are alarm calls or food calls parallel to words? They share one property—arbitrariness—with human vocabulary: they are non‐iconic, meaning that they do not sound like the entities they represent; see Deacon, Chapter 43. Alarm calls are often described as having functional reference; that is, they are prompted by external events (such as the appearance of a leopard) rather than merely conveying an animal's internal state, such as fear or aggression. We might then assume that the leopard alarm in some way ‘means’ leopard. But this is not necessarily so—the leopard call may alternatively be associated in the hearer's mind with the leopard‐specific escape route, climbing a tree; see Hurford, Chapter 40; Tallerman, Chapter 51, for discussion.
Alarm calls differ from words in all other respects; see Tallerman, Chapter 48. They are not formed from different permutations of a discrete set of sounds, but rather, are holistic. Both the calls themselves and the broad contexts that provoke them are innate. Conversely, words, both forms and meanings, are learned by human infants, and crucially, new words are learned by each speaker throughout life. Monkey alarm calls are primarily used when the particular predator is present, and sometimes to deceive conspecifics into thinking a predator is around. Even in the latter case, calls are wholly situation‐specific; calls cannot be used predicatively to discuss a predator, past, present, or hypothetical. Alarm calls are indexical, meaning that they have a causal link to what they represent—normally, the presence of the predator induces the appropriate alarm call. Alarm calls thus also lack the property of displacement that is crucial in language: words are not tied to or produced in a precise context, but can be used whenever the concept they represent floats into our minds.
Words are thus true symbols, whereas animal calls, even if functionally referential, are not; symbolic reference, which must be acquired by learning, is explored in detail by Deacon, Chapter 43, and by Harnad, Chapter 42. Critically, word meanings are established between a community of speakers and agreed by convention. Part of what this entails is that the meaning of a word can change very quickly, providing other members of the language community adopt the new meaning (think of net, web, or drive). Alarm calls, in contrast, have a fixed meaning and essentially form a closed set. The total repertoire of calls in any animal species is tiny, numbering no more than a few dozen distinct calls, whereas the vocabulary of all human languages numbers tens of thousands of items (Tallerman, Chapter 48). The gulf between the small set of essentially innate and relatively inflexible calls found in all other primate species and the massive, open‐ended, and learned vocabularies of human beings reveals one of the major innovations in language evolution that must be accounted for. This is no mere matter of degree—in (p. 6) vocabulary, something altogether different has evolved from anything seen in the natural communication systems of other animals (Burling, Chapter 44).
The evolution of a massive, learned vocabulary store (Tallerman 2009) is just one of the unique aspects of language. Only in language do we find the extensive categorization that, for instance, divides the lexicon into discrete categories such as noun, verb, adjective, each category with its own distinctive behaviour. The categories themselves are unlikely to be innate, since they differ from language to language, but the ability to categorize in this way, and on the basis of little data, appears to be uniquely human. Evidence of children's abilities in generalizing over categories has been well known at least since Berko (1958). In the wug test, children are told that a mythical creature in a picture is a wug; when asked what two such creatures are called, they have no difficulty in replying /w℘γz/, thus using the correct plural form of the noun.
Only language conveys propositional meanings (‘idea units’; loosely, what we call sentences, such as The bird flew past the window), and of course these are unlimited in scope. Only language exhibits all the paraphernalia of syntax (Tallerman, Chapter 48), including headed phrases, recursion, long‐distance dependencies, and expressions such as each other or themselves that can only be interpreted using other expressions (Kim and Mel hurt each other). Only language displays the property of duality of patterning (see Tallerman, Chapter 51), with combinations on two levels of organization: meaningless units (phonemes) are combined into meaningful morphemes/words, and words are combined into phrases. Other highly distinctive properties arise on each level of linguistic organization; even the speech signal itself displays significant adaptations both in production and perception (see Pinker and Jackendoff 2005 for an overview).
Given the limited nature of the evidence obtainable from studying animal communication systems, how do researchers hope to break into the evolutionary puzzle that is language?
1.2 What Counts as Evidence in Language Evolution?
The previous section introduced the major novelties of the language faculty, which includes notable discontinuities with animal communication systems. This gives rise to a fundamental dilemma in the field of language evolution. Language seems to display many features with no precursors, yet general evolutionary principles suggest that a complex trait like language, which is not under the control of any single gene or related group of genes, must have evolved in large part from simpler (p. 7) precursors. Frustratingly, we have no direct evidence for any aspect of language evolution, and no uncontroversial indirect evidence. Moreover, what is considered possible evidence differs from discipline to discipline, as we now discuss.
The comparative method is an obvious place to start; see chapters in Part I, also Fitch (2010a). There are two ways in which this method can be employed. The first involves comparing similar traits within a clade. For humans, the set of primates as a whole or the smaller set of great apes would be most relevant; for instance, tool use by other great apes is an established trait, so it seems likely that the last common ancestor of all great apes, including humans, was able to use simple tools. This trait is thus a homologue, involving a shared common ancestry. (See Wood and Bauernfeind, Chapter 25, for discussion of likely features of the last common ancestor between panins, i.e. chimpanzees/bonobos, and hominins, i.e. creatures that are on the human line of descent, though not necessarily direct ancestors to Homo sapiens.) Relevant here too are the natural communication systems of closely‐related species, and also their latent language‐related abilities, such as the ability to learn arbitrary symbols under human instruction (Gibson, Chapter 3). The alternative way of employing the comparative method involves comparing the convergent evolution of similar traits across a number of unrelated lineages. For instance, bipedal locomotion in humans, kangaroos, and birds is not due to common ancestry, so is an analogous trait across the three lineages. Analogues are useful because they may have evolved in different lineages under comparable selection pressures, such as a similar habitat, diet, or predation pattern. The problem, as noted above, is that homologues and analogues to essential properties of language are not easily established in animal systems, and no other species has a language faculty, so the comparative method is difficult to apply straightforwardly. For discussion in this volume, see especially Arbib (20); de Waal and Pollick (6); Gibson (3 and 11); Hopkins and Vauclair (18); Hurford (40); Pepperberg (10); Slocombe (7); Tallerman (48); Zuberbühler (5).
The discipline of palaeoanthropology examines the fossil record, and from skull endocasts may uncover anatomical evidence of brain structure of potential relevance to language, including brain size, external cortical reorganization, and hemispheric asymmetries (Wilkins, Chapter 19). Unfortunately, we cannot study past stages of brain evolution in any depth, since endocasts provide no evidence of internal brain structure. Similarly, with the exception of an occasional hyoid bone, we have no fossilized remains of the vocal tract (see MacLarnon, Chapter 22; Wood and Bauernfeind, Chapter 25). However, even if we had clear evidence of the emergence of modern vocal tract structure, we would not necessarily know how to interpret it. A broadly modern structure, for instance, could initially have evolved as a spandrel, that is, as a by‐product, perhaps of bipedalism or changes in dental function (MacLarnon, Chapter 22). In that case, speech capabilities could still have been lacking if neural adaptations had not yet occurred. Even if we were certain that a modern vocal tract provided full speech capabilities, this would not (p. 8) necessarily imply the presence of a full language faculty, since this encompasses far more than speech.
Recently, molecular biology has provided another possible source of physical evidence: the use of genetic material from hominin fossils, and the tracking of changes in DNA in hominin populations (Cann, Chapter 24; Pakendorf, Chapter 59). Again, however, these methods are fraught with difficulties and controversies. For instance, which class of DNA (mitochondrial, Y‐chromosome, or autosomal) should be considered the most reliable? This is as yet a young field, and new discoveries and constant developments in technology should provide more answers in future decades.
Another line of enquiry looks not at hominin fossil remains themselves, but at the artefacts left by our ancestors. Archaeologists have argued that inferences can be made about the development of symbolic communication and linguistic complexity by looking at tools and other implements, or personal ornaments such as beads, thus assuming some link between linguistic skills and cognitive sophistication, as evidenced in the material record. Moreover, if a certain level of cultural complexity is attested both in known societies and in prehistoric societies, it seems reasonable to assume that a similar level of complexity occurs in cognition too. In this volume, the chapters by Boeckx (52); Botha (30); d'Errico and Vanhaeren (29); Donald (17); Mann (26); Mithen (28); and Wynn (27) discuss the relevance of the archaeological record and the difficulties inherent in interpreting it; see also Cann, Chapter 24. There are many possible drawbacks to using technological advances to infer the presence of the language faculty, not least because crucial artefacts made of degradable materials may be absent from the record: for instance, early hominins may have utilized plant materials, including bark, leaves, wood, grass, and reeds, and animal soft parts, including hides, furs, and feathers. We only have to think of the exponential increase in the complexity of our own artefacts between 1850 and 2010 to realize that there is no simple chain of inference between sophistication in the archaeological record and the presence of language (see also Botha, Chapter 30). Moreover, new archaeological findings readily overturn previous conclusions. For example, it was once thought that art, beads, and some forms of stone‐flaking appeared only in Upper Palaeolithic times, i.e. after about 35,000 BP. We now know that beads, other putative forms of symbolism, and advanced flaking techniques long predate the Upper Palaeolithic (Brown et al. 2009; d'Errico and Vanhaeren, Chapter 29; McBrearty and Brooks 2000). And until quite recently, tools comprising more than one component—such as harpoons or bows and arrows—were thought to have originated only within the last 20,000 years (Coolidge and Wynn 2009b; Wynn, Chapter 27) but a recent find suggests that arrows were being produced as much as 64 kya (thousand years ago) (Lombard and Phillipson 2010). Thus, the archaeological record requires us to frequently re‐evaluate evidence and arguments.
(p. 9) Given the problems with the physical record, both of hominin fossils and associated artefacts, many have turned to other potential sources of evidence for language evolution. Jackendoff (2010: 65) suggests that reverse engineering provides the most productive methodology: ‘We attempt to infer the nature of universal grammar and the language acquisition device from the structure of the modern language capacity’. Thus, one form of evidence comes from the study of normal language and its acquisition; in this volume, see in particular the contributions by Bickerton (49); Burling (44); de Boer (33); Falk (32); Goldin‐Meadow (57); Graf Estes (64); Locke (34); MacNeilage (46); Studdert‐Kennedy (45); Tallerman (48 and 51). However, the danger of confusing ontogenetic and phylogenetic processes must always be guarded against here. There is no reason to think that any specific evidence concerning the origins of language can be gained from studying the acquisition of modern languages. Moreover, infants learning language today have a full language faculty, which clearly is not the case for the earliest hominins.
Linguists often suggest that evidence can be obtained by extrapolating from other modern contexts; for instance, from observable ‘language genesis’ in adults and children, including the formation of pidgins and creoles (in this volume, see Bickerton, Chapter 49; Carstairs‐McCarthy, Chapter 47; Roberge, Chapter 56), homesign, and emergent sign languages (Goldin‐Meadow, Chapter 57). A very strong line of linguistic research (and one of the few areas widely considered to provide good evidence by practitioners of disparate linguistic theories) involves the study of grammaticalization. It is widely argued that putative prehistoric stages of language can be reconstructed by studying known linguistic trajectories of change—specifically, the ways in which grammatical elements are formed from lexical elements. In this volume, see especially the contributions of Bybee (55); Carstairs‐McCarthy (47); and Heine and Kuteva (54). The importance of grammaticalization is also emphasized by Bickerton, Chapter 49; Corballis, Chapter 41; and Chater and Christiansen, Chapter 65.
There are also, however, applications of reverse engineering in spheres other than the narrowly linguistic. In the field of cognition, Coolidge and Wynn (Chapter 21) investigate the evolution of modern thinking, specifically the emergence of indirect speech acts. There are also applications of reverse engineering in evolutionary biology; in this volume, see in particular Számadó and Szathmáry, Chapter 14, who discuss the co‐evolution of language and the brain. Since there must be some relationship between genes and language, it has sometimes been assumed that there are readily identifiable genes ‘for’ language, or aspects of language (for example, FOXP2 was often carelessly reported in the popular press as the ‘language gene’). Diller and Cann, Chapter 15, evaluate the evidence concerning genetic correlates for language, and conclude that language is highly unlikely to be associated with any single genetic mutation; see also Cann, Chapter 24.
(p. 10) Two related, relatively new applications of reverse engineering have become major fields of investigation in language evolution: computational and mathematical modelling, and robotics, including embodied agent models. Formal models allow the predictions of theories of language evolution to be tested empirically, by building in the assumptions to be tested and seeing if they indeed result from the model: if the model produces these results, the assumptions are borne out. In this volume, see especially Cangelosi (62); Kirby (61); and Smith (60); also de Boer (63); MacNeilage (46); Studdert‐Kennedy (45). As both Kirby and Smith discuss, results and predictions obtained from the formal models can further be tested on human subjects in the laboratory.
In sum, though there will inevitably be much speculation in a field of this nature, we believe that serious advances have been made in the past few decades in terms of building an evidence‐based discipline. In the next section we consider in more detail the properties of language as a biological system.
1.3 Language evolution and biology
We start by examining the uniqueness of language in biological terms, in comparison with other animal communication systems. Language is a complex amalgam of lifelong learning (nonetheless including a critical period) and innateness; see Fitch, Chapter 13. Most researchers agree that both aspects are crucial to language, but many controversies arise over where the line should be drawn (see the following section). The aspects uncontroversially considered to be learned are, of course, vocabularies and idiosyncratic lexical properties of distinct languages, transmitted from generation to generation (a trait known as traditional transmission). Vocabulary is added beyond the critical period for language acquisition, a feature with few clear analogues in other animal communication systems.
Simple communication systems which combine vocal learning and innateness are found in some animals (notably, songbirds), but the contributions made by each aspect are easier to tease apart, since experiments can be performed which would be impossible with human subjects. Here we see a marked contrast with the communication systems of non‐human primates, in which learning plays a minimal part—it is more a case of fine-tuning the acoustic properties of calls, and of learning the specific contexts in which it is appropriate to use each call. Vocal learning does play a vital role in the communication systems of some non‐primates, however, especially in bird species (Slater, Chapter 8), in a number of marine mammals (whales, dolphins, sea lions etc.; Janik, Chapter 9), and in some bats. Among vocal learning birds, there are certain parallels to language learning: (p. 11) song learning can involve a sensitive period, outside of which the learned system will either be incomplete or abnormal; song is traditionally transmitted (i.e. learned from adult models); and a stage analogous to babbling (known as sub‐song) occurs. However, Gibson (Chapter 11) points out that language learning is actually very different from the learning of bird song, which often has a sensitive period followed by a long ‘quiet’ period, with song emerging only in adulthood; see Hurford (2011). Moreover, young birds raised without an appropriate adult model (e.g. solely female instead of male relatives) will nonetheless sing, even though the song is not adult‐like. This shows that there is an innate stratum, some basis for the song which is not entirely learned.
In the case of language, the child undoubtedly brings crucial cognitive contributions to the learning process, yet without linguistic input, full language does not develop. Despite the necessity for learning, the language faculty (whatever it contains) provides such a powerful drive that in the absence of normal linguistic input, something language‐like can emerge. A clear example comes from the deaf children of hearing parents: from the non‐linguistic gestures that the parents make to communicate with the child, a structured system—known as homesign—develops spontaneously in the child's communication (see Goldin‐Meadow, Chapter 57). This is not language, but has indisputable linguistic properties. And tellingly, when speakers of different homesign systems get together in a naturalistic setting, a shared system with more linguistic properties soon emerges, as in the well‐known case of Nicaraguan Sign Language, also discussed by Goldin‐Meadow. Over the course of a couple of ‘generations’ of schoolchildren, this sign system developed into full language. It is also well documented throughout the world that when contact occurs between groups with no shared language, restricted linguistic systems develop, known as pidgins (Roberge, Chapter 56), and these may in due course become full languages, learned natively by children. Given such evidence, it is difficult to conclude that language has no genetic component. We will assume, then, that there have been significant adaptations in our species with respect to a language faculty.
Strikingly, though, and unlike animal communication systems, language differs radically in its superficial form in distinct geographical locations—we have mutually unintelligible ‘languages’ in different regions, rather than distinct ‘dialects’, as is the case in some bird song and whale song systems. The superficial diversity of language systems has no discernible consequences for language learning; infants seem equally capable of learning any ambient language (or indeed, learning half a dozen or more languages in their environment), and take around the same amount of time to get to the same stages, whatever language they are learning. This fact alone suggests the presence of an innate predisposition for language learning.
Another biologically distinctive property of language concerns its function (see also below). If indeed language has biological ‘function’ at all, it is difficult to discern what the primary function might be, or might have been while language (p. 12) was evolving. The function of animal communication systems, on the other hand, typically revolves around reproduction, including mate attraction, pair bonding, and defence of territory. Even learned animal systems thus have a very limited message. There is a biological imperative for songbirds to learn their songs: birds that don't learn the species‐appropriate song are less likely to be chosen as mates, and song quality is perceived by potential mates as a fitness indicator (Slater, Chapter 8). Song is, thus, an honest signal (Zahavi and Zahavi 1997), in a way that language is not. Song is ‘costly’, as it takes time to produce, and so reveals the quality of the singer: only a bird that is already in good condition can afford to spend time singing rather than feeding; song may also draw the attention of predators. Conversely, producing language requires virtually no calorific expenditure above and beyond that needed for overall brain growth and maintenance; it doesn't take up valuable time that could be used to forage and it doesn't require that the speaker be in good condition. One of the questions surrounding the origins of language is therefore how a ‘cheap’ communication system of this nature might have arisen; for discussion in this volume, see especially Donald, Chapter 17; Dunbar, Chapter 36; Falk, Chapter 32; Knight and Power, Chapter 37.
It is also notable that language involves developments on three distinct but interacting timescales (Kirby, Chapter 61; Carstairs‐McCarthy, Chapter 47; Számadó and Szathmáry, Chapter 14). The first of these is biological evolution: whatever is in the language faculty (and its precursors in early hominins) must be genetically transmitted under selective pressures—the transition in hominins from no language to language is a biological fact, so must conform to known biological processes. Language in its earlier forms can therefore be assumed to have been adaptive, i.e. to have conferred fitness benefits on its users. At the very least, whatever neurological, physical, or other changes accompanied an evolving language faculty had to have no negative impact on selection.
Biological change thus encompasses all the phylogenetic changes in hominins which are prerequisites for language, including the evolution of crucial abilities not shared with other primates, or at best, only minimally developed in non‐human primates. For the speech modality, these prerequisites include full vocal control (the ability both to vocalize and to suppress vocalization at will), vocal imitation, and vocal learning; see MacLarnon, Chapter 22; MacNeilage, Chapter 46. A certain amount of vocal flexibility is in fact attested in modern primates (see Slocombe, Chapter 7; Zuberbühler, Chapter 5), taking the form of acoustic modification of calls. It is now also known that some primates can both vocalize volitionally and suppress vocalization under certain circumstances. Slocombe also reports on some evidence showing that both vocal imitation and learning occur, for instance in different ‘dialects’ of vocalizations in chimpanzees; if there are novel vocalizations, they must be learned and must spread via imitation. But even before those traits emerged, our ancestors must have developed the ability to understand that conspecifics are communicating deliberately; to infer the mental states of other (p. 13) individuals (see Gibson, Chapter 3; Hurford, Chapter 40; Knight and Power, Chapter 37); and to engage cooperatively in all sorts of tasks, which eventually included discourse (Tomasello 2008). These traits are all expressed to some extent in modern apes, so probably existed in the last common ancestor of apes and humans. Language placed a premium on these abilities. For a protolanguage to emerge, hominins needed to develop expanded abilities in such domains, leading ultimately to the ability to learn, store, and retrieve a vast intersecting network of arbitrary symbols (words; Deacon, Chapter 43), and the crucial property of displacement (the ability to refer to entities remote in time or space; Hurford, Chapter 40); see also Tallerman (2009).
Using these conventional symbols relies in turn on the capacity to imitate, rehearse, and refine the practical skills required (Burling, Chapter 44; Corballis, Chapter 41; Donald, Chapter 17). The use of vocabulary also relies on a shared conceptual system, which probably developed directly from primate cognition (Hurford, Chapter 40), and requires ‘the ability to associate gestures or vocalizations with concepts’ (Burling, Chapter 44; see also Corballis, Chapter 41): this must be one of the critical first steps in language evolution. For full language, more is needed—the major development being the compositional syntactic abilities which are the main impetus in generative grammar for assuming an innate language capacity; see Bickerton, Chapter 49; Tallerman, Chapter 48, for an outline of syntactic processes.
The second timescale involves cultural transmission: individual languages are transmitted across generations, and within populations of speakers. This has led to proposals that languages themselves adapt to become more learnable (Christiansen and Chater 2008; Chater and Christiansen, Chapter 65). Many developments on this timescale are known from attested language change, in particular the processes known as grammaticalization, whereby lexical items evolve into functional items (auxiliaries, complementizers, demonstratives, determiners, and so on). Most linguists assume that similar processes were operative in the evolution of the full language faculty, so that the earliest protolanguages—simpler precursors to language—may well have distinguished no categories other than protonouns and protoverbs (Hurford 2003a; Heine and Kuteva 2007, Chapter 54; Tallerman, Chapter 51).
Cultural transmission involves not only vertical transmission, between parents and children, but also horizontal transmission of various kinds, both within and across communities. This includes transmission between speakers of different languages, in cases of language contact: see Pakendorf, Chapter 59. Such contact can lead to interesting mismatches between the genetic and linguistic heritage in a population, as Pakendorf outlines. Since population contact is likely to have been extensive throughout our evolution, language contact between linguistic groups has very likely contributed much to language evolution itself (Nichols, Chapter 58). (p. 14) Horizontal transmission between groups of deaf children also occurred in the development of Nicaraguan Sign Language (Goldin‐Meadow, Chapter 57).
The third timescale is that of individual learning—the growth of language in children. MacNeilage, Chapter 46, argues that there is one respect in which ontogeny recapitulates phylogeny, and that is in speech production; infants and early hominins must share the same biomechanical constraints on mouth movements, and these lead in both cases to initially simple syllable patterns (as seen in babbling; see Studdert‐Kennedy, Chapter 45, for discussion). Even prelinguistic infants possess impressive statistical learning abilities which provide cues for segmentation, enabling the internal structure of words and phrases to be detected in the continuous stream of speech (Graf Estes, Chapter 64). Of course, these learning mechanisms also evolved (Számadó and Szathmáry, Chapter 14), thus relating back to the biological timescale.
As mentioned above, the complexity of the language faculty precludes any simple account of language evolution relying on a few, recent genetic mutations. The sequencing of the human genome (International Human Genome Sequencing Consortium 2004) revealed, rather surprisingly, that humans only have around 20,000–25,000 genes—far fewer than was anticipated. (A microscopic roundworm, Caenorhabditis elegans, has over 19,000 genes.) Since humans share approximately 99% of their genome with chimpanzees and bonobos, this suggests that relatively few genes determine all of the biological differences between humans and panins (Gibson 2002). Two factors may account for this. First, most genes are pleiotropic, which means that they have control over more than one trait. Second, many genes (such as FOXP2) are also regulatory in nature; that is, they serve as switches that turn multiple downstream genes on or off. Regulatory genes that are active early in development can have profound effects on developing phenotypes. Given the small number of genetic differences between panins and humans, it is likely that many of the phenotypic (i.e. observable) differences reflect functional differences in regulatory genes. The small number of total genes in the human genome—coupled with the small number of probable genetic differences between other apes and humans—also argues against views that each aspect of distinctively human neurology, behaviour, or language is controlled by a distinct gene (Gibson 2002; Diller and Cann, Chapter 15). Chater and Christiansen, Chapter 65, even doubt that the language faculty has genetic underpinnings at all, arguing that aspects of language fluctuate far more quickly than genetic changes could accommodate; but see Számadó and Szathmáry, Chapter 14, for an alternative view. We do not doubt that cultural transmission has shaped the language faculty to some extent; since the earliest forms of protolanguage must have been culturally transmitted (Nichols, Chapter 58), just as languages themselves are, then learnability seems likely to have played an important role in evolution. But we also see an evolving language faculty itself as clearly adaptive.
(p. 15) It is also important to acknowledge epigenetic factors in human evolution. Genes interact with their environment, so that the same genotype (i.e. the same DNA) occurring in two individuals does not necessarily produce the same phenotype (observable properties or behaviour) in both, a phenomenon termed phenotypic plasticity (Számadó and Szathmáry, Chapter 14). A simple example is height. The ultimate height reached by any individual is a product of interacting genetic and environmental effects, including intrauterine environment, postnatal diet, and overall health. As noted by Számadó and Szathmáry, and outlined in the introduction to Part II, phenotypic plasticity is built into mammalian, including human, brain developmental processes. It is possible, via the Baldwin Effect and genetic assimilation, for initially plastic phenotypes, including learned behaviours, to ultimately become genetically fixed (Fitch, Chapter 13; Gibson and Tallerman, Chapter 12; Számadó and Szathmáry, Chapter 14). Although an ‘instinct’ to learn languages has clearly been incorporated into the human genome, no specific lexical items or specific syntactic constructions are universal. This suggests that language, as opposed to most animal calls, is a specific adaptation for communicating about highly variable events.
1.4 Language: what are we trying to account for and (how) did it evolve?
Do we even agree what language is? Although we have written so far as if it is clear what the term ‘language’ refers to, the likelihood is that readers have quite disparate ideas on this topic. We started this introduction by comparing language as a communication system with animal communication systems, something that is controversial for those who do not regard language as primarily ‘for’ communication at all. If ‘language’ suggests different things to researchers from distinct fields, then we won't necessarily agree about what there is to account for in ‘language evolution’. We therefore need to consider how various terms have been defined and used in the field.
A primary distinction often made in linguistics is that between E‐language and I‐language (Chomsky 1986). E‐language (‘external’ language) refers to linguistic behaviours, also indicating an observable set of languages (living or extinct); E‐language is in some sense ‘out there’ in a language‐speaking community. I‐language is a cognitive entity; it refers to the speaker's (‘internal’ and ‘individual’) knowledge of language, and is regarded by many linguists as the proper object of biological study. Broadly, I‐language can be equated with the better‐known term, competence; I‐language is a property of an individual's brain, while E‐language is (p. 16) the output of a set of I‐languages. For evolutionary linguistics, the relevance lies in the distinction between the evolution of language as a human faculty, and the subsequent development of various languages (linguistic systems) over historical time, which is generally not thought to involve evolution in a biological sense.
There is, of course, no way to see I‐language directly, and few methods are available for investigating the nature of the language faculty apart from studying the outputs of I‐language, including interrogating native speaker grammaticality judgements: indeed, linguistics journals are full of papers investigating various aspects of (E‐)‘languages’. Whether or not these analyses shed light on the language faculty itself is often a matter of interpretation and of theoretical assumptions. Moreover, I‐language itself is not transferred from parent to child, but must crucially pass through what Hurford (1990b) terms the ‘Arena of Use’. Children, like linguists, have no direct access to their parents' language faculty, and their own I‐language is in part the product of learning from incomplete linguistic data produced in a social setting. There seems little doubt that processes of socio/cultural transmission play a role in shaping languages (Kirby, Chapter 61). An objection here might be that such transmission has shaped E‐languages but does not affect I‐languages. For instance, it is likely that traditional transmission is involved in forming vowel systems that keep segments as far apart as possible within the acoustic space available (de Boer, Chapter 63), and also involved in linearizing words and phrases in ways that aid processing (Hawkins 1994, 2004). But cultural transmission operating while the language faculty was still evolving may also have shaped I‐languages themselves. The very fact of having to be learnable by human brains may determine structural properties of language (Anderson, Chapter 39; Chater and Christiansen, Chapter 65).
A second distinction (Hauser et al. 2002) is between FLN and FLB—the faculty of language in the narrow and in the broad sense. FLN is a component of the wider FLB, but refers to whatever is uniquely linguistic—‘the abstract linguistic computational system alone’ (Hauser et al. 2002: 1571)—and thus, uniquely human. FLB contains many additional capacities, including memory, respiration, and the auditory system; traits that are used in language but are not necessarily uniquely human. Jackendoff usefully (2010) refines this distinction. Some aspects of the broader language faculty are uniquely human, but have a wider function than the purely linguistic, such as a full theory of mind. Other aspects of the language faculty are both uniquely human and uniquely linguistic, yet have evolved directly from existing primate features; a clear instance is the specialized human vocal tract (MacLarnon, Chapter 22). In FLN remains whatever is radically new in the primate lineage—aspects of the language faculty that are so specialized or distinctive that they appear to have no primate precursors. From a biological perspective, as little as possible should be ascribed to this last category.
To step back a little, these distinctions also raise questions. What do we mean by a language faculty? Does it even exist? Most linguists, psychologists, and biologists (p. 17) assume, at a minimum, that humans have some genetic endowment for language acquisition; a biological trait which, in the presence of socially‐presented linguistic data, ensures that language will develop in all children with normal biological endowments and normal socio/cultural experience. Briscoe (2009: 369) defines such an innate language acquisition device as ‘nothing more or less than a learning mechanism which incorporates some language‐specific inductive learning bias in favour of some proper subset of the space of possible grammars’. Linguists often refer to this biological endowment as universal grammar (UG). UG in this sense addresses the issue of the species‐specificity of the language faculty; even when raised by humans, other great ape infants do not acquire language (though some may acquire rudimentary, protolanguage‐like communications systems). Even if ‘an innate, species‐specific and domain‐specific faculty for language’ (Kirby, Chapter 61) is rejected, it is presumably impossible to deny some crucial involvement of our genetic code in language acquisition. As well as referring to the ‘initial state’ of the language faculty, for many authors UG also constrains the design of languages, providing ‘restrictions on search space’ (Chomsky 2010: 61). For instance, the infant language‐learner can assume that syntactic processes are structure‐dependent, so that, say, a fronting construction operates on a whole constituent rather than part of a constituent (Whose book did you buy? vs. *Whose did you buy __ book?).
The concept of UG itself is frequently misunderstood (see Jackendoff 2002: ch. 4 for useful discussion and Goldberg 2008 for commentary on various interpretations). Although not everyone who uses the term UG has exactly the same conception of it, various aspects should be clear. UG is not ‘what all languages have in common’, nor a set of language universals (contra Tomasello 2009a). Nor is it an abstract semantic structure common to all languages. UG is often now seen as ‘the “toolkit” that a human child brings to learning languages’ (Jackendoff 2002: 75). Under this conception, UG provides a set of tools, or basic principles, for building languages, which each language customizes in specific ways; see also Culicover and Jackendoff (2005) for more details on the Toolkit Hypothesis. There is no expectation here that everything the toolkit can build is found in all languages (see Carstairs‐McCarthy, Chapter 50), but it does constrain what can be built.
Many questions arise. A major issue concerns whether there is a specialized (domain‐specific) language acquisition device at all. If a UG of this nature does exist, what aspects of the language faculty does it contribute to? Does it contain linguistically specific principles? As an alternative, can we do away with UG, so that every aspect of language learning is subsumed under more general learning mechanisms? There are probably two polarized extremes in this area. At one end of the spectrum lies the view that there is a completely specialized, innate language faculty, centring around—or even consisting solely of—a narrow syntactic core (see Piattelli‐Palmarini 1989, 1994, 2010; Anderson and Lightfoot 2002). Much recent debate focuses on the content of ‘narrow syntax’ (Hauser et al. 2002). (p. 18) Historically within generative linguistics, a great deal was attributed to an innate UG, including very specific autonomous syntactic principles such as subjacency and other island constraints, and numerous other filters such as the empty category principle or the specified subject constraint. Crucially, such properties were seen as highly abstract and arbitrary, rather than functionally motivated, and this arbitrariness was a central plank in the argument; if these domain‐specific principles don't make language more ‘useful’ or usable—and may even be dysfunctional (Chomsky 1995)—then they can't be adaptive and thus can't have evolved by natural selection (see Lightfoot 1991, Chapter 31; Anderson, Chapter 39). Both Anderson and Lightfoot stress that it is extremely unlikely that natural selection accounts for every aspect of the language faculty.
Although UG is still a central concept in more recent Minimalist theorizing in linguistics, its role and hypothesized content is much reduced. The bulk of the machinery associated with the heyday of the Principles and Parameters framework is no longer considered part of UG (Hauser et al. 2002; Chomsky 2005, 2010; see Boeckx, Chapter 52). Chomsky (2005, 2010) proposes that the central syntactic operation is Merge, which takes items X and Y and combines them to form Z, thus building ‘recursive’ hierarchical structure; this, in Chomsky's view, is the main component of genetically‐determined UG (see Bickerton, Chapter 49).
At the other end of the spectrum of views on UG lies a disparate body of work from various disciplines, including various ‘usage‐based’ and ‘emergentist’ approaches to language (MacWhinney 1999; see Bybee, Chapter 55). Proponents of such approaches may deny that there is any UG: thus, no domain‐specific properties pertaining to language, and no genetic endowment that is language‐specific. In this vein, Tomasello (2009a: 471) comments that ‘the idea that there is a biological adaptation with specific linguistic content…is dead’; see also Tomasello (2003a, 2005, 2008) and Christiansen and Chater (2008), who ‘conclude that a biologically determined UG is not evolutionarily viable’ (2008: 489); also Arbib, Chapter 20. This is not necessarily to deny that the child brings species‐specific abilities to the language‐learning task, but crucially, these are seen as general cognitive and pragmatic analytical capacities (Tomasello 2003a, 2005). Such factors constrain languages to conform to the patterns we find, but are not domain‐specific to language. Bybee and McClelland (2005: 396) also ‘view language structure as emerging from forces that operate during language use’; such forces include frequency effects, whereby common sequences of words come to be treated as a single unit and then undergo phonological reduction (e.g. dunno from I don't know). From a somewhat different perspective, Christiansen and Chater (2008) also argue that rather than ‘language evolution’ involving phylogenetic change in humans, language itself has adapted (literally) to fit the learner's brain: there are thus biological constraints, involving human learning biases, ‘but these constraints emerge from cognitive machinery that is not language‐specific’ (2008: 507); see (p. 19) also Evans and Levinson (2009). Bickerton, Chapter 49, evaluates such ‘cultural invention’ accounts.
Common to these ‘usage‐based’ approaches (also Croft 2001) is the idea that grammatical constructions themselves shape language and contribute to the appearance of design. Chater and Christiansen, Chapter 65, argue that human learning and processing capacities can account for many aspects of language structure. This view does not rule out the existence of linguistic universals (i.e. properties found in all languages), though of course these would have nothing to do with UG, but would be due to common ‘aspects of human cognition, social interaction and information processing’ (Tomasello 2009a: 471). Learners are sensitive to the communicative functions of language, which are the same across all human cultures, and stem from the fact that humans basically conceptualize the world in the same way (Tomasello 2008). Similarly, Christiansen and Chater ‘adopt a non‐formal conception of universals in which they emerge from processes of repeated language acquisition and use’ (2008: 500).
Amongst those who support the UG hypothesis, a central tenet has been ‘poverty of the stimulus’ arguments, which claim that the language data which children receive as input are too limited, haphazard, and imperfect to allow them to infer the grammar of the ambient language without innate, language‐specific learning mechanisms (Chomsky 1965, 1986; Piattelli‐Palmarini 1989; Anderson, Chapter 39; Boeckx, Chapter 52). Much disagreement with the entire concept of ‘poverty of the stimulus’ has arisen. MacNeilage, Chapter 46, contends that arguments of this kind are inapplicable to phonology, since children ‘hear all the sound patterns’ they need to learn, though other phonologists may well disagree with this view. Extensive work has been undertaken on learning and other areas of cognition since UG was initially proposed, and undoubtedly indicates that there is less work for the child to do than was once supposed. Various kinds of ‘head start’ that do not involve language‐specific principles are widely proposed. Goldberg (2008: 523) cautions against overlooking ‘the power of statistics, implicit memory, the nature of categorization, emergent behaviour, and the impressively repetitive nature of certain aspects of the input’ (see Graf Estes, Chapter 64, on statistical learning). Moreover, the input is likely not as unhelpfully degenerate as once assumed, and interactions between child and caregiver are probably highly significant in language acquisition (see de Boer, Chapter 33, on infant‐directed speech; also Falk, Chapter 32; Locke, Chapter 34). To an extent, discoveries of this nature undermine arguments from the ‘poverty of the stimulus’, but they do not negate it: we are far from knowing all the methods employed by infants in learning any one of the world's 6000 or so languages, including the child's abilities to abstract beyond a limited set of data and to extrapolate the correct generalizations.
Tomasello's work (e.g. 2008, 2009b) also discusses the relevance of collaboration, cooperation, and shared intentionality in the evolution of human cognition, including language. He suggests that shared goals and collaborative actions arose (p. 20) in the service of new hunting and/or foraging techniques, under conditions not shared by other great apes (in this volume see Gibson, Chapter 35, on foraging; Knight and Power, Chapter 37; also see Bickerton 2009a). Specific behaviours which are undeveloped in other great apes, such as extensive pointing, establishing common ground, and joint attention are all highly relevant to language acquisition, especially acquisition of vocabulary (Burling, Chapter 44; Hurford, Chapter 40).
A large body of broadly ‘functionalist’ work within linguistics also suggests that grammars themselves are shaped by the functions they perform, lessening the need to posit arbitrary constraints in UG; see Newmeyer (1998, 2005) for extensive discussion. The most concrete of these proposals provide evidence from language processing. For instance, Hawkins (1994, 2004) demonstrates that the requirements of language processing link directly to the form of grammars themselves. In head‐initial languages, such as English, short constituents generally precede long ones: normal English word order has Adj‐Noun, as in a yellow book, but in a book yellow with age, the adjective phrase is ‘heavy’, so must follow the noun, hence the ungrammaticality in English of *a yellow with age book. Head‐final languages, such as Japanese, reverse the preferred order: long constituents precede short ones. Sometimes these principles are merely strong preferences, but they are also widely grammaticalized, meaning that a language disallows dispreferred orders. In a similar vein, Christiansen and Devlin (1997) show that the universally strong tendency for a word order that is fixed across all phrases within a language (head‐initial vs. head‐final) need not be due to an innate, language‐specific principle, but instead may be accounted for by human sequential learning mechanisms. It seems likely, then, that processing requirements have been responsible for much language structure, thus obviating the need for many arbitrary constraints in UG.
The concept of UG is therefore subject to various lines of attack, including arguments that languages themselves adapt to learners' brains, that usage shapes grammar, that language processing shapes grammar, and that learning of grammar is aided by many domain‐general cognitive processes. Does this mean that what we might broadly call performance factors, i.e. factors involving the use of language, can explain and predict all aspects of language structure, so that domain‐specific principles and/or UG can be dispensed with entirely? For some, the answer is yes; Tomasello (2005) argues that poverty of the stimulus arguments are void, and that no language‐specific principles need be postulated. But as Jackendoff (2002: 79) points out, ‘if language is indeed a specialized system, one should expect some of its functional principles to be sui generis’. This debate essentially boils down to the question of whether general cognitive principles and domain‐general learning mechanisms can ‘buy’ the kinds of language structure that recur cross‐linguistically (Bybee, Chapter 55), as well as the child's ability to acquire these structures so readily, or whether domain‐specific linguistic principles are required (see Anderson, Chapter 39; Boeckx, Chapter 52, for discussion).
(p. 21) In fact though, much recent work in linguistics draws on both traditions—on mainstream generative grammar and on cognitive/functionalist work. For example, the Construction Grammar approach (Goldberg 1995, 2006) influences not only the work of Joan Bybee and Michael Tomasello, working firmly within the usage‐based tradition, but also the ‘Simpler Syntax’ of generative linguists Peter Culicover and Ray Jackendoff (2005). Chomsky (2010: 9) also notes that ‘we need no longer assume that the means of generating structured expressions are highly articulated and specific to language’. Moreover, Chomsky's recent work explicitly proposes that language acquisition depends not only on genetic endowment (UG) and linguistic data, but also on ‘third factor’ principles not specific to the language faculty, such as the capacity for data analysis, and more general biological principles, including developmental constraints.
In conclusion, it is fairly clear that UG should no longer be conceived of as a large set of highly specific (and purely syntactic) principles, in the sense of the Principles and Parameters approach. But this does not entail that there is no specific biological endowment relating to the language faculty.
We turn next to the role of natural selection in the evolution of language, and the question of how, why (and if) a language faculty evolved at all. Clearly, if any domain‐specific principles are required to account for the language faculty, then the appearance of such principles must be accounted for. There are, broadly, two views here, adaptationist and non‐adaptationist (see Bickerton, Chapter 49; Chater and Christiansen, Chapter 65; Gibson, Chapter 35 for more discussion). Under the former view, the language faculty evolves gradually via natural selection; all stages are adaptive, so confer increased fitness on their possessors. The seminal paper here is Pinker and Bloom (1990); the target article and the following commentaries set out many of the important issues. Pinker and Bloom aim to counter the position that ‘the evolution of the human language faculty cannot be explained by Darwinian natural selection’ (1990: 707) by providing arguments from design, and by showing that language displays all the signs of adaptive complexity.
The non‐adaptationist or exaptationist alternative (e.g. Piattelli‐Palmarini 1989, 1990, 2010; Lightfoot 1999, 2000, Chapter 31) argues that natural selection may have played a minor role, or even no role at all, in the formation of the language faculty. Critical aspects of language may have arisen as a spandrel (an evolutionary by‐product), for instance of increased brain size (Chomsky 2010), or via general physical or biochemical or developmental constraints (‘laws of form’) still dimly understood (see Boeckx, Chapter 52), or as a result of ‘macroevolutionary changes that are caused by single point mutations in regulatory genes’ (Piattelli‐Palmarini 2010: 156). This view is essentially saltationist too, suggesting that the language faculty appeared very suddenly and without primate precursors, and denying that there has been gradual evolution of a language faculty under selective pressure. In part, this goes hand ‐in ‐hand with the view that the language faculty has not (p. 22) evolved ‘for’ (and hence was not shaped by) communication, but rather, involved a new kind of recursive thought process; see the discussion below.
The very existence of a language faculty is called into question by some. For instance, Tomasello (1990, 1999, 2008) argues that language is purely a human cultural invention, and is also acquired culturally, without help from an innate UG. Evans and Levinson (2009: 446) claim that ‘The fact that language is a bio‐cultural hybrid is its most important property’; this leads to what they see as extreme language diversity, rather than a homogeneous, but abstract, universal linguistic template. Supporters of this view also argue against a Pinker‐and‐Bloom‐type of gradually evolving language faculty, but on the grounds that no domain‐specific abilities have evolved: ‘children learn language using general‐purpose cognitive mechanisms, rather than language‐specific mechanisms’ (Christiansen and Chater 2008: 507). This is not necessarily to deny that there were adaptations, including biological adaptations, but these are for general human cognition (which includes language).
In large part, the answer to the question of ‘what, if anything, evolved?’ depends on what one thinks is uniquely linguistic; what the language faculty contains. As just noted, some recent work claims that general cognition handles everything linguistic, while other work (e.g. Culicover and Jackendoff 2005) regards the linguistic ‘toolkit’ as quite extensive, and certainly as containing more than just syntactic principles. For Hauser et al. (2002) and Chomsky (2010), FLN contains very little, perhaps only the operation Merge, which creates syntactic structure by combining lexical items (Bickerton, Chapter 49; Boeckx, Chapter 52). Thus, Chomsky can suggest that the ‘simplest speculation about the evolution of language’ (i.e. FLN) is that ‘rewiring of the brain took place in some individual’ to yield Merge (2010: 59). We should note, though, that to date there is no neuroanatomical evidence that any such rewiring occurred. In any event, this putative development is not intended to account for the entire language faculty—it doesn't need to: if FLN contains only one critical syntactic operation, then everything left over is regarded as part of FLB, which as noted above includes properties that are not even species‐specific. If what is unique in language can be minimized in this way, then not much needs to be explained. Thus, Chomsky suggests that there may even have been a single ‘genetic event’ causing language (Chomsky 2010: 58; but see Diller and Cann, Chapter 15). Under this view, the role of natural selection in shaping the language faculty is also minimized (as noted by Pinker and Jackendoff 2005: 219). Chomsky (2010: 61) further suggests that ‘solving the externalization problem’ (i.e. getting I‐language out of the brain into E‐language, via phonology and morphology) ‘may not have involved an evolutionary change—that is, a genomic change’. One problem with this approach is that numerous additional aspects of the language faculty apart from Merge appear to be not only uniquely human, but also domain‐specific, and hence must be part of UG; crucial examples (two among many) are duality of patterning and the ability to learn and store a vast (p. 23) lexicon with highly specific linguistic properties; see Pinker and Jackendoff (2005), Jackendoff and Pinker (2005), and Jackendoff (2010) for extensive discussion and illustration; also Tallerman, Chapters 48 and 51.
In sum, we agree that it seems highly likely that many factors apart from ‘mere’ natural selection have shaped the language faculty, including some organizational principles that have nothing to do with genetic changes—but has anyone ever doubted this? Darwin certainly did not deny the existence of alternative evolutionary mechanisms, though of course he could not have known of such biological factors as random genetic drift or gene flow. In any case, natural selection crucially interacts with such factors, since selection pressures maintain or remove random mutations. Since the language faculty in its entirety is a complex collection of traits, not a single trait, the likelihood is that different factors shaped distinct parts. It is biologically—if not linguistically—feasible to argue that some single aspect is critical, and that this is not due to natural selection. But it is surely indefensible to propose that the whole language faculty is attributable to non‐selective factors.
We turn next to the question of why language evolved, in other words, what selection pressures were involved. This of course also brings up the question of function, as mentioned above. Language is, as noted from the start, a collection of interrelated features, some of which are uniquely linguistic (either with or without obvious primate precursors), and others which have a broader application than the linguistic (see Jackendoff 2010). Discussion of function, selection, and adaptation may seem to suggest that we can disentangle all these factors, but this is almost certainly impossible.
One question regarding function that may be amenable to scientific investigation is why only one primate lineage developed a language faculty. To many, the answer seems clear: to enhance communication (in this volume, see especially Dunbar, Chapter 36; Falk, Chapter 32). But the issue is not whether communication has played some role in shaping the language faculty; there is probably near unanimous agreement that it has. What is at issue is what language evolved ‘for’—communication or thought. The two main positions can be characterized thus: 1) language evolved as—and is uniquely adapted for—communication, therefore traits enhancing communication were selected for from the start; or alternatively, 2) language evolved in the service of internal thought and was only later ‘externalized’, therefore selective pressures for more efficient communication came later. Under this view, ‘the earliest stage of language would have been […] a language of thought, available for use internally’ (Chomsky 2010: 55; see Boeckx, Chapter 52). Purely internal language would provide ‘capacities for complex thought, planning, interpretation’, etc. (Chomsky 2010: 59). Chomsky has consistently taken the view that communicative needs did not provide a major selection pressure (see Piattelli‐Palmarini 1989, 2010; Chomsky 2005, 2010; Fitch et al. 2005), and language is not seen primarily as a communication system (Chomsky 2000a). Moreover, current (p. 24) utility (e.g. the use of language for communication) does not explain the functional origins of a trait (Fitch et al. 2005).
Although we do not share the view that the ‘language of thought’ came first, it is instructive to consider how ‘inner speech’ might nonetheless be adaptive (see also Coolidge and Wynn, Chapter 21). For example, Gary Lupyan's work (e.g. 2006) suggests how vocabulary might arise without communicative pressure. In experiments with adults, having mental labels for new concepts is shown to aid categorical learning; see also Harnad, Chapter 42. This is not inherently a communicative function, but it is adaptive; for instance, it would help a hominin to distinguish between two similar‐looking mushrooms, one of which is nutritious and the other poisonous. Of course, labels (words) are now the prima facie instance of what is learned from the environment, but this need not (in fact, cannot) have been the case at the very start of the evolutionary process: the earliest arbitrary symbols did not exist in a community until someone put them out there. (See Harnad, Chapter 42, on the ‘symbol grounding problem’; also Cangelosi, Chapter 62; Deacon, Chapter 43.) Others have proposed that critical selection pressures arise from the kind of thought which language makes possible; Penn et al. (2008) suggest that an ability to reason about higher‐order relations would be adaptive, more so even than being able to communicate with conspecifics. On this view, some kind of simple pre‐language, used for communication, might in fact have evolved first; subsequently, more complex linguistic constructions might arise in a ‘language of thought’, where they would be highly adaptive in terms of problem‐solving.
The alternative view, that the primary adaptation was for communication, is held by many, including Pinker and Bloom (1990), Hurford (2002), Jackendoff and Pinker (2005), Pinker and Jackendoff (2005), Christiansen and Chater (2008), Levinson's contributions to Enfield and Levinson (2006), and Tomasello (2008). Pinker and Jackendoff (2005: 224) state that ‘the key question in characterizing a biological function is not what a trait is typically used for but what it is designed for, in the biologist's sense—namely, which putative function can predict the features that the trait possesses’. The works cited argue that the design features of language are specialized to handle the mapping between meaning and sound, which is exactly what is involved in communication (i.e. rather than inner speech). As Pinker and Jackendoff note (2005: 231), ‘the argument that language is designed for interior monologues rather than communication fails to explain why languages map meaning onto sounds and why they must be learned from a social context’.
In fact, languages are replete with features that only service communication, and play no role in mental representation. These features include (but are not limited to) speech production and perception; the entirety of the systems of phonology and morphology, including duality of patterning; devices that regulate the linearization of phrases and propositions, including unmarked word orders and ‘movement’ constructions such as focalization (e.g. Beans, I like, but spinach, I hate) and question formation; principles of interpretation of expressions with no (p. 25) independent meaning, including pronouns and anaphors; the formation of grammatical functions such as ‘subject’ and ‘object’ and the ability to change these functions by means of such processes as passivization, anti‐passivization, dative shift, and so on. More or less the whole point of syntactic operations is to express different pragmatic effects; we wouldn't need constituent questions or a passive construction unless we were trying to put a message across. Along with these properties, all known languages have some set of purely grammatical or ‘functional’ elements (such as complementizers, auxiliaries, determiners, and pronouns; see Carstairs‐McCarthy on complexity, Chapter 50) and morphosyntactic features, such as number and grammatical gender, case, and agreement. Some of these elements have a clear role in terms of communication, such as complementizers marking clause boundaries; and case and agreement, which mark grammatical relations (thus showing who did what to whom). But none of them appear to play any role in conceptual structure, though it might be argued that these functional elements are relatively recent innovations in language (Heine and Kuteva, Chapter 54), thus played no part in the human ‘environment of evolutionary adaptedness’, and are not part of the initial adaptation.
The issue of critical initial selection pressures relates to, but should not be conflated with, the continuity issue. There are three distinct views of continuity (see also Hauser et al. 2002). One, language is totally dissimilar both to animal communication and cognition; there is thus no continuity at all with prior systems, and the language faculty (or some subpart deemed essential) is a saltation (Piattelli‐Palmarini 1989, 2010). Two, direct continuity with animal communication systems is the only biologically feasible solution. Saltations giving rise to exceptional complexity are impossible. Our ancestors are primates, so language must originate in primate call/gesture systems (Wray 1998, 2000). Three, language is totally dissimilar to modern great ape communication, so cannot evolve from similar ancestral systems, but instead its origins lie in primate cognitive/conceptual systems (Bickerton 1998, 2000; Hurford 2002, 2007; Fitch et al. 2005; Newmeyer 2005; Pinker and Jackendoff 2005; for discussion in this volume, see Arbib, Chapter 20; Seyfarth and Cheney, Chapter 4; Wilkins, Chapter 19).
Proponents of the first view note that evolutionary novelties may arise, not by gradual adaptation, but by apparently quite sudden shifts, a view drawing on the punctuated equilibrium approach of Gould and Eldredge (1977); see Lightfoot, Chapter 31. Not all changes are adaptive, though they may subsequently acquire adaptive value and be selected for. (We note, though, that contrary to older assumptions, natural selection is now known to produce some very fast adaptations; see Számadó and Szathmáry, Chapter 14.) Under this first view, the language faculty (whatever it contains) may even be a spandrel, a feature not directly selected for, but which originally arose for non‐adaptive reasons, as a by‐product of evolutionary change (Gould and Lewontin 1979). Adopting this view, Piattelli‐Palmarini (1989) claims that looking for language precursors in other apes is (p. 26) useless; once the gradualist/adaptationist view is discarded, biologists need not search for intermediate stages of language: none exist. But, of course, there is no way to test the validity of any hypothesis except by examining evidence. In this case, the only way to determine the validity of the ‘no language precursors in apes’ hypothesis is to study apes.
The remaining two views do assume some continuity with animal precursors, but cover a wide range of possibilities.
To an extent, we believe there is some evidence to support all three views. The problem is, which aspects of language are deemed to be critical? Since language is a complex system with crucially interacting subparts, it is unhelpful to promote one aspect at the expense of others. Some aspects of language may well be spandrels. For instance, Fitch (2000a) suggests that the lowered larynx may have evolved primarily for size exaggeration, and was only subsequently selected for with regards to speech sound production. But other aspects have been directly selected for, and have clear primate precursors with a similar function. There are certainly trivial senses in which primate communication systems give rise to language; for instance, we use a modified version of the same vocal tract, and the same auditory system, to produce and perceive sound (and mutatis mutandis for the gestural and visual systems in the case of sign languages). However, there are few similarities between language and the natural communication of other apes—not even in the sound system, which might be expected to show some traces of its evolutionary history. Therefore, the second view, if understood in any but the weakest sense, is unsupported. The third view, that (aspects of) language are rooted in primate cognition, seems to us to have empirical support (for discussion of evidence, see Hauser 1996; Hurford 2007, Chapter 40; Gibson, Chapter 3; Seyfarth and Cheney, Chapter 4). It also seems a promising line of further enquiry, since the cognitive capabilities of other modern apes can at least be investigated scientifically; we hope that this is not the biological parallel to the drunk looking for his keys under the lamp‐post. In contrast, Penn et al. (2008) argue that there is a radical discontinuity between human and animal cognition, though they do not doubt that both human cognition and language ‘evolved through standard evolutionary mechanisms’ (2008: 129).
In the following section, we briefly examine relevant stages in hominin evolution.
1.5 Human evolution and language
Although all great ape species (Gibson, Chapter 3; see also Knight and Power, Chapter 37) have sufficient mental and communicative capacities to use rudimentary protolanguages, none do so in the wild. Hence, protolanguage almost certainly (p. 27) dates to a period subsequent to the hominin phylogenetic split from chimpanzees and bonobos, around 5–8 million years ago (mya) (Cann, Chapter 24). Language, then, is an innovation in hominins. Although a number of hominin and possible hominin fossils date to the lengthy period lasting from 2–7 mya (Wood and Bauernfeind, Chapter 25), we do not know for sure if any of these early hominins, such as Paranthropus, Ardipithecus, Australopithecus, or indeed early Homo were our direct ancestors. Moreover, we cannot tell whether any of these hominins had any form of protolanguage or anything we would recognize as speech. If they did, this is not obvious from fossil anatomy, as most species had brain sizes only slightly larger than those of apes, and at least one had an ape‐like hyoid bone.
However, nearly all australopithecines and early forms of Homo had made definite strides in the direction of human‐like adaptations. Although most retained some (ape‐like) adaptations for an arboreal lifestyle, all were also bipedal to some degree; bipedalism is a trait often hypothesized to serve as a preadaptation to the descended larynx that characterizes modern humans (MacLarnon, Chapter 22). Most of these hominins also had enlarged molar teeth with thicker enamel layers than is generally present in apes (Wood and Bauernfeind, Chapter 25), suggesting dietary adaptations. By 2.6 mya, some were clearly manufacturing sharp‐edged stone tools (Semaw et al. 1997) which were used for cutting meat and tendons (Mithen, Chapter 28; Wynn, Chapter 27). Finds of bashed long bones and crania indicate that hammerstones were used even earlier, and what are possible cut marks on bones suggest that manufactured or naturally sharp‐edged stone tools may also have been used prior to 3 mya (McPherron et al. 2010). Taken together, this evidence suggests that by the period between 2–4 mya, early hominins had adopted some non ape‐like foraging strategies (Gibson, Chapter 35). They had perhaps also begun the long evolutionary trek to a hominin feeding adaptation that was eventually characterized by cooking (Wrangham 2009), and by the exploitation of foods with high nutrient density, which were hard to acquire, largely extractive or hunted (Parker and Gibson 1979; Lancaster and Kaplan 2007). These altered dietary strategies both enabled and were accompanied by major changes in life history and social structure. For example, humans are less mature at birth than apes, have a longer period of growth and maturation, and live longer (Locke, Chapter 34). Also, human adults—unlike ape adults—form male/female food‐sharing bonds, and routinely provision the young (Deacon 1997; Hrdy 2009). Possibly these dietary, life history, and social changes played a role in the selection of various aspects of protolanguage, babbling, or speech (Falk, Chapter 32; Locke, Chapter 34; Parker and Gibson 1979).
Dietary changes may also have helped provide the nutrients and energy necessary to sustain the brain enlargement which characterizes early Homo, with brain sizes of 500–725 cc, as opposed to 400–525 cc in australopithecines; Wood and Bauernfeind, Chapter 25. This was the initial step in a two-million‐year‐long period of continuing neural expansion. In a human adult, the brain uses 20% of the body's (p. 28) metabolic energy; in newborns the figure is closer to 60% (Aiello et al. 2001). Growing brains also have very high essential fatty acid requirements (Singh 2005). From comparative studies of animal tool users, Parker and Gibson (1979) hypothesize that the earliest hominin tool users would have acquired increased nutrients by practising omnivorous extractive foraging; that is, they would have adaptations enabling them to use tools to remove a wide variety of high energy, nutrient‐dense foods from outer casings, including hard‐shelled nuts, beans, tubers, embedded insects, honey, shellfish, brains, bone marrow, tortoises, burrowing animals, and meat that required extraction from tough outer hides (see also Lancaster and Kaplan 2007). Later work by Aiello and Wheeler (1995) provided a compatible, expensive tissue, hypothesis. Using evidence of relative gut and brain sizes in various animal species, they suggest that hominin brain evolution required increased consumption of high energy, easy‐to‐digest foods, such as animal products, nuts, or underground tubers. Both hypotheses appear to have been verified by findings of bones that are broken into to allow extraction of marrow, and stone tools that could be used to cut meat in early hominin times (Wynn, Chapter 27). Wrangham et al. (2009) also suggest that early hominins were exploiting water‐based underground storage organs (see also Tobias 2001). Aiello and Wheeler (1995) further suggest that cooking may have been a critical adaptation for meeting the energy requirements of the brain, a hypothesis that has been greatly developed by Wrangham (2009). The cooking adaptation, however, almost certainly arose at a later time than the basic shift to a high quality, nutrient‐rich diet.
By between 1.8 and 1.9 mya, we see the emergence of a new grade of Homo, generally referred to as Homo erectus, though African specimens are sometimes classified separately as Homo ergaster. These hominins are characterized by full bipedalism and greatly enlarged brains (900–1000 cc) in comparison to those of apes; see Mann, Chapter 26. Not only were Homo erectus brains absolutely large in comparison to those of apes or earlier hominins, they were also relatively large in comparison to body size, and they exhibited a hominin‐specific enlargement (‘reorganization’) of the parieto‐occipital‐temporal region (Wilkins, Chapter 19). It has long been debated which factor has played the greatest role in the evolution of enhanced human intelligence: brain reorganization, absolute brain size, or relative brain size (Holloway 1968; Jerison 1973). Homo erectus, however, had them all. Perhaps this is not surprising since most postulated reorganizational changes in the human brain (such as increased numbers of gyri and sulci) increased neuronal connectivity, and proportionally greater increases in the size of some brain regions rather than others actually correlate with increased brain or cortical size (Jerison 1982; Passingham 1975). Of course, a postulated change in one parameter does not indicate that other changes lack relevance. Even if key changes in the brain that are not size‐related have occurred in human evolution, this does not negate the potential cognitive importance of increased neural information‐processing capacities (Gibson and Jessee 1999).
(p. 29) So the increased brain size and obligate bipedalism of Homo erectus clearly indicate an entry into a new, non‐arboreal niche that was not ape‐like, and that involved the exploitation of higher‐energy foods than was the case for apes or earlier hominins, perhaps including increased quantities of animal foods such as scavenged (Bickerton 2009a) or hunted meat, along with tubers and/or cooked foods (Aiello and Wheeler 1995; Wrangham 2009). Whether this move into a new niche was propelled by climatic changes, population pressures, or simple opportunism is unclear. It is unknown whether the increases in brain size in Homo erectus had yet to result in the earlier births of their young (Falk, Chapter 32). It is also not clear whether dietary changes in this period had already necessitated social adaptations such as the adult provisioning of young, male/female food-sharing bonds, and cooperative foraging endeavours. Eventually, such social adaptations may have contributed to the supply of increased sustenance for growing brains (Marlowe 2010).
Evidence from the archaeological record may also provide indications of cognitive changes. Bilaterally symmetrical Acheulean handaxes (Mithen, Chapter 28; Wynn, Chapter 27) are produced by African Homo erectus, beginning about 1.6 mya. These indicate that in comparison to apes and earlier hominins, Homo erectus had increased spatial intelligence (Wilkins, Chapter 19), increased procedural learning capacities (Coolidge and Wynn, Chapter 21; Wynn, Chapter 27), possibly an increased tendency to practise new skills (Corballis, Chapter 41; Donald, Chapter 17), the ability to hold greater amounts of information in mind (Gibson and Jessee 1999), and greater social learning and imitative skills (Donald, Chapter 17; Mithen, Chapter 28). We cannot, of course, know for sure whether Homo erectus also had any form of language or protolanguage. Taken as a whole, though, evidence including the invasion of a new niche by this species, its expanded brain size, and its ability to manufacture spatially symmetrical tools do make it seem quite likely that erectus possessed a pre‐syntactic protolanguage, though there is less agreement about the properties this might have had (Tallerman 2007, 2008a, Chapter 51). Dunbar (Chapter 36) argues on the basis of correlations between social group size and neocortical size that erectus also had enhanced vocal capacities in comparison to apes. MacLarnon, on the other hand (Chapter 22) argues that the narrow thoracic spinal canal of Homo erectus suggests an absence of enhanced breathing control for speech (though see Wood and Bauernfeind, Chapter 25.)
A further major transition occurred about 400 kya with the appearance of advanced hominins, which some palaeoanthropologists classify as archaic Homo sapiens—humans—and others as Homo heidelbergensis (Gibson and Tallerman, Chapter 23; Mann, Chapter 26). These hominins had an almost modern brain size of about 1200–1300 cc. Advanced lithic technology also emerges at this time, and we find the earliest clear evidence of hunting (not merely scavenging) in the form of carefully made wooden spears from Schöningen, Germany, dating to 380–400 kya (Thieme 1997); evidence also comes from the remains of hunted roe deer, horses, (p. 30) and giant elk from various English sites (Wynn, Chapter 27). From the same era, 400 kya, comes the first plausible evidence of deliberate interment of human remains, from the ‘Pit of Bones’ in Atapuerca, Spain, as well as the first evidence for the use of mineralized pigments, perhaps for body ornamentation (see Wynn, Chapter 27).
There are widely differing estimates of when language (let alone protolanguage) arose, depending on what evidence is considered relevant (see Gibson and Tallerman, Chapter 23). Some archaeologists, as well as some linguists, regard the emergence of the full language faculty as extremely recent; see, for instance, Tattersall (1998b, 2010) and Chomsky (2005, 2010). The works cited by Chomsky speculate that the language faculty is part of a broader ‘human capacity’ which only appeared around 50 kya, and certainly less than around 100 kya. The idea that language is not the product of a long and slow evolution, but rather, is a recent saltation ties in with the Minimalist assumptions discussed above. Chomsky consistently suggests that only one crucial step was required, the appearance of Merge, seen as the product of ‘brain rewiring’. In Chomsky's view, this is incompatible with a gradual evolution of the full language capacity. However, the mere existence of the capacity to ‘merge’ information does not necessarily imply the presence of the expanded working memory capacities that would be needed to merge large amounts of information (Coolidge and Wynn, Chapter 21). It has also been argued that the ‘merge’ capacity is not unique to language, but is also necessary for the manufacture of constructed tools (e.g. hafted tools) and thus was present considerably earlier than 50 kya (Gibson and Jessee 1999).
What has been considered evidence for a very recent emergence of language is the massive increase in technological sophistication seen in the archaeological record in Europe, starting at around 35–40 kya, a period known as the Upper Palaeolithic. This technology led to composite tools such as harpoons and spearthrowers; highly adaptable, specialized stone blades and microliths, i.e. small blades used in backed tools; a wider variety of materials worked on, including ivory, bone, and antler; as well as an explosion in forms of art, including personal ornaments, an increased use of pigments, carvings in various materials, cave paintings, musical instruments, and grave goods, all of which have been associated with symbolic cognition (e.g. Tattersall 2010). According to this view, a ‘human revolution’ occurred within the last 50 kya, producing as a package advanced technology, symbolic thought, and language (Klein 1989, 1992, 1998; Mellars 1989a, b; Mithen 1996; Klein and Edgar 2002).
However, many recent archaeological discoveries, and more accurate methods of dating previous discoveries, have changed the picture radically; see d'Errico and Vanhaeren, Chapter 29; also Wynn, Chapter 27. What was once thought to be specifically European technology is now known to have been present in Africa, starting perhaps 300 kya, but in any case significantly pre‐dating the European Upper Palaeolithic (McBrearty and Brooks 2000 is the seminal reference, but much (p. 31) work in the past decade also supports this view; see contributions to Mellars et al. 2007). The use of finely‐made blades and points in the African Middle Stone Age dates to at least 280 kya (McBrearty 2007), as do grindstones for processing plant foods and pigments used in colouration. Since Middle Stone Age points were routinely hafted (McBrearty and Brooks 2000), it can no longer be assumed that composite thought is a recent innovation in our species. Personal ornaments such as beads date to between 70 and 100 kya, and abstract art, in the form of carved bone and ochre, dates to the same period. Brown et al. (2009) report on the controlled use of fire in the heat treatment of stones, in order to improve their flaking properties, apparently dating as far back as 164 kya in Pinnacle Point, South Africa. The authors point out that a sophisticated knowledge of fire is required to complete the process, which is cognitively highly demanding. This location also provides the earliest evidence for the consumption of shellfish (164 kya, ± 12 ky), argued to have been collected by predicting tides (Marean et al. 2007; Marean 2010a). And Lombard and Phillipson (2010) report on findings from Sibudu Cave, South Africa, showing that stone‐tipped arrows were in use 64 kya, pushing the use of bows back at least 20,000 years from previous discoveries.
Interestingly, recent genetic findings support these lines of evidence. Behar et al. (2008) report that mitochondrial DNA (mtDNA—which is transmitted via maternal inheritance only) in the Khoisan peoples of South Africa diverged from mtDNA in the rest of the human gene pool between 90 and 150 kya, and remained separate for a long period: introgression from other lineages did not occur until about 40 kya. Since the Khoisan peoples have a normal human language faculty, this strongly suggests that full language was already in place at the time of the split. See also the discussion in Cann, Chapter 24.
In sum, the picture now emerging from archaeological, palaeoanthropological, geological, and genetic evidence (Cann, Chapter 24; Mann, Chapter 26) indicates that ‘modern’ human anatomy, behaviour, and cognition are significantly older than once believed, implying that far from being a recent innovation, symbolic thought and some form of language may have been in place 200 kya, or, indeed, significantly earlier.
1.6 Organization of the volume
The volume is divided into five parts. Part I, Insights from comparative animal behaviour, examines animal communication systems and cognitive capacities of potential relevance to the evolution of language and speech. Part II, The biology of language evolution: Anatomy, genetics, and neurology, offers various views of the (p. 32) physical components of a language faculty, including the evolution of a language‐ready brain, the potential relevance of genetic changes in the hominin lineage, and the evolution of the vocal tract. Part III, The prehistory of language: When and why did language evolve?, centres on palaeontological and archaeological evidence of human evolution and presents current interpretations of the selective events that may have led to the evolution of language. Part IV, Launching language: The development of a linguistic species, presents the most immediately ‘linguistic’ chapters, dealing with central properties to be accounted for in language evolution, and issues surrounding the forces that shaped the language faculty. Finally, the chapters in Part V, Language change, creation, and transmission in modern humans, examine a number of putative ‘windows’ on language evolution; for instance, modern events involving language emergence or change, for which we have reasonably concrete evidence, might shed light on the evolution of the language faculty itself.
Chapters in Part I focus both on non‐human primates and on other more distantly related species. In the case of non‐human primates, the central question concerns homologous traits, that is, properties shared by humans and other primates by virtue of common ancestry. For instance, all apes (gibbons, siamangs, chimpanzees, bonobos, gorillas, orang‐utans, and humans) lack tails, and apart from orang‐utans, live in social groups. Hence, we can infer that the last common ancestor of these species (which probably lived 14–18 mya; Cann, Chapter 24) was also tailless and social. Similar methods can also shed light on changes within specific lineages. For instance, great apes apart from humans are all well adapted to tree‐climbing and have air sacs, so we can infer with confidence that our distant ancestors were largely tree‐dwelling and had air sacs, and it is the hominin lineage that diverged.
When it comes to cognitive characteristics, it is far harder to determine homologies, because unlike physical characteristics, these are not readily available for inspection in the fossil record and, indeed, it has often proven difficult to determine cognitive capacities in living species. Thus, in the case of language evolution, we have many more questions than answers concerning possible primitive traits, i.e. those inherited from a common ancestor. Nonetheless, progress is being made. From the chapters in Part I, it seems fairly clear that the common ancestor of great apes and humans, although almost certainly lacking protolanguage, would have possessed a number of protolanguage‐pertinent cognitive and communicative skills, including the ability to create novel referential gestures and, if sufficiently motivated, to use such gestures in cooperative endeavours. It remains uncertain whether or not the common ancestor of all great apes and humans, or even the common ancestor of chimpanzees, bonobos, and humans (about 5–8 mya) had the capacity to create novel vocalizations. However, it now appears that extant great apes have more vocal flexibility than previously thought (Slocombe, Chapter 7).
Although great apes may have most, or even all, of the essential cognitive prerequisites for a protolanguage of some kind, they clearly do not possess the (p. 33) full complement of neural, cognitive, and physical characteristics (such as the uniquely human vocal tract) needed for fully developed language and speech, including syntax and extensive hierarchical capacities. In some cases, apes may possess cognitive or neural traits which they use in non‐communicative contexts, but which were exapted for language; that is, traits that were co‐opted for language at some point during hominin evolution, such as social cognition (Seyfarth and Cheney, Chapter 4). In other instances, essential components of language or speech lack any primate homologues at all, but rather evolved anew in the hominin line. Disentangling these issues is fraught with problems, but, nonetheless, stands as one of the most critical goals in the reconstruction of language evolution.
Examining communication and cognition in distantly related lineages such as birds and cetaceans (marine mammals) is unlikely to reveal relevant homologous traits, other than those that might be common to all vertebrates. However, it may shed light on analogous traits—functionally similar features that have evolved separately in more than one lineage, usually in response to similar environmental pressures, a phenomenon termed convergent evolution. For example, wings have evolved separately in birds and bats. Both types of wing permit flight, but because these lineages share no recent common ancestry, their wings are structurally very different. Although at least from the perspective of most linguists, nothing in the rest of the animal kingdom is remotely analogous to language, a number of animals do have vocal learning capacities that are far more sophisticated than those of non‐human primates. What we can hope for from animal studies is to find out something about the selection pressures that led to the evolution of advanced vocal learning capacities and other traits absent in apes, but present in humans and some distantly related taxa. For example, what selective pressures led humans to a communication system that is fundamentally innate, since all normal infants clearly have a language faculty, yet which in all its fine details (all the specific elements of each ‘language’) is learnt, transmitted from parent to offspring, just like some bird song?
Turning now to Part II, The biology of language evolution: Anatomy, genetics, and neurology, we note that scientists from different disciplines ask different, but equally valid, questions. In doing so, as Fitch notes, they often ignore or ‘black box’ subject matter beyond their specific realms of inquiry and expertise. It is common, for example, for animal behaviourists to focus entirely on visible behaviours, while ignoring the genetic, physiological, and neurological mechanisms that make those behaviours possible. Similarly, many language evolutionary theorists treat genes and brains as black boxes. Black boxing often results in theories that make sense, until one examines the contents of the box. For example, on the basis of archaeological data, some have proposed that language arose from a sudden rewiring of the brain, perhaps caused by a specific mutation (see Klein and Edgar 2002; note also remarks above). Although such proposals may seem perfectly reasonable if viewed solely from the context of archaeological remains, to (p. 34) many geneticists and neurobiologists they seem untenable. Consequently, a full understanding of language evolution will require the opening of many boxes and the integration of their contents into comprehensive, synthetic frameworks. To aid in this endeavour, the chapters in Part II focus on biological aspects of language evolution primarily from the perspectives of anatomy, neurobiology, and genetics, but also to some extent on modern understandings of neural developmental processes and evolutionary mechanisms that can result in species‐wide fixations of what were initially plastic phenotypes.
Part III is The pre‐history of language: When and why did language evolve?. Chapters here address questions of language origins from genetic, palaeoanthropological, archaeological, and linguistic perspectives, and offer some selective scenarios for the evolution of language. Incontrovertible evidence for the existence of language dates only to the emergence of written scripts a few thousand years BP. Though it is certain that language arose long prior to that, the time period of its emergence is unknown. Widely diverse information drawn from a number of academic disciplines can, however, provide clues, which, taken in combination, may eventually yield plausible answers. Genes and fossils, in combination, help us chart phylogenetic pathways and times of origin of both early hominins and modern humans. From the fossils, we can also determine the size and external configuration of the brain of each hominin species, as well as the structure of the hyoid bone, oral cavity, thorax, and locomotor organs. From archaeological remains we glean evidence of hominin lifestyles, diet, technology, and art. From historical linguistics, we gain insights about language history as well as possible answers to key questions such as whether modern languages derive from single versus multiple early languages (see also Nichols, Chapter 58). The papers in Part III focus largely on two interconnected issues. First, when did the language faculty emerge? And when are protolanguages and languages likely to have initially appeared? Second, why did language evolve? Did language, for example, arise as a mere accident of genetic drift or as a spandrel (by‐product) of some other evolved trait(s)? Alternatively, perhaps the language faculty, speech, or even particular linguistic features were the specific targets of selective agents; if so, what might those agents have been?
In Part IV, Launching language: The development of a linguistic species, the chapters deal with the various aspects of the language mosaic that must be accounted for in language evolution. Here, the cognitive capacities required as prerequisites for language are discussed, including the symbolic capacity, the development of linguistic meaning, and the ability to learn and store vocabulary. Questions surrounding the modality for language externalization are discussed: were the earliest pre‐languages gestural or vocal? How did the uniquely human system of duality of patterning emerge? A central issue in evolutionary linguistics concerns the existence—and appearance—of what is generally termed ‘protolanguage’; a putative pre‐syntactic stage, which, like full language, depended critically (p. 35) on socio/cultural transmission, and thus involved learning. Protolanguage thus constitutes a radical break with the primate communication systems that must precede it: it is entirely volitional and (in its details) entirely learned. Whatever the actual form of such protolanguages was, how did they subsequently turn into fully syntactic languages? These issues are debated in Part IV.
Part V is Language change, creation, and transmission in modern humans. A number of chapters here discuss instances of observable language creation and change, including the formation of pidgins and creoles; homesign and emergent sign languages; the processes of grammaticalization, which occur in all languages; and the role of contact between individuals and groups in triggering language change and language shift. Plausibly, what is known about the way languages emerge and change in recorded history and in the present day may offer evidence for the trajectory of evolution in the earliest pre‐languages and languages. Most of the chapters in Part V emphasize the critical role played by the socio/cultural transmission of language. Since the details of individual languages can only be transmitted in a social context, it seems highly likely that this mode of transmission has shaped the language faculty itself, and thus has shaped the way that languages are too. Some authors in Part V suggest that languages are themselves adaptive systems, akin to independent biological entities but residing in human brains. Evidence of a fairly new kind is offered to support the hypotheses in a number of chapters in Part V: work from computational, mathematical, and robotic modelling. As is the case with molecular evidence, modelling is a relatively young field, and it is certain that many advances will occur in this sphere in the coming decades.