Genetic influences on language evolution: an evaluation of the evidence
Abstract and Keywords
This article concentrates on the three genes of recent interest in the literature on language origins. These genes are microcephalin and ASPM, which cause microcephaly when disabled and FOXP2, which causes a severe speech and language disability when disrupted. The FOXP2 gene was isolated, sequenced, classified as a member of the forkhead box family, and named FOXP2 by the year 2001. The protein products of forkhead genes have forkhead DNA binding domains, which bind to specified regulatory sequences in other genes, and regulate the expression of these other genes. FOXP2 is expressed in the mouse brain during development, but is also expressed in a wide variety of mouse tissues. The gene has many essential roles in mammalian development and function that are totally unrelated to language. It was announced in the year 2005 that two genes essential for proper brain growth, microcephalin and ASPM, are undergoing a change. Microcephalin and ASPM proteins are crucial for proper brain development. Microcephalin is involved in regulating the cell cycle especially in relation to DNA repair before cell division. ASPM helps to align the mitotic spindles in the cell so that it divides symmetrically. The defective versions of microcephalin and ASPM result in microcephaly, a genetic disorder in which people have small heads and small brains.
It is commonly accepted by biologists that language is a defining characteristic of Homo sapiens and that the biological capacity for language was present in the earliest anatomically modern humans. Chimpanzees and bonobos, on the other hand—the species most closely related to us—have only limited linguistic capabilities, using manual, visual, and auditory modalities, and they have minimal conscious neuromuscular control of the vocal tract, not being able to speak or even mimic words. Now that both the human and the chimpanzee genomes have been sequenced, we can begin to ask seriously what the genetic changes were that gave humans the ability to speak and to use our grammatically elaborated languages. We (p. 169) are seeing that the answer is much more complicated than such simplifications as ‘the grammar gene’ or ‘a mutation for language’.
We concentrate in this chapter on the three genes of recent interest in the literature on language origins: two genes that cause microcephaly when disabled (microcephalin and ASPM), and FOXP2, which, among other things, causes a severe speech and language disability when disrupted.
In evaluating the evidence, two important principles should be kept in mind: (1) many genetic processes which may interfere with normal language use are not related to the origin or evolution of language in any major way; and (2) the presence or absence of a large brain does not guarantee the presence or absence of language.
For an example of the first principle, the fragile X syndrome in humans can lead to a syndrome including delays in speech acquisition, repetitive speech, or problems with vocabulary, syntax, and labelling. In addition to communication difficulties, the children can show varying levels of mental retardation, distinctively large ears and testicles, and long faces, and they may suffer from excessive shyness, tremors, or seizures. In the disease state, the FMR‐1 gene cannot produce the FMRP protein that is necessary for proper development of neural connections in the developing brain. The mouse model of fragile X syndrome shows that this syndrome does not primarily target language, but targets underlying processes of neurodevelopment at a very basic level.
An example of the second principle, concerning large brains, is that microcephalics with a chimpanzee‐size human brain can learn language at least to the level of 6-year-olds by the time they are 12, showing that the organization of the brain, not its size, is crucial for language. Note also that a human mutation in the CMAH gene about 2.7 million years ago stopped production of an enzyme that apparently inhibits brain cell growth (Chou et al. 2002). Releasing this brake on brain cell growth may have been a major factor in tripling the size of the human brain since that time. This growth may have been a prerequisite for the reorganization of the brain to give it the capacities for language, but it does not in itself explain this reorganization or the origin of language capabilities.
With regard to FOXP2 and the microcephaly genes, discoveries have sometimes led to false hopes as well as premature and erroneous conclusions, but evidence is building up and is giving us leads for further study. Let us review this evidence.
In 1990 there was much excitement that the ‘grammar gene’ might have been found. Half of the members of a three‐generation family in London, the KE family, (p. 170) had a severe speech and language disorder which showed the inheritance pattern consistent with a dominant autosomal gene (Hurst et al. 1990). Gopnik and Crago (1991) did a linguistic study on this family and found special difficulty with paradigmatic grammar. In the popular mind this translated to the notion of a ‘grammar gene’. Noam Chomsky had been arguing for more than three decades that universal grammar was innate in humans, and here, finally, was possible genetic evidence.
Further studies of the KE family demonstrated that grammar was not the central issue; the core deficit was one involving sequential articulation and orofacial praxis (Watkins et al. 2002a). In affected members of the KE family we see disruption of normal brain development resulting in increases and decreases of a wide range of brain structures, including subcortical structures and the cerebellum (Watkins et al. 2002b).
By 2001 the gene was isolated, sequenced, classified as a member of the forkhead box family, and named FOXP2 (Lai et al. 2001). The protein products of forkhead genes have forkhead DNA binding domains which bind to specified regulatory sequences in other genes, and regulate the expression of these other genes. More often than not FOXP2 turns other genes off (Spiteri et al. 2007). FOXP2 is expressed in the mouse brain during development, but is also expressed in a wide variety of mouse tissues; it has many essential roles in mammalian development and function that are totally unrelated to language.
Enard et al. (2002), in their study of the ‘Molecular evolution of FOXP2, a gene involved in speech and language’, found that there are three amino acid changes in human FOXP2 protein compared with mouse, and that two of these changes come in the human line in the 6 million years since our common ancestor with chimpanzees. Later work showed that the one difference between chimpanzee and mouse occurred on the mouse line. FOXP2 is one of the most conserved mammalian genes, with no FOXP2 protein amino acid changes in chimpanzee, gorilla, or macaque going back some 90 million years since the common ancestor of primates and rodents. Yet there are two changes between humans and our common ancestor with chimpanzees. Enard et al. speculated that these relatively recent changes in the human line may have affected ‘a person's ability to control orofacial movements and thus to develop proficient spoken language’. Then using computer simulation with a likelihood model they concluded that the most likely date for the fixation of these human mutations was zero years ago, that is, so unlikely, by their model, as to almost not have happened. They calculated a 95% confidence interval going back 120,000 years on a chi‐square distribution highly skewed toward zero years ago. This, they said, was consistent with Klein's speculation that a mutation for language caused a cultural revolution 50 kya that led to art and to human migration out of Africa (Klein 1989).
(p. 171) McBrearty and Brooks (2000), in their article ‘The revolution that wasn't’ effectively refuted Klein's hypothesis of a cultural revolution. In a paper presented at the Cradle of Language conference in 2006, Diller and Cann showed that the date of zero years ago was based on a flawed and inappropriate model. Using genomic evidence, we proposed a date of 1.8 or 1.9 mya for the mutations in FOXP2, approximately the time when the genus Homo (Homo habilis, H. ergaster,H. erectus) emerged (Diller and Cann 2009). A year later, in 2007, a team including Enard sequenced the Neanderthal FOXP2 gene and announced that Neanderthals have the same mutations in FOXP2 that modern humans have (Krause et al. 2007). The date for the Neanderthal/modern human common ancestor is some 660–500 kya, so the mutations in FOXP2 occurred some time before that split, consistent with our date of 1.8 or 1.9 mya. Hardly anybody believes any more that the human mutations in FOXP2 occurred in the last 200,000 years, i.e. since the emergence of anatomically modern humans.
In several species, FOXP2 is related to vocalization (see references in Diller and Cann 2009): it is upregulated in neurodevelopment during seasonal periods of song circuit growth for canaries, and during the time when zebra finches learn their song in infancy. In adult zebra finches, FOXP2 protein is downregulated during singing. In mice with only one copy of FOXP2, ultrasonic vocalization of infants is greatly decreased. In echolocating bats there is an unusually large number of mutations in FOXP2. Thus it would seem likely that FOXP2 is more important for developmental circuits for vocalized speech than for something as complex as grammar.
Speech and certain aspects of grammar, however, are closely related to each other from the standpoint of human neural function. The KE family with its disruptive FOXP2 mutation has a disruption of both speech and certain aspects of grammar in the wider context of orofacial dyspraxia. Broca's aphasia, stemming from lesions in the motor association cortex for the vocal tract, involves both effortful, distorted speech and agrammatism. Grammar may have been dependent on speech for its neural origin, the neural mechanisms for grammar in Broca's area being elaborated on the motor association cortex for vocalization.
The date of 1.8 or 1.9 mya for the human FOXP2 mutations is just at the time when the human brain began to triple in size, from the 450cc of australopithecine and chimpanzee brains to the 1350cc of modern human brains. If the elements of vocal speech began early in this evolution, then symbolic speech, grammatical language, and the spectacular brain growth would have evolved together, the type of co‐evolution a biologist would expect.
Mutations in a single gene can cause disease and great disruption of function, as in the KE family, but biologists do not expect single mutations to cause complex innovations such as the origin of language; they expect long periods of co‐evolution with many genetic changes.
(p. 172) 15.2 Tone languages, microcephalin, and ASPM
In 2005 it was announced that two genes essential for proper brain growth, microcephalin and ASPM, are currently undergoing a change: new variants of these genes have been gaining frequency in the last ̃37,000 years (for microcephalin) or ̃5800 years (for ASPM) (P. Evans et al. 2005; Mekel‐Bobrov et al. 2005). Dediu and Ladd noticed a negative correlation between these new gene variants and tone languages (Dediu and Ladd 2007). They argued that tonality is phylogenetically older, and that these new gene variants give populations a bias towards non‐tonality in language. Although Dediu and Ladd present statistically significant correlations, this is almost certainly a statistical artefact with speculation that goes far beyond the evidence, as discussed below.
We need some background on these two genes because of the controversies in the press about intelligence and race when it was announced in 2005 that the brain is still evolving, and that certain favourable new mutations are common in Eurasia but not in Africa. Bruce Lahn and the University of Chicago wanted to patent these genes for a potential intelligence test in case it could be shown that there was a cognitive advantage to these mutations (Balter 2005). No cognitive advantages have been found and the patent process was stopped. People with these mutations have no advantages on tests for general or social intelligence, and have no differences in brain size or head circumference. If Dediu and Ladd were right that these new mutations caused a bias against tone languages, this could actually be seen as a cognitive disadvantage.
The question of whether there was recent selection at the microcephalin and ASPM loci is a serious one. The original evidence came with comparison to simulated data. A Harvard group, noting wryly that ‘detection of selection solely by comparison with simulated data has had a mixed record’, compared ASPM with a number of other loci empirically, and found no evidence of selection (Yu et al. 2007). Population genetic models with population structure combined with population growth could also produce the patterns found at ASPM and microcephalin (Currat et al. 2006). That is to say, genetic drift without selection could explain the spread of these genes. If there is no selection at the ASPM or microcephalin loci, arguments based on selection would no longer be viable.
Defective versions of microcephalin and ASPM result in microcephaly, a genetic disorder in which people have small heads and small brains—brains the size of chimpanzee brains. With primary microcephaly, or microcephaly vera, there are no other physical signs except for the small brain and head and its attendant mental retardation. The structure of the small brain is grossly similar to the normal human brain, but there are subtle differences, and the cerebral cortex is somewhat smaller (p. 173) than in a perfectly scaled-down brain. Microcephalics can learn language (at least to the 6‐year‐old level) in spite of a chimpanzee‐size brain, showing that it is the organization of the brain, not its size, that is important for language.
When it was discovered that mutations in microcephalin and ASPM caused microcephaly, the immediate speculation was that these genes might ‘control’ (P. Evans et al. 2004a), ‘regulate’, (P. Evans et al. 2005), or even ‘determine’ (Jackson et al. 2002) brain size, and thus be key to the evolution of the large brain in modern humans.
This trap of reverse thinking was present in the first major work on microcephaly, Charles Vogt's 1867 monograph Memoire sur les microcéphales ou homme‐singes (homme‐singes = ape men), which examined almost 200 known European cases (Vogt 1867). Vogt proposed that microcephaly involved an ‘atavistic formation’, or a winding backwards of evolution. Given modern genetic knowledge this argument is untenable. Nevertheless it is a common‐sense trap that is easy to fall into even in the 21st century.
Microcephalin and ASPM proteins are crucial for proper brain development: microcephalin is involved in regulating the cell‐cycle especially in relation to DNA repair before cell division. ASPM helps to align the mitotic spindles in the cell so that it divides symmetrically. It is not clear that these genes ‘regulate’, ‘control’, or ‘determine’ brain size any more than air pressure in a tyre regulates, controls, or determines speed—a flat tyre causes speed to plummet, but more air than normal doesn't do much, and if air pressure remains normal, something else determines speed.
The data on the global frequency of the ASPM and microcephalin variants (ASPM‐D and MCPH‐D, respectively), as presented in the maps of P. Evans et al. (2005) and Mekel‐Bobrov et al. (2005), show that both mutations arose in Eurasia, with ASPM‐D much more recent and not spreading as much to East Asia.
Dediu and Ladd (2007) present a figure which shows that in populations where there is a low frequency of both MCPH‐D and ASPM‐D, the ten languages sampled are all tone languages. This represents the reality of sub‐Saharan Africa. In places with an ASPM‐D frequency of at least 30% (western Eurasia) there are no tone languages. Elsewhere there are approximately equal numbers of tone and non‐tone languages. The statistical algorithms used by Dediu and Ladd ruled out geography as a significant factor in the relationship between tone languages and these gene variants. But the mutations arose outside of Africa, and it is clear by inspection that if one eliminated Africa from the analysis the results would be quite different.
Dediu and Ladd exclude the Americas from their analysis because they were ‘too poorly sampled for their genetic and linguistic diversity’. Of 1002 native American languages (Gordon 2005), there is genetic data for ASPM‐D and MCPH‐D for only five population groups. But Africa has an estimated 2092 indigenous languages (Gordon 2005) and there are only 11 in Dediu and Ladd's analysis—almost exactly the same low sample ratio as in the Americas. Australia and India have no (p. 174) representative languages in the analysis. Besides the low sampling of languages, the genetic sampling from each language population is very low. The typical language group analysed had only 19 people giving genetic information; a third of the groups had between 7 and 10 people in the sample. This statistical base is not adequate to test Dediu and Ladd's hypothesis.
An additional statistical problem is that the ASPM‐D mutation originated so recently, an estimated 5800 years ago, that two‐thirds of the samples in this analysis have a frequency of this variant of less than 30%, and only one sample is significantly over 50%. Is that small proportion of the population large enough to bias language change so strongly as to drive out tone languages in 5800 years? The frequency of ASPM‐D 5800 years ago would have been 0%, and presumably it has been increasing at an exponential rate. This means that most of the increase has been in recent centuries in the historical period. Is there evidence that there has been widespread loss of tone systems in tonal languages during the historical period in populations where we find the ASPM‐D haplotype?
Dediu and Ladd claim that the correlation of the distribution of tone languages with the distribution of ASPM‐D and MCPH‐D is not ‘spurious’, but the statistical base is not convincing. They do not present linguistic evidence that ASPM‐D and MCPH‐D have driven out tone systems in the historical period, and they do not specify any genetic or biological mechanism of causality.
15.3 Gene networks relating to the capacity for language: FOXP2 and CNTNAP2
Although FOXP2 does not have a simple direct causal role in the evolution of language, there is suggestive evidence of a role in vocalization and control of speech organs both in humans and other animals, in addition to a role in neurodevelopment. Since FOXP2 is a transcription factor regulating the expression of other genes, two groups have looked for other genes important for speech and/or language by investigating the downstream targets of FOXP2, that is, the genes regulated in some way by FOXP2 (Spiteri et al. 2007; Vernes et al. 2007). Vernes et al. (2007) suggest that they ‘would expect a minimum of 1.5% of promoters in the human genome (i.e. at least several hundred genes) to be occupied by FOXP2’ in their cell‐based models. In turn, these genes would affect several hundred more genes downstream in various biochemical pathways.
One intriguing target of FOXP2 is CNTNAP2, a gene involved in cortical development and axonal function (Vernes et al. 2008). High levels of CNTNAP2 have been found in language related circuits. Polymorphisms in CNTNAP2 have (p. 175) been associated with specific language impairment (as tested by nonsense word repetition), and with language delays in children with autism. Vernes et al. found that FOXP2 protein binds to and directly downregulates CNTNAP2, so that in the developing human cortex, the lamina that contain the most FOXP2 protein have the lowest levels of CNTNAP2. But if FOXP2 and CNTNAP2 are negatively correlated, how can high levels of both be important for language? This simple finding demonstrates the complexity of the gene networks regulated by FOXP2.
It is not likely that there was any single mutation causing the origin of language, or even speech, as seen by the complex relationship between FOXP2 and CNTNAP2, and by the fact that FOXP2 regulates several hundred genes, including many that have non‐language related functions which are so important that there were no amino acid changing mutations in FOXP2 for 90 million years, from chimpanzee back to its common ancestor with rodents. We are only at the beginning of understanding the role of genetic processes in building the neuroanatomical structures necessary for human language.