Show Summary Details

Page of

PRINTED FROM OXFORD HANDBOOKS ONLINE ( © Oxford University Press, 2018. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

date: 27 January 2020

A House of Sound Structure, of Marvelous form and Proportion: An Introduction

Abstract and Keywords

This article begins by defending the so-called Proposition 1, which claims that for the linguist Arabic is the most interesting language in the world. It then describes the handbook’s scope and choice of topics, followed by a discussion of the real world of research on Arabic. It suggests that the study of a language must be more than the sum of its parts. However, as far as Arabic goes, a holistic linguistic tradition remains an unrealized desideratum. A number of factors continue to militate against this development, including the fact that Arabic is a very large language, the stovepiping characteristic of contemporary academia and the clash of academic and cultural traditions. Finally, the article discusses how the only approach that does justice to Proposition 1 is one grounded in radically open-minded empiricism.

Keywords: Arabic, language, linguistics, Proposition 1, empiricism

1.1 The Interest of Arabic: Proposition 1

Arguably, for the linguist, Arabic is the most interesting language in the world. I will term this “Proposition 1.” This claim will certainly strike most as either arrogant or woefully wrong-headed, otiose, and lacking any measurable basis of substantiation. It furthermore runs afoul of deeply embedded beliefs in linguistics itself.

In particular is the assumption that all languages are, for purposes of linguistic analysis and insight they give into the universal properties of language, equal. Indeed, on this basis one can agree only that there is no a priori reason to think that the structure of Arabic will tell us more about language than will, say, the structure of Dγweďe, a Central Chadic language spoken by perhaps 40,000 speakers. In terms of its grammatical properties alone, Arabic has no more claim to the attention of linguists than does any other language.

To hypothetically formulate a second objection, it might be argued in some circles that Arabic should have a special linguistic place due to being the language of Quranic (p. 2) revelation. While this position may have its partisans among some, it in fact has no inherent connection to its status within linguistics, as indeed was recognized by many of the Classical Arabic grammarians themselves (e.g., Ibn al-Nadim, cited in [Owens, “History”]).

A third objection is simply that there is no basis for defining what “interesting” means. This brings us to the defense of the proposition.

1.1.1 The Geographical, Demographic, Chronological, Cultural Gestalt

First, and most basically, once one factors away the grammatical, semantic, pragmatic, and formal aspects of languages, it is clear that not all languages are equal.

This can be measured first of all with simple quantitative standards. There are large languages and small languages. Arabic is one of the world’s largest, spoken natively by about 300 million speakers, and as a second language (L2) by perhaps another 60 million. It is by a large margin the largest language in Africa (nearly 200 million speakers) and one of the biggest in Asia (120 million). It has been estimated to be the fifth largest language in the world in terms of native speakers. Strength of numbers alone guarantees it communicative centrality in the world language system (de Swaan 1998, 2001).

Arabic is equally spoken over one of the largest land areas of any native language. It is spoken continuously in the east from Iraq and Khuzistan in southwest Iran, all the way to Morocco and to northeastern Nigeria in the west, an area covering nearly a seventh of the latitudinal distance of the globe. In addition, a number of Arabic-speaking Sprachinseln can be found outside of this area (see Map 1.4 in the Appendix at the end of this chapter).

Arabic is furthermore the language of the Quran, Islam being the only one of the large religions whose holy book is revealed in a specific language. Hence, it is learned to one degree or another for religious, ritual, cultural, and legal purposes by nearly all Muslims,1 and equally important, is therefore revered as the purveyor of God’s word. It is the language of the great texts of Arabic–Islamic culture. “Arabic” thus binds the (p. 3) communicative, intellectual, and emotional in one linguistic gestalt, in a way perhaps no other language in the world today does.

The history, both written and orally reconstructible, of the Arabic-speaking peoples is, compared with most languages, well documented, even if from the specialist’s perspective gaps in the history are perhaps more prominent than what is available. The first reference to Arabs, which may be inferred to be a reference to Arabic speakers, dates from 853 BCE, North-Arabic clan names are mentioned even earlier (Lipiński 2000: 101, 457), and Arabic begins spreading with great rapidity out of its core Middle East location in the Arabian peninsula, Iraq, and Syrian and Jordanian desert at the beginning of the Islamic era (nominally, 622 CE). By 92/711, relatively large and self-contained groups of Arabic speakers stretch from Uzbekistan in the east to Spain (Andalusia) in the west. A further significant expansion out of Upper Egypt into the Lake Chad area around 800/1400 extends this region. With the exception of Spain, and allowing for modern, “global” diasporas, this essentially defines the limits of the Arabic-speaking world until today (see Owens 2009, chapter 1, for broader summary).

The linguistic consequences and challenges of this geo-history are self-evident. While Arabic has even in pre-Islamic times always been dialectally diverse (Rabin 1951), this diversity has probably increased in the wake of the great Arab–Islamic expansion. If till today simple models for classifying Arabic dialects elude us [Behnstedt and Woidich, “Dialectology”], it is no doubt in large part because an originally diverse proto-situation has continued to diversify across the vast geographical region where Arabic is spoken. Hand in hand with cataloguing the dialectal diversity goes the challenge of developing an historical linguistic model that accounts for the present-day situation. If, as argued in this volume [Owens, “History”], traditional accounts of Arabic language history have generally failed to provide linguistically adequate models of historical development, work on a comprehensive account is largely in its incipient stages.

Not surprisingly, in its expansion across a seventh of the earth’s latitudinal distance, speakers of Arabic have come into contact with a large number of languages. The degree to which spoken Arabic itself has been globally affected by this contact is a matter of ongoing debate, with some scholars, such as Versteegh (particularly 1984), arguing that the effects have been profound, whereas others, including Kossmann [“Borrowing”], see Arabic often as the dominant, hence imposing, language in contact situations. Certainly the latter perspective receives support from those well- or fairly well-documented extreme situations where unquestionably, or arguably, new varieties arise from the contact. One of these concerns the emergence of Pidgin and Creole varieties in the Sudanic region and East Africa, varieties that emerged from a common ancestor in the 19th century, today variously known as Turku, Juba Arabic, and Nubi or Kinubi. Since Versteegh’s (1984) argument that the structure of Arabic dialects is to be accounted for by having passed through a stage of Pidginization, a counterconsensus ([Tosco and Manfredi, “Creoles”]) has developed that these Pidgin/Creole varieties are indeed entirely new languages, following the classical model of creolization, with little implication for understanding mainstream historical developments of contemporary Arabic dialects. Relatively underdebated are Uzbekistan and Afghanistan Arabic, spoken by (p. 4) very small populations. Whereas these varieties have classic features of Arabic verbal morphological structure, in other areas of grammar they display marked deviations from any other variety of Arabic, for instance, in having a fixed subject–object–verb (SOV) word order. All deviations are readily explicable as influence from the Dari, Tajik, and Uzbek adstrates, and therefore the question can be raised as to whether these varieties are typologically mixed languages ([Tosco and Manfredi, “Creoles”]).

Before adducing more evidence in favor of Proposition 1, it is relevant here to take stock of the argument to this point. Beginning from older, classical perspectives on language, issues in Arabic dialectology and language history are multifarious, the challenge of building a comprehensive descriptive database remains high, and questions of language contact all along the vast geographical expanse of Arabic are open. Each of these domains represents a significant linguistic challenge, certainly descriptively but also methodologically and theoretically: what is the role of contemporary dialects in reconstructing language history; what determines direction of influence (van Coetsam 2000); what domains of language are more liable to contact influence; why do ostensibly similar global social conditions among communities of Arabic speakers lead to radically different linguistic outcomes (Owens 2000: 23); indeed, does a definable construct “Arabic” exist [Retsö, “Arabic?”]. But matters become even more interesting linguistically when the two peripheral varieties, Juba Arabic/Nubi and Uzbekistan Arabic, are added. Arabic is the only language in the world from which have emerged both Creole varieties and, arguably, mixed-language varieties. Arabic thus provides a living model for linguistics as a whole to address classic questions of historical and contact linguistics: what happens structurally to a language in the case of normal transmission (in general, the end product of the contemporary dialects) versus, by way of comparison, extreme situations of sociopolitical upheaval or cases of intense contact in a minority situation (Thomason and Kaufman 1988). Interim positions along the continuum formed by these poles can be integrated into linguistic typologies (e.g., Maltese, Kormakiti Arabic in Cyprus, Anatolian Arabic). Certainly, in the domains of phonology and morphology and also to some degree syntax, rigorous measures of core (necessary, not sufficient) Arabic could be constructed. Lurking in the background is the question of how inferences can be drawn from today’s situations to interpret issues of Arabic historical linguistics and how, proceeding from contemporary sociolinguistics methodologies, determining factors in such developments can be extrapolated.

1.1.2 The Classical Language, the Linguistic Tradition

The factors summarized in the previous section alone are of enticing interest to linguistics, without mention even having been made of what is unquestionably the most central icon of Arabic: the classical language. It is remarkable that what today is for some the form of Arabic—the ʕ Arabiyya, or the Fuṣħaa, popularly known as Standard or Modern Standard Arabic—is by and large identical to the form of Arabic broadly described by the late 2nd-/8th-century grammarian Sibawaih.

(p. 5) The functions of the ʕArabiyya are legion. Most centrally, it is, roughly, the variety of Quranic revelation. It is the variety that came to symbolize the remarkable intellectual and cultural flowering in the Islamic era and the variety around which the Arabic script developed [Daniels, “Writing”]. It is the variety that became a central cultural and political pillar of the Arabic nahḍa “renaissance” movement of the 19th century ([Newman, “Nahḍa”]) and enjoys the status of official language in 23 nation states today (see Map 1.3 in the chapter Appendix) with its concomitant importance in modern educational systems, it is the variety typically taught in non-Arab universities [Ryding, “Acquisition”], and it continues to be an essential element in any debate on Arab identity [Suleiman, “Folk Linguistics”].

Each and every one of these associations implies linguistic issues of different types: descriptive, historical, political, second-language acquisition. What is most remarkable, however, is the Arabic linguistic tradition itself, which was built on the basis of one of the true classics of linguistics, the Kitaab of Sibawaih (Baalbaki 2008; [“ALT I”]). The very first book on Arabic grammar (so far as our documented record of transmission goes) is a comprehensive (nearly 1,000 pages) descriptive work built on a highly elaborated grammatical theory. While opinions differ as to the origin of the post-Sibawaih Arabic linguistic tradition, it is clear that a highly sophisticated and differentiated theoretical grammatical and pragmatic discourse continued to develop for at least the next 500 years [Larcher “ALT II”]. No less interesting and significant is the voluminous lexicographical tradition that developed in tandem with the grammatical [Sara, “Classical Lexicography”].

Students of Arabic therefore deal not only with the varieties of Arabic themselves but also with a metadiscourse, as it were, which was established within Arabic–Islamic culture. Arabic texts were passed down to us, along with a theoretical framework for analyzing them, constitutive of the Arabic–Islamic tradition, which continues to be of central importance in the contemporary teaching of Arabic and which challenges the interpretive acumen of linguists studying this tradition.

Thus, with respect to Proposition 1, it is not only that Arabic is one of the few languages of the world within which developed a linguistic tradition; also, it is a tradition that continues to exercise its influence on today’s Arabs and Arabic society and beyond to Islamic society.

1.1.3 Arabic and Arab Identities

The two previous points set the stage for the inherent language tension that exists in contemporary Arabic societies. Arabic, the mother tongue of its approximately 300 million speakers, is not the same Arabic as the Arabic that is codified and has official political status and cultural centrality through its association with the Quran and with pan-Arab identity.

On one hand, these two broadly defined varieties can be represented as mutually opposed: official versus unofficial, written versus spoken, formal versus (p. 6) informal, pan- versus local, learned formally versus acquired as a first language (L1). The functional contrasts were made famous by Ferguson (1959). Equally, one can emphasize the complementarity of the codes. The native colloquial is the language not only of home and friends but also of all that is informal, unofficial, spontaneous, and intimate. The growing entertainment industry in its diverse media manifestations is thus wholly dominated by the colloquials, as is the informal world of texting and twittering [Holes, “Orality”]. Blogging, a domain awaiting comprehensive linguistic research, appears to cover a spectrum of styles.

The difference between the two is also one of ideology versus practice, of ideal versus real. The fuṣħaa, even if in its perceptions and usage it is a variety of fuzzy contours (Kaye 1972; Parkinson 1991) and is rarely2 used in the real world in its prescribed form, is the variety of preeminent cultural importance [Suleiman, “Folk Linguistics”].

Sociolinguistics, a subdiscipline of linguistics of relatively recent provenance closely related to the older dialectology, shows the degree to which ideal and real can differ in the realm of spoken Arabic. The careful microdocumentation of speech communities consistently has shown (studies from the Arabian–Persian Gulf, Saudi Arabia, Jordan, Damascus, Bethlehem, Cairo, Casablanca, and northeast Nigeria) that features of spoken colloquial varieties are what drive language change [Al-Wer, “Sociolinguistics”]. Moreover, when Arabic meets other languages bilingually, it is again the colloquial that always forms the basic matrix of contact [Davies et al., “Codeswitching”]. Even in mixed colloquial–fuṣħaa exchanges such as on media talk shows, the colloquial can have a dominant role.

The vibrant co-existence of quite differentiated varieties, a situation hardly unique to Arabic, nonetheless takes on a special, perhaps unique status in the world’s languages, precisely because each variety, beyond its linguistic profile, embodies a different history, a different symbolism, a different legitimization. While these differences are of central interest to students of linguistics, they extend beyond the academic lecture hall to the real world of language teaching and language policy. To which variety, for instance, should a program of second-language teaching be tailored, or, if the varieties have different cognitive profiles, what are the implications for L1 teaching? These are questions best not answered by policy fiat. Indeed, the experience of Arabic in post-9/11 America represents probably the sorriest example ever of huge resources expended for developing language teaching programs, largely divorced from the fundamental research on (p. 7) the language being taught that would make for a more rational and efficient teaching program [Ryding, “Acquisition”]. Research from across the spectrum of linguistics is implicated in any academicization of Arabic teaching, whether as an L1 or L2.

1.1.4 Grammar

Arabic is thus a language of rare breadth and extension in the world, a language like perhaps no other in the degree to which it embodies the culture and politics of its speakers. It is, however, a language, and it has been studied from a number of classical grammatical perspectives. Even here Arabic has structural features that set it apart from many, sometimes most, of the world’s languages.

The phenomenon of emphasis (pharyngealization) of consonants is a hallmark of the language and has engendered numerous studies both in phonetics [Embarki, “Phonetics”] and in phonology [Hellmuth, “Phonology”]. What is emblematic of Arabic, however, hardly exhausts the interest of Arabic for linguistics. As Hellmuth points out, for instance, stress in Arabic has been of central interest in phonological theory.

In morphology, an ongoing debate surrounding Arabic and many other Semitic languages is the status of the consonantal root as a morphemic element. As Ratcliffe [“Morphology”] points out, the Arabic grammatical tradition itself viewed the stem, not the root, as the basis of morphology, and arguments from within contemporary morphological theory have been developed for this as well. But equally, psycholinguistic studies on the basis of carefully constructed experiments have interpreted the consonantal root as having a crucial role in morphological processing [Boudelaa, “Psycholinguistics”].

Besides the Arabic grammatical tradition itself (1.1.2), there are two further prominent approaches to Arabic grammar. The older one is the philological tradition [Edzard, “Philology”], with which the study of Arabic grammar in the West began. Besides its general interest in Arabic grammar, this tradition incorporates cultural issues and has been present at the interface of Arabic texts of all genres and language. The other is more recent and is based on the precepts of theoretical grammar, particularly syntactic theory in the generative tradition, which endeavors to locate what is specifically Arabic within a broader program of universal grammar [Benmamoun and Choueri, “Syntax”].3 All of the formal grammatical domains feed into the growing domain of computational linguistics and into the broader field of natural language processing [Ditters, “Computational”].

Finally, the classical lexicographical tradition has its counterpart in contemporary lexicography, a field increasingly drawing the vast online publishing industry in Arabic (p. 8) for its sources [Buckwalter and Parkinson, “Modern Lexicography”]. Here again one experiences the special challenges confronting the Arabic lexicographer, for instance, whether to lemmatize according to root or stem, how to sublemmatize parts of speech, and whether to lump polysemously or to differentiate identical forms.

The articles in this handbook describe a language that, when looked at in its totality, is of rare thematic linguistic differentiation.

1.2 Scope and Choice of Chapter Topics

Proposition 1 encapsulates an ideal. The handbook is intended to reflect the full breadth of research on Arabic linguistics in the West. Realistically, this implies that it includes only chapters on topics judged to have a critical mass of background research. The reader will therefore miss domains that might be expected in a linguistics handbook. Asymmetries will be noticeable. There is a chapter on L2 acquisition but none on L1 acquisition, a chapter on sociolinguistics, but none on oral discourse, a number of chapters on grammar but none on semantics. The gaps are regrettable but unavoidable so long as the focus of the chapters is on the domains of Arabic linguistics that do indeed enjoy a fairly broad and deep coverage rather than on Arabic-flavored general linguistics, as it were. 4

The chapters themselves reflect domains of research with great disparities of detail. In some cases the chapter is able to cover nearly all of the published research on a given domain, for instance, the chapter on Pidgins and Creoles and even, surprisingly (see remark at end of 1.1.3), work on L2 Arabic language acquisition. In others the breadth of available material has meant that authors could summarize only broad lines of research, illustrating the topic in greater detail with selected examples. Arabic language contact, particularly as reflected in loanwords, for instance, has a very large literature; the research on Arabic dialects is immense, and the research on the Arabic grammatical tradition is large. As far as Western research goes, these disparities to some degree reflect the relative age of the subdomain. In general, codeswitching, psycholinguistics, sociolinguistics, and pidgin and creole linguistics, for instance, are barely 30 or 40 years old as independent specializations of linguistics. Research on Arabic dialects, on the other hand, was already well established in the 19th century. This does not, however, imply (p. 9) that any domain of Arabic linguistics has been exhaustively treated. As Behnstedt and Woidich point out [“Dialectology”], many dialects, for instance, are poorly described, and the integration of dialectology and sociolinguistics, an essential element of sociolinguistics in the West, has seen only modest progress in the case of Arabic, while historical dialectology, a part of the general field of Arabic historical linguistics, is meager at best.

Gaps should certainly be seen as a challenge to open up wider avenues of research.

1.3 The Real World of Research on Arabic: a Critical Look

Given the current state of research on Arabicist it may be asked: if Proposition 1 is correct, does the linguistic research match the inherent interest of the language?

Here I would answer with only a very conditional “yes.” On one hand, as noted in the previous section, there are areas of research with a large literature and well-established research tradition. On the other hand, there are topics central to the study of any language with only modest research traditions in Arabic. Studies on spoken Arabic discourse are rare (see note 4), while more recent domains of linguistics such as psycho-linguistics, sociolinguistics, or the study of spoken Arabic pragmatics, though growing, are still in their incipient stages.

Ultimately, however, the study of a language must be more than the sum of its parts. It will be suggested here that, as far as Arabic goes, a holistic linguistic tradition remains an as yet unrealized desideratum. In the past and currently, a number of factors militate against this development. Four factors can be identified.

1.3.1 Arabic Is Large

The first is simply the immensity of the field itself. Arabic presents prima facie anything but a unified domain of inquiry. Consider, for instance, the two basic media that Arabic linguistics works with: the written and the spoken word, the former of which is associated with the Classical and Standard language and the latter with the dialects. These two media are in important respects of a different nature. The written domain is a learned domain, one that itself continues a heritage dating back to the 2nd/8th century, whose standard and norms have been long established. While one might be able to change certain aspects of the Standard language, such as the idiomatic domain ([Newman, Kossmann]), one cannot change its morphology or syntax. The spoken domain, on the other hand, is beholden to contemporary methods of descriptive and field linguistics, associated with, inter alia, corpus collection and language documentation, work with expert consultants, and instrumental phonetics of the spoken language. Norms, such as there are in this domain, emerge from the individual research studies undertaken in it.

(p. 10) Experience, moreover, has shown that in the Western tradition these two domains exist largely in parallel universes, with scholars linked to one or the other but not both. Those concerned with the written language, for instance, to the extent that they move outside the field of the linguistics of the written varieties, gravitate toward the other literary domains of Arabic such as Arabic literature, law, and medical texts. Many such individual cases could be cited, but quite typical in this respect is Carl Brockelmann, whose Grundriss der semitischen Sprachen (1908, 1913) remains a standard reference work. After publishing this work, he went on to write another well-regarded book, Geschichte der islamischen Völker (1943) (History of Islamic Peoples). Brockelmann never studied a spoken variety of Arabic, and his Grundriss, while a work of compendious scholarship, is marked by a decided antipathy toward theoretical issues in historical and contact linguistics (Owens 2009: 43), precisely two areas where Arabic is particularly implicated, as discussed already.5

Those working in the realm of the spoken language, on the other hand, are faced initially with a plethora of challenges, for instance, which aspect of language to concentrate on or which varieties of Arabic to try to delineate. Finding a format to integrate these in turn with the Classical or Standard varieties may imply defining variables that are central to neither tradition.

Edzard [“Philology”] correctly notes that there is in principle no contradiction between a philological (written) orientation and a “theoretical” linguistic one; experience has nonetheless shown that relatively few scholars not only work in both domains but also, more importantly, attempt a synthesis of the two.

1.3.2 Stovepiping

The problem is at once abetted and exacerbated by the stovepiping characteristic of contemporary academia. Whereas 30 years ago one could claim to be a linguist, today it is more likely that one will be a sociolinguist, psycholinguist, or general or specialized syntactician. Certainly these developments follow their own internal logic, as methods and theoretical perspectives have become more specialized during this period. At the same time, in this there is the danger that the academic apparatus defines the language rather than the language being served by the apparatus.

To take an example from sociolinguistics, one can ask how many studies are needed to define the social status of the “qaf” variable. On one hand, the fact that there have been fruitful studies on this variable means that it provides a necessary and interesting comparative breadth; on the other hand, certainly many other variables, some of broad (p. 11) comparative potential and others of particular local interest, await treatment. Added to this, embedding the findings on a comparative basis in the vast Arabic world is a challenge that has received relatively little attention from Arabic sociolinguists.6 Beyond this is the ever-present danger of calling the game over as soon as a sociolinguistic phenomenon has been studied from within a particular theoretical perspective, as often as not one initially defined from outside of the Arabic-speaking world. Al-Wer’s perspective in [“Sociolinguistics”] is better; she shows that ultimately constructs need to be interpreted within a context that does justice to the particularities of a given part of the Arabic world, illustrating her point with the interpretation of the ostensibly universal or at least very general “education” variable as a proxy for other, community-immanent variables.

1.3.3 Clash of Traditions

Complementing the two previously defined issues is that academic and cultural traditions provide ready barriers for synthetic perspectives. Within the West, for instance, Carter (1988: 207) attempts to dissociate Arabic linguistics from Arab linguistics. “… ‘Arabic linguistics’.… detaches the language entirely from its environment so that it becomes a pure abstraction.” On the other hand, Arab linguistics, the legitimate study of the Arabic language, is “… the vast and continuing output of traditional works, both editions of texts and secondary sources, which remain wholly within the historical norms of Islamic scholarship” (ibid.). In Carter’s terms, a handbook of Arabic linguistics that has at its core questions about the Arabic language, however defined, is suspect from the start.

To be fair, one of Carter’s objections to an Arabic linguistics deserves attention. “Solving” a problem in Arabic within a general linguistic theory runs the danger of importing an issue, a technique of inquiry, a focus on a grammatical construction whose ultimate interest is dictated from outside of Arabic and whose “solution” offers little to those interested in the complex structure of Arabic. At the same time, however, as noted already, trivial an observation though it is, Arabic is simply a language, so linguistic approaches will want to understand it within general theories of language. Moreover, as argued in Sections 1.1.1 and 1.1.2, Arabic itself has unique geographical, social, historical, and cultural properties that have, as it were, pushed the language in directions hardly encountered elsewhere. Linguistic theory can hardly avoid it, even if, in practice, non-Arabicist linguists often do so (see, e.g., criticisms in [Tosco and Manfredi, “Creoles”] or Ryding [“Acquisition”] on the barriers confronting researchers (p. 12) of second-language acquisition due simply to lack of language knowledge). It is easy to formulate a solution to this problem: practitioners need to be as well versed in Arabic in all its linguistic ramifications as they are in the methodologies and theories of linguistics. Nonetheless, its implementation implies a commitment of both individual and institutional time and intellectual resources, which are not necessarily easy to come by.

Perhaps more pernicious than the delegitimization of a linguistic approach to Arabic is Mahdi’s (1984: 37) admonition to study dialects to be rid of its debilitating influence on the Standard (fuṣħaa).7 This perhaps well-intentioned perspective derives most directly from a normative 19th-century tradition (see [Newman, “Nahḍa”], which attempts to lay the blame for the ill learning of the Standard language on the use of dialects and can justify the study of dialects only against a possible benefit for the Standard. Such a perspective is not uncommon in the Arabic world.8 Leaving aside the cultural and political issues inherent in this position [Suleiman, “Folk Linguistics”], adopting this perspective would necessarily mean excluding Chapters 10, 12, 13, 14, 15, and 22 from this volume while requiring severe reductions in most others, since the dialect is nothing less than the mother tongue. It is not so much an approach foreign to general linguistic inquiry as it is a rejection of the scientific and empirical study of the world, defining in narrow political-cultural terms the goals of research on one of the most ineffable and undefined domains of human experience: language.

1.4 Attitudes

The reader may be confused at this point. On one hand, Proposition 1 claims that Arabic is, for the linguist, an intellectual challenge like no other. On the other, this challenge is often met by traditions, theories, academic structures, and attitudes that at best ensure a fragmented understanding of the language and at worst succeed in a holistic characterization of “Arabic” only at the expense of defining whole domains of language experience into nonexistence.

It can be suggested, without exaggerating the professional and even ideological differences that accrue in the study of Arabic, that the only approach that does justice to Proposition 1 is one grounded on radically open-minded empiricism.

(p. 13) For precedence, one need go no further than the medieval Arabic grammatical tradition itself. The following quote from the mid-4th-/10th-century Zajjaji in one of the earliest works of metareflection in the Arabic tradition. In Chapter 5 (al-baab al-xaamis), he reflects on the nature of linguistic causes. After identifying three types of linguistic causes (pedagogical, analogical, theoretical-speculative; see Versteegh 1995: 89), he approvingly summarizes the approach to language study attributed to al-Xalil ibn Aħmad, the polymath contemporary of and teacher of Sibawaih (see [Sara, “Classical Lexicography”]). In the passage, Xalil is said to have likened the scholar trying to ascertain the nature of (Arabic/language) to one trying to understand a house construction:

A House of Sound Structure, of Marvelous form and ProportionAn Introduction

If a wise person were to enter a house of sound structure, of marvelous form and proportion, whose builder’s wisdom appeared correct to him according to reliable information, the unmistakable lines of proof and clear arguments. And each time the man stopped and pondered a part of the house, he said, “[the builder] did it this way for such and such a reason and such and such a cause.” That is what occurred to him and appeared reasonable to him. Now it might be that the builder did build it for the reason the man inspecting the house thought, but it is equally possible he did it for another reason, even if the inspector’s reason might be correct. If a different grammatical reason should occur to another person than myself, which is more appropriate than my explanation, so let him suggest it. (p. 66)

With this passage, there are obviously interpretative issues that go beyond an introductory chapter. In particular, the passage is enticingly ambiguous as to what a “more appropriate” explanation might be. The history of the Arabic tradition itself shows that an explanation in the 5th/11th century might be more nuanced than one in the 3rd/9th and that one in the 6th/12th century might add further elements [Larcher, “ALT II”], not to mention the classic grammar-internal differences of the Basrans and Kufans (Sibawaih vs. Farra’; Owens 1990). It would be a grave mistake, however, to stop with the classical tradition. The recent history of linguistics is marked not only by the continual reappraisal of classic linguistic ideas and traditional issues but also by new theoretical, methodological, and, increasingly, technical advances, many described in this volume, that promise to transform, expand, and enrich the very idea of grammatical explanation to such an extent that a genius such as Xalil, if he were alive today, would be envious.

Xalil’s metaphor unmistakably sets a basic ground rule for linguistic research, namely, that no possible explanatory aspect be excluded on a priori grounds. Since explanations are, ultimately, explanations of linguistic substance, facts, observations, summaries of (p. 14) data, measurements, and reinterpretations of previous explanations, Xalil’s approach implies setting no preconditions as to what comes under the purview of Arabic linguistics. 9

It is in this spirit that the current handbook should be read; it is a reference work that brings together different approaches and scholarly traditions, an invitation to the reader to explore the multifaceted world of Arabic linguistics. The articles in this volume expertly explore the nature of the house of Arabic from many angles. Many argue for specific points of view, others give descriptions of synoptic breadth, while others provide exhaustive overviews of the state of the art. The parts may or may not come together to describe a common structure; they do provide blueprints for a better understanding of it.

Note to References

Chapters 9 “Issues in Arabic Computational Linguistics” and 13 “Dialects and Dialectology” have very comprehensive bibliographies. They are, however, too large to be included in their entirety in the print version of the handbook. Rather than edit away this very valuable resource, it was decided to include the complete bibliographies to these two chapters in the online version of the handbook while including a selected bibliography in the print version.


This Appendix gives basic background information about Arabic as well as a brief discussion of the transcription and transliteration conventions used in this book.

A.1 Maps

The bulk of the native Arabic-speaking population lives within countries with majority Arabic-speaking populations. Sizable non-Arabic minorities include Berbers (Amazigh), with large minorities in Algeria and Libya and up to half the population of Morocco, where in fact Arabic shares its status as official language with Berber. Other minorities are speakers of the various South Arabian languages in Yemen (and a small population in Oman) and Kurds and Aramaic speakers in Iraq and to a lesser degree Syria. Even after the South Sudan, which has few native Arabic speakers, recently split off from the North, the Sudan has a large and diverse linguistic minority population. Finally, Mauretania has a not insignificant (p. 15) non–Arabic-speaking population (Wolof, Fulfulde) in the south of the country. Map 1.1 shows countries with majority Arabic-speaking populations. It can be noted that although the main lingua franca of South Sudan, Juba Arabic, historically derives from Arabic, by linguistic measures it is a different language [Tosco and Manfredi, “Creoles”] and therefore is not included on Map 1.1.

Maps 1.2 and 1.3 illustrate the lack of complete isomorphy between political status of a language and the native language of its inhabitants. The Arab League (A House of Sound Structure, of Marvelous form and ProportionAn Introduction, Map 1.2) comes close to being composed entirely of countries with Arabic as a majority language. There are only two exceptions: Somalia, where the native language of the vast majority of the population is Somali, a Lowland East Cushitic language genetically very distantly related to Arabic; and the Comoro Islands, whose native Bantu language is closely related to Swahili.

Besides being the official language of all countries in the Arab League, Arabic is also the official language of Eritrea (majority native language Tigrinya; Hailemariam 2002: 75), a country with a tiny population of Arabic native speakers. In addition it is, along with French, an official language in Chad, which does have a sizable native Arabic-speaking minority. In these two countries, Arabic attained official status under quite different circumstances and at different times. In Eritrea, for instance, it was during the brief British rule from 1941 to 1952 that Arabic was introduced as the official language, a status it has maintained until today, whereas in Chad Arabic was adopted as an official language well after independence (1960) in the 1990s, and only after considerable debate (de Pommerol 1997).

Finally, Map 1.4 shows that for the most part Arabic-speaking minorities live on the political borders of majority Arabic-speaking countries. Even the exceptions in this regard, the tiny Arabic-speaking populations of Uzbekistan, Afghanistan, and Khorasan in eastern Iran were, at the time of their settlement in the 2nd/8th century, a part of a continuous migration of Arabs into Central Asia. It can be noted that, while from the perspective of genetic linguistics Maltese can be considered a variety of Arabic (Owens 2010), on a sociopolitical basis and as an official language of the European Union it is an independent language.

A.2 Genetic Affiliation of Arabic

While a definitive classification of Arabic within a Stammbaum representation may be impossible [Retsö, “Arabic”], within traditional genetic models the following two models are the most widely discussed (based on Faber 1997: 5, 6):





Ancient Egyptian



 Variant 1

  East Semitic: Akkadian, Eblaitic

  West Semitic

   Northwest Semitic

    Canaanite: Hebrew, Phoenician, Moabite


(p. 16)     South Semitic


     Southeast Semitic

      Modern South Arabian: Jibbali, Mehri, Harsŭsi, Soqotri


      OSA: Sabean, Qatabanian, Hadramauti, Minean

      Ethiopian Semitic

 Variant 2

  East Semitic



  West Semitic

   Central Semitic


    Northwest Semitic


     Canaanite: Hebrew, Phoenician, Moabite, Ammonite, El-Amarna



  South Semitic



    Mehri, Harsŭsi, Jibbāli


    Old South Arabian

    Ethiopian Semitic

     North Ethiopic: Ge’ez, Tigre, Tigrinya

     South Ethiopia

      Transverse SE

      Amharic, Argobba

      Harari, East Gurage

     Outer SE

      n group: Gafat, Soddo, Goggot

      tt group


       West Gurage

A.3 Transcription and Transliteration Conventions

The representation of Arabic in Latin script is beholden to different conventions. Rather than try to force standardization in this volume, the various systems used are taken over intact in different chapters. Having said this, the editor is strongly biased toward the use of the International Phonetic Alphabet (IPA), or modified IPA symbols, for representing any spoken text. Nothing, moreover, speaks against using it for transliterated written texts, though here other traditions have developed different conventions.

A House of Sound Structure, of Marvelous form and ProportionAn Introduction

Map 1.1 Countries with Arabic as a majority language.

A House of Sound Structure, of Marvelous form and ProportionAn Introduction

Map 1.2 The Arab league.

A House of Sound Structure, of Marvelous form and ProportionAn Introduction

Map 1.3 Arabic as official language.

A House of Sound Structure, of Marvelous form and ProportionAn Introduction

Map 1.4 Arabic as minority language.

Ultimately, moreover, justification can be asked of each set of conventions. For instance, representing a long “i” as [ī, i:, ii, or iy] implies different phonological interpretations of the nature of vowel length. It can be noted that IPA conventions themselves should hardly be (p. 17) (p. 18) (p. 19) (p. 20) (p. 21) regarded as sacrosanct. The multiexponential phenomenon of “emphasis,” for instance, is now represented by C + ʕ, such as t ʕ, that is, C + pharyngealization. As the two relevant articles in this volume make clear, however ([Embarki, “Phonetics”; Hellmuth, “Phonology”]), pharyngealization (tongue retraction toward pharynx, pharyngeal constriction) is but one gesture defining the phenomenon and is not necessarily the most prominent one.10 Equally relevant would be, for instance, a symbol based on the articulatory metaphor developed in the Arabic tradition of likening the flattened tongue body to a plate or pot cover (iṭbaaq, muṭbaq).

In any case, the multiplicity of transcription/transliteration conventions means that the reader’s indulgence is needed for the treatment of proper Arabic names, where the same person will appear in difference orthographic guises, according to the conventions of the chapter, Ibn Jinni, Ibn Jinnī, Ibn Ğinnī, Ibn Ğinni. Would that he could comment on the matter.


Amara, Mohammad. 2005. Language, migration and urbanization: The case of Bethlehem. Linguistics 43: 883–901.Find this resource:

Baalbaki, Ramzi. 2008. The legacy of the Kitaab. Leiden: Brill.Find this resource:

Brockelmann, Carl. 1908–1913. Grundriss der vergleichenden Grammatik der semitischen Sprachen. 2 vols. Berlin: Reuther and Reichard.Find this resource:

——. 19772 (1943). Geschichte der islamischen Völker. Hildesheim: Olms.Find this resource:

Carter, Michael. 1988. Arab linguistics and Arabic linguistics. Zeitschrift für Geschichte der arabisch-islamischen Wissenschaften 4: 205–218.Find this resource:

de Pommerol, Patrice. 1997. L’arabe tchadien: émergence d’une langue véhiculaire. Paris: Karthala.Find this resource:

de Swaan, Abram. 1998. A political sociology of the world language system (1): The dynamics of language spread. Language Problems and Language Planning 22: 63–75.Find this resource:

——. 2001. Words of the world. Cambridge, UK: Polity Press.Find this resource:

Faber, Alice. 1997. Genetic subgrouping of the Semitic languages. In The Semitic languages, ed. Robert Hetzron, 3–15. London: Routledge.Find this resource:

Faiq, Said. 2006. Coherence. In Encyclopedia of Arabic Language and Linguistics, vol. I, ed. Kees Versteegh, Associate Editors: Mushira Eid, Alaa Elgibali, Manfred Woidich, Andrzej Zaborski, 427–430. Leiden, Netherlands: Brill.Find this resource:

Ferguson, Charles. 1959. Diglossia. Word 15: 325–340.Find this resource:

Hachimi, Atiqa. 2007. Becoming Casablancan: Fessis in Casablanca as a case study. In Arabic in the city: Issues in language variation and change, ed. Catherine Miller, Enam Al-Wer, Dominique Caubet, and Janet Watson, 97–122. London: Routledge.Find this resource:

Haeri, Niloofar. 1996. The sociolinguistic market in Cairo: Gender, class, and education. London: Kegan Paul International.Find this resource:

Hailemariam, Chefna. 2002. Language and education in Eritrea. Amsterdam: Aksant.Find this resource:

Holes, Clive. 1987. Language in a modernising Arab state: The case of Bahrain. London: Kegan Paul International.Find this resource:

Kaye, Alan. 1972. Remarks on diglossia in Arabic: Well-defined vs. ill-defined. Linguistics 81: 32–48.Find this resource:

(p. 22) Khalil, Esam. 2006. Cohesion. In Encyclopedia of Arabic Language and Linguistics, vol. I, ed. Kees Versteegh, Associate Editors: Mushira Eid, Alaa Elgibali, Manfred Woidich, Andrzej Zaborski, 430–433. Leiden, Netherlands: Brill.Find this resource:

Lipiński, Edward. 2000. The Aramaeans, their ancient history, culture, religion. Leuven: Peeters.Find this resource:

Mahdi, Muhsin (ed.). 1984. Alf Layla wa Layla. Leiden: Brill.Find this resource:

Al-Marzuqi, Munsṣif. 2011. ʒ ayy luƔa sayatakallam al- ʕArab al-qarn al-muqbil? Jazira Net, November 6, 2011.Find this resource:

Mejdell, Gunvor. 2006. Mixed styles in spoken Arabic in Egypt. Leiden: Brill.Find this resource:

Owens, Jonathan. 1990. Early Arabic grammatical theory: Heterogeneity and consolidation. Amsterdam: Benjamins.Find this resource:

——. 1995. Language in the graphics mode: Arabic among the Kanuri of Nigeria. Language Sciences 17: 181–199.Find this resource:

——. 1998. Neighborhood and ancestry: Variation in the spoken Arabic of Maiduguri (Nigeria). Amsterdam: John Benjamins.Find this resource:

——. 2000. Introduction. In Arabic as a minority language, ed. J. Owens, 1–43. Berlin: Mouton de Gruyter.Find this resource:

——. 2009. A linguistic history of Arabic. Oxford: Oxford University Press.Find this resource:

——. 2010. Review article: “What is a language?: Review of Bernard Comrie, Ray Fabri, Elizabeth Hume, Manwel Mifsud, Thomas Stolz & Martine Vanhove (eds.), Introducing Maltese Linguistics. Selected papers from the 1st International Conference on Maltese Linguistics, Bremen, 18–20 October 2007.” Journal of Language Contact (Varia): 103–118.Find this resource:

Owens J., and Alaa Elgibali (eds.). 2010. Information structure in spoken Arabic. London: Routledge.Find this resource:

Parkinson, Dilworth. 1991. Searching for modern Fusṣḥā: Real life formal Arabic. Al-’Arabiyya 24: 31–64.Find this resource:

Procházka, Stephan. 2006. Arabic. In Encyclopedia of language and linguistics, vol. I, ed. Keith Brown, 423–431. Oxford: Elsevier.Find this resource:

Rabin, Chaim. 1951. Ancient West Arabian. London: Taylor’s University Press.Find this resource:

Sallam, A. 1980. Phonological variation in educated spoken Arabic. BSOAS 43: 77–100.Find this resource:

Thomason, Sarah, and Terrence Kaufman. 1988. Language contact, Creolization, and genetic linguistics. Berkeley: University of California Press.Find this resource:

van Coetsam, Frans. 2000. A general and unified theory of the transmission process in language contact. Heidelberg: Winter.Find this resource:

Versteegh, Kees. 1984. Pidginization and Creolization: The case of Arabic. Amsterdam: Benjamins.Find this resource:

——. 1995. The explanation of linguistic causes. Amsterdam: Benjamins.Find this resource: “List of languages by number of native speakers.”


(1) For native speakers, Procházka’s (2006) estimate of 280 million strikes us as reasonable, if perhaps slightly low. In addition, Arabic is spoken fluently as a second-language lingua franca in particular in Algeria, Morocco, Mauretania, Libya, Yemen, Chad, Tunisia, and the Sudan.

An estimate of 452 million “total” speakers, such as found at languages_by_number_of_native_speakers#30_to_50_million_native_speakers, should be treated with great caution. Estimating total number of speakers in a language like Arabic begs the question of what a language speaker is. In a survey carried out among Kanuri, one individual reported to me that she uses Arabic “often” (Owens 1995). When I thereupon addressed her in Arabic, she could not understand a word. She explained that she began many acts with bi sm illaahi (“in the name of God”). Defining “total” (of what?) is no less a slippery task than defining “often.”

(2) The crucial adverb rarely should be understood as follows. Arabic is spoken by, conservatively, 300 million individuals. Each individual, probably conservatively, speaks for two hours per day, at 10,000 words per hour (slightly low probably), giving 6 trillion words of Arabic per day. The only forums where a normative, spoken Standard Arabic is used are certain media broadcasts (e.g., the excellent news channels al-ʕ Arabiyya or al-Jaziyra, national and commercial channels mainly for information-orientated topics such as news and documentaries) and in various official meetings, including some but hardly all educational formats (see Mejdell 2006; also [Holes, “Orality”]). Of the 300 million speakers, only a tiny minority of them are engaged at any one time in a function prescribing the use of Standard Arabic. Otherwise, for most individuals nearly always, and for all at some time, the basis of everyday speech is a colloquial variant.

(3) Chapter 6 is a double chapter; the original intent was to have two separate chapters, one on the standard language and the other on dialects. Individual circumstances required conflating the two into one.

(4) For instance, the justifiably well-regarded Encyclopedia of Arabic Language and Linguistics has a chapter on “Cohesion” (Khalil 2006) with nine non-Arabic items in the bibliography and ten on Arabic. Unfortunately, this breakdown realistically reflects the dearth of material on spoken Arabic discourse, for instance, only one book-length work, an edited volume (Owens and Elgibali 2010), which is too little in the editor’s view to merit a separate chapter here. The article preceding Khalil’s on “Coherence,” a central topic equally in literary and spoken texts, treats the subject only as it is reflected in the Classical literary tradition (Faiq 2006). The limitation is regrettable but does reflect the unbalanced state of the art in this domain.

(5) Indeed, it is striking that while comparative Semitic and comparative Indo-European literature both came of age in the same era, the 19th century, and to a large degree in the same region—Central Europe— the theoretical contribution of the former to the development of general principles of historical linguistics was negligible whereas that of the latter was essential.

(6) For instance, despite relatively well-documented accounts of “qaf” variation covering thirty years of research in the Arabic world from the Gulf to Morocco (e.g., Sallam 1980; Holes 1987; Haeri 1996; Amara 2005; Hachimi 2007), no studies have synthesized these accounts with a view toward defining the extent to which a common social dynamic lies behind “qaf” usage. It is, for instance, no sociolinguistic accident that the “qaf” variable is of such marginal interest in Nigerian Arabic, a distinctly minority language in northeast Nigeria, that it was not included as a variable in Owens (1998).

(7) Mahdi speaks of the sicknesses of the dialects, which require treatment A House of Sound Structure, of Marvelous form and ProportionAn Introduction. The passage in fact comes in the Introduction to a well-edited edition of 1001 Nights, which left the original “Middle Arabic” style intact rather than classicizing out its authenticity, as is the current custom (e.g., the version on arabicorpus).

Another popular approach is the regulation of language use by legal fiat. Munṣif al-Marzuqi, who writes an occasional column for Jezira Net, for instance, would (article of Nov. 6, 2011) criminalize the use of what he terms “Creole” Arabic, by which he intends, in the parlance of contemporary linguistics, a codeswitched variety of Arabic (tajriym istiʕmaal luƔat al-kriyuwl).

(8) For instance, generally speaking, “Arabic” in Arabic departments in the Arabic world stop with the classical language.

(9) An extreme though in today’s world by no means uncommon situation is when Arabic needs to be studied in tandem with other languages in the domain of codeswitching [Bentahila et al., “Codeswitching”; also Kossmann, “Borrowing”; Newman, “Nahḍa”].

(10) For instance, Embarki [“Phonetics”], summarizing Al-Ani (1970), identifies four traits of consonantal “emphasis,” only one of which involves pharyngeal space.