Show Summary Details

Page of

PRINTED FROM OXFORD HANDBOOKS ONLINE ( © Oxford University Press, 2018. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

date: 21 January 2021

Introduction: Status and Definition of Compounding

Abstract and Keywords

This chapter begins by addressing the question: do we really know what a compound is? It then describes the most important of the criteria for distinguishing compounds: (i) stress and other phonological means; (ii) syntactic impenetrability, inseparability, and unalterability; and (iii) the behaviour of the complex item with respect to inflection. An overview of the remainder of the book is also presented.

Keywords: compounds, stress, phonological criteria, syntactic impenetrability, inseparability, unalterability, inflection, word formation

Most of us are familiar with the parable of the blind men and the elephant. Each of the men developed a theory of ‘elephanthood’ on the basis of a limited perception: one fellow's elephant was like a rope, another's like a broad leaf, a third's like a tree trunk, and so on. We conceived of this volume in the fear that our current picture of ‘compoundhood’ might be like the blind men's elephant, and in the hope that by putting together the disparate pieces of what we know, something like the whole elephant might appear. Each of us might have a limited perception, illuminating and interesting in its own way, but no one perspective gives the whole story. What we endeavour to do in this Handbook is to give a variety of pictures of compounding that both complicate and deepen our understanding of this important means of extending the lexicon of a language.

Our intention is to complicate our view both theoretically and descriptively. In terms of theory, we consider compounding from disparate frameworks, both generative and non-generative, and from different perspectives: synchronic, diachronic, psycholinguistic, and developmental. Descriptively, we hope to sharpen (p. 4) our understanding of what constitutes a compound by looking not only at familiar languages, but also at a range of typologically and areally diverse languages. The two views are complementary.

We will offer a brief overview of the volume in section 1.2, but first we try to take our own first pass at this distinctive species of word formation: do we really know what a compound is?

1.1 The Problem of Definition: What's a Compound?

Compounding is a linguistic phenomenon that might at first glance seem straightforward: in his introductory text Bauer (2003: 40) defines a compound as ‘the formation of a new lexeme by adjoining two or more lexemes’. But Marchand, in ‘Expansion, transposition, and derivation’ (1967), presents another view, in effect saying that compounds don't exist as a separate sort of word formation; indeed, he distinguishes only two basic categories of word formation: expansion and derivation. Whether a complex word belongs to one or the other category depends on whether what he calls the ‘determinatum’—in effect, the head of the complex word—is an independent morpheme or not. For Marchand, an expansion is a complex word in which the determinatum is an independent morpheme. Expansions might have either a bound or a free morpheme as their ‘determinant’—in current terms, their modifier or non-head element. This allows Marchand to class prefixed items like reheat or outrun as the same animal as compounds like steamboat or colourblind. Words in which the determinatum/head is bound are derivations; in effect, suffixed words constitute one category of word formation, compounds and prefixed words another.

The reader might ask why a handbook on compounding should begin by contrasting an apparently straightforward definition of compound with such a non-canonical view of compounds. The answer is precisely that there has always been much discussion of exactly what a compound is, and even of whether compounds exist as a distinct species of word formation. We can identify two main reasons why it is difficult to come up with a satisfying and universally applicable definition of ‘compound’. On the one hand, the elements that make up compounds in some languages are not free-standing words, but rather stems or roots. On the other, we cannot always make a clean distinction between compound words on the one hand and derived words or phrases on the other. We might term these the ‘micro question’ and the ‘macro question’.

(p. 5) Let us look at the ‘micro question’ first. In the 1960 edition of his magnum opus, Marchand, for example, assumes that “[w]hen two or more words are combined into a morphological unit, we speak of a compound” (1960: 11). But this definition of compound is rooted in the analytical features of English, in particular, its lack of inflectional morphemes. In inflectional languages like Czech, Slovak, or Russian, the individual constituents of syntactic phrases are inflected. Compounds result from the combination not of words, but stems—uninflected parts of independent words that do not themselves constitute independent words. It is the compound as a whole that is inflected. In Slovak, for example, we know that rýchlovlak ‘express train’ is a compound because the left-hand constituent (rýchlyA ‘fast’) is devoid of an inflectional morpheme and displays a linking element -o. On the other hand, we know that rýchly vlak ‘fast train’ (any train that goes fast) is a syntactic phrase and not a compound because the adjective rýchly is inflected to agree with the noun. It is only the lack of inflectional morphemes in English that makes surface forms of English compounds and free syntactic groups identical in terms of their morphological forms (compare, for example, blackboard and black board).

In light of this issue, it would seem that defining a compound as a combination of two or more lexemes, as Bauer does, is the safer way to go: the term lexeme would seem specific enough to exclude affixes but broad enough to encompass the roots, stems, and free words that can make up compounds in typologically diverse languages. But with Bauer's definition we have to be clear about what we mean by ‘lexeme’.

One problem hinges on how we distinguish bound roots from derivational affixes. One criterion that we might use is semantic: roots in some sense have more semantic substance than affixes. But there are languages in which items that have been formally identified as affixes have as much, or nearly as much, semantic substance as items that might be identified as roots in other languages. Mithun (1999: 48–50) argues, for example, for what she calls ‘lexical affixes’ in many Native American languages. Bearing meanings like ‘clothes’, ‘floor’, ‘nape’, ‘knob’ (in Spokane) or ‘eat’, ‘say’, ‘fetch’, ‘hit’ (in Yupʼik), they might look semantically like roots, but their distribution is different from that of roots, and they serve a discourse function rather different from that of roots (they serve to background information that has already been introduced in a discourse). So distinguishing lexemes from non-lexemes might not be possible in semantic terms.

Another criterion must therefore be formal: we might say that bound roots can be distinguished from affixes only by virtue of also occurring as free forms (inflected, of course, in languages that require inflection). But that means that words like overfly and outrun in English must be considered compounds, rather than prefixed forms (as Marchand would like). There are two problems with this conclusion. First, the status of verbal compounds in English is highly disputed, and (p. 6) these items are clearly verbal.1 Second, even though over and out also occur as free morphemes in English, the form that attaches to the verbs fly and run behaves rather differently from the first element of a compound. Specifically, the first element of a compound in English is typically syntactically inert: it does not affect the syntactic distribution of the complex word. Yet over- and out- have clear effects on verbal diathesis:

  1. (1)

    1. a. *The plane flew the field ~ The plane overflew the field.

    2. b. *Yitzl ran Bonki ~ Yitzl outran Bonki.

In light of this, it's not clear that we should be any more satisfied with the formal criterion than with the semantic criterion.

Bauer's definition also runs afoul of what we have called the ‘macro question’: how do we distinguish compounds from phrasal forms? Recall that Bauer defines a compound as a ‘new lexeme’. But how do we know when we have a ‘new lexeme’? It is relatively clear that in English a blackboard is a different lexeme from a black board: the former is sufficiently lexicalized that it can refer to just about any object that one can write on with chalk, regardless of its colour. But does a form like tomato bowl constitute a new lexeme if I use that term in pointing to a bowl on the counter that just happens at this moment to be holding tomatoes? In other words, so-called deictic compounds (Downing 1977) seem no more like new lexemes than some syntactic phrases do, and yet they still show some of the characteristics that we attribute to compounds (stress pattern, for one). Even more fraught is the question of whether items like a floor of a birdcage taste or a wouldn't you like to know sneer are compounds. Certainly we would be hesitant to call items like these lexemes. And yet some theorists have argued that they are indeed compounds.

It would seem that the only way to answer such questions would be to come up with a list of criteria some number of which forms would have to satisfy in order to be considered compounds. But here too we run into problems. In spite of extensive research into compounds and compounding processes, there are hardly any universally accepted criteria for determining what a compound is. Donalies (2004: 76), for example, analyses Germanic, Romance, Slavic, Finno-Ugric, and Modern Greek constructions in terms of ten postulated criteria. Compounds:

  • are complex

  • are formed without word-formation affixes

  • are spelled together

  • have a specific stress pattern

  • (p. 7) include linking elements

  • are right-headed

  • are inflected as a whole

  • are syntactically inseparable

  • are syntactico-semantic islands

  • are conceptual units

Some of these criteria deserve serious consideration, others are far less plausible, or at least have far less cross-linguistic applicability. For example, it goes without saying that compounds are complex, but this does not in and of itself distinguish them from derived words, which are also complex. Donalies' second criterion is also not very useful: apparently all it means is that compounding is different from derivation, but what that difference is is precisely what is at issue. Spelling and headedness differ from one language to the next, and even within a single language, as we will see, as does the presence or absence of linking elements. These criteria might have limited utility within a particular language or group of languages, but cross-linguistically they cannot be definitive. And we have already seen that it's not so easy to decide what constitutes a ‘conceptual unit’ if we take ‘conceptual unit’ to be similar to what Bauer calls a ‘lexeme’. A potential criterion that Donalies does not mention is lexicalization or listedness. But it does not take long to dismiss that either as a criterion for compoundhood: the more productive the process of compounding in a language, the less chance that individual compounds will be lexicalized.

Spelling also cannot be taken as a plausible criterion for determining compoundhood. Spelling is generally rejected as a criterion of compoundhood in English because the spelling of compounds is so inconsistent.2 Although there might seem to be a tendency for institutionalized compounds to be spelled as one word or hyphenated (cf. blackboard vs. a black board) this is hardly a hard-and-fast rule. As Szymanek has argued, ‘[o]rthography, i.e. spelling convention for compounds cannot be taken seriously … the orthography of English compounds is notoriously inconsistent: some compounds are written as single words (postcard, football), in others the constituents are hyphenated (sound-wave, tennis-ball), and in still others the constituent elements are spaced of, i.e. written as two separate words (blood bank, game ball)’ (1998: 41). Some compounds occur in all three variants: flowerpot, flower-pot, flower pot. So spelling cannot be used to determine compoundhood in English. In Czech and Slovak, in contrast, spelling has sometimes been considered an important criterion because all compounds are spelled as one word, whereas syntactic phrases are spelled as separate words. But this puts the cart before the horse: if we acknowledge that the spoken language is primary, and the writing system only an artficial system designed to capture the spoken word, there must clearly be some criteria which lead writers to write a sequence as one (p. 8) word rather than two (e.g. lack of inflection on the first item, see above). In other words, spelling cannot be taken as a criterion of compoundhood because it only secondarily reflects the situation in the spoken language.

That leaves us with the most important of the criteria for distinguishing compounds: (i) stress and other phonological means; (ii) syntactic impenetrability, inseparability, and unalterability; and (iii) the behaviour of the complex item with respect to inflection. We will start with the issue of stress, and then look more carefully at the other phonological and syntactic criteria mentioned above.

1.1.1 Phonological Criteria Stress in English

Stress is a more relevant criterion for determining compoundhood, at least for English, and has been the focus of intensive research in recent decades. Nevertheless, we will see that it is still quite problematic. It is often said that English compounds bear stress on the left-hand constituent, whereas syntactic phrases carry a level stress or are stressed on the head, i.e. the right-hand constituent. But there are numerous problems with this generalization.

Some of these problems stem from factors that seem purely idiosyncratic. On the one hand, individual native speakers can vary in their pronunciation of particular forms, as can specific groups of speakers (Pennanen 1980). Kingdon (1966:164), for example, claims that in American English ‘there is a stronger tendency towards the simple-stressing of compounds’, by which he means left-hand stress, although whether this is true or not is unclear to us. On the other hand, context and the pragmatic conditions under which a word is pronounced can influence the pronunciation of particular compounds. Kingdon (1958:147), Roach (1983:100), Bauer (1983:103), and most recently Štekauer, Valera, and Diaz (2007) all point out that the position of stress in isolation may differ from that when such words are pronounced in sentence context. Spencer (2003) notes as well that stress can occasionally be used to distinguish between different readings of the same combination of constituents: for example ʼtoy factory is probably a factory where toys are made, but a toy ʼfactory is a factory which is also a toy. And as Bauer (1998b: 70–2) points out, even individual dictionaries can differ in the way they mark stress on particular compounds.

Nevertheless, theorists have repeatedly tried to find systematic explanations for why some English compounds bear left-hand stress and others do not. Some of these explanations take syntactic form as relevant. Marchand (1960: 15), for example, counts compounds with present or past participles as the second stem as a systematic exception to left-hand stress, giving examples like easy-going, high-born, man-made. We question the systematicity of this ‘exception’ though—of the three examples Marchand gives, two have left-hand stress for many speakers of American English, and it is easy to add other examples of compounds based on participles (p. 9) that follow the prevalent left-hand stress pattern of English compounds (truck driving, hand held). Indeed, Olsen (2000b) notes that synthetic compounds (including these participle-based forms) are systematically left-stressed.

Giegerich (2004, and this volume) also attempts to relate stress to the structural characteristics of N + N constructions. He argues that most attribute-head N + N constructions are phrases rather than compounds, and therefore bear stress on the right-hand constituent. For example in steel bridge, the noun steel modifies bridge, and therefore is a phrase and has right-hand stress. On the other hand, N + N combinations that exhibit complement-head structures (e.g. battlefield, fruit market, hand cream) are compounds, and therefore bear stress on the left-hand constituent, as do attribute-head collocations that are lexicalized. Plag (2006) shows experimentally that complement-head collocations generally do exhibit left-hand stress, but that attribute-head collocations do as well, albeit less frequently. This applies to novel compounds as well as to lexicalized ones, so left-hand stress in attribute-head collocations cannot be attributed to lexicalization, as Giegerich suggests. It therefore seems difficult to find a structural principle that explains the variability of stress in English compounds.

Other researchers have attempted to find semantic principles which influence the stress pattern of English compounds. For example, Jones (1969: 258) presents three semantic criteria conditioning the presence of a single main stress on the left-hand constituent:

  1. (2)

    1. a. The compound denotes a single new idea rather than the combination of two ideas suggested by the original words, i.e. the meaning of the compound is not a pure sum total of the meanings of its constituents.

    2. b. The meaning of the compound noun is the meaning of the second constituent restricted in some important way by the first element (ʼbirthday, ʼcart-horse, ʼsheepdog). When the second compound constituent is felt to be especially important the compound is double stressed (ʼbow ʼwindow, ʼeye ʼwitness) (Jones 1969: 259).

    3. c. The first element is either expressly or by implication contrasted with something (ʼflute-player—in contrast to, for example, piano-player).

But these criteria are highly problematic. Does apple pie with right-hand stress denote any less of a single new idea than apple cake, which has left-hand stress? And what exactly do we mean by ‘new idea’?3 Similarly, is pie any less importantly restricted by apple than cake is? Is window more important in bow window than in replacement window? Bauer (1983:107) rightly calls in question the last criterion (as do Marchand 1960: 16–17 and Kingdon 1958:151): if it were the case that left-hand (p. 10) stress obtains when a contrast is intended, all compounds might be expected to be pronounced with left-hand stress. But this is certainly not the case: ʼcherry ʼbrandy is contrasted with ʼapricot ʼbrandy, ʼpeach ʼbrandy, and ʼgrape ʼbrandy, although all are double stressed.

Sampson (1980: 265–6) suggests that compounds in which the first stem describes what the second stem is made of receive right-hand stress, but only when the second stem denotes a solid and fixed artefact. So iron saucepan and rubber band are right-stressed, but sand dune, wine stain, water droplet, and oil slick are not, even though the first stem expresses what the second is made of. This would be a curious observation, if it were true, and we might ask why only this semantic class receives right-hand stress. But there are exceptions to this generalization, as even Sampson himself points out—for example, rubber band is typically left-stressed in American English. If composed foods constitute artefacts, apple pie is well-behaved, but apple cake is not.

Olsen (2000b) makes another attempt to find semantic criteria that distinguish right-stressed N + N collocations from left-stressed ones. In addition to the ‘made of’ criterion that Sampson proposes for right-hand stress, she adds compounds whose first stems express temporal or locational relations, citing examples such as summer ʼdress, summer ʼnight, and hotel ʼkitchen.4 Left-hand stress, on the other hand, occurs in compounds ‘where a relational notion is not overtly expressed by the head noun, but is inferable on the basis of the meaning of one of the constituents’ (2000b: 60). So, for example, she argues that the compound mucus cell is left-stressed because ‘we … know that mucus generally “lines” cells’ (ibid.). There are a number of problems with her hypothesis, however. On the one hand, it is easy enough to find examples of left-stressed compounds whose first stems express temporal or locational meanings (ʼrestaurant ʼkitchen, ʼwinter coat, ʼsummer school5). And on the other, it is not entirely clear to us (knowing virtually nothing about biology!) that a mucus cell is so-called because it is lined with mucus, rather than made of mucus (in which case we would predict right-hand stress).6

Plag (2006b: 147–8) shows experimentally that stress assignment in novel compounds sometimes seems to depend on analogy to existing N + N constructions in the mental lexicon, with the head determining the analogical pattern. This effect has been observed in the literature for street and avenue compounds, where the former are left-stressed (Fifth Street) and the latter right stressed (Fifth Avenue) (Bauer 1983). But Plag's experimental data extend this result to compounds based (p. 11) on musical terms such as symphony and sonata: the former are typically right-stressed, the latter left-stressed. Plag, however, leaves it as an open question ‘how far such an analogical approach can reach’ and what combination of factors can be held responsible for this kind of analogical behaviour.

What we are forced to conclude is that for English, at least, left-hand stress is often a mark of compoundhood, but certainly cannot be taken as either a necessary or a sufficient condition for distinguishing a compound from a phrase. As Spencer (2003: 331) so aptly puts it, ‘there is a double dissociation between stress and structure’. There are phrases with left-hand stress and compounds with right-hand or double stress. We therefore need to look at other criteria that have been proposed for identifying compounds. Phonological criteria in other languages

Languages other than English of course have other phonological means for distinguishing compounds from syntactic phrases, among them: distinctive tonal patterns (Bambara [Bauer, Chapter 17, this volume]; Hausa [Newman 2000:190, 116]; Konni [Štekauer, Körtvélyessy, and Valera 2007: 66]); vowel harmony (Chuckchee [Bauer, Chapter 17, this volume]); stress patterns (German, Danish, modern Greek, Polish, Hebrew [see articles in this volume]; Ket [Štekauer, Körtvélyessy, and Valera 2007: 66]); segmental effects like fricative voicing (Slave [Rice, this volume]) or voicing (Japanese [Bauer, Chapter 17, this volume]); vowel deletion (Hebrew [Borer, this volume]) or vowel reduction (Maipure, Baniva [Zamponi, this volume]). In most cases, however, the literature does not allow us to tell how consistently these criteria distinguish compounding as a type of word formation. Much further research is needed to determine if in at least some languages compounding is phonologically distinctive in specific ways.

1.1.2 Syntactic criteria

Among the syntactic criteria that have been suggested for distinguishing compounds from phrases in English are inseparability, the inability to modify the first element of the compound, and the inability to replace the second noun of a nominal compound with a pro-form such as one.

Certainly the most reliable of these is the inseparability criterion: a complex form is a compound (as opposed to a phrase) if no other element can be inserted between the two constituents. While it is possible to insert another word into the phrase a black bird, e.g. black ugly bird, no such insertion is permitted with the compound blackbird. Ugly can only modify the compound as a whole: ugly blackbird. Although this criterion is almost foolproof, there are nevertheless cases that might call it into question. For example, if we were to consider phrasal verbs (i.e. verb plus particle combinations) as compounds of a sort (e.g. he took his hat (p. 12) off), this criterion would not hold for them. Of course, the solution here is easy: we need only decide that phrasal verbs are not compounds; Jackendoff (2002b), for example, considers them to be constructional idioms. Our second example is not so easily dismissed. In particular, it appears that the first constituents of items that we would otherwise have reason to call compounds can sometimes be coordinated, for example wind and water mills or hypo- but not hyperglycaemic (Spencer 2003). One might argue here, however, that these coordinated forms are really phrasal compounds, but of course that raises the issue of whether phrasal compounds really are compounds (see Lieber 1992a).

Yet another syntactic criterion of compoundhood might be modification of the first stem: the first stem of a compound does not admit modification, whereas in a syntactic construction modification is possible. With regard to adjective + noun complexes it appears that only phrases and not compounds can be modified by very. Therefore, we can say a very black bird if, say, we are pointing at a crow, but not *a very black bird if it is genus agelaius we are pointing out. But this test is not foolproof: relational adjectives can never be modified by very (*a very mortal disease), so the test can only be used if the adjective in question is a qualitative one. Bauer (1998b: 73) considers a broader version of this criterion for noun + noun complexes, pointing out that it seems impossible to modify the first stem in river-bed with an adjective: *swollen river-bed (where it is the river that is meant to be swollen). But in other cases, modification seems possible. He cites attested examples such as Serious Fraud Office and instant noodle salad.

Bauer (1998b: 77) also suggests as a test for compoundhood the inability to replace the second stem with a pro-form. In a phrase, it should be possible to replace the head noun with one, but not in a compound. So a black one can refer to our crow, but a blackone cannot be our agelaius. Nevertheless, Bauer shows that this criterion is also not foolproof. Although rare, examples like He wanted a riding horse, as neither of the carriage ones would suffice are attested, with riding horse and carriage horse otherwise having the appearance of compounds.

There are also language-specific syntactic criteria for distinguishing compounds from phrases. For example, Bauer (Chapter 21, this volume) notes that in Danish only a single N can take a postposed definite article. Therefore if a postposed definite article is possible, we have evidence that a sequence of two roots must be a compound. Fradin (this volume) points out that word order gives us evidence for compoundhood in French: if a sequence of lexemes displays an order that cannot be generated for syntactic phrases, we are likely dealing with a compound.7

(p. 13) 1.1.3 Inflection and linking elements

A final criterion for compoundhood concerns inflection, in those languages in which nominal inflection figures prominently. In one scenario, we recognize a compound when the head of the compound bears inflection, but the non-head does not. In another possible scenario we know that we are dealing with a compound when the non-head bears a compound-specific inflection. The former situation holds in Dutch, as Don shows (this volume), where the non-head of the compound is uninflected, and the latter in Yimas, where the non-head is marked as Oblique (Bauer, Chapter 17, this volume, citing Foley 1991).

As has frequently been pointed out, however, there are many languages in which non-compound-specific inflection does sometimes occur on the non-head of a compound, and discussion centres on how to interpret such cases. In English the plural marker is not infrequently found inside a compound: overseas investor, parks commissioner, programs coordinator, arms-conscious, sales-oriented, pants-loving (Selkirk 1982: 52). Selkirk tries to explain some of these exceptions: ‘it would seem that the actual use of the plural marker … might have the function (pragmatically speaking) of imposing the plural interpretation of the non-head, in the interest of avoiding ambiguity. This is probably the case with programs coordinator or private schools catalogue, for the corresponding program coordinator and private school catalogue are easily and perhaps preferentially understood as concerning only one program or private school’ (1982: 52).8 But surely this does not explain all cases. On the one hand, with respect to the initial stem of a compound like dress manufacturer, as Selkirk points out, it makes no sense to think of a manufacturer of one dress. On the other hand, a compound like programmes list doesn't seem to have any possible contrast: a programme list wouldn't be a list if it didn't involve more than one programme. It would seem that a plural is possible but not necessary in a compound to denote plurality of the first stem. Similarly, with possessive marking; it is not necessary to mark possessive on the initial element of a compound (for example, gangster money), but it is nevertheless possible (e.g. children's hour).

The issue of compound-internal inflection is inevitably bound up with that of so-called linking elements (also called interfixes or intermorphs in the literature). A linking element is a meaningless extension that occurs between the first and second elements of compounds. In some languages it is completely clear that this element is not an inflectional morpheme. For example, as Ralli (this volume) argues for modern Greek, the first element of a compound is always followed (p. 14) by -o which is semantically empty and is the historical remnant of a no-longer-existent theme vowel. For other languages, such as German and Dutch, there has been extensive discussion of whether the linking-elements can ever be construed as inflectional. The consensus is that while they may trace back historically to case or number markers, their status is quite different in the synchronic grammar: as mentioned above, they are meaningless, and often they do not correspond to any of the current inflectional forms of the nouns they occur on, although occasionally they may plausibly be interpreted as adding a plural flavour to the first element of the compound. See the chapters by Don and Neef in Part 2 of this volume for suggestions as to how linking elements in Dutch and German are to be handled.9

1.1.4 Summary

The picture that emerges here may seem a bit dispiriting: what are we to think if there are (almost) no reliable criteria for distinguishing compounds from phrases or from other sorts of derived words? One possible conclusion is that there is no such thing as a compound; Spencer (2003) argues this position for English. Another conclusion might be that there is simply a cline of more compound-like and less compound-like complexes, with no clear categorical distinction. As Bauer has put it (1998b: 78), ‘none of the possible criteria give a reliable distinction between two types of construction. The implication is that any distinction drawn on the basis of just one of these criteria is simply a random division of noun + noun constructions, not a strongly motivated borderline between syntax and the lexicon.’ A third conclusion is Olsen's (2000b): all noun + noun collocations are compounds for her. Plag (2006) simply remains agnostic on whether there is a distinction between noun + noun compounds and phrases.

So we return to the blind men and the elephant: not only are we not sure what the elephant looks like, but some of us are not even sure there's an elephant at all. There may be a single species, a range of related species, or the whole thing might be a delusion. Nevertheless, the majority of theorists—and us among them—seem to believe that it's worth looking further. We would not have been able to assemble so many interesting papers in this volume if this were not the case. Our approach is to broaden our focus and look for both theoretical perspectives that might bring new insight to the questions that compounds raise, and data from widely diverse languages that might be brought to bear on the issue of definition.

(p. 15) 1.2 Prospectus

We have divided this handbook into two parts. In the first, we look at compounds from a broad range of perspectives, both methodological and theoretical. We consider the issue of compounds from both a synchronic and a diachronic point of view, and through the lenses of a number of contemporary theoretical frameworks. We have tried to be eclectic in our choice of theoretical perspectives, tapping both the more familiar traditions of Western Europe and North America, and the rather less visible (for Western readers, at least) traditions of Central European linguistics. And we have tried to bring in as well the views of specialists in psycholinguistics and language acquisition.

Chapters 2 and 3 continue our focus on issues of definition and classification. In his chapter on idiomatology, Kavka assesses the relationship between compounds and idioms, arguing that both exhibit a gradience from mildly to wildly idiosyncratic interpretation that begs us to consider them together. Scalise and Bisetto consider a wide range of systems that have been proposed for classifying different types of compounds and, finding them all inconsistent, propose a new scheme of classification that has better potential for cross-linguistic utility. They rightly suggest that having a broadly applicable schema for classifying compounds will inevitably help us to see typological and areal trends in compounding.

In Chapter 4, ten Hacken reviews the early history of the treatment of compounding in generative theory, starting with Lees's seminal monograph (1960) and tracing theoretical developments through the beginning of the Lexicalist movement in the late 1970s. His chapter sets the stage for Chapters 59, each of which approaches the analysis of compounding from a different theoretical perspective.

Several of these theories fit roughly into the generative tradition, but differ in where they locate compounding in the architecture of the grammar. Giegerich's view (Chapter 9) is an extension of the Lexicalist tradition. He explores a multi-level analysis in which he treats as compounds in English only those N + N combinations with initial stress, leaving all other N + N combinations to the syntax. Giegerich considers as well the extent to which the meanings of various types of N + N combination can be correlated with their stress behaviour. In their chapters Jack-endoff (Chapter 6) and Di Sciullo (Chapter 8) propose frameworks that locate the analysis of compounds in the morphology. Both seek to derive compounds by mechanisms that are parallel (Jackendoff) or similar to (Di Sciullo) syntactic processes, but are not part of the syntax per se. Harley (Chapter 7), on the other hand, takes a purely syntactic approach to compounding, within the framework of Distributed Morphology, treating compounding as a species of syntactic incorporation.

In contrast to both the morphological and the syntactic approach, Lieber (Chapter 5) suggests that a perspicuous analysis of compounds is available within her framework of lexical semantic representation, and by inference, that the debate (p. 16) over whether to analyse compounding as morphology or as syntax is orthogonal to the most pressing issues in the analysis of compounds: what matters is not where compounds are derived, but how compound interpretation is to be arrived at. Booij (Chapter 10) approaches compounding from the perspective of Construction Grammar, another off-shoot of the generative tradition that has become prominent in the last decade or so, but which stresses the importance of hierarchically organized constructional schemas, along with the role of analogy for the generation of compounds.

Moving away from the generative tradition, in Chapter 11 Grzega maps various views of compounding from the ‘onomasiological’ perspective, a revived theoretical approach with a long tradition, especially in Central Europe and Germany. Heyvaert (Chapter 12) reviews analyses of compounding from the point of view of Cognitive Linguistics, which looks at language from the point of view of the cognitive abilities of the language user.

The next three chapters take a view towards the language user as well, looking at the perception, processing, and acquisition of compounds. In Chapter 13, Gagné reviews the copious psycholinguistic literature considering the nature of the mental representations in compounds, including such interrelated issues as the extent to which the individual lexemes of compounds or the whole compounds themselves are represented in the mental lexicon, the ways in which those representations are or are not linked, and the nature of lexical access in perception of compounds (for example, whether decomposition is involved or not). Štekauer, in Chapter 14, interrelates the word-formation and word-interpretation models in order to explain the predictability of the meanings of novel context-free compounds; his model explores the idea that there are always one or two dominant meanings ‘winning’ the competition among different potential meanings as a result of interplay between linguistic and extralinguistic factors. In Chapter 15, Berman looks at the acquisition of compounds, primarily in English and Hebrew, from the perspectives both of naturalistic and experimental data, and considers the extent to which the age and sequence of acquisition of compounds might be a matter of typological differences in languages.

Chapter 16 takes a diachronic view of compounding. There, Kastovsky looks both at the history of compounding in Indo-European, considering the origins of compounds, and at the treatment of compounding in the historical-comparative tradition.

The second part of this volume begins with the question of how to arrive at a typology of compounding given our present state of knowledge. Looking at a variety of languages, Bauer points out, perhaps surprisingly for some, that compounding is not a linguistic universal,10 and moreover that at our present state of (p. 17) knowledge we know of no special correlations between aspects of compounding and other typological characteristics of languages cross-linguistically. In the remainder of this section, we look at the facts of compounding in a range of languages. Although we can make no claim to representative coverage of the languages of the world either in terms of typology or geographic area, we have nevertheless made an effort to represent both well-known and well-researched languages and lesser-known and under-documented languages. Our sample is heavily weighted towards Indo-European, where – as is often the case – the majority of published work has been done. Within Indo-European, the sample is again disproportionately represented by Germanic (English, Dutch, German, Danish) and Romance (French, Spanish), again partly because there has been longstanding and active debate about compounding in these subfamilies. Outside of Germanic and Romance in Indo-European, we cover Modern Greek and Polish. We have also attempted to sample a number of other language families, including Sino-Tibetan (Mandarin Chinese), Afro-Asiatic (Hebrew), Finno-Ugric (Hungarian), Athapaskan (Slave), Iroquoian (Mohawk), Arawakan (Maipure-Yavitero), Araucanian (Mapudungan), and Pama-Nyungan (Warlpiri), as well as one language isolate (Japanese). We might have wished for a wider range of languages, but we were limited both by issues of space in this volume, and by the dearth of research on compounding in many lesser-known and under-described languages. In spite of its limitations, this section nevertheless contains what we believe is the largest sample of descriptions of compounding in the languages of the world that has been brought together in a single volume, and we hope for that reason that it will be useful.

We have tried to arrange the volume so that each chapter is self-contained and can be read separately. We hardly expect our readers to plough through all the articles consecutively. But for those who are truly engrossed by the study of compounds and who have the stamina to do so, we hope that they will see that certain intriguing themes emerge time and time again, regardless of framework or language. Among those themes are the following:

  • The definitional problem: Just about every article in this volume starts out by mentioning the difficulty in figuring out what constitutes a compound, either cross-linguistically or language-specifically. It seems almost impossible to draw a firm line between compounds on the one hand and phrases or derived words on the other, perhaps suggesting, as many in the tradition of Cognitive Linguistics have done, that compounding is a gradient, rather than a categorical phenomenon, with prototypical examples and fuzzy edges.

  • The problem of interpretation: This is perhaps better stated as a range of problems, including the issues of distinguishing compounds from idioms, of how compounds are assigned their interpretations, and of the extent to which those interpretations may be predicted.

  • (p. 18) The component problem: Given the difficulties above, what sorts of analyses are most appropriate for compounds, either cross-linguistically or language-specifically? What does the study of compounds tell us about what Jackendoff so aptly calls ‘the architecture of grammar’? The study of compounds has the potential to illuminate large and long-standing (and perhaps ultimately unresolvable) issues about the relationship among major components of the grammar, especially of course morphology and syntax, and about the nature of ‘wordhood’.

There are smaller leitmotifs that recur as well: issues of headedness, the nature of exocentricity, the relationship between compounding and other morphosyntactic phenomena such as incorporation, serial verbs, phrasal verbs, and the like.

We hope that ultimately this volume will serve as a resource for scholars and as a starting point and inspiration for those who wish to continue the work that still needs to be done on compounding. We suspect that we will never agree on what this particular elephant looks like, but if this volume accomplishes only one purpose, it should be to improve our certainty that there really is an elephant. We may still be blind, but we're pretty sure we're not delusional.


(1) Kiparsky (1982), for example, makes no distinction between verbal compounds and other compounds in terms of their derivation, and Štekauer and Valera (2007) also suggest that the generation of verbal compounds in English is not always a matter of back-formation. Indeed, Bauer and Renouf (2001) cite a few new verbal compounds in their corpus-based studies. On the other hand a number of scholars, including Pennanen (1966) and Adams (1973), treat verbal compounds like air condition as a product of back-formation.

(2) For a different view, see, for example, Arnold (1966) and Achmanova (1958).

(3) A variation on this proposal can be found in Ladd (1984), who proposes that we find left-hand stress when the left-hand constituent designates a subcategory of the right-hand constituent. This, of course, still runs afoul of the apple cake, apple pie problem.

(4) The last of these is actually left-stressed for one of the present authors.

(5) The last of these examples is also cited by Plag (2006).

(6) Spencer (2003) provides a very useful appendix in which he lists particular semantic fields in which some compounds are left-stressed and others right-stressed, suggesting that semantic field can have nothing to do with the determination of stress. Spencer in fact argues that it is impossible to distinguish compounds from phrases in English on any basis, and indeed that within a theory like Chomsky's Bare Phrase Structure, there is no need to do so.

(7) Marchand (1960) suggests this criterion for English as well, for compounds with present or past participle as the second constituent: easy-going, high-born, man-made. In these cases, Marchand (1960: 15) employs the following compoundhood criterion: the first constituent cannot syntactically function as a modifier of the right-hand constituent. The same principle is applied to the grass-green type characterized by two heavy stresses: an adjective cannot syntactically be modified by a preceding substantive (the corresponding syntactic phrase is ‘green as grass’).

(8) Allen (1978) tries to argue that the s in cases like craftsman is a linking element rather than a plural marker. She claims that the meaning of the first constituent in craftsman, tradesman, oarsman, helmsman, etc. is singular, and some of the first constituents of words of this class (kinsman, deersman) do not even have a plural. But it seems clear to us, as to Selkirk, that some of these forms indeed do denote plurals.

(9) See also Štekauer and Valera (2007) for examples of linking elements in other languages. Interestingly, they point out that the stem + stem type of compounds without any linking element is much more frequent than that with a linking element, but that in languages that have both types, the type with linking element is generally more productive.

(10) Indeed, in Štekauer, Körtvélyessy, and Valera's (2007) core sample of fifty-four languages, only forty-nine displayed compounding.