Jump to ContentJump to Main Navigation
BorrowingLoanwords in the Speech Community and in the Grammar$

Shana Poplack

Print publication date: 2018

Print ISBN-13: 9780190256388

Published to Oxford Scholarship Online: November 2017

DOI: 10.1093/oso/9780190256388.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (oxford.universitypressscholarship.com). (c) Copyright Oxford University Press, 2020. All Rights Reserved. An individual user may print out a PDF of a single chapter of a monograph in OSO for personal use. date: 20 October 2020



(p.1) 1 Rationale

Shana Poplack

Oxford University Press

Abstract and Keywords

This chapter identifies the rationale behind this volume: the enduring controversy over how to theorize language-mixing strategies. Relating this controversy to discrepancies in the conceptualization and treatment of the data of language mixing, it outlines a method to distinguish among other-language phenomena based on spontaneous bilingual performance, quantitative analysis, and rigorous standards of proof. It justifies the focus on the three quantitatively predominant manifestations of language mixing: nonce borrowing, lexical retrieval of previously borrowed words and code-switching. It introduces and defines integration, the major tool in characterizing language-mixing types. Ensuing chapters identify and illustrate an array of integration strategies, whereby the vast majority of lone other-language items are adapted to the morphological and syntactic patterns of a recipient language, in a variety of language pairs.

Keywords:   language mixing, nonce borrowing, loanword retrieval, code-switching, lone other-language items, recipient language, donor language, loanword integration, bilingual performance

1.1. Introduction

The last few decades have seen intensified interest in bilingual behavior, especially in its most elusive manifestation, code-switching. Literally dozens of volumes have been published on the subject, and new theses, articles, and handbooks continue to appear. Yet curiously little of this massive body of work has had a decisive impact on the field, in the sense of leading to consensus on even such elementary notions as what code-switching is and what constrains its usage.

Throughout this flurry of scholarly activity, with relatively few notable exceptions (e.g., Backus 2014; Field 2002; Haspelmath and Tadmor 2009; Haugen 1950a, 1950b; Heath 1989; Muysken 2000; Treffers-Daller 2010; Van Hout and Muysken 1994), short shrift has been given to the major manifestation of language contact: lexical borrowing. In virtually every bilingual situation empirically studied, borrowed items make up the overwhelming majority of other-language material. Here too, controversy reigns. Scholars have long been divided over whether borrowing and code-switching are distinct processes or instantiations of a single one. The conflicting code-switching theories that characterize the field today are due in no small part to the decisions their proponents take with regard to the identification and treatment of borrowings, especially when these occur spontaneously. To be sure, either position hinges on the ability to recognize them in the complex stream of bilingual speech, which many scholars have claimed is impossible. Comparisons with established loanwords are often of little avail; these are end products, but borrowing on the fly may display little resemblance to them. Correct identification of the various manifestations of language contact is an empirical question, requiring highly ramified empirical methods, yet it has thus far received relatively little systematic empirical attention. This volume, which synthesizes, builds on, and in some cases reinterprets more than three decades of original research on language-mixing strategies as they actually occur in the discourse of bilingual speakers, aims to fill this lacuna. It seeks to contribute to scholarly debate by characterizing in detail the phenomenon of lexical borrowing, in the speech community as well as in the grammar, both synchronically and, more unusual, diachronically. In so doing, we propose falsifiable hypotheses about established loanwords and nonce borrowings and test them empirically on a wealth of unique datasets on a wide variety of typologically similar and distinct language pairs. These are described in chapter 3. A major focus is the detailed analysis of integration, the principal mechanism underlying the borrowing process. Though the shape the borrowed form assumes may be colored by community convention, we (p.2) show that the act of transforming donor-language elements into native material is universal.

Thus, in contrast to most other treatments of language mixing, which deal with the product of borrowing (if they consider it at all), the work presented here focuses on the process: how speakers go about incorporating foreign items into their bilingual discourse, how they adapt them to recipient-language grammatical structure, how these forms diffuse across speakers and communities, how long they persist in real time, and whether they change over the duration. Particularly original here are the vast quantities of spontaneous bilingual performance data at our disposal and the scientific apparatus we bring to its analysis. Together, these enable us to test for the first time (and in a number of cases disprove) many long-standing beliefs about the way lexical borrowing works, which have until now lacked this kind of scientific validation.

This focus on borrowing further distinguishes this monograph from the bulk of the contact literature since the early seminal work of Haugen (1950a, 1950b) and Weinreich (1953/1968). Rather than analyze it in its own right, most current treatments view borrowing as ancillary to code-switching, if not a category concocted out of whole cloth to salvage certain code-switching theories. Schatz’s (1989, 129) appreciation that borrowing and nonce borrowing “often seem to play the role of garbage can designed to throw in data that does not fit neatly defined constraints on code-switching” and Boumans’ (1998, 159) assertion that “the label B[orrowing] has no explanatory value and also creates some confusion” are illustrative in this regard. As the analyses assembled in these chapters will demonstrate, such remarks belie a profound incomprehension of the actual behavior of bilinguals and the workings of the bilingual speech community. Here borrowing, in its various guises, will be seen to constitute not only the preferred mixing strategy by far but one which is heavily constrained by community norms. This motivates our decision to devote the bulk of this work to the analysis of borrowing, free from the controversies surrounding code-switching. In chapter 9, we return to the issue of distinguishing these two types of language mixing.

How do we know when we are in the presence of a loanword? Criteria offered to identify them typically include synonym replacement, cultural reference, persistence over generations, phonetic modification, frequency, and speakers’ own perceptions of the status of the word in the language, among others. In practice, however, none of these have been applied systematically, either for lack of pertinent data or appropriate analytical methods. This is why—in contrast to many other treatments—we turn to usage itself for the answer. We have pioneered the recourse to objective, empirically established criteria of frequency, recurrence, and diffusion to identify the different types of borrowings in the community (Poplack, Sankoff, and Miller 1988; chapter 4) prior to analyzing their linguistic structure and social distribution. The goal here is not to identify the loanwords in a given language (as per Haspelmath and Tadmor [2009], or Schultz [2012], among many others). Indeed, such lists are liable to be outdated by the time they (p.3) are published, since lexical items—both borrowed and native—are notoriously volatile (chapter 8). Moreover, items functioning as bona fide loanwords in the lexicon of one cohort of bilinguals (the young, the English-dominant, or those residing in a given neighborhood) may not have currency in another (chapter 11). Instead, this work aims to shed light on the process of borrowing, in terms of its major concomitant: integration into the structure of a recipient language. Here the bank of established loanwords (however constituted) provides crucial evidence in the form of snapshots of the outcome. The work presented in this volume enhances our understanding of that outcome by tracking the trajectory that spontaneously borrowed elements would have followed to achieve it. This kind of information, culled from community-based bilingual production data, synchronic and diachronic, is (necessarily) opaque to the historical linguist, whose access is limited to the product and who must therefore reconstruct both process and transition period.

In elucidating the mechanism of integration, the analyses assembled here have established that it cannot be understood without taking into account the fine details of inherent variability, in both donor and recipient languages. Variability is a hallmark of speech, and although this is usually overlooked in the contact literature, bilingual speech is no exception. The originality of this work resides in the systematic use it makes of such variability to elucidate the processes involved in language mixing. The analyses that constitute this volume are thus naturally integrated into the framework for linguistic analysis known as variation theory, with its focus on spontaneous speech in social context and dedicated quantitative methods of analyzing it (chapter 2). The variationist perspective is most evident in the bilingual performance data on which the analyses are based and the quantitative reasoning and standards of proof underlying the claims that are made. The bulk of the data comes from the Ottawa-Hull French Corpus (Poplack 1989), arguably still the only major bilingual corpus constituted according to scientific principles of representativeness. As described in chapter 3, it was collected in a well-defined bilingual speech community and incorporates, as potentially explanatory independent variables, individuals of differing socio-demographic cohorts and bilingual abilities. This yields speaker and speech samples that are large, varied, and representative enough to carry out the kinds of quantitative analyses necessary to detect language-mixing patterns. These in turn enable us to characterize the community norms which often trump predictions made on a purely linguistic basis (chapter 11).

A key methodological hallmark of this approach is comparison. Adopting the comparative sociolinguistic method (Poplack and Meechan 1998b; chapter 2), we confront the grammatical structure of borrowed items with that of the recipient language, the donor language, and where available, other mixed-language counterparts. Diachronic comparisons with older corpora afford the first characterization of pathways of loanword development in bilingual speech and concomitant assessment of long-held ideas about how integration proceeds over time. Access to (p.4) a number of smaller datasets on typologically diverse language pairs enhances this endeavor, enabling us to elaborate a variety of diagnostics of language membership (chapters 5, 6, 7, and 9). These may involve the same donor language incorporated into different recipients (e.g., French into Wolof, Fongbe, and Tunisian Arabic, or English into Ukrainian, Japanese, and Tamil), or a single language alternately in the guise of donor and recipient (French, English). Such multifaceted comparisons are the source of our contention that the integration of borrowed words is universal, though it may not always assume the same form, even in the same language pair. They also reveal that this process is best understood in terms of recipient-language patterns, which are often invisible to any but systematic quantitative analysis. Crucially, these patterns can only be fully apprehended by taking account of variability. To be sure, we do encounter some seemingly idiosyncratic loanword incorporation devices, but their touted uniqueness as products of bilingual grammar often dissipates once the peculiarities of the relevant recipient and donor benchmark varieties are factored into the analysis. Indeed, elements purportedly specialized for introducing other-language items not infrequently turn out to constitute the only productive avenue for incorporating neologisms of all kinds, not just borrowed ones.

As will be immediately apparent to anyone who has worked with naturally occurring bilingual speech or read the ample literature on bilingual mixing, many, if not most, of its manifestations are difficult to identify a priori and out of context. There are of course unambiguous cases, but relatively few would qualify as such. This is because the vast majority of other-language material consists of lone items, as in (1). Judging by surface indicators alone, many of these could be borrowed or code-switched or arguably, the product of yet another mixing mechanism. If only by virtue of their sheer quantitative importance, the behavior of these lone items (if in fact they all behave in the same way) has the capacity to skew any trends in the data. This makes it imperative to determine their status before attempting to arrive at a theory of their behavior and, by extension, that of any other mixing strategy.



In this context, perhaps the most contentious issue in the language mixing literature revolves around the concept of nonce borrowing, first adumbrated in Weinreich (1953/1968) but catapulted into scholarly debate by Sankoff, Poplack, (p.5) and Vanniarajan (1990; chapter 5). The identity, nature, and even existence of nonce borrowings have been the subject of enduring controversy in the field. Some scholars insist that that they are code-switches (e.g., Myers-Scotton 2002); some declare that they do not exist (Gardner-Chloros 2009; Gardner-Chloros and Edwards 2004); while others (reluctantly) acknowledge their existence but maintain that they are indistinguishable from code-switches. Some consider them redundant (Stammers and Deuchar 2012) or circular (Haspelmath and Tadmor 2009), and still others misinterpret the original proposals and their ramifications (e.g., the claim that nonce borrowings exhibit no integration at all [Simango 2000, 489]). Virtually all of the many papers written on code-switching since 1990 position themselves with respect to the phenomenon of nonce borrowing, but most do so through reference to secondary or tertiary sources, creating a whole self-sustaining narrative line which bears little relationship to the original claims and proofs. Over the years, a number of dedicated empirical methods, ranging from data collection to manipulation to analysis, have been developed to deal with these issues. A major goal of this monograph is to synthesize this research within a consistent theoretical, methodological, and terminological framework.

The working assumption underlying the analyses presented here is that the same lexical material is fair game for either borrowing or code-switching, and—linguistic and extra-linguistic conditions permitting—speakers may avail themselves of one, the other, or both in the course of a single bilingual interaction. This means that the proper identification of these strategies for combining languages ultimately resides not in the dictates of theory but in the way speakers themselves handle the material. As we will be at pains to demonstrate in the chapters making up this volume, independent comparative analysis is required to make this determination. Other-language elements cannot be identified a priori as borrowings (or code-switches for that matter, pace the decision of many scholars to simply designate them as such). Drawing on the well-documented observations that established loanwords are equivalent to recipient-language counterparts in their structural linguistic characteristics and that code-switches retain those of the language from which they are drawn, we marshal these two polar opposites as benchmarks against which to situate the mass of lone donor-language items that make up the vast majority of any bilingual corpus. One contribution of this research, then, is the validation of distributional and contextual criteria for distinguishing established loanwords from nonce borrowings, and both from code-switches.

1.2. Definitions

The above discussion has made only informal use of the terminology of bilingual discourse; before proceeding, it will be useful to define our understanding of these terms. Language mixing is used here as an umbrella term for various combinations of overt lexical material from two or more languages; structural borrowing, semantic extensions, and other contact-induced changes with no (p.6) overt lexical footprint are not considered here. An array of phenomena fulfill the above definition (see, e.g., Deuchar, Muysken, and Wang 2007; Gullberg, Indefrey, and Muysken 2009; Muysken 2015), but our focus in this volume is on the three that quantitative analysis has shown to account for nearly all the LD-origin material in the 16 corpora analyzed for this work. These are lexical borrowing, retrieval of previously borrowed loanwords, and code-switching. We operationally define borrowing as the process of transferring (Clyne 2003) or incorporating (Thomason and Kaufman 1988) lexical items originating from one language into discourse of another. We refer to these under the general label, other-language items. The focus is on lone other-language items (or compounds functioning as such), the canonical size of the prototypical loanword. The language that provides these items is referred to as the donor language (LD), or occasionally the source language, while the one that hosts them is the native, or recipient language (LR). In this work, LD and LR refer to the unmixed or monolingual stretches of these languages that are usual concomitants of mixed discourse. These serve as benchmarks against which the behavior of the lone LD-origin items is situated. Crucially, the specific donor and recipient languages invoked throughout are those of the same participants who produced the mixed data. Their benchmark varieties may be standard or not, but in no case is an idealized version of LD or LR appealed to for purposes of comparison with the behavior of borrowed items. As we will see, this minimizes the chances of misidentification, since an individual cannot be expected to integrate an other-language item into a variety s/he does not speak.

The lone LD items are further subdivided into those that figure in dictionaries or published wordlists of LR, to which we refer as attested or occasionally bona fide loanwords, and those that do not. Lexical retrieval of attested loanwords differs crucially from active borrowing: in the former, recourse to LD need not be involved. By virtue of having already been borrowed, such LD items are now constituents of the LR lexicon, independent of their ultimate etymological origin. This explains why they can be used by LR monolinguals with no knowledge of LD. Insofar as these items represent the product of the borrowing process alluded to above, they too qualify as benchmarks against which the behavior of spontaneously incorporated LD-origin items may be situated.

Other cross-cutting distinctions concern frequency or recurrence, which refer to the number of times a word was uttered, and diffusion, which captures the number of speakers to whom it has spread. Given the size of the corpus on which the major analyses of this volume are based (chapter 3), lone LD-origin items could be meaningfully classified according to these criteria. Those that are widely dispersed across the community (here operationalized as having occurred spontaneously in the speech of more than 10 individuals [chapter 4]) are designated established or widespread loanwords. Contrasting with established loanwords are LD-origin items of varying lesser frequencies, including those that occur only once. The latter are the contentious nonce words that (p.7) have challenged the field for so long. Nonce words are operationally identified using objective quantitative measures (chapters 4, 8, and 9), but their status as borrowings or code-switches can only be confirmed as a result of detailed and systematic comparison with the behavior of relevant counterparts in the benchmarks: LR and LD, as well as established loanwords and unambiguous code-switches, where available. LD-origin items occurring only once will therefore be referred to as nonce words or nonce items pending assessment of their status, and nonce borrowings only if they have been found to pattern with LR (and by extension established loanwords). In contrast to the informal use often made in the literature, here the term nonce borrowing is restricted to LD-origin items that simultaneously respond to both these frequency and integration criteria. Indeed, identification of nonce items as borrowed or code-switched is a major contribution of this volume.

Frequency, attestation history, and level of diffusion are alternate measures of entrenchment—whether in the lexicon or in the community—of a borrowed word. To some extent these are independent: the word may be attested and infrequent or established but unattested (chapter 4). Nonetheless, we operationally consider frequent, widespread, and/or dictionary-attested LD-origin items to form part of the LR lexical stock. These are the criteria we appeal to in identifying such items as loanwords, as distinct from borrowings (including eventual nonce borrowings), which, although also in use, may be infrequent, idiosyncratic, and/or unattested.2 In what follows, then, the term loanword is generally reserved for the product of the borrowing process, while the nominal use of borrowing refers to items that have not (yet) achieved this status.

Both types of LD-origin item tend to be inserted into LR structure; code-switching, in contrast, refers to alternation (cf. also Muysken [2000]) of stretches of one language with stretches of another, each retaining the morphology, syntax, and optionally the phonology of LD. The notion of LR is not pertinent to code-switching, because the process involved is not insertion but juxtaposition (Poplack 1980, 2015a). There is theoretically no length limitation on code-switches (though the analyses in ensuing chapters will confirm that single-word switches are exceedingly rare), but given the specific goal of ascertaining whether they can be distinguished from borrowing in their ambiguous (i.e., single-word) instantiations, in the first instance we target unambiguous code-switches as a benchmark for comparison. Thus, unless otherwise specified, the term code-switching as used here refers to multiword stretches of the other language. Confrontation of the behavior of lone LD-origin words with that of LD words contained in code-switches to LD, each analyzed independently in its own right, is thus the thread that binds much of the work presented here.

(p.8) Our major tool in characterizing language-mixing types is analysis of integration, the process by which bilingual speakers adapt LD items to the phonological, morphological, and syntactic patterns of LR. Ensuing chapters identify and illustrate an array of integration strategies, which vary from language pair to language pair, depending on the particular properties of LR. A focus on conflict sites (Poplack and Meechan 1998b)—areas where the contact languages differ quantitatively or qualitatively on some measure—enables us to distinguish voluntary from inadvertent integration. These will be shown to represent a key diagnostic of system membership.

1.3. Plan of this volume

The remainder of this volume is organized as follows. Chapter 2 reviews the variationist perspective on language and outlines its specific applications to the study of language mixing. Key among them are the focus on actual bilingual performance data, contextualization of its manifestations across speakers, mixing strategies and language pairs, systematic quantitative analysis of usage patterns, and incorporating checks on the validity and reliability of the results.

Chapter 3 describes the constitution of the bilingual “mega-corpus” which provides the data on which the major analyses of chapters 4, 8, 9, 10, and 11 are based, and introduces the 11 corpora of typologically distinct language pairs whose analysis provides corroborating evidence of many of the claims made there.

Chapter 4 reports the results of the first large-scale community-based study of borrowing as it transpires synchronically in the course of regular bilingual interactions. This work also represents an initial attempt to furnish an empirical basis for going beyond the traditionally invoked attested loanwords to characterize the borrowing process. Departing from distinctions among lone LD-origin items of varying frequencies, we carry out detailed structural analyses to ascertain whether English-origin nonce words incorporated into French display different structural properties from established loanwords. Among the linguistic diagnostics examined are gender assignment, plural marking, verb morphology, word order, and phonological realization. Consideration of all lone English-origin items regardless of frequency, recurrence, or attestation history enabled us to recruit bona fide loanwords, in conjunction with what we know of LR grammar, as benchmarks against which to assess whether the contentious lone LD-origin items are behaving like their established counterparts (and thus could be inferred to have been borrowed). This inaugurated the comparative sociolinguistic method, illustrated throughout this volume, which will be seen to be crucial in the identification and analysis of bilingual behavior. Results lead to the first corpus-based definition of nonce borrowing and the unexpected findings that 1) nearly all of the lone LD items studied, nonce or widespread, displayed a high level of integration into LR; 2) this was achieved very early—almost immediately!—at the morphosyntactic level, while phonological integration was variable; and 3) the linguistic (p.9) behavior of nonce items paralleled that of their established loanword counterparts. We revisit these results, expanding upon them synchronically and diachronically, in chapter 8 and test them on a variety of typologically distinct language pairs in chapters 5, 6, and 7.

The robust finding of chapter 4—that lone LD-origin words, whether novel or longstanding, display virtually identical linguistic behavior to established loanwords—led to the Nonce Borrowing Hypothesis. This captures the empirical observation that not only do speakers code-switch spontaneously, they may also borrow spontaneously, and these spontaneous borrowings assume the morphological and syntactic identity of LR prior to and independent of achieving the social characteristics of established loanwords. A corollary to the Nonce Borrowing Hypothesis is that borrowing, whether nonce or established, is a phenomenon of language mixture distinct from code-switching and is operationally distinguishable as such. Chapter 5 provides the reasoning behind this distinction. If the Nonce Borrowing Hypothesis is correct, it should apply not only to typologically similar language pairs like French-English but equally well to typologically different ones. In chapter 5, we characterize in some detail the process of nonce borrowing in just such a pair, Tamil, a head-final language with primarily SOV word order and associated features, and English, which shares few of those properties. This is an ideal testing ground for the Nonce Borrowing Hypothesis (and by extension the code-switching vs. borrowing distinction), since when Tamil is combined with English, numerous potential morphological and syntactic conflicts arise.

The work reported in chapter 5 also revisits the issue of residual variability observed in chapter 4, which was tentatively associated with the variability inherent in LR. Chapter 5 takes this insight beyond the casual comparisons of chapter 4, initiating a long-term research program aiming to analyze the properties of LR in conjunction with those of the borrowed material. The goal here is to test a much stronger loanword integration hypothesis: that LD-origin material that has been borrowed will display variability in morphosyntactic integration paralleling that of LR. Accordingly, another innovation first implemented in the work reported in chapter 5 involves explicitly marshaling the specific LR variety spoken by the bilinguals under study as the benchmark for comparison. Since that variety had rarely if ever been taken into account before, let alone studied from a variationist perspective, many key elements of the integration process had until then been obscured.

The conflicts examined in chapter 5 are word order and case-marking of English-origin nouns functioning as direct and indirect objects of Tamil verbs. If these were Tamil objects, they would appear in the canonical Tamil pre-verbal order and feature accusative and dative case markers, respectively. Simple token counts of the kind effected in chapter 4 reveal that while virtually all of them did precede the verb, consistent with Tamil grammar, and indirect objects were overwhelmingly inflected with Tamil dative markers, speakers usually “failed” to mark English-origin direct objects. Much attention has been lavished on such (p.10) apparently bare LD nouns, which have been attested in many other language pairs as well. The general consensus, as emerges from reliance on their surface form alone, is that they are code-switches. But such assumptions leave unanswered the question of why speakers would elect to borrow most indirect objects but code-switch most direct objects, especially when both “require” Tamil case inflections inadmissible in English. The comparative analyses in this chapter locate the answer in the properties of (unmixed) LR, in particular its (apparently previously unattested) propensity to case-mark variably. The finding that the distribution of overt and null case-marking on English-origin nouns closely parallels that on their native Tamil counterparts led to a crucial realization: where bare forms result from variability in LR, the facts of that variability can be used to identify which forms have been borrowed into LR and which have been switched to LD. Specifically, if the distribution of null case-marking on LD-origin nouns parallels that of native nouns, we are compelled to recognize even the apparently bare forms as borrowed.

Chapter 6 examines the identification of bare forms in more detail. Previous chapters relied heavily on morphological integration of borrowed material, finding that it tended to occur early and overwhelmingly in the French-English pair, and although morphology was often absent in Tamil accusative contexts, this too reflected LR marking patterns, signaling that LR grammar was operating on the LD words. But not all languages or contexts lend themselves to such comparisons. For example, in many languages, including Tamil, nominatives are null-marked, rendering this context a coincidence site, i.e., one that is not diagnostic when combined with non-case-marking languages like English. Rejecting the option of simply classifying such bare nouns according to theory-internal criteria, we recognize that their status is inherently ambiguous: absent any overt indication to the contrary, their surface behavior could be as consistent with null-marked borrowings as with single-word code-switches. Chapter 6 explores the use of syntactic criteria to disambiguate even such formally ambiguous elements, illustrating with an analysis of intra-clausal mixing of French with Wolof and Fongbe. These are isolating languages featuring virtually no overt morphology on the noun, obviating the morphological criterion for loanword integration. Here we appeal instead to the syntax of nouns and NPs, focusing on their variable distribution across different types of modification structure.

This chapter expands and systematizes the comparative endeavor adumbrated in chapter 4 by considering in greater detail not just the rates of occurrence of some diagnostic feature but also the variable structure of each of the languages involved in the contact situation. Because the LRs have no history of sociolinguistic description and hence no independent documentation of the way the languages are spoken, this again involves first establishing the (variable and invariant) patterns of NP modification independently in each and then comparing their behavior with that of LD congeners. Lone French-origin nouns are shown to pattern with their unmixed Wolof or Fongbe counterparts in ways far too specific to be (p.11) due to chance, while at the same time patterning differently from French nouns in unmixed French contexts. The only explanation for these differences from each other and from unmixed LR stems from recognizing that the LD-origin nouns have been borrowed and integrated into different recipients.

Chapter 7 corroborates the findings of foregoing chapters by reviewing a series of replications of the studies reported there on eight typologically distinct language pairs, making use of a wide array of phonological, morphological, and syntactic diagnostics. These include vowel harmony, word order, case-marking, adjectival expression, nominal determination patterns, and verb incorporation strategies, among others. Wherever a conflict site between LR and LD could be determined, lone items were systematically shown to behave like the former, often to the point of assuming the fine details of its variable quantitative conditioning. In the aggregate, the loanword integration hypotheses of chapter 5 were confirmed, although in some cases an option other than the dominant LR strategy was chosen. In the Tunisian Arabic-French pair, for instance, bilinguals avoided morphologically inflecting lone LD-origin items altogether, but in every case the selected avenue of integration favored by the community derived solely from LR and was absent from LD. Such strategies cannot be predicted on the basis of the linguistic characteristics of the languages in contact, foreshadowing the crucial role of the community in establishing and implementing pathways of loanword integration (chapter 11).

Making use of a unique series of speech corpora collected between the 1940s and 2007, chapter 8 traces, for the first time, the diachronic trajectory of nonce forms in bilingual production over a real-time period of 61 years and nearly a century and a half in apparent time. The length of the time frame coupled with the quantity of data together enable us to test and refute two widely embraced standard assumptions about such forms that had heretofore lacked empirical support: 1) lone LD-origin items introduced as nonce words typically increase in frequency and diffusion, and 2) lone LD-origin items are introduced as code-switches and only gradually converted to loanwords as they increase in frequency and diffusion. Adopting a strict definition of nonce word and systematically tracking those that persisted over more than one time period, we show that they generally do not go on to become established loanwords. Nor does it seem possible to predict which ones will “stick,” since no correlations with categories like “core” versus “cultural” or “luxury” versus “need” could be made. This is consistent with the primacy of the speech community in regulating the stock of borrowed words.

Consideration of the linguistic trajectory of those nonce words that did increase in frequency and diffusion over the 60-year period investigated shows that these items are not integrated gradually. On the contrary, perhaps the most startling result of these analyses is that nonce words assume LR grammatical structure abruptly. This suggests that when speakers go to access a lone LD-origin item, they make an instantaneous decision about how to treat it. They may opt to borrow it, in which case they assign it all the requisite LR structure, including its variable (p.12) properties, independent of considerations of frequency, diffusion, or attestation. The other alternative is to leave it as is. This implies incorporating it along with the grammatical properties associated with LD, a process we have been referring to as code-switching. The research reported in this chapter reveals that this almost never occurs with single words.

Having characterized the borrowing process empirically in its own right, in chapter 9 we confront it explicitly with the multiword (i.e., unambiguous) code-switches produced by the same French-English bilinguals. On each of the diagnostics examined in chapter 8, speakers are shown to treat their language-mixing types in diametrically different ways. They imbue their multiword code-switches with the morphosyntactic structure of LD while integrating their borrowings into that of LR, even to the extent of mirroring its variable patterning. Also measured is speakers’ relative propensity to engage in these mixing types in the first place, reasoning that if code-switching can in fact be equated with nonce borrowing, as many claim, speakers who make copious use of one should be equally likely to use the other. No such correlation could be established, further attesting to the distinction among these different mixing strategies in the behavior of bilinguals. Corroborating evidence comes from three additional language pairs and one triplet, in which, regardless of diagnostic or language, lone LD-origin items, nonce and established, are seen to behave in parallel, and differently from code-switches.

The variable phonetic integration of LD-origin material reported in chapter 4 suggested that this is not a reliable indicator of other-language status, in striking contrast to the morphosyntax (chapters 4 and 8), explaining the conspicuous absence of phonetic diagnostics from ensuing chapters. Capitalizing on a series of methodological innovations, chapter 10 revisits the question of whether speakers marshal phonetic integration as a strategy to distinguish language-mixing types. Observing that a key problem is that much of what is conventionally construed as integrated is involuntary (e.g., because a speaker’s inability to pronounce LD elements results by default in LR realizations), the work reported in chapter 10 targets speakers who are demonstrably able to produce both, thus ensuring that integration is by choice. Systematic comparison of the behavior of individuals, diagnostics, and language-mixing types reveals variability at every level of the phonetic adaptation process, providing strong confirmation of the suggestion of chapter 4 that individuals do not systematically integrate their LD words phonetically into LR. This applies not only when the LD words are nonce borrowings (perhaps unsurprisingly to some) but astonishingly, given well-entrenched assumptions in this regard, to dictionary-attested loanwords as well! Nor do the individuals studied share a phonetic strategy for handling any of their language-mixing types. This is in striking contrast to the morphosyntactic treatment they afford this same LD material when borrowing it: immediate, quasi-categorical, and consistent adaptation community-wide. This confirms that phonetic and morphosyntactic integration are independent. Only the latter is a reliable metric for distinguishing language-mixing types.

(p.13) How do borrowed words diffuse across the speech community, and what is the role of socio-demographic factors in their adoption and spread? Chapter 11 revisits the results of the first community-based study of this question. Exploiting the methodology described in chapter 2, here we systematically assess the effects of age, gender, social class membership, level of education, individual bilingual proficiency, minority versus majority status, and neighborhood of residence on the adoption and distribution of borrowed material. Making use of a sharedness measure, we also infer the channels of diffusion of specific words: how many borrowed types any pair of individuals uses in common and how many each uses that the other does not. Among the novel findings are: with respect to overall rate of borrowed words as a proportion of total vocabulary, social class is more important than either environmental (i.e., neighborhood or language status) effects or individual bilingual proficiency. The social class effect may be equated with normative pressures on speaking “well” (via avoidance of LD-origin material). Yet when we examine the specific preference for nonce borrowing, we find that the norms of the community override not only social class considerations but, more surprisingly, individual bilingual proficiencies too. This striking finding confirms that borrowing behavior is acquired and not simply a function of lexical need. Were it otherwise, individual abilities would outweigh other factors. Instead, both borrowing rates and type correspond to wider community norms, evidenced as (implicit) community-level sanctions against the elevated use of borrowing or as a community-wide preference for a particular type of borrowing.

Chapter 12 concludes with some reflections on the implications of this work for the study of language mixing.


(1.) Codes in parentheses identify corpus, speaker, and address of the utterance. See “Abbreviations and conventions” for transcription and glossing protocols.

(2.) This basically corresponds to Grosjean’s (1982) distinction between speech and language borrowing, but is the opposite of Haspelmath’s (2009) categorization of loanwords and borrowings.