Codeswitching: a bilingual toolkit for opportunistic speech planning.

\r\nAnne L. Beatty-Martínez*

  • 1 Department of Psychology, McGill University, Montreal, QC, Canada
  • 2 Department of Language Science, University of California, Irvine, Irvine, CA, United States
  • 3 Center for Language Science, The Pennsylvania State University, University Park, PA, United States
  • 4 Department of Spanish, Italian and Portuguese, The Pennsylvania State University, University Park, PA, United States

The ability to engage in fluent codeswitching is a hallmark of the flexibility and creativity of bilingual language use. Recent discoveries have changed the way we think about codeswitching and its implications for language processing and language control. One is that codeswitching is not haphazard, but subject to unique linguistic and cognitive constraints. Another is that not all bilinguals codeswitch, but those who do, exhibit usage patterns conforming to community-based norms. However, less is known about the cognitive processes that regulate and promote the likelihood of codeswitched speech. We review recent empirical studies and provide corpus evidence that highlight how codeswitching serves as an opportunistic strategy for optimizing performance in cooperative communication. From this perspective, codeswitching is part and parcel of a toolkit available to bilingual codeswitching speakers to assist in language production by allowing both languages to remain active and accessible, and therefore providing an alternative means to convey meaning, with implications for bilingual speech planning and language control more generally.


Traditionally, the study of codeswitching production and bilingual speech more generally has been carried out within separate disciplines, where cognitive psychologists and psycholinguists have primarily centered on exogenously-cued language switching, 1 and sociolinguists have focused on the analysis of codeswitching patterns within discourse of members of a given speech community. Formal disciplinary differences aside, one recurrent cross-disciplinary finding is that even when highly proficient bilinguals retain full control over the choice of how to use the two languages, switching is cognitively more demanding or costly than staying in one language (e.g., Gollan and Ferreira, 2009 ; Fricke et al., 2016 ; Gollan and Goldrick, 2016 ; cf. Johns and Steuck, 2018 ). This finding appears counterintuitive given the ubiquity of codeswitching in many bilingual communities, and thus begs the question of why bilinguals codeswitch in the first place. Here we put forth the proposal, based on quantitative analyses of spontaneous codeswitched speech, that codeswitching serves as a toolkit, or an opportunistic strategy for optimizing task performance in cooperative communication. While previous research has focused largely on the costs that codeswitching brings to language processing ( Guzzardo Tamargo et al., 2016 ; Adamou and Shen, 2017 ; Beatty-Martínez and Dussias, 2017 ; Byers-Heinlein et al., 2017 ; for reviews see Van Hell et al., 2015 , 2018 ), we consider the possible advantages that codeswitching may offer to language producers during bilingual language interactions. Critical to this endeavor is the view that codeswitching offers a unique flexibility that is driven by an interplay of bottom-up and top-down processes, but through which resources from both languages are ultimately recruited to convey speakers’ communicative intentions.

We refer to codeswitching patterns as the tendency to switch at particular syntactic or prosodic boundaries, or as proposed by Torres Cacoullos and Travis (2018) “…of the places where bilinguals can switch, where they prefer to do so” (p. 175; see also Poplack, 1993 ). It is important to note that bilingual speakers use their languages in different ways, and not all contexts of language use incur the same cognitive demands in speech production ( Green and Abutalebi, 2013 ; Luk and Bialystok, 2013 ; Green and Wei, 2014 ). Differences in codeswitching experience can affect not only language abilities ( Beatty-Martínez and Dussias, 2017 ; Valdés Kroff et al., 2018 ), but have also been proposed to mediate the relation between language and cognitive processes ( Beatty-Martínez et al., 2019 ). Furthermore, while not all bilinguals regularly codeswitch, those who do exhibit usage patterns conforming to community-based norms ( Beatty-Martínez et al., 2018 ; Torres Cacoullos and Travis, 2018 ; Ramírez Urbaneja, 2019 ).

Although codeswitching serves a variety of discourse functions, intentions to codeswitch are likely subject to pragmatic, and interactional constraints. Poplack (1987) compared codeswitching behaviors of Spanish-English Puerto Ricans living in New York City to those of French-English bilinguals in Ottawa-Hull, Canada, and observed differences in how the communities engaged in codeswitching. While Puerto Ricans adopted an open discourse mode, opportunistically threading together words and phrases from each language in order to convey the intended meaning, Ottawa-Hull bilinguals maximized the salience of switch points to fulfill rhetorical functions such as contrast and emphasis (see also Myslín and Levy, 2015 , for a similar observation with Czech-English bilinguals). Importantly, these findings suggest that bilinguals may plan speech differently as a function of their communicative goals ( Gardner-Chloros et al., 2013 ).

Codeswitching patterns are also constrained by bilingual ability. Whereas highly proficient bilinguals typically favor complex intra-sentential codeswitches and exhibit greater consistency of codeswitching occurrences, less proficient bilinguals tend to limit switching to freely movable constituents (e.g., tag items such as “I mean” or “you know”; Poplack, 1980 ), and show less voluntary control of their switching behavior ( Lipski, 2014 ). This observation is particularly relevant for bilingual speech planning because it shows that “fluent bilinguals codeswitch because they can, and not because they cannot speak any other way” ( Lipski, 2014 , p. 24). It follows that a better understanding of the processes that mediate codeswitching requires the consideration of bilinguals’ habits of language use as well as the interactional demands of their language environment.

This paper is not intended as a comprehensive review of the literature on codeswitching. Instead, we attempt to take stock of recent empirical findings from spontaneous language use that highlight how codeswitching enables bilinguals to handle cognitively demanding aspects of speech planning. We first consider the influence of bottom-up processes (i.e., structural priming) in codeswitching behavior, and argue that, while codeswitching may be sensitive to priming, bottom-up processes are ultimately modulated by top-down influences so as to convey speakers’ communicative intentions ( Green, 2018 ). As a first approximation, we provide corpus evidence of our own, focusing on complex noun phrases (NPs) in Spanish-English bilinguals who have extensive codeswitching experience, to exemplify how speaker intentions guide production choices in codeswitched speech. While it is beyond the scope of this article to fully evaluate our proposal, we hope to demonstrate the potential of this approach to highlight the value of naturalistic data and improve our understanding of how proficient bilinguals manage to use their two languages opportunistically in production.

The Contributions of Bottom-Up Factors in Codeswitching

Speakers’ production choices are not independent of their past experiences, as evidenced by the tendency (commonly referred to as structural persistence or priming) to reuse structures that they have recently produced or comprehended themselves ( MacDonald, 2013 ; Dell and Chang, 2014 ; Torres Cacoullos and Travis, 2018 ). Priming effects are widespread in spontaneous speech and have been observed both within individual languages (within-language priming) and cross-linguistically, where producing/hearing a structure in one language increases the probability of producing a related structure in the other language (see Pickering and Ferreira, 2008 ; Gries and Kootstra, 2017 , for reviews). Priming has been proposed as an important mechanism for speech planning, serving a facilitative function in processes related to selection and retrieval ( MacDonald, 2013 ). In the case of bilinguals, priming may provide a unique lens with respect to the strength of associations between cross-linguistic representations and the levels of processing at which cross-language activation can occur.

Priming effects are generally stronger when the prime and target are similar, which has led to the hypothesis that words with overlapping form and meaning across languages (e.g., cognates) may precipitate codeswitching ( Clyne, 2003 ; Broersma and de Bot, 2006 ; Broersma, 2009 ; de Bot et al., 2009 ). The logic is that cognate words can enhance the likelihood of a codeswitch by triggering a relatively high degree of cross-language activation, and in so doing, allowing the language system to switch from output in one language to output in another language. Indeed, cross-language priming effects are generally stronger when there is lexical overlap and shared word order across languages ( Kootstra et al., 2010 ), which is congenial to the idea that linguistic representations vary in their degree of activation in bilingual speech production ( Green, 2018 ). In an analysis of the Bangor Miami Corpus ( Deuchar et al., 2014 ), Fricke and Kootstra (2016) found that priming influenced not only the tendency to codeswitch, but the type of codeswitch as well. Importantly, they observed that other-language words, irrespective of whether they share the same word form, influenced the likelihood of codeswitching.

The Fricke and Kootstra (2016) results illustrate how bottom-up processes influence codeswitching behavior. That said, the scope of these effects in explaining codeswitching behavior is likely limited for a variety of reasons. It should be noted that cross-language priming is weaker in strength and shorter-lived than within-language priming ( Schoonbaert et al., 2007 ; Travis et al., 2017 ). In a study of coreferential subject priming, Torres Cacoullos and Travis (2018) reported that within-language priming was nearly four times stronger than cross-language priming. This result is also consistent with the observation of Myslín and Levy (2015) that words are generally more likely to reoccur in the language of most recent mention. Second, it has been established that speakers’ tendency to codeswitch is primed more by their own speech (i.e., within-speaker priming) than by the speech of others (i.e., between-speaker priming, also referred to comprehension-to-production priming), indicating that priming decreases as a function of the referential distance 2 between the prime and the target ( Fricke and Kootstra, 2016 ; see also Gries, 2005 ). Lastly, while spontaneous codeswitching is often deemed characteristic of bilingual discourse, the vast majority of utterances bilinguals produce are unilingual. For example, in the Bangor Miami Corpus, Fricke, and Kootstra reported that of the 42,291 utterances bilinguals produced, the bulk of them (94.2%) were in a single language (see also Beatty-Martínez and Dussias, 2019 for the proportion of unilingual and codeswitched NPs across four bilingual corpora). These factors taken together provide strong evidence that even habitual codeswitchers produce utterances in one language despite high levels of cross-language activation. Thus, bottom-up processes alone, no matter how robust, are not sufficient to account for codeswitching behavior in its entirety. Below, we consider how the speaker’s intentions may exert top-down control over codeswitching practices to achieve communicative goals.

Codeswitching as a Repair Strategy

The ease of producing speech with little conscious effort and few errors belies the complexity of its underlying cognitive processes. Speech disfluencies (e.g., pauses, false starts, and/or hesitations) are direct evidence of production difficulty ( Arnold et al., 2000 ); the fact that speakers make errors while planning utterances and sometimes correct them evinces the need for monitoring and control in production ( Nozari and Novick, 2017 ). 3 As a result, speakers may learn implicit strategies to mitigate production difficulty ( MacDonald, 2013 ; Dell and Chang, 2014 ). Here, we consider the idea that increased cognitive demands in language production may promote codeswitching as a deus ex machina of sorts: proficient bilinguals who have extensive codeswitching practice resort to such behavior as a way to mitigate speech planning demands that arise during the normal course of developing a speech plan (e.g., MacDonald, 2013 ). For bilinguals, speech planning is subject to the parallel activation of the two languages ( Kroll et al., 2006 ), creating many opportunities for cross-language interference, and increasing the potential for within-language interference ( Abutalebi and Green, 2007 ). Bilinguals must, therefore, develop language regulatory strategies to help them manage the relative activation of the two languages when planning goal-oriented speech ( Bogulski et al., 2019 ). Such strategies may include actively suppressing one language to enable fluent speech in the other language when the desire (or requirement) is to use one language alone, but they may also include codeswitching when the desire is to use both languages opportunistically ( Green, 2018 ).

One way to examine this issue is by identifying the types of phonetic and prosodic variation that arise in codeswitched speech. In an analysis of the Bangor Miami Corpus of Spanish-English codeswitching ( Deuchar et al., 2014 ), Fricke et al. (2016) found that lexical items involving a spontaneously-produced codeswitch had reduced speech rate and were more disfluent, relative to matched unilingual control lexical items. To a large extent, one can view these acoustic features as proxies for production difficulty, where slower speech rate and decreased fluency are associated with reduced automaticity (e.g., Segalowitz, 2010 ). Fricke et al.’s analysis of voice onset time (VOT) further revealed that low-level phonetic modulations often occur in anticipation of a codeswitch: English voiceless stops/ptk/were produced with more Spanish-like VOTs the closer they were to Spanish words, suggesting that these processing costs may more adequately reflect changes in the relative activation of the two languages (see also Balukas and Koops, 2015 , for a similar result with codeswitching bilinguals from New Mexico). It is possible that these phonetic changes arise due to the unintended activation of the non-target language, forcing the speaker to switch languages to maintain fluidity in the conversation. Conversely, speakers may have a strong desire to switch languages, and the anticipation of the switch leads to a momentary reorganization of the language system.

To dissociate these two explanations, we turn to a recent study by Johns and Steuck (2018) on the prosodic structure of codeswitched speech in the New Mexico Spanish-English Bilingual (NMSEB) corpus ( Torres Cacoullos and Travis, 2018 ). They observed that codeswitching was more likely to occur toward the end of a prosodic sentence, suggesting that harder-to-produce elements, i.e., those that tend to be produced later in utterances ( MacDonald, 2013 ), will often co-occur with codeswitched speech. Critically, however, they also observed faster speech rates within codeswitched prosodic sentences, relative to unilingual control utterances. This latter finding is important because it suggests that codeswitching is not a source of production costs per se . On the contrary, it may help bilingual speakers circumvent difficulties that are inherent to speech planning more generally, hence why it is more likely to occur toward the end of a planned utterance.

It is important to reiterate that, whereas Johns and Steuck (2018) focused on the speech rate within a prosodic sentence, Fricke et al. (2016) focused on the speech rate of words preceding codeswitches. This contrast reveals how codeswitching may come to affect bilingual speech at different levels of planning and raises the question of how to interpret the production costs observed in Fricke et al.’s study. We believe they reflect a momentary reorganization of the prosodic and phonetic systems, and that this reorganization is driven by a deliberate intent to switch languages. From this perspective, codeswitching serves two important functions in production. First, it enables speakers to negotiate lexical competition in a way that minimizes the impact of within-language and cross-language lexical interference. These prosodic and phonetic changes observed within single lexical items may in turn facilitate planning at higher levels, with the goal of maximizing fluency at the discourse level (see Hopp, 2015 , 2016 , for a similar account on how lexical processing impacts sentence comprehension in bilinguals). Second, the fact that codeswitching leads to systematic variation in speech means that listeners can reliably exploit these cues to facilitate comprehension ( Fricke et al., 2016 ; Guzzardo Tamargo et al., 2016 ; Valdés Kroff et al., 2017 ; Beatty-Martínez, 2019 ; Shen et al., 2020 ).

Codeswitching and the Problem of Variable Equivalence

If codeswitching enables bilinguals to successfully navigate linguistic interference in production, what are the strategies that reliably promote a codeswitch? One possibility is that bilinguals rely on cross-linguistic convergence to ensure that a codeswitch is successfully deployed. Research on codeswitching constraints (e.g., the equivalence constraint; Poplack, 1980 ) and cross-linguistic priming (see section “The contributions of bottom-up factors in codeswitching”) provide some basis for this idea but are insufficient to explain the overall pattern of data available to date. Interestingly, such an account predicts that bilinguals will consistently avoid “conflict sites” ( Poplack and Meechan, 1998 , p. 132) across the two languages when attempting to switch. But since we have argued that codeswitching is a tool to negotiate speech planning difficulties, we would expect opportunistic use of the languages at sites of variable equivalence, where the languages partially overlap ( Torres Cacoullos and Poplack, 2016 ). One way to tease this apart is by examining the prosodic structure of unilingual and codeswitched speech.

Recent evidence suggests that bilinguals strategically employ prosodic distancing at codeswitch junctures where the two languages sometimes differ due to independent, but inherently variable, processes to execute a codeswitch ( Torres Cacoullos and Travis, 2018 ). Like Johns and Steuck (2018) , this area of research examines prosodically-transcribed spontaneous bilingual data where the speech stream is segmented not into boundaries of major syntactic constituents but rather in stretches of speech uttered under a single intonation contour (e.g., intonation units; henceforth, IUs; Du Bois et al., 1993 ). Prosodic boundaries are perceptually delimited by a set of acoustic features (e.g., a pause, an initial rise in overall pitch level, and final phrase lengthening), and have been presented as evidence that speakers plan their speech in relatively large chunks, corresponding to IUs ( Krivokapić, 2012 ; Bishop and Kim, 2018 ). Given that it has been argued that speakers plan speech at prosodic boundaries ( Krivokapić, 2014 ), it is likely that linguistic material in the same prosodic unit is planned differently than those occurring in different units.

We illustrate this argument with recent developments in the prosodic positioning of complement clauses. Whereas main clauses typically co-occur in different IUs, main and complement clauses, which share a tighter syntactic relationship, tend to co-occur in the same IU ( Du Bois, 1987 ; Croft, 1995 ; Steuck, 2016 ). Steuck and Torres Cacoullos (2019) observed the same pattern in the speech of Spanish-English bilingual speakers when speaking in either of their two languages. Interestingly, main and complement clauses appeared to be prosodically less integrated when bilinguals codeswitched at the clause boundary, a result that could be interpreted as evidence for prosodic distancing (see example 1a below). However, Steuck and Torres Cacoullos also reported that when codeswitching occurred elsewhere (i.e., within the main or complement clause, see example 1b), the rate of prosodic integration of the two clauses was no different than unilingual IUs. Thus, prosodic distancing is not an inherent consequence of codeswitching, but rather serves as a strategy for negotiating cross-linguistic differences between the two languages: the complementizer “that” is present variably in English, while the complementizer “que” is present always in Spanish ( Torres Cacoullos and Travis, 2018 ).


Perhaps most telling is that bilinguals overwhelmingly prefer to codeswitch at prosodic boundaries rather than within IUs despite cross-linguistic differences ( Shenk, 2006 ; Durán-Urrea, 2012 ; Myslín and Levy, 2015 ). For example, Steuck and Torres Cacoullos (2019) reported that 60% of codeswitches involving main and complement clauses were at the boundary between the two clauses. Plaistowe (2015) extends this pattern more broadly too: in the NMSEB corpus ( Torres Cacoullos and Travis, 2018 ), speakers switched at IU boundaries 93% of the time. Why might this be? We consider the following possibility: the tendency of codeswitching at IU boundaries may reflect the outcome of a competitive process between active items of both languages and where codeswitching is best understood as an opportunistic response of the most active and most easily retrieved items ( Green and Wei, 2014 ). We infer that the pattern will depend, first and foremost, on how speakers manage the relative activation of their languages, as shaped by their habits of language use and the control demands of their interactional context ( Green and Abutalebi, 2013 ; Green and Wei, 2014 ; Beatty-Martínez et al., 2019 ). For example, bilinguals in single-language contexts engage language control competitively (i.e., where language membership is maximized and the activation of one language is suppressed at the expense of the other). In turn, bilinguals in codeswitching contexts engage language control cooperatively (i.e., where language membership is minimized and coactivation is maintained all the way through speech planning so that items from both languages make themselves available for selection).

Codeswitching as an Opportunistic Strategy

Recently, Green and Abutalebi (2013) and Green and Wei (2014) proposed that bilinguals in a dense-codeswitching context make use of processes related to opportunistic planning (e.g., Hayes-Roth and Hayes-Roth, 1979 ; Patalano and Seifert, 1997 ), spontaneously taking advantage of unforeseen opportunities to achieve their communicative goals. Despite growing interest in this idea, there is little empirical research directly examining how bilinguals make use of such a strategy in spontaneous discourse. Below we provide evidence for opportunistic planning by examining the production preferences in the modification of complex NPs of Spanish-English bilinguals living in San Juan, Puerto Rico. Before describing the distributions themselves, we provide a brief overview of the interactional context, participants, and data collection methodology. While Spanish remains the predominant language of Puerto Rico, the use of English is loosely supported in many contexts of everyday life (e.g., in education, media, and other societal domains). Importantly, codeswitching is very common among bilinguals, especially those of the younger generations ( Casas, 2016 ; Pousada, 2017 ; Beatty-Martínez, 2019 ; Guzzardo Tamargo et al., 2019 ). Thus, it follows that bilinguals in this context may be able to use whichever words and structures that are most active to achieve their communicative goals with little-to-no interactional cost ( Green and Abutalebi, 2013 ; Beatty-Martínez et al., 2019 ). In other words, “their skill lies less in avoiding language conflict than in utilizing the joint activation of both languages and adapting their utterances appropriately” ( Green, 2011 ; p. 2). Codeswitching in this context therefore represents a device for taking advantage of the more efficient of the two languages ( Gibson et al., 2019 ) and through which the cost in time and resources can be minimized.

The data under study here were obtained from the Puerto Rico subset of the Codeswitching Map Task (PR-CMT) corpus ( Beatty-Martínez et al., 2018 ; Beatty-Martínez and Dussias, 2019 ; Królikowska et al., 2019 ), a corpus of unscripted, task-oriented dialogs designed to assess codeswitching behaviors in bilingual speakers. The corpus consists of approximately 2.5 h of recordings with 10 Spanish-English bilinguals (6 female). All participants were native Spanish speakers who had acquired Spanish at birth and English either simultaneously or in early childhood. Participants assessed their own proficiency to be equally high in both languages (see Table 1 for a summary of participant characteristics).


Table 1. Participant self-reported characteristics.

Participants also answered questions about overall language exposure to Spanish and English and their frequency of use in various contexts in daily life. As depicted in Figure 1 , participants reported more exposure to Spanish when interacting with family, more exposure to English in the media, but being exposed to both languages equally among friends. Descriptively, these data exemplify how participants’ interactional context supports the use of both languages.


Figure 1. Participants’ self-reported exposure to Spanish and English across different social domains. Ratings were made on a 10-point scale ranging from 0 (no exposure) to 10 (high exposure). Error bars indicate standard error of the mean.

In the map task, director-matcher pairs took turns describing visual scenes (i.e., maps) to one another within a designated time limit. Participants played the role of the director, sitting at a table opposite a confederate matcher who was both a close friend and an in-group member from the same speech community (i.e., San Juan, Puerto Rico). This is important, as previous research has shown that speakers may produce four times as many codeswitches in informal contexts when they are paired with an in-group interlocutor ( Poplack, 1983 ). Furthermore, unlike other guided production tasks where the data distribution is typically controlled and participants are either forced to switch languages or familiarized with object names before the interaction takes place, dialogs were completely unscripted and conversational partners were free to use whichever language they wanted. This sacrifice in experimental control is compensated by the opportunity to offer insights of non-standard language use within the speech community ( Sankoff, 1988 ; Torres Cacoullos and Travis, 2018 ).

Director and matcher maps differed only in terms of the way the objects were arranged on a computer screen. Visual scenes contained background objects that were fixed; moveable objects were placed in reference to fixed objects, exerting the need to describe them in terms of their spatial arrangement (see Figure 2 for an example). Visual maps required to replicate the experiment are included as Supplementary Material Files . All objects were presented in color to elicit more detailed descriptions. Additionally, some objects appeared more than once in the same slide, but with different qualities (e.g., a series of faces differing in their facial expressions; see Gullberg et al., 2009 ; Pivneva et al., 2012 ; Valdés Kroff and Fernández-Duque, 2017 , for similar procedures) as evidenced in excerpt (2) below:


Figure 2. A visual panel from the Codeswitching Map Task.


Our quantitative analysis abides by the principle of accountability ( Labov, 1972 ), comparing the rate of codeswitching across different types of constructions by contextualizing them with respect to the contexts where they could have occurred but did not (i.e., by circumscribing the variable context; Labov, 2005 ). This approach has been widely employed in corpus analyses of codeswitched speech by extracting not only codeswitched tokens across the different types of constructions, but also their unilingual counterparts in Spanish and English ( Poplack, 1980 , 2017 ; Torres Cacoullos and Travis, 2018 ; Steuck and Torres Cacoullos, 2019 ). Table 3 summarizes the distribution of unilingual and mixed NPs extracted from the corpus. We begin by examining the distribution of simple NPs –composed only of a determiner and a noun– across unilingual and mixed phrases. As shown in Table 2 , the vast majority of NPs in the corpus were unilingual (Unilingual, Mixed: χ 2 = 321.14, df = 1, and p < 0.001), with roughly half of them produced in Spanish and about a third in English. This finding is congenial to past studies showing that codeswitched utterances constitute a small proportion of corpus data, even in communities where codeswitching is a regular communicative practice ( Beatty-Martínez and Dussias, 2017 , 2019 ; Green, 2019 ). For simple mixed NPs, all but three tokens (“la balloon,” “la guitar,” “the rueda”; English ballon, guitar, and wheel, respectively) were comprised of a Spanish masculine determiner and an English noun, replicating the well-documented asymmetry with respect to grammatical gender and switching direction ( Poplack, 1980 ; Valdés Kroff, 2016 ; Beatty-Martínez et al., 2018 ; Casielles-Suárez, 2018 ; cf. Blokzijl et al., 2017 ).


Table 2. Number and proportion of noun phrase utterances across languages in the PR-CMT corpus.


Table 3. Number and proportion of complex Adj + N/N + Adj constructions across languages in the PR-CMT corpus.

Next, we examine bilinguals’ structural and language choices in the modification of complex NPs (e.g., the black dog) –a site of variable equivalence between English and Spanish–relative to the mixed Determiner + Noun baseline shown in Table 3 . Critically, examining the distributional patterns of complex mixed NPs will allow us to explore whether there are opportunistic behaviors in how codeswitching bilinguals manage to negotiate their two languages.

In English, adjectives typically precede the noun (Adj + N; e.g., the Det yellow Mod house N ). In Spanish, most adjectives are typically placed post-nominally (N + Adj; e.g., la Det casa N amarilla Mod ) although there is a small group of modifiers that occurs prenominally (e.g., quantitative modifiers such as ordinals and cardinals; e.g., la Det primera Mod casa N , “the first house”). A further cross-linguistic difference is that English makes use of compounding freely and productively (i.e., N + N constructions such as “the diamond ring”) whereas compounding in Spanish is much more limited, preferring left-headed noun-prepositional-phrase (N + PP) constructions (e.g., “el anillo de diamante” ; Liceras et al., 2002 ; Varela, 2012 ). Lastly, Spanish differs from English in that Spanish agreement rules require that other grammatical elements (e.g., determiners, adjectives, etc.) match the gender of the noun they modify. Against this background, one possibility is that complex mixed NPs should be generally avoided in contexts that require overt gender marking (e.g., Otheguy and Lapidus, 2003 ; Balam and Parafita Couto, 2019 ) or “strictly limited” ( Pfaff, 1979 , p. 306) due to cross-linguistic differences in word order (for Adj + N and N + Adj constructions) and lexicalization preferences (for N + N and N + PP constructions). If this were the case, we would expect to find a decrease in the proportion of codeswitching in complex NPs relative to the proportion of codeswitching in simple NPs. However, in our data, the opposite is true. 4

While all-Spanish utterances predominate when bilinguals produce simple (Det + N) NPs (Spanish, English: χ 2 = 40.034, df = 1, and p < 0.001; Spanish, Mixed: χ 2 = 113.39, df = 1, and p < 0.001), they are not preferred when modifiers (i.e., adjectives) are used (Spanish, English: χ 2 = 22.469, df = 1, and p = 1.00; Spanish, Mixed: χ 2 = 3.504, df = 1, and p = 0.969), as shown in Table 3 . This shift in language choice cannot be due to differences in proficiency or exposure, since Spanish is the native and predominant language of this community of speakers.

One potential explanation, following Myslín and Levy (2015) , is that the use of English (participants’ less frequent and therefore more salient language) offers a distinct encoding that signals novel information. Such an account would predict an increase in the use of English within complex mixed NPs across all types of modifiers, regardless of the type of modifier and of the type of construction. An alternative hypothesis, and one that we endorse here, is that speakers will adopt strategies from both languages that are advantageous within a given communicative context. In this case, we would expect speakers to prefer the use of prenominal modification strategies (i.e., Adj+N or N+N constructions), which are overwhelmingly preferred in English but can also appear in Spanish with some types of modifiers (e.g., quantitative modifiers). Such a strategy would help disambiguate between competing sources of information in the map task. For example, when referring to duplicate objects such as the gloves displayed in Figure 2 , participants could describe the target glove as having a specific color (e.g., “The brown/gray glove” in English or “El guante marrón/gris ” in Spanish) or as being made of a specific material (e.g., “The leather/cotton glove” in English or “El guante de cuero/algodón ” in Spanish). While it is difficult to determine at which point disambiguation is achieved when using English (i.e., listeners could initially consider other brown/gray items such as the brown purse displayed in the figure), what can be said with more certainty is that for Spanish utterances, disambiguation between the target and non-target gloves cannot be achieved until after the noun is spoken (e.g., el guante marrón/de cuero ). Therefore, bilinguals’ language and structural choices should favor prenominalization in duplicate contexts to facilitate referent identification ( Fukumura, 2018 ), and thus, optimize task performance.

Indeed, a comparison of the proportion of complex mixed NPs in duplicate against singleton items confirmed that the proportion of codeswitches was greater for duplicate items (Duplicate, Singleton: χ 2 = 4.588, df = 1, and p = 0.016). Moreover, as the data in Table 4 show, complex mixed NP constructions were overwhelmingly made up of an English prenominal modifier followed by an English noun (e.g., el red car; Prenominal, Post-nominal: χ 2 = 50.330, df = 1, and p < 0.001; English, Spanish: χ 2 = 47.573, df = 1, and p < 0.001), suggesting that the use of prenominalization increased across the board. That said, we note that not all complex mixed NPs were opportunistic, as there was a smaller subset of tokens containing Spanish modifiers after the noun (e.g., el car rojo). Importantly, however, the pattern of results reported here is consistent with the distributions reported for Spanish-English bilinguals in Miami ( Parafita Couto and Gullberg, 2019 ) 5 and Northern Belize ( Balam and Parafita Couto, 2019 ).


Table 4. Distribution of complex mixed NP modifiers across languages and word order in the PR-CMT corpus.

As we mentioned earlier, quantitative modifiers ( N = 32) occur prenominally in Spanish, and as such, these were examined separately. At this point one could speculate that bilinguals simply prefer to produce complex mixed NPs with English modifiers. However, if prenominalization, rather than the use of English per se , is key to bilinguals’ structural and language choices, we should then expect a relative increase in the proportion of Spanish modifiers in complex mixed NPs with quantitative modifiers. And, indeed, this is what we observe in Table 5 (Quantitative, Non-Quantitative: χ 2 = 46.178, df = 1, and p < 0.001). Moreover, Spanish modifiers were more prevalent relative to English modifiers in this context (Spanish, English: χ 2 = 11.281, df = 1, and p < 0.001), demonstrating that bilinguals will capitalize on the dominant language when it converges with the optimal strategy (i.e., prenominalization).


Table 5. Distribution of Spanish and English quantitative modifiers in complex mixed NPs in the PR-CMT corpus.

The second pattern of results concerns bilinguals’ structural and language choices in N + N and N + PP constructions. Recall that N + N compounds are highly productive in English but dispreferred in Spanish; the opposite is true for N + PP constructions. Notwithstanding, when bilinguals codeswitch, they are able to opportunistically make use of both Spanish and English strategies. Following the same logic as described above, one possibility is that bilinguals will show a preference for English lexicalization strategies, given that the use of the N + N construction allows the speaker to focus on what is perhaps more important or conceptually salient earlier in the utterance ( MacDonald, 2013 ; Fukumura, 2018 ). Because Spanish is the dominant language, we can interpret the switch from Spanish into English in mixed N + N constructions as reflecting an opportunistic response, suggesting that the English strategy was most active and most easily retrieved. As Table 6 shows, bilinguals are actively making use of the N + N construction. In all codeswitched tokens, both the head noun and the modifier were produced in English and were preceded by a Spanish masculine determiner. Remarkably, the rate of mixed N + N constructions is nearly identical to that of unilingual English utterances (English, Mixed: χ 2 = 0.115, df = 1, and p = 0.367) and is higher than the codeswitching rate reported previously (N + N, Adj + N: χ 2 = 6.662, df = 1, and p = 0.005). We speculate that this increase may be related to chunking, the process by which frequently co-occurring sequences of words are grouped together in cognitive representation ( Bybee, 2013 ; Christiansen and Chater, 2016 ). Because chunking is a gradient phenomenon, Adj + N and N + N constructions (e.g., such as “blue shoe” and “tennis shoe”, respectively) can be conceptualized as falling on a continuum, where instances with stronger collocational associations are more likely to be accessed as a single unit rather than compositionally ( Bybee, 2010 ).


Table 6. Number and proportion of N + N constructions across languages in the PR-CMT corpus.

Consistent with the prediction that bilinguals would capitalize on language structures with prenominal modification, N + N constructions are produced at a much higher rate in the corpus relative to N + PP constructions (N + N, N + PP: χ 2 = 37.895, df = 1, and p < 0.001). As shown in Table 7 , the majority of N + PP constructions were produced in Spanish (Spanish, Mixed: χ 2 = 3.062, df = 1, and p = 0.040). This can be taken as further evidence for how bilinguals are able to accommodate their production choices to optimize task performance. Notwithstanding, we do not take this finding to indicate that bilinguals disregard the use of Spanish-preferred constructions when codeswitching. The few codeswitches that did occur in the corpus are indicative that bilinguals do consider and make use of alternative forms of expression that would be competing in monolingual contexts. We believe that, in this particular communicative context, N + PP constructions serve as a “just-in-time” or deus ex machina resource to circumvent potential pitfalls of the speech plan. An important implication is that bilinguals can use (or switch into) one language while the other language stands at the ready as future challenges and opportunities emerge.


Table 7. Number and proportion of N + PP constructions across languages in the PR-CMT corpus.

Altogether, these data provide initial empirical support for opportunistic planning during codeswitching. Contrary to the prediction that bilinguals would avoid switching in contexts of variable equivalence due to differences in word order and lexicalization preferences, we observed increased rates of codeswitching despite any potential costs, consistent with Steuck and Torres Cacoullos (2019) . This finding also speaks to bilinguals’ intention to codeswitch as a means to achieve their communicative goals. Specifically, we observed that codeswitching bilinguals capitalize on what is most optimal for the current situation (i.e., prenominal modification) by switching languages when circumstances call for such a change. Codeswitching thus may serve as an opportunistic strategy to make use of whatever comes most readily available, all the while conforming to the goals of the speaker.

Closing Remarks

The studies reviewed here, together with the data we examined, provide critical evidence for the way in which the language system is controlled. In line with contemporary theoretical models of bilingual speech production and language control ( Green, 2011 , 2018 , 2019 ; Green and Abutalebi, 2013 ; Green and Wei, 2014 ), these data support the notion of a cooperative control state, where both languages may openly contribute to production. This stands in contrast with other forms of language use in which language control is engaged competitively and where the “gate” for non-target language items is locked ( Green and Wei, 2014 , p. 502). Although, research on bilingual language production has shown that bilinguals demonstrate difficulties in language fluency, due perhaps to reduced functional use of the languages (e.g., Gollan et al., 2008 ), increased cross-language competition (e.g., Sullivan et al., 2018 ), or limited proficiency ( Bialystok et al., 2008 ), our data suggest that codeswitching might aid language fluency by allowing both languages to remain active and accessible, and therefore providing an alternative means to convey meaning. It remains to be determined what the role of cognitive control is in spontaneous codeswitched speech relative to unilingual speech ( Nozari and Novick, 2017 ). For now, we note that while such flexibility may not be impervious to production costs that arise during normal speech production (e.g., Green, 2019 ), having the option to either explore or restrict language control states throughout the planning process may potentially alleviate many cognitive demands. In this way, this finding provides support for the more general notion that speakers adopt implicit strategies to mitigate production difficulty ( MacDonald, 2013 ). While the precise mechanisms underlying codeswitching are yet to be fully understood, we hope this will be an active area of research in years to come.

Data Availability Statement

All datasets generated for this study are included in the article/ Supplementary Material .

Ethics Statement

The studies involving human participants were reviewed and approved by The Pennsylvania State University Institutional Review Board (Approval Number: 34810). The participants provided their written informed consent to participate in this study.

Author Contributions

All authors equally conceived the theory and hypotheses presented here and wrote the manuscript.

The writing of this manuscript was supported in part by NIH grant F32-AG064810 to AB-M, NSF Grant BCS-1824072 and NIH Grant F31HD098783 to CN-T, and by NSF Grants BCS-15351241 and OISE-1545900 to PD.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


We are grateful to Rena Torres Cacoullos for helpful comments and discussions during the preparation of this manuscript. We would also like to thank the editor and the reviewers for their insightful comments on earlier versions of the manuscript.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2020.01699/full#supplementary-material

PRESENTATION S1 | Director slides for the Codeswitching Map Task.

PRESENTATION S2 | Matcher slides for the Codeswitching Map Task.

  • ^ Although cued language-switching studies provide a direct bridge to the more general phenomenon of task switching and non-verbal cognitive control (e.g., Monsell, 2003 ; Prior and Gollan, 2011 ; Zhang et al., 2015 ), whether the same cognitive and neural processes that underlie cued language switching are also deployed for spontaneously-produced codeswitches is an open question. For present purposes, we treat language switching and codeswitching as qualitatively different phenomena, and thus focus exclusively on codeswitching research.
  • ^ Under usage-based approaches, priming effects are typically evaluated in terms of “referential distance” ( Givón, 1983 ; Myhill, 2005 , p. 473), where distance is measured in terms of the number of intervening clauses between the target and the previous mention of the referent as well as the presence or absence of intervening human subjects ( Torres Cacoullos and Travis, 2018 ).
  • ^ Recently there have been a number of studies that have examined disfluencies in codeswitched speech while reading aloud (e.g., Gollan and Goldrick, 2016 ; Gollan et al., 2017 ; Halberstadt, 2017 ). However, it is beyond the scope of this article to determine the extent to which the cognitive processes engaged in a reading-aloud paradigm are generalizable to spontaneous speech production (c.f., Guaïtella, 1999 ).
  • ^ A reviewer raised the possibility that the absence of English-to-Spanish mixed NPs might affect the predictions regarding bilinguals’ production choices. We hypothesize that where codeswitching norms differ, opportunistic strategies may manifest differently. The codeswitching patterns of Nicaraguan bilinguals are an interesting test case as they seem to differ from other Spanish-English bilingual communities, exhibiting a marked preference for English determiners in simple mixed NPs (e.g., “the perro” instead of “el dog”; Blokzijl et al., 2017 ). Given that prenominal modification is most optimal (in terms of greater discriminatory efficiency), the more opportunistic strategy would be to avoid switching within complex mixed NP structures altogether (preferring unilingual English complex NPs instead). Our hope is that the proposal put forth here will inform and shape future research directions.
  • ^ Note that this study also reports a similar pattern for two other language pairs (Welsh-English and Papiamento-Dutch) with the same conflict regarding the relative order of the adjective and the noun.

Abutalebi, J., and Green, D. W. (2007). Bilingual language production: the neurocognition of language representation and control. J. Neurolinguist. 20, 242–275. doi: 10.1016/j.jneuroling.2006.10.003

CrossRef Full Text | Google Scholar

Adamou, E., and Shen, X. R. (2017). There are no language switching costs when codeswitching is frequent: processing and language switching costs. Int. J. Biling. 23, 53–70. doi: 10.1177/1367006917709094

Arnold, J. E., Losongco, A., Wasow, T., and Ginstrom, R. (2000). Heaviness vs. newness: the effects of structural complexity and discourse status on constituent ordering. Language 76, 28–55. doi: 10.1353/lan.2000.0045

Balam, O., and Parafita Couto, M. C. (2019). Adjectives in Spanish/English code-switching: avoidance of grammatical gender in bi/multilingual speech. Spanish in Context 16, 21–35. doi: 10.1075/sic.00034.bal

Balukas, C., and Koops, C. (2015). Spanish-English bilingual voice onset time in spontaneous code-switching. Int. J. Biling. 19, 423–443. doi: 10.1177/1367006913516035

Beatty-Martínez, A. L. (2019). Revisiting Spanish Grammatical Gender In Monolingual And Bilingual Speakers: Evidence From Event-Related Potentials And Eye-Movements , Doctoral thesis, The Pennsylvania State University, University Park, PA.

Google Scholar

Beatty-Martínez, A. L., and Dussias, P. E. (2017). Bilingual experience shapes language processing: evidence from codeswitching. J. Mem. Lang. 95, 173–189. doi: 10.1016/j.jml.2017.04.002

Beatty-Martínez, A. L., and Dussias, P. E. (2019). Revisiting masculine and feminine grammatical gender in Spanish: linguistic, psycholinguistic, and neurolinguistic evidence. Front. Psychol. 10:751. doi: 10.3389/fpsyg.2019.00751

PubMed Abstract | CrossRef Full Text | Google Scholar

Beatty-Martínez, A. L., Navarro-Torres, C. A., Dussias, P. E., Bajo, M. T., Guzzardo Tamargo, R. E., and Kroll, J. F. (2019). Interactional context mediates the consequences of bilingualism for language and cognition. J. Exp. Psychol. Learn. Mem. Cogn. 46, 1022–1047. doi: 10.1037/xlm0000770

Beatty-Martínez, A. L., Valdés Kroff, J. R., and Dussias, P. E. (2018). From the field to the lab: a converging methods approach to the study of codeswitching. Languages 3:2. doi: 10.3390/languages3020019

Bialystok, E., Craik, F. I. M., and Luk, G. (2008). Lexical access in bilinguals: Effects of vocabulary size and executive control. J. Neurolinguist. 21, 522–538. doi: 10.1016/j.jneuroling.2007.07.001

Bishop, J., and Kim, B. (2018). Anticipatory shortening: articulation rate, phrase length, and lookahead in speech production. Paper Presented at the 9th International Conference on Speech Prosody , Poznan.

Blokzijl, J., Deuchar, M., and Parafita Couto, M. C. (2017). Determiner asymmetry in mixed nominal constructions: the role of grammatical factors in data from miami and nicaragua. Languages 2:4. doi: 10.3390/languages2040020

Bogulski, C. A., Bice, K., and Kroll, J. F. (2019). Bilingualism as a desirable difficulty: advantages in word learning depend on regulation of the dominant language. Biling. Lang. Cognit. 22, 1052–1067. doi: 10.1017/S1366728918000858P

Broersma, M. (2009). Triggered codeswitching between cognate languages. Biling. Lang. Cognit. 12, 447–462. doi: 10.1017/S1366728909990204

Broersma, M., and de Bot, K. (2006). Triggered codeswitching: a corpus-based evaluation of the original triggering hypothesis and a new alternative. Biling. Lang. Cognit. 9, 1–13. doi: 10.1017/S1366728905002348

Bybee, J. (2010). Language, Usage, And Cognition. New York, NY: Cambridge University Press.

Bybee, J. (2013). “Usage-based theory and exemplar representation,” in The Oxford Handbook of Construction Grammar , eds T. Hoffman and G. Trousdale (Oxford: Oxford University Press), 49–69. doi: 10.1093/oxfordhb/9780195396683.013.0004

Byers-Heinlein, K., Morin-Lessard, E., and Lew-Williams, C. (2017). Bilingual infants control their languages as they listen. Proc. Natl. Acad. Sci. U.S.A. 114, 9032–9037. doi: 10.1073/pnas.1703220114

Casas, M. P. (2016). “Codeswitching and identity among Island puerto rican bilinguals,” in Spanish-English codeswitching in the Caribbean and the U.S , eds R. E. Guzzardo Tamargo, C. M. Mazak, and M. C. Parafita Couto (Amsterdam: John Benjamins), 37–60. doi: 10.1075/ihll.11.02per

Casielles-Suárez, E. (2018). Gender assignment to Spanish-English mixed DPs: singleton vs. multiword switches. Span. Context 15, 392–416. doi: 10.1075/sic.00020.cas

Christiansen, M. H., and Chater, N. (2016). The now-or-never bottleneck: a fundamental constraint on language. Behav. Brain Sci. 39:E62. doi: 10.1017/S0140525X1500031X

Clyne, M. G. (2003). Dynamics of Language Contact: English and Immigrant Languages. Cambridge, MA: Cambridge University Press.

Croft, W. (1995). Intonation units and grammatical structure. Linguistics 33, 839–882. doi: 10.1515/ling.1995.33.5.839

de Bot, K., Broersma, M., and Isurin, L. (2009). “Sources of triggering in code switching,” in Multidisciplinary Approaches To Code Switching , eds L. Isurin, D. Winford, and K. de Bot (Amsterdam: John Benjamins), 85–102. doi: 10.1075/sibil.41.07bot

Dell, G. S., and Chang, F. (2014). The P-chain: relating sentence production and its disorders to comprehension and acquisition. Philos. T. Roy. Soc. B. 369, 20120394. doi: 10.1098/rstb.2012.0394

Deuchar, M., Davies, P., Herring, J., Parafita Couto, M. C., and Carter, D. (2014). “Building bilingual corpora,” in Advances in The Study Of Bilingualism , eds E. M. Thomas and I. Mennen (Bristol: Multilingual Matters), 93–110.

Du Bois, J. W. (1987). The discourse basis of ergativity. Language 64, 805–855. doi: 10.2307/415719

Du Bois, J. W., Schuetze-Coburn, S., Cumming, S., and Paolino, D. (1993). “Outline of discourse transcription,” in Talking Data: Transcription and Coding In Discourse Research , eds J. Edwards and M. Lampert (Hillsdale, NJ: Lawrence Erlbaum), 45–89.

Durán-Urrea, D. E. (2012). A Community-Based Study Of Social, Prosodic, And Syntactic Factors In Code-Switching , Doctoral thesis, The Pennsylvania State University, University Park, PA.

Fricke, M., and Kootstra, G. J. (2016). Primed codeswitching in spontaneous bilingual dialogue. J. Mem. Lang. 91, 181–201. doi: 10.1016/j.jml.2016.04.003

Fricke, M., Kroll, J. F., and Dussias, P. E. (2016). Phonetic variation in bilingual speech: a lens for studying the production-comprehension link. J. Mem. Lang. 89, 110–137. doi: 10.1016/j.jml.2015.10.001

Fukumura, K. (2018). Ordering adjectives in referential communication. J. Mem. Lang. 101, 37–50. doi: 10.1016/j.jml.2018.03.003

Gardner-Chloros, P., McEntee-Atalianis, L., and Paraskeva, M. (2013). Code-switching and pausing: an interdisciplinary study. Int. J. Multiling. 10, 1–26. doi: 10.1080/14790718.2012.657642

Gibson, E., Futrell, R., Piandadosi, S. T., Dautriche, I., Mahowald, K., Bergen, L., et al. (2019). How efficiency shapes human language. Trends Cogn. Sci. 23, 389–407. doi: 10.1016/j.tics.2019.02.003

Givón, T. (1983). “Topic continuity in spoken English,” in Topic Continuity In Discourse: A Quantitative Cross-Linguistic Study , ed. T. Givón (Amsterdam: John Benjamins), 343–363.

Gollan, T. H., and Ferreira, V. S. (2009). Should I stay or should I switch? A cost-benefit analysis of voluntary language switching in young and aging bilinguals. J. Exp. Psychol. Learn. Mem. Cogn. 35, 640–665. doi: 10.1037/a0014981

Gollan, T. H., and Goldrick, M. (2016). Grammatical constraints on language switching: language control is not just executive control. J. Mem. Lang. 90, 177–199. doi: 10.1016/j.jml.2016.04.002

Gollan, T. H., Montoya, R. I., Cera, C., and Sandoval, T. C. (2008). More use almost always a means a smaller frequency effect: Aging, bilingualism, and the weaker links hypothesis. J. Mem. Lang. 58, 787–814. doi: 10.1016/j.jml.2007.07.001

Gollan, T. H., Stasenko, A., Li, C., and Salmon, D. P. (2017). Bilingual language intrusions and other speech errors in Alzheimer’s disease. Brain Cogn. 118, 27–44. doi: 10.1016/j.bandc.2017.07.007

Green, D. (2011). Language control in different contexts: the behavioral ecology of bilingual speakers. Front. Psychol. 2:103. doi: 10.3389/fpsyg.2011.00103

Green, D. W. (2018). Language control and code-switching. Languages 3:8. doi: 10.3390/languages3020008

Green, D. W. (2019). “Language control and attention during conversation: An exploration,” in The Handbook of the Neuroscience of Multilingualism , eds J. W. Schwieter and M. Paradis (Hoboken, NJ: Wiley and Sons), 427–446. doi: 10.1002/9781119387725.ch21

Green, D. W., and Abutalebi, J. (2013). Language control in bilinguals: the adaptive control hypothesis. J. Cogn. Psychol. 25, 515–530. doi: 10.1080/20445911.2013.796377

Green, D. W., and Wei, L. (2014). A control process model of code-switching. Lang. Cogn. Neurosci. 29, 499–511. doi: 10.1080/23273798.2014.882515

Gries, S. T. (2005). Syntactic priming: a corpus-based approach. J. Psycholinguist. Res. 34, 365–399. doi: 10.1007/s10936-005-6139-3

Gries, S. T., and Kootstra, G. J. (2017). Structural priming within and across languages: a corpus-based perspective. Biling. Lang. Cognit. 20, 235–250.

Guaïtella, I. (1999). Rhythm in speech: What rhythmic organizations reveal about cognitive processes in spontaneous speech production versus reading aloud. J. Pragmat. 31, 509–523. doi: 10.1016/s0378-2166(98)00079-4

Gullberg, M., Indefrey, P., and Muysken, P. (2009). “Research techniques for the study of code-switching,” in The Cambridge Handbook Of Linguistic Code-Switching , eds B. Bullock and A. J. Toribio (Cambridge: Cambridge University Press), 21–39. doi: 10.1017/CBO9780511576331.003

Guzzardo Tamargo, R. E., Loureiro-Rodríguez, V., Acar, E. F., and Vélez Avilés, J. (2019). Attitudes in progress: puerto rican youth’s opinions on monolingual and code-switched language varieties. J. Multiling. Multicul. 40, 304–321. doi: 10.1080/01434632.2018.1515951

Guzzardo Tamargo, R. E., Valdés Kroff, J. R., and Dussias, P. E. (2016). Examining the relationship between comprehension and production processes in code-switched language. J. Mem. Lang. 89, 138–161. doi: 10.1016/j.jml.2015.12.002

Halberstadt, L. P. (2017). Investigating Community Norms And Linguistic Mechanisms In Codeswitching: Bridging Linguistic Theory And Psycholinguistic Experimentation , Doctoral dissertation, The Pennsylvania State University, University Park, PA.

Hayes-Roth, B., and Hayes-Roth, F. (1979). A cognitive model of planning. Cognit. Sci. 3, 275–310. doi: 10.1016/S0364-0213(79)80010-5

Hopp, H. (2015). Individual differences in the second language processing of object-subject ambiguities. Appl. Psycholinguist. 36, 29–73. doi: 10.1017/S0142716413000180

Hopp, H. (2016). Learning (not) to predict: grammatical gender processing in second language acquisition. Sec. Lang. Res. 32, 277–307. doi: 10.1177/0267658315624960

Johns, M. A., and Steuck, J. (2018). Evaluating “easy first” in codeswitching: a corpus approach. Paper Presented at 9th International Workshop on Spanish Sociolinguistics , Queens College, NY.

Kootstra, G. J., van Hell, J. G., and Dijkstra, T. (2010). Syntactic alignment and sharedword order in code-switched sentence production: evidence from bilingual monologue and dialogue. J. Mem. Lang. 63, 210–231. doi: 10.1016/j.jml.2010.03.006

Krivokapić, J. (2012). “Prosodic planning in speech production,” in Speech Planning and Dynamics , eds S. Fuchs, M. Weihrich, D. Pape, and P. Perrier (München: Peter Lang), 157–190.

Krivokapić, J. (2014). Gestural coordination at prosodic boundaries and its role for prosodic structure and speech planning processes. Philos. Trans. R. Soc. B 369:20130397. doi: 10.1098/rstb.2013.0397

Królikowska, M., Bierings, E., Beatty-Martínez, A., Navarro-Torres, C., Dussias, P., and Parafita Couto, M. C. (2019). “Gender-assignment strategies within the bilingual determiner phrase: four Spanish-English communities examined,” in Poster Presented at the 3rd Conference on Bilingualism in the Hispanic and Lusophone World (BHL) , Leiden.

Kroll, J. F., Bobb, S. C., and Wodniecka, Z. (2006). Language selectivity is the exception, not the rule: arguments against a fixed locus of language selection in bilingual speech. Biling. Lang. Cognit. 9, 119–135. doi: 10.1017/S1366728906002483

Labov, W. (1972). Sociolinguistic Patterns. Oxford: Basil Blackwell.

Labov, W. (2005). “Quantitative reasoning in linguistics,” in Sociolinguistics/Soziolinguistik: An International Handbook of the Science of Language and Society , Vol. 1, eds U. Ammon, N. Dittmar, K. J. Mattheier, and P. Trudgill (Berlin: Mouton de Gruyter), 6–22.

Liceras, J. M., Díaz, L., and Saloma-Robertson, T. (2002). “The compounding parameter and the word-marker hypothesis: accounting for adult L2 acquisition of Spanish N-N compounding,” in The Acquisition of Spanish Morphosyntax , eds A. T. Pérez-Leroux and J. M. Liceras (Dordrecht: Kluwer Academic Publishers), 1–32.

Lipski, J. M. (2014). Spanish-English code-switching among low-fluency bilinguals: towards an expanded typology. Sociolinguist. St. 8, 23–55. doi: 10.1558/sols.v8i1.23

Luk, G., and Bialystok, E. (2013). Bilingualism is not a categorical variable: Interaction between language proficiency and usage. J. Cognit. Psycol. 25, 605–621. doi: 10.1080/20445911.2013.795574

MacDonald, M. C. (2013). How language production shapes language form and comprehension. Front. Psychol. 4:226. doi: 10.3389/fpsyg.2013.00226

Meechan, M., and Poplack, S. (1995). Orphan categories in bilingual discourse: Adjectivization strategies in Wolof-French and Fongbe-French. Lang. Var. Change 7, 169–194. doi: 10.1017/S0954394500000971

Monsell, S. (2003). Task switching. Trends Cogn. Sci. 7, 134–140. doi: 10.1016/s1364-6613(03)00028-7

Myhill, J. (2005). “Quantitative methods of discourse analysis,” in Quantitive Linguistik: Ein Internationales Handbuch , eds R. Köhler, G. Altmann, and R. Piotrowski (Berlin: Walter de Gruyter), 471–497.

Myslín, M., and Levy, R. (2015). Code-switching and predictability of meaning in discourse. Language 91, 871–905. doi: 10.1353/lan.2015.0068

Nozari, N., and Novick, J. M. (2017). Monitoring and control in language production. Curr. Dir. Psychol. Sci. 26, 403–410. doi: 10.1177/0963721417702419

Otheguy, R., and Lapidus, N. (2003). “An adaptive approach to noun gender in New York contact Spanish,” in A Romance Perspective on Language Knowledge and Use , eds R. Núñez-Cedeño and R. Cameron (Amsterdam: John Benjamins), 209–232.

Parafita Couto, M. C., and Gullberg, M. (2019). Code-switching within the noun phrase: evidence from three corpora. Int. J. Biling. 23, 695–714. doi: 10.1177/1367006917729543

Patalano, A. L., and Seifert, C. M. (1997). Opportunistic planning: being reminded of pending goals. J. Cognit. Pscyhol. 34, 1–36. doi: 10.1006/cogp.1997.0655

Pfaff, C. (1979). Constraints on language mixing: intrasentential code-switching and borrowing in Spanish/English. Language 55, 291–318. doi: 10.2307/412586

Pickering, M. J., and Ferreira, V. S. (2008). Structural priming: a critical review. Psychol. Bull. 134, 427–459. doi: 10.1037/0033-2909.134.3.427.Structural

Pivneva, I., Palmer, C., and Titone, D. (2012). Inhibitory control and L2 proficiency modulate bilingual language production: evidence from spontaneous monologue and dialogue speech. Front. Psychol. 3:57. doi: 10.3389/fpsyg.2012.00057

Plaistowe, J. (2015). Coordinated Code-Switching? An Investigation Of Language Selection In Bilingual Conversation , Honors thesis, Australian National University, Canberra, AU.

Poplack, S. (1980). Sometimes I’ll start a sentence in Spanish y termino en español: toward a typology of codeswitching. Linguistics 18, 581–618. doi: 10.1515/ling.1980.18.7-8.581

Poplack, S. (1983). “Bilingual competence: Linguistic interference or grammatical integrity?,” in Spanish in the US Setting: Beyond the Southwest , ed. E. Olivares (Arlington: National Clearinghouse for Bilingual Education), 107–131.

Poplack, S. (1987). “Contrasting patterns of code-switching in two communities” in aspects of multilingualism,” in Proceedings of the Fourth Nordic Symposium on Bilingualism , eds E. Wande, J. Anward, B. Nordberg, L. Steensland, and M. Thelander (Uppsala: Borgström), 51–77.

Poplack, S. (1993). “Variation theory and language contact,” in American dialect research , ed. D. R. Preston (Amsterdam: John Benjamins), 251–286. doi: 10.1075/z.68.13pop

Poplack, S. (2017). Loanwords in the Speech Community and in the Grammar. Oxford: University Press.

Poplack, S., and Meechan, M. (1998). Introduction: how languages fit together in codemixing. Int. J. Biling. 2, 127–138. doi: 10.1177/136700699800200201

Pousada A. (ed.) (2017). Being Bilingual in Borinquen: Student Voices from the University of Puerto Rico. Newcastle upon Tyne: Cambridge Scholars Publishing.

Prior, A., and Gollan, T. H. (2011). Good language-switchers are good task-switchers: evidence from Spanish-English and Mandarin-English bilinguals. J. Int. Neuropsych. Soc. 17, 682–691. doi: 10.1017/S1355617711000580

Ramírez Urbaneja, D. (2019). ¿Tú tienes una little pumpkin?’ Mixed noun phrases in Spanish-English bilingual children and adults. Int. J. Biling 1–16. doi: 10.1177/1367006919888580

Sankoff, D. (1988). “Variable rules,” in Sociolinguistics: An International Handbook of the Science of Language and Society , Vol. 2, eds U. Ammon, N. Dittmar, and K. J. Mattheier (Berlin: Walter de Gruyter), 984–997.

Schoonbaert, S., Hartsuiker, R. J., and Pickering, M. J. (2007). The representation of lexical and syntactic information in bilinguals: evidence from syntactic priming. J. Mem. Lang. 56, 153–171. doi: 10.1016/j.jml.2006.10.002

Segalowitz, N. (2010). Cognitive Bases Of Second Language Fluency. New York, NY: Routledge, doi: 10.4324/9780203851357

Shen, A., Gahl, S., and Johnson, K. (2020). Didn’t hear that coming: effects of withholding phonetic cues to code-switching. Biling. Lang. Cognit. 1–12 doi: 10.1017/S1366728919000877

Shenk, P. S. (2006). The interactional and syntactic importance of prosody in Spanish- English bilingual discourse. Int. J. Biling. 10, 179–205. doi: 10.1177/13670069060100020401

Steuck, J. (2016). “Exploring the syntax-semantics-prosody interface: complement clauses in conversation,” in Inquiries in Hispanic Linguistics: From Theory To Empirical Evidence , eds A. Cuza, L. Czerwionka, and D. Olson (Amsterdam: John Benjamins), 73–94. doi: 10.1075/ihll.12.05ste

Steuck, J., and Torres Cacoullos, R. (2019). “Complementing in another language: prosody and code-switching,” in Language Variation: European Perspectives, VII , eds J. A. Villena Ponsoda, F. D. Montesinos, A. Ávila Muños, and M. Vida Castro (Amsterdam: John Benjamins), 217–230. doi: 10.1075/silv.22.14ste

Sullivan, M., Poarch, G., and Bialystok, E. (2018). Why is lexical retrieval slower for bilinguals? Evidence from picture naming. Biling. Lang. Cogn. 21, 479–488. doi: 10.1017/S1366728917000694

Torres Cacoullos, R., and Poplack, S. (2016). Code-Switching In Spontaneous Bilingual Speech. Alexandria: National Science Foundation.

Torres Cacoullos, R., and Travis, C. E. (2018). Bilingualism in the Community: Code-Switching And Grammars In Contact. Cambridge, MA: Cambridge University Press.

Travis, C. E., Torres Cacoullos, R., and Kidd, E. (2017). Cross-language priming: a view from bilingual speech. Biling. Lang. Cognit. 20, 283–298. doi: 10.1017/S1366728915000127

Valdés Kroff, J. R. (2016). “Mixed NPs in Spanish-English bilingual speech: using a corpus-based approach to inform models of sentence processing,” in Spanish-English Code-Switching In The Caribbean and the US , eds R. E. Guzzardo Tamargo, C. M. Mazak, and M. C. Parafita Couto (Amsterdam: John Benjamins), 281–300. doi: 10.1075/ihll.11.12val

Valdés Kroff, J. R., Dussias, P. E., Gerfen, C., Perrotti, L., and Bajo, M. T. (2017). Experience with code-switching modulates the use of grammatical gender during sentence processing. Linguist. Approaches Biling. 7, 163–198. doi: 10.1075/lab.15010.val

Valdés Kroff, J. R., and Fernández-Duque, M. (2017). “Experimentally inducing Spanish- English code-switching: a new conversation paradigm,” in Multidisciplinary Approaches To Bilingualism in the Hispanic and Lusophone world , eds K. Bellamy, M. Child, P. González González, A. Muntendam, and M. C. Parafita Couto (Amsterdam: John Benjamins), 209–231. doi: 10.1075/ihll.13.09val

Valdés Kroff, J. R., Guzzardo Tamargo, R. E., and Dussias, P. E. (2018). Experimental contributions of eye-tracking to the understanding of comprehension processes while hearing and reading code-switches Linguist. Approach. Biling. 8, 98–133. doi: 10.1075/lab.16011.val

Van Hell, J. G., Fernández, C. B., Kootstra, G. J., Litcofsky, K. A., and Ting, C. Y. (2018). Electrophysiological and experimental-behavioral approaches to the study of intra-sentential code-switching. Linguist. Approach. Biling. 8, 134–161. doi: 10.1075/lab.16010.van

Van Hell, J. G., Litcofsky, K. A., and Ting, C. Y. (2015). “Intra-sentential code-switching: cognitive and neural approaches,” in The Cambridge Handbook Of Bilingual Processing , ed. J. W. Schwieter (Cambridge: Cambridge University Press), 459–482. doi: 10.1017/CBO9781107447257.020

Varela, S. (2012). “Derivation and compounding,” in The Handbook of Hispanic Linguistics , eds J. I. Hualde, A. Olarrea, and E. O’Rourke (Oxford: Blackwell), 209–226. doi: 10.1002/9781118228098.ch11

Zhang, H., Kang, C., Wu, Y., Ma, F., and Guo, T. (2015). Improving proactive control with training on language switching in bilinguals. Neuroreport 26, 354–359. doi: 10.1097/WNR.0000000000000353

Keywords : codeswitching, language production, speech planning, opportunistic planning, language control

Citation: Beatty-Martínez AL, Navarro-Torres CA and Dussias PE (2020) Codeswitching: A Bilingual Toolkit for Opportunistic Speech Planning. Front. Psychol. 11:1699. doi: 10.3389/fpsyg.2020.01699

Received: 05 February 2020; Accepted: 22 June 2020; Published: 17 July 2020.

Reviewed by:

Copyright © 2020 Beatty-Martínez, Navarro-Torres and Dussias. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Anne L. Beatty-Martínez, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

2003 Articles

Issues in Code-Switching: Competing Theories and Models

Boztepe, Erman

This paper provides a critical overview of the theoretical, analytical, and practical questions most prevalent in the study of the structural and the sociolinguistic dimensions of code-switching (CS). In doing so, it reviews a range of empirical studies from around the world. The paper first looks at the linguistic research on the structural features of CS focusing in particular on the code-switching versus borrowing distinction, and the syntactic constraints governing its operation. It then critically reviews sociological, anthropological, and linguistic perspectives dominating the sociolinguistic research on CS over the past three decades. Major empirical studies on the discourse functions of CS are discussed, noting the similarities and differences between socially motivated CS and style-shifting. Finally, directions for future research on CS are discussed, giving particular emphasis to the methodological issue of its applicability to the analysis of bilingual classroom interaction.

  • Linguistics--Research
  • Code switching (Linguistics)
  • English language--Study and teaching--Foreign speakers
  • Linguistics

thumnail for 4.-Boztepe-2003.pdf

Also Published In

More about this work.

  • DOI Copy DOI to clipboard

College & Research Libraries ( C&RL ) is the official, bi-monthly, online-only scholarly research journal of the Association of College & Research Libraries, a division of the American Library Association.

C&RL is now on Instragram! Follow us today.

research papers on code switching

C&RL News


Advertising Information

  • Research is an Activity and a Subject of Study: A Proposed Metaconcept and Its Practical Application (73317 views)
  • Information Code-Switching: A Study of Language Preferences in Academic Libraries (38996 views)
  • Three Perspectives on Information Literacy in Academia: Talking to Librarians, Faculty, and Students (27477 views)

Information Code-Switching: A Study of Language Preferences in Academic Libraries

Frans Albarillo *

Initially coined by sociolinguists, the term code-switching refers to the alternation of languages by multilinguals. Code-switching is an active research area that has significant implications for academic libraries. Using data from focus groups and a survey tool, this paper examines language preferences of foreign-born students for particular information tasks. The main finding is that students’ culture and language represent an active influence on and important part of their identity, information consumption, and academic socialization. The author discusses the practical implications of these findings on academic library services in relation to ACRL’s 2012 Diversity Standards Cultural Competency with an emphasis on standard 6, linguistic diversity.


This paper is about foreign-born students’ language preferences and information use in academic libraries. Based on a quantitative and qualitative data set collected by the author, this study examines a concept called code-switching, a linguistic phenomenon where speakers change between two or more languages or between varieties of a language within a speech act or discourse. 1 The author was concerned with how individuals switch languages for different information tasks. Since information is coded in language, the author will refer to the phenomenon as simply information code-switching, referring to switching languages for a particular information task. The author coined the term “information code-switching” to distinguish it as a kind of code-switching that refers specifically to the preferences of a person when it comes to choosing a language or dialect for a particular kind of information behavior.

The basic research question the author investigates is whether there are types of information activities and places where multilingual students code-switch. The answer is yes. The data analyzed here, from focus groups and a survey, show how code-switching works in relation to topics of concern to academic libraries. Librarians are interested in providing access to resources, creating a friendly and welcoming environment, and accommodating the behaviors and preferences of the library’s users. This paper provides qualitative and quantitative evidence that foreign-born students who use academic libraries do switch languages when communicating and use different languages for different kinds of information tasks.

This triangulated approach shows a nuanced picture of the language preferences and information use of foreign-born students as a student population. For students with another language besides English, these data show that language and information use and seeking are intertwined. Even though English is widely considered to be the language of the academy, non-English languages are spoken in academic libraries all the time and used for a variety of information seeking and media consumption. Code-switching has many implications for library service, especially relating to professional standard 6, Linguistic Diversity, of ACRL’s 2012 Diversity Standards Cultural Competency for Academic Libraries, as will be examined in the discussion section.

Literature Review

This literature review is a selective review of the vast literature on code-switching (CS), covering some key studies and works in CS. In the literature, code-switching can also be referred to as “codeswitching” or “code switching.” This author chooses to use the hyphenated version, as it appears in a seminal article by Jan-Peter Blom and John Gumperz describing code-switching in Norway between two language varieties, Ranamål and Bokmål. In their study, Blom and Gumperz look for “social meaning” in how people use their “linguistic repertoire.” 2 Ranamål is a “prestige” dialect and Bokmål is a standard language taught in schools in northern Norway. 3 Blom and Gumperz describe CS: “In their everyday interactions, they [people in the village of Hemnesberget] select among the two as the situation demands. Members view this alternation as a shift between two distinct entities, which are never mixed. A person speaks one or the other.” 4 Carol Myers-Scotton, a linguist writing in the 1990s about research in code-switching, summarizes Gumperz’s characterization of CS as the use of language codes as a type of “social strategy” and extends it to include not only changing languages for content, but speakers using it to frame discourse to show social meanings like solidarity or power relations. 5 Similarly, in the data for this paper, the author will discuss how languages are used strategically for different kinds of information including normal situational and demographic contexts like language choice at home as an expression of cultural identity.

Another concept that is important to this discussion is the notion of a language domain, or the context in which a certain language is chosen. Joshua Fishman traces the history of this concept in his article on the topic, and defines domains as created by “institutional contexts” and their “congruent behavioral co-occurrences.” 6 Fishman further elaborates that domains can have different settings, have a “sociopsychological” component, and, most important, the “domain is a sociocultural construct abstracted from topics of communications, relationships between communicators, and locales of communication, in accord with institutions of a society and the spheres of activity of a speech community, in such a way that individual behavior and social patterns can be distinguished from each other and yet be related to each other.” 7 Fishman’s domain model of language use is the construct used in this study. While Fishman did not initially consider libraries as a separate domain from school, in this study the author looks at information code-switching as an extension of the domain model to language use in libraries. An extended analysis of Fishman’s concept of linguistic domain is given by L.B. Breitborde: “A domain is not the actual interaction (the setting), but an abstract set of relationships between status, topic, and locale which gives meaning to the events that actually comprise social interaction.” 8 In the case of academic libraries in the United States, the status of non-English languages and their presence in library collections, signage, and user experiences are rarely discussed, and, in most cases, languages other than English are invisible.

Fishman includes several domains in his language census of 431 individuals in a Puerto Rican neighborhood in Jersey City, such as “home, church, school, work place, and beach.” 9 In his language census, Fishman asks his respondents about their language abilities in English and Spanish, including “reading, writing, understanding, and speaking, writing letters, language of instruction, conversational languages, etc.” 10 In this paper, the language questions were designed in a similar manner using Fishman’s work and the language questions in the Children of Immigrants Longitudinal Study (CILS). 11 Language domains allow bilinguals to use their linguistic repertoires for a variety of social functions, according to Breitborde: “The use of their linguistic repertoire by bilingual speakers has been linked to situation, setting, social relationships, identity, and topic.” 12 The relationships, identity, and topics in this paper involve the academic library and the relationships foreign-born students have to information topics, tasks, and people as mediated through language.

Another oft-cited code-switching study is by Shana Poplack, who examines Puerto Rican Spanish and English in New York City. 13 Poplack provides examples of “intra-sentential” CS, where English and Spanish are mixed in the same sentence. For example, “Me iban a layoff” is translated as “They were going to lay me off” 14 and includes both English and Spanish in the same sentence. Poplack also notes the influence of Spanish phonology on the perfectly formed grammatical English when “That’s what he said” is pronounced “[da ‘wari se].” 15 And Poplack reports that most of the CS discourses were grammatical: “Perhaps the most striking result of this study is that there were virtually no instances of ungrammatical combinations of L1 and L2 [first language and second language] in the 1,835 switches studied, regardless of the bilingual ability of the speaker.” 16 Another important finding by Poplack is the positive attitude of the speakers who code-switched more toward their language. 17 This landmark research underscores how important CS is for multilingual individuals in defining social relations when speakers have access to more than one language. CS has a situational context, in terms of location, prestige, and institutional support. CS is fluid and natural and is not something that is planned ahead of time. Furthermore, switches have grammatical patterns and should not be seen as “broken” utterances. A significant theme in CS research aims at dispelling these myths about ill-formed languages and dialects. The languages in an individual’s linguistic repertoire will be important to the individual’s or community’s identity, and it is important to think about how libraries recognize or do not recognize languages through their signage, public services, collection development, and implementation of technology.

CS as a research topic today has crossed from linguistics into fields including education, English composition studies, and cultural studies. A search using “codeswitching OR code switching OR code-switching” in the Scopus Social Science and Humanities Index (limiting the search to articles) shows that there were 67 articles published about code-switching in 2016. For a fascinating look at CS in hip-hop the author recommends the book Global Linguistic Flows: Hip Hop Cultures, Youth Identities, and the Politics of Language by Samy Alim, Awad Ibrahim, and Alastair Pennycook. Sociolinguist Jannis Androutsopoulos’s chapter “Language and the Three Spheres of Hip-Hop” has a fascinating discussion on how English is used in code-switching and code-mixing to exemplify style, social identity, and glocality [global and local practices] when English is code-switched or code-mixed with a non-English language. 18 While hip hop is one artistic genre where language and code-switching play a major role in global media communities, the localization of English into specific local varieties is a rich and thriving academic field of study called World Englishes. This field of study recognizes that “The unprecedented spread of English has not [led] to a uniform global language; English is indigenizing into new vernaculars and specializing into national and international varieties of the lingua franca. As Mufwene puts it, ‘rather than driving the world towards monolingualism, differential evolution of English appears to be substituting a new form [of] diversity for an older one’ (2013:50).’” 19 Although this paper does not focus on World Englishes or code-switching between varieties of a single language, an instance of this can be seen in the qualitative data when students dialect-shift to British Standard English—their preferred variety of English when looking for news media.

Language teaching is another field that actively researches code-switching. A recent example is the work of Marta Fairclough and Flavia Belpoliti, who examine code-switches in the writing of English/Spanish heritage language learners to look at the transfer of vocabulary from English to Spanish. 20 A heritage language learner (HLL) is “an individual who is raised in a home where a non-English language is spoken.” 21 Fairclough and Belpoliti investigate how code-switching between English and Spanish improves HLLs’ Spanish literacy by analyzing essays written in response to a prompt measuring a learner’s vocabulary in the language they are trying to learn. 22 As an occurrence closely associated with globalization and transnationalism, CS is a popular phenomenon to study as a topic that involves multilingualism, language instruction, and education. The topic of CS is very relevant today, as other disciplines incorporate this behavior into their research areas. The current study is not a linguistic study or education study; rather, it looks at how an understanding of CS can be applied to academic libraries as an important site to study the intersections of culture, language, information, and immigration.

The same search described above for “codeswitching OR code switching OR code-switching,” when done in Library, Information Science & Technology Abstracts and Library & Information Science Source, returned only two results. The first article, by Magdalena Malechová, discusses code-switching as an intercultural communication trend and a contact linguistic phenomenon. 23 Contact linguistics is a subdiscipline of linguistics that has an active history of research in CS. For data, Malechová’s article counts occurrences of “grammatical code-switching” between English and German in two online German newspapers, concluding that CS is a “strong communication trend.” 24 The second result is an article by Bettina Kümmerling-Meibauer, who analyzes the “visual codes” 25 in Korean and Iranian bilingual picture books for children. She concludes that, in addition to having two separate languages, multilingual picture books have “an elaborate visual code, that are both universal and cultural,” which needs to be accounted for in how children are taught to read. 26 Malechová’s work and Kümmerling-Meibauer’s study show how CS is being extended to analyze media and visual codes, respectively. At the time of writing, the author could not find articles in any library and information science journals on code-switching. This study aims to fill that gap by providing examples of CS by foreign-born students in an academic library context. This study also aims to persuade librarians that information code-switching, or the use of different languages for different information tasks, is a research area that librarians can pursue to understand the role that language plays in the information behavior of multilingual individuals. For librarians working with multilingual populations, it is important to be aware of CS and how populations use their languages to consume information and media in academic libraries. While there are no articles specifically on code-switching in library and information science journals, there are a range of works that look at language issues in libraries.

Language in Libraries

Research about language in libraries exists in the literature and can be found as a topic associated with populations like international students 27 and immigrants. 28 Considering the complexity of the relationship between language and information, very few studies exist that look directly at language preferences, language attitudes, bilingual outreach, and linguistic diversity. At the time of writing, a study by Ignacio Ferrer-Vinent on language preferences at the reference desk 29 is the only study that directly investigates language preferences in academic libraries with reference interactions. This is a rich area that needs further exploring. In relation to library instruction, work by scholars like Karen Bordonaro 30 on students of English as a second language is an important example of how language learning happens in the academic information-seeking context. Sara Luly and Holger Lenz apply the model for Language Oriented Library Instruction (LOLI) 31 to learners of German as a foreign language, who have varying pedagogical needs compared to international students and immigrant students. A critical gap that no one has looked at is how librarians need to vary their instruction according to the different library populations: international students, immigrant students, generation 1.5 students, and foreign language learners. The author has written a literature review about these various different kinds of English language learners 32 and how important it is to distinguish between these populations. In addition to this variation across the kinds of English language learners, it is also important that librarians learn about the varieties of Englishes that students can speak, with attention to World Englishes, as there may be more than one kind of English used in the classroom and the library. Sonia Smith’s article “Library Instruction for Romanized Hebrew” discusses her experiences at McGill University in Canada creating a library instruction session to help students in an advanced Hebrew class navigate the romanized Hebrew catalog records. 33 Smith emphasizes how important the role of library instruction 34 is to scholars who wish to access scholarly materials that are romanized in the library catalog. There is a lively discussion on romanization, language, and access of titles in non-Latin scripts in the journal Cataloging & Classification Quarterly ; these languages include Persian, 35 Korean, 36 and Japanese. 37 The author has also discussed issues of romanization and transliteration as barriers to accessing content in library databases in his article “Evaluating Language Functionality in Library Databases.” 38 For outreach to international students, academic librarians Xiang Li, Kevin McDowell, and Xiaotong Wang write about their experiences creating videos about the library in Arabic, Chinese, English, Japanese, and Korean to help international students “navigate new systems and to bridge the gap between past library experiences and US academic library settings.” 39

Language is an important aspect of cultural competency. In a survey by Misa Mi and Yingting Zhang, two health sciences librarians exploring their perceptions of culturally competent library services, they found that “…those who spoke another language in addition to English rated their own levels of cultural competency higher than those who only spoke English. Those with the ability to speak another language might have an advantage of better understanding a given culture, which could lead to higher levels of cultural awareness and sensitivity.” 40 Mi and Zhang go on to argue that “Culturally competent librarians should regard the ability to speak a second language as an asset that demonstrates greater cognitive ability [33], rather than a deficiency [5]. It would be worthwhile for librarians to develop awareness and knowledge of language differences (which does not require an ability to speak that language) that are reflected in verbal and nonverbal communication processes and norms for effective cross-cultural interactions with and service provision for users from different backgrounds.” 41 The author agrees with Mi and Zhang that, in general, second languages should be viewed as a form of cultural capital, and a cognitive advantage, and that multilingual individuals are more sensitive to cultural differences that they negotiate daily in their lives as speakers of minority languages. It is important to be conscious of essentializing culture into language. While language is an important facet of culture, there are other dimensions of culture like race, ethnicity, and religion that are also meaningful cultural factors. However, the focus of this article is language, and because language is a part of cultural competency, the author recommends that managers should hire multilingual individuals, or at least include multilingualism as a preferred qualification in job advertisements. Managers should also provide training for monolingual staff who work with multilingual populations.

Linguistic diversity training is often ignored in organizational contexts. By linguistic diversity training, the author means education on accents, creoles, code-switching, identifying relevant varieties of English (World Englishes) and other linguistic facets that characterize the patron population. Additionally, it’s important for librarians to understand that language groups like Russian, French, Spanish, or Mandarin are often lingua francas for other linguistic minorities. Ideally, multilingual colleagues could talk about their language use with their monolingual colleagues to increase their awareness of linguistic diversity. For example, the Spanish language is incredibly diverse; and, in areas where it is spoken, cross-cultural trainings could be led by the individuals in the library who speak it so that people who are not aware of these differences can be mindful and at the very least know that differences exist in spoken Spanish that could indicate race, ethnicity, identity, religion, and other potential categories that a speaker may self-identify with. Knowing about linguistic diversity and linguistic behaviors like code-switching is important in the goal of becoming a culturally competent organization. Medical librarians have created a roadmap for hospitals 42 to move their organizations closer, and the author believes that it is important for academic libraries to do the same by studying the demographic effects of factors like language, race, ethnicity, income, age, gender, and able-bodiedness, and how these factors affect or influence library use.


With approval from the City University of New York Institutional Review Board and using a small grant awarded by the PSC-CUNY Research Award Program, the author conducted focus group interviews in the spring of 2014 and a survey in the fall of 2014.

Population and Data Collection Procedures

The author used SurveyMonkey, an online survey tool, to create a survey to screen candidates for focus groups. A link to the screening survey was posted on flyers around campus to recruit foreign-born students. The total number of responses for the screening survey was 66, with 33 complete replies and 4 respondents who did not meet the main selection criteria (at least 1 year of high school in their home country and a current student at Brooklyn College at the graduate or undergraduate level). There were a total of 29 qualified respondents and a 45 percent response rate (# interviewed / # eligible to participate). Thirteen students participated in the focus groups. Recruiting for the focus groups was difficult, and the focus groups were very small, ranging from 2 to 4 individuals per group. Participants received $20 each to take part in the focus group. The interviews were semistructured, took 45 minutes to 1 hour to complete, and were based on the following prompts: What languages do you use in your daily life? What does research mean to you? How do you do research? The author recorded the data using a Zoom H2 audio recorder and transcribed the recordings in NVivo.

The main survey was designed during and after the 2014 summer Institute of Research Design in Librarianship. Before launching the survey, the author did a pilot test of the survey with students who participated in the focus groups. The author hired a research assistant for the survey portion of the study, and together the researchers piloted the survey, used a screening survey to recruit survey participants, and collected data during the fall 2014 semester. Flyers were distributed on campus advertising the study with a link to the screening survey. The researchers also staffed an informational table about the survey in various spaces around the campus, including the library. Several offices agreed to promote the screening survey on their Facebook pages and their mailing lists (Women’s Center, the Office of Graduate Admissions, and the office of student activities). The screening survey collected demographic and educational background information, which allowed researchers to include participants based on the following criteria:

  • Participants are foreign-born
  • Participants are undergraduate and graduate students

Qualifying participants received an e-mail with a link to the full survey within two days of completing the screening survey. The full survey took 30 to 40 minutes to complete and was divided into the following parts: demographics, educational background, language use, library use, and cultural questions (including views on American-style research). For this paper the author examines the data from the language use and demographics sections.

Because the survey was linked to their e-mail, respondents could finish one part of the survey and return to the survey at another time to complete the other parts. The researchers sent out e-mails to remind participants that they needed to complete the full survey. Once SurveyMonkey indicated that a survey was complete, the researchers made an appointment with the student via e-mail to distribute the $10 incentive.

Participants also had the option of doing an in-person survey, where the researcher would help the participant complete the survey in a classroom setting, after which they would immediately receive their incentive, though none of the eligible participants chose this option.

There were 3,004 foreign-born students at Brooklyn College in the fall 2014 semester. 43 A total of 274 students were screened, and 123 eligible students were invited to participate in the full survey. Of these, 103 responded and participated. Ten surveys contained partial responses and were discarded, and one survey was discarded because the person self-reported low English reading and writing ability. For this paper, 92 complete surveys were analyzed. The survey response rate was 74 percent (# complete surveys / # eligible to participate).

Survey Data and Analysis

The author downloaded the survey data from SurveyMonkey and analyzed CS and language use data in Excel, and then in SPSS 21 using independent samples t-tests to look for associations between mean scores in language use variables (CS, language domains, information tasks, and language ability) and demographic grouping variables, which include student status (undergraduate or graduate), immigration status (permanent or temporary status), first-generation student status, gender, and race/ethnicity. T-tests were not conducted for grouping variables that had fewer than 10 respondents (for example, there were only 5 respondents who identified as Hispanic and 2 respondents who identified as Middle Eastern). The author used Somers’ d test to look for associations between language use variables and the following dependent variables: age, age arrived in the United States, years lived in the United States, and median income (estimated from the 2014 American Community Survey 44 using respondent zip codes).

The author has chosen to treat the Likert-type variables as continuous rather than ordinal. The survey data meet all the assumptions for an independent samples t-test 45 and the Somers’ d test for associations. 46 The author has carefully followed Arlene Fink’s survey methods book 47 in this analysis.

The author created a variable called code-switching based on the following survey question:

I switch between English and my non-English language(s)

b. Very frequently

c. Occasionally

e. Very rarely

The author coded this variable in SPSS 21 (1 = Never, 6 = Always).

The grouping variables for language domains and information tasks were created from a matrix as shown in figures 1 and 2, with Language 2 (L2) referring to the main language the student spoke in addition to English.

The responses were dummy-coded (0 = no attribute, 1 = presence of the attribute).

The variable for language ability was created from the questions below with responses coded in a five-point Likert-type scale (1 = not at all, 5 = very well).

How well do you speak English?

How well do you understand English?

How well do you read English?

How well do you write English?

How well do you speak L2?

How well do you understand L2?

How well do you read L2?

How well do you write L2?

The first of the demographic variables, student status, was determined by the following question:

Are you a high school, undergraduate, or graduate student?

a. I am not a student [disqualify]

b. I am a high school student [disqualify]

c. I am an undergraduate student

d. I am a graduate student

The author used two questions in the survey that asked about immigration status to construct two variables: permanent immigration status and temporary immigration status. The first question was asked in the screening survey: Are you an international student with one of the following visas: F, J, M, A, H1B, or K? (a. Yes b. No) The second question was asked in the main survey:

What is your immigration status?

a. U.S. citizen

b. U.S. citizen by naturalization

c. Permanent resident

d. Not a U.S. citizen

e. Dual citizenship or nationality

f. Deferred Action for Childhood Arrivals (“DACA”)

g. I do not wish to answer this question

h. I don’t know

Respondents who chose a, b, c, or e in the main survey were grouped as permanent, and dummy-coded in SPSS (0 = not permanent, 1 = permanent). Respondents who chose a in the screening survey question and d or f in the main survey question were grouped as temporary status.

Gender, age, age arrived in the United States, years in the United States, and zip code were constructed from the survey questions below.

What is your gender? a. Male b. Female

What year were you born? [numerical text field]

What year did you move to the United States? [numerical text field]

What is your current zip code? [numerical text field]

First-generation college students were identified using the following question: Ideally, what’s your intention for completing a degree? Check all that apply. Nineteen answers were available, and respondents could choose as many as applied; answer L, “I am the first in my family to get a college degree,” was used to identify first-generation college students. The author dummy-coded the variable (0 = not a first-generation college student, 1 = first-generation college student).

Race and ethnicity categories were taken from adapted from a White House document on reporting race. 48 The question appeared as follows in the survey:

Please indicate your race:

a. Hispanic or Latino (a person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin, regardless of race)

b. American Indian or Alaska Native (a person having origins in any of the original peoples of North and South America, including Central America, who maintains cultural identification through tribal affiliation or community attachment)

c. Asian (a person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian Subcontinent, including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam)

d. Black or African American (a person having origins in any of the black racial groups of Africa)

e. Native Hawaiian or Other Pacific Islander (a person having origins in any of the original peoples of Hawaii, Guam, Samoa, or other Pacific Islands)

f. White (a person having origins in any of the original peoples of Europe)

g. Middle Easterner (a person having origins from the Middle Eastern countries)

h. North African (a person having origins from the North African countries)

i. From multiple races

j. Other (please specify)

The author dummy-coded all the race and ethnicity variables.

Figure 3 shows the binary grouping variables.

Finally, reading preferences were created from the matrix in figure 4. The author created variables for language preference for academic and nonacademic reading to determine whether students who are more likely to code-switch prefer a particular language for academic and leisure reading.

The data illustrate that students who speak a language besides English are more likely to do so in certain language domains such as home, school, or with friends. The ability of students to speak in English and their second language is also associated with how likely these students are to switch between languages.

Independent samples t-tests (hereafter t-tests) were not conducted for race and ethnicity categories with fewer than 10 respondents, which included Hispanic, American Indian, Native Hawaiian, Middle Easterner, North African, and multiple races. The t-tests showed that the following variables did not have a statistical difference in the mean reported CS score for the multilingual respondents:

  • Immigration status
  • Student status
  • First-generation status
  • White or nonwhite (race and ethnicity variable)
  • Other (race and ethnicity variable)

The author found no associations using Somers’ d test between reported CS and the following variables:

  • Years in the United States
  • Median income based on zip code
  • Ability to speak English
  • Ability to understand L2 [language spoken other than English]
  • Ability to write L2

Table 2 shows the variables where the author found that foreign-born students are very likely to switch to L2 (the non-English language) for information tasks and in linguistic domains such as home, school, and with friends.

Table 3 shows associations between language ability and CS using Somers’ d test. According to Laerd Statistics, “A value of –1 indicates that all pairs of observations are discordant and a value of +1 indicates that all pairs of observations are concordant.” 49 Table 3 shows moderate to weak effects.

Reported CS is positively associated with higher L2 speaking and reading ability with a moderate effect strength. Additionally, reported CS has an inverse association with the students’ ability to understand, read, and write English. The strength of the inverse association is weak (–0.305, –0.299, –0.237).

Focus Group Data and Analysis

In the focus groups, all but one of the participants were bilingual (counting English and Jamaican Creole English as separate languages), and the single monolingual speaker of English spoke British English. There were 10 instances of students reporting code-switching in the focus group transcripts. The author also counted instances of switching from American to British English as a CS. Below are select quotes from the transcripts when participants were asked about how they used their languages. The following languages were discussed in the focus groups: English, Vietnamese, Hindi, French, French Creole (Kreyol), Cantonese, Mandarin, Hakka, Armenian, Bengali (Bangla), British English, Patwa (Jamaican Creole), Portuguese, Spanish, Russian, and Ukrainian. The data collected and described in this section illustrate the complexity of code-switching, the use of different languages for different types of information, the importance of community, and the presence of dialects both in English and non-English languages.

A quote that exemplifies the complexity of code-switching is from a Chinese student whose parents do not speak the same variety of Chinese. The student referred to the Hakka language as a “dialect,” describing her upbringing in a multi-Chinese dialect household, where neither of her parents spoke English very well. She explained that if she doesn’t want her parents to understand what she is saying she switches to English as her secret language. This quote from her is also illustrative of the linguistic diversity of Chinese languages:

“Yeah, when I am in school. Of course, most of the time I speak English in school. When I go home, I speak to my dad, my mom. I talk with my mom in dialect. Because she doesn’t know how to speak Mandarin. My dad speak to me in dialect. I respond back in Mandarin because my dad is not much good. Not much fluent in the dialect. But my dad I can hear and listen. You know, for my sisters, my brothers, I speak in like half English, half dialect, and maybe some has just like all everything mixed one or I don’t want to know what my parents to know what I am talking about, I speak English.”

A female Colombian undergraduate student described which languages she uses for different types of information:

‘‘I do like Spanish literature. When it comes to reading for pleasure, 50/50 for Spanish and English. I read a lot of religion in Spanish. Research, I don’t. And it’s a little limited in Spanish, because my entire college career has been in English. So even when I’ve tried, it’s difficult. I’ve done it, like when I travel back home. Or if I’m trying to read a text that’s in Spanish, and it’s academic, it takes a little bit of work. Because I have the information in my head in English.”

The same student also said:

“I pretty much read what I come across. As long as it’s interested, is interesting, and I’m interested in it. I don’t really have a preference with the news per se. I have always read the Bible in Spanish. Family and friends, and music. It’s like 80% in Spanish, 20% in English. Television is mostly English, just because lack of good stuff (laughter). Seriously. All those telenovelas, get out! So that’s out.”

Academic vocabulary in English and Spanish is learned. This student is obviously a fluent Spanish speaker and reader, but she knows her academic discipline in English only so she cannot readily code-switch to Spanish for academic tasks.

Community plays an important role in CS and, because this male Indian graduate student does not have a community of Bengali speakers, he has limited ability in Bengali. But his comprehension is maintained through preference for Bengali music:

“For research, basically I do everything in English because I have little family and friends here and back home so I can barely speak in Bengali. Other than that, I use English everywhere. For research, I follow the Wall Street Journal and the Economist. I don’t follow Bengali newspapers. Per se about music, I prefer Bengali, and Hindi, and English as well.”

This quote illustrates how a language may not manifest in one ability (like speaking) but can still be used and enjoyed for other kinds of information (music, movies, oral poetry, etc.). And when probed about mixing Hindi and English, the male Bengali student said, “Yeah. It’s Hinglish. Hindi and English. Yeah.”

English, like any language, has dialects. An undergraduate student from Grenada described how British English influences the kinds of information sources she seeks: “Yeah, in Grenada we’re more influenced by the British, so the spelling of our words are along the British standard.” And when probed about her preferences for British English sources, she said:

“Yes, maybe because I’m familiar with it. If I go on the Internet, I would look up bbc.uk. I’m only here three years, so I’ll go back and forth. So it’s not like I’m going to assimilate into the culture. I’m just like an outsider looking in. I feel most comfortable with the British way of news, expressing, so their writing. I don’t know how to explain. It’s so different here. When I did A levels, years ago. It’s just a whole different experience. I started at a community college, BMCC [Borough of Manhattan Community College]. It was like A level all over again, although it was an associate degree. I had to focus on the spelling of the words, how we spell color and you spell color. C-o-l-o-u-r, we spell it different. It was just, no other difficulty really.”

This is an example of how culture influences information seeking. This preference for British sources was verified by the two other Jamaican students who also spent some of their formative education in Jamaica.

This female Bengali undergraduate student uses Bengali exclusively with her family:

“So in school I speak in English. And I live with my aunt. And she works 14 hours a day, so I hardly see her. Whenever I see her for half an hour or an hour, we’re going to talk in Bangla [Bengali], but we don’t like mix stuff. Some of the words are in English. Like we don’t have a word for chair that’s in Bangla. So some of the words has to be in English. But um, most of the cases we don’t mix two languages. And even when I’m talking to my parents I’ll be talking Bangla.”

In response to a probe about making friends in school, this student noticed that the variety of Bengali spoken by the students she meets in Brooklyn contain an unusual accent. This is an example of the linguistic diversity that can exist in the same language between diaspora speakers and recently arrived speakers. Her CS shows that she enjoys and uses her languages for a variety of information consumption:

“I have met like a couple of Bangladeshi people in the school. But as she was saying, they don’t speak Bangla. What I feel they have really weird accent when they speak in Bangla. I feel weird when they speak in Bangla. So they like speak okay, if you speak comfortable speaking in English, I don’t have a problem. And in case of music, I feel like I’m more into Bangla and Hindi music. More than English music. English music is okay,…. If I listen to a song I’ll only listen to like, once or twice. But if I’m gonna listen to music, like the whole day, it has to be Bangla and Hindi. And even in the case of movies or soaps. I watch a lot of English soaps. Like I watched How I Met Your Mother, the whole 9th season. And I watch like Scrubs, and Big Bang Theory. And what’s the other one? I like watching CSI and Breaking Bad. …Yeah sorry, Breaking Bad.”

The following quote from a male Vietnamese undergraduate student shows his sophistication when looking for news about his home country: “I do read Vietnamese news and writings that’s about Vietnamese because I want to know what’s going on at the university. We still have government controlled media. So sometimes the news has bias. I like Vietnamese pop, for my personal life.” This male Chinese undergraduate student, after multiple probes, simply stated that both English and Chinese are important in his life: “I think the language is important, because I prefer the both language in my life for my enjoyment.”

The focus group data clearly show that multilinguals are using languages in different ways to consume and communicate information; they are switching readily in their daily lives before going to school, at school, and after school. These data highlight the complex and fluid ways language and culture influence how multilingual foreign-born students consume information. It’s important to note that within large language categories exist smaller subgroupings or varieties, whether they are varieties of English or Chinese dialects. These language groupings are not homogeneous populations, and care should be taken by academic librarians to acknowledge these differences in both accent and grammaticality. Interactions with foreign-born students should focus on communicating and not correcting.

The survey and data provide evidence of CS in foreign-born students in academic environments like the library at Brooklyn College (“I use L2 at school,” P < .001, MD = 5) and most likely with co-ethnic friends (“I speak L2 with my friends,” P < .001, MD = 1.81). The data show that students who report a second language are more likely to use it for a variety of information tasks like searching the Internet, consuming media, and communicating through social media or similar texting mediums, as shown in table 1. These data show that home is the place that L2 speakers will most likely code-switch ( P < .001, MD = 2.559). They also show that L2 is used for academic purpose like reading ( P < .05, MD = .644), especially for news and current events ( P < .001, MD = 1.194). The moderate association in table 2 between code-switching, L2 speaking, and L2 reading needs to be further investigated. Does it suggest that people who speak and read a language are more engaged in that language? It’s difficult to tell and is further complicated for languages that don’t have a strong print culture (Creoles, for example). The inverse association between English ability and CS is also interesting in that the higher the ability reported in English comprehension, reading, and writing, the lower the likelihood for CS. Brooklyn College library is a multilingual space, as the survey data show, and the language environment of foreign-born students is complex and fluid.

An important detail about the data collected in the survey is that 38 of the 92 respondents self-identified as Asian, and they are more likely than any of the other demographic category surveyed to code-switch ( P < .001, MD = 1.283). Additionally, there were respondents from the Caribbean who did not self-identify with the category of black/African American; they chose the “other” response to write in Caribbean. Similarly, there were Bengali students who did not choose to self-identify with Asian (the author considered, but did not include, a category for South Asian), and several of the respondents wrote in Bengali by selecting the “other” category. It was unfortunate that there were very few Hispanic foreign-born students, since Spanish is widely spoken in New York City.

Furthermore, it’s important to think about how Creole languages do not really appear in more formal academic language domains. For example, there are no calculus books written in Haitian Kreyol as there would be calculus books written in Vietnamese, Mandarin, or Spanish. Another question that these survey data cannot effectively comment on is CS for speakers of other varieties of English, like British English: do those speakers code-switch across varieties of English? For example, would the Trinidadian student from the focus group have reported in the survey or even be conscious of Americanizing her spelling and accent to function in an American English academic environments; is that a form of code-switching? These finite, qualitative distinctions are difficult to tease out in a survey. Fortunately, the focus group data show and identify some of these experiences and processes.

Synthesis of the Survey and Focus Group Data

This section discusses two instances when the quantitative findings are complemented by the qualitative findings. There is one instance when the quantitative findings are contradicted by the qualitative data.

Both data sets in this study capture the importance of CS at home. The survey data correlate that home is an important place for language use, and the narrative from the Chinese student who speaks various dialects of Chinese and English with her parents, brothers, and sisters reveals the complexity of her language use. Both data sets show that there is diversity within Chinese and that, as librarians who value linguistic diversity, it is important to be aware that Chinese is not a monolithic language or cultural category. The application to practice is clear: if a library has large groups of Chinese speakers, it will be important to know and identify which cultural and linguistic groups use the academic library.

The second pattern found in both data sets includes the influence of language over the preference for consuming information like academic reading and news. The findings suggest that it is likely that students will code-switch in academic reading contexts and when reading news. The focus group data give us more insight into this information behavior. For example, the data from the Colombian undergraduate student tell us that her code-switching is in very specific domains, like leisurely reading and religious reading, and that she has difficulty reading academic Spanish. This information is important to know for building collections and for supporting students who may want to take their academic experiences to their home country for an internship, for a job, or for pursuing graduate work. In this study we also see how dialect preferences in English influence how Caribbean speakers gravitate toward the British Standard English with consuming news and current events, which is also an example of dialect-switching. These data showing how English dialects can affect information behavior is important for helping librarians understand that, not only is there diversity across languages, there are dialectal differences within major language groups like English, Chinese, Spanish, and others.

The third pattern found in the survey, that students are more likely to use L2 with friends, is contradicted by the example of the female Bengali student who finds that Bengali accents spoken by Brooklyn students are unusual. Again, while students are likely to switch with friends, students who have just arrived may not have friends who speak the same dialect. This is important when distinguishing between immigrant students and international students when it comes to conducting further studies of co-ethnic language use in academic libraries.

Future studies involving code-switching should focus on capturing both quantitative and qualitative data. It’s just as important to understand individual experiences and processes around code-switching, because this helps in interpreting the statistical data. Qualitative data can also reveal missed or new variables that will give the researcher better insight into the relationships between language and information.


This survey and analysis contain some limitations that should be kept in mind when considering the findings. The first limitation is the nature of self-reported usage data, though the author attempted to increase the internal reliability of the data by using focus groups to triangulate the results. The second limitation is related to sampling: the focus groups were very small, and the survey was a convenience sample. Third, income as a variable was collected as a rough estimate using ACS 2014 data by zip code. This is not the best way to capture income data, but it does provide some information. Finally, it should be noted that many of the terms used throughout this article, including race, gender, ethnicity, and immigration status, are social science terms often used in survey research; their use is not meant to give offense or intentionally exclude any groups. There is always some aspect of reductionism that occurs in survey-based research, and the author welcomes feedback on how to make the survey categories more inclusive.

Implications for Academic Library Services

CS is a well-established phenomenon outside of library and information science, and this paper aimed to introduce librarians to this concept and document this behavior in an academic library setting. There are many implications of CS for academic library services, especially in the area of linguistic diversity, standard 6 of ACRL’s Diversity Standards: Cultural Competency for Academic Libraries, which reads: “Librarians and library staff shall support the preservation and promotion of linguistic diversity, and work to foster a climate of inclusion aimed at eliminating discrimination and oppression based on linguistic or other diversities.” 50

The data from this study provide evidence that language influences the information behavior of students in the form of code-switching and dialect switching. More research could be done to investigate information code-switching, which the author has broadly defined as changing languages or dialects for particular information tasks. This kind of research would allow librarians to map language use, language choice, and language preferences of students to actual library collections, services, and resources. Furthermore, this research would be valuable for serving first-generation college students, generation 1.5 students, international students, and immigrant students.

Another area of critical importance is being inclusive of non-English languages in collection development: “collection managers should be attentive to represent the linguistic needs of library constituents, and assure that library resources in print or electronic formats are available, especially to support the academic curricula reflecting all diversity issues, including those of visually disabled constituents.” 51 Increasing the visibility of non-English scholarly sources can be as simple as creating library guides that show students how to access peer-reviewed journals and open access indexes in non-English languages. Engaging with the scholarly literature in non-English languages is particularly important in the social sciences. For example, librarians could create a guide that would allow Spanish-speaking students of urban sociology to engage with and synthesize sociological ideas in Spanish language journals with concepts from English language journals. The practice of incorporating non-English sources into English-language papers is a long scholarly tradition in the humanities. Language access is also an important concept in the diversity standards: “Provide and advocate for the provision of information, reference, referrals, instruction, collection management, and other services in the language appropriate to their constituencies, including the use of interpreters.” As the CS data suggest, our libraries are not monolingual spaces, so making sure that printers can print in different scripts (and in general having technology capable of supporting users’ linguistic preferences), as well as having welcome signage in other languages are steps that libraries can take to make non-English speakers feel more included when using the library as a space for studying or meeting with classmates. Ideally, supporting linguistic diversity in academic libraries would include multilingual staff who could create library instruction and other academic library services that cater to large linguistic populations served by the academic library.

The purpose of this study was to explore language use, language choice, and language preferences in academic libraries, and the author found evidence for code-switching patterns in both qualitative and quantitative data. In the analysis, the author maintains that code-switching patterns are correlated with information tasks and argues that more research could be done on information code-switching to give librarians data on language use and apply those data to library services. For the foreign-born students analyzed in this study, it is clear that their culture and their non-English language represent an active and important part of their identity, information consumption, and academic socialization. In their language choice for information, there is enough statistically significant evidence for information code-switching, when students switch languages for a particular information task. Yet, in most of the academic library literature, these active language communities, their patterns of use, and their preferences have not been the subject of research. These data show that Brooklyn College Library is a rich multilingual space, yet there are only a few studies that discuss multilingualism in academic libraries. What can academic librarians do with these kinds of data?

There are many practical recommendations relating to linguistic diversity, including creating a multilingual-friendly environment. Reference assistance could include offering specialized library instruction or orientations for immigrant students, first-generation college students, and international students. Public services staff could also receive linguistic diversity training that includes information on ESL and EFL populations. Linguistic diversity training might also focus content on creating sensitivity and awareness of patrons who are linguistic minorities (for example, Spanish speakers who speak other Central American languages such as Mayan languages), as well as information about Creoles and pidgins, how accents work, nonwritten languages, and varieties of spoken English that may be relevant to the patron population. Computer labs can be language-friendly, with a variety of keyboard formats and printers available for people who need to print e-mails, share notes, and look up concepts in languages other than English. In the focus group interview with the Colombian undergraduate student, she spoke about wanting to gain some experience working in Colombia; however, she lacked the Spanish academic vocabulary to be competitive. To help students like her, academic librarians could create LibGuides for non-English scholarly sources that include Latindex, for example ( www.latindex.unam.mx/latindex/inicio ). Libraries could focus on hiring multilingual librarians. There is clearly a need for more research in transnationalism, especially in academic libraries that have a high number of foreign-born students. Are these students trying to use their American college degrees and create transnational careers that take advantage of their cultural capital? Are academic libraries spaces currently treated as monolingual rather than multilingual spaces? How does this affect our practice, and how can libraries change to support these students in their information needs? Monica Jacobe, director of the Center for American Language & Culture at The College of New Jersey, speaks about immigration trends in student success:

First-generation college students will no longer be primarily American-born students from working class families. Instead, many more students in that category will be recent immigrants, born all over the world, who completed high school in the U.S. For many schools, they will “look” on paper like domestic applicants, but the support they need will be very different.

How can academic librarians imagine a shift from monocultural and monolingual approaches to multilingual approaches? And what services will need to be rethought? These are the kinds of questions additional studies on language use and libraries can answer. CS is just one conceptual tool from sociolinguistics that has very practical applications in our work with international and immigrant students.

1. Barbara E. Bullock and Almeida Jacqueline Toribio, “Themes in the Study of Code-Switching,” in The Cambridge Handbook of Linguistic Code-Switching , eds. Barbara E. Bullock and Almeida Jacqueline Toribio, Cambridge Handbooks in Linguistics (Cambridge; New York: Cambridge University Press, 2009), 1–19.

2. Jan-Peter Blom and John J. Gumperz, “Social Meaning in Linguistic Structure: Code-Switching in Norway,” in Directions in Sociolinguistics: The Ethnography of Communication , eds. John J. Gumperz and Dell Hymes (New York: Blackwell, 1986), 409.

3. Ibid., 411.

4. Ibid.

5. Carol Myers-Scotton, “The Rise of Codeswitching as a Research Topic,” in Social Motivations for Codeswitching: Evidence from Africa (Oxford: Clarendon Press, 1995), 57.

6. Joshua Fishman, “Domains and the Relationship between Micro- and Macro-Sociolinguistics,” in Directions in Sociolinguistics: The Ethnography of Communication , eds. John J. Gumperz and Dell Hymes (New York: Blackwell, 1986), 441.

7. Ibid., 442–43.

8. L.B. Breitborde, “Levels of Analysis in Sociolinguistic Explanation: Bilingual Code Switching, Social Relations, and Domain Theory,” International Journal of the Sociology of Language 1983, no. 39 (Jan. 1983): 19.

9. Fishman, “Domains and the Relationship between Micro- and Macro-Sociolinguistics,” 447.

10. Ibid., 448.

11. Alejandro Portes and Rubén Rumbaut, “Children of Immigrants Longitudinal Study (CILS), 1991–2006,” Children of Immigrants Longitudinal Study (CILS), 1991–2006 (ICPSR 20520) , 2009, available online at www.icpsr.umich.edu/icpsrweb/RCMD/studies/20520 [accessed 24 March 2015].

12. Breitborde, “Levels of Analysis in Sociolinguistic Explanation,” 5.

13. Shana Poplack, “Sometimes I’ll Start a Sentence in Spanish Y Termino En Español: Toward a Typology of Codeswitching,” Linguistics 18, no. 7/8 (1980): 581–618, doi:10.1515/ling.1980.18.7-8.581 .

14. Ibid., 583.

15. Ibid., 584.

16. Ibid., 600.

17. Ibid., 610.

18. Jannis Androutsopoulos, “Language and the Three Spheres of Hip Hop,” in Global Linguistic Flows: Hip Hop Cultures, Youth Identities, and the Politics of Language , eds. H. Sammy Alim, Awad Ibrahim, and Alastair Pennycook (New York: Routledge, 2009), 54–58.

19. Elena Seoane, “World Englishes Today,” in World Englishes: New Theoretical and Methodological Considerations , eds. Elena Seoane and Cristina Suárez Gómez, Varieties of English around the World G57 (Amsterdam; Philadelphia: John Benjamins Publishing Company, 2016), 1.

20. Marta Fairclough and Flavia Belpoliti, “Emerging Literacy in Spanish among Hispanic Heritage Language University Students in the USA: A Pilot Study,” International Journal of Bilingual Education and Bilingualism 19, no. 2 (Mar. 3, 2016): 185–201, doi:10.1080/13670050.2015.1037718 .

21. Ibid., 186.

22. Ibid., 189.

23. Magdalena Malechová, “Multilingualism as a Sociolinguistic Contact Phenomenon with Regard to Current Forms of Multilingual Communication Code-Switching as One of the Contemporary Communication Trends,” Višejezičnost Kao Sociolingvistički Fenomen Kontakta Imajući U Vidu Suvremene Oblike Višejezične Komunikacije Mijenjanje Kodova Kao Suvremeni Komunikacijski Trend 49, no. 1/2 (July 2016): 86–93.

24. Ibid., 91.

25. Bettina Kümmerling-Meibauer, “Code-Switching in Multilingual Picturebooks,” Bookbird: A Journal of International Children’s Literature (Johns Hopkins University Press) 51, no. 3 (July 2013): 12–21.

26. Ibid., 19.

27. Amanda B. Click, Claire Walker Wiley, and Meggan Houlihan, “The Internationalization of the Academic Library: A Systematic Review of 25 Years of Literature on International Students,” doi:10.5860/crl.v78i3.16591 .

28. Nadia Caidi, Danielle Allard, and Lisa Quirke, “Information Practices of Immigrants,” Annual Review of Information Science and Technology 44, no. 1 (Jan. 1, 2010): 515, doi:10.1002/aris.2010.1440440118 .

29. Ignacio J. Ferrer-Vinent, “For English, Press 1: International Students’ Language Preference at the Reference Desk,” Reference Librarian 51, no. 3 (Sept. 2010): 189–201, doi:10.1080/02763871003800429 .

30. Karen Bordonaro, “Is Library Database Searching a Language Learning Activity?” College & Research Libraries 71, no. 3 (May 2010): 273–84.

31. Sara Luly and Holger Lenz, “Language in Context: A Model of Language Oriented Library Instruction,” Journal of Academic Librarianship 41, no. 2 (Mar. 2015): 140–48, doi:10.1016/j.acalib.2015.01.001 .

32. Frans Albarillo, “Is the Library’s Online Orientation Program Effective with English Language Learners?” College & Research Libraries , 78, no. 5 (July 2017): 656–59, doi:10.5860/crl.78.5.652 .

33. Sonia Smith, “Library Instruction for Romanized Hebrew,” Journal of Academic Librarianship 41, no. 2 (Mar. 2015): 197–200, doi:10.1016/j.acalib.2014.08.003 .

34. Ibid., 199.

35. Molavi Fereshteh, “Main Issues in Cataloging Persian Language Materials in North America,” Cataloging & Classification Quarterly 43, no. 2 (Dec. 8, 2006): 77–82, doi:10.1300/J104v43n02_06 .

36. Kim SungKyung, “Romanization in Cataloging of Korean Materials,” Cataloging & Classification Quarterly 43, no. 2 (Dec. 8, 2006): 53–76, doi:10.1300/J104v43n02_05 .

37. Hikaru Nakano, “Non-Roman Language Cataloging in Bulk: A Case Study of Japanese Language Materials,” Cataloging & Classification Quarterly 55, no. 2 (Feb. 2017): 75–88, doi:10.1080/01639374.2016.1250853 .

38. Albarillo, “Is the Library’s Online Orientation Program Effective with English Language Learners?” 3.

39. Xiang Li, Kevin McDowell, and Xiaotong Wang, “Building Bridges: Outreach to International Students via Vernacular Language Videos,” Reference Services Review 44, no. 3 (July 2016): 325, doi:10.1108/RSR-10-2015-0044 .

40. Misa Mi and Yingting Zhang, “Culturally Competent Library Services and Related Factors among Health Sciences Librarians: An Exploratory Study,” Journal of the Medical Library Association 105, no. 2 (Apr. 2017): 135, doi:10.5195/jmla.2017.203 .

41. Ibid., 136.

42. Ibid., 133.

43. Institutional Research and Data Analysis at Brooklyn College, City University of New York, “Fall 2014 Final Enrollment Reports, Enrollment Table 21 Country of Birth,” available online at www.brooklyn.cuny.edu/bc/offices/avpbandp/ipra/enrollment/F14/Enrollment-Table21.pdf [accessed 20 November 2016].

44. United States Census Bureau, “American FactFinder,” available online at http://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml [accessed 20 November 2016].

45. Laerd Statistics, “Independent-Samples T-Test Using SPSS Statistics,” Statistical Tutorials and Software Guides (2015), available online at https://statistics.laerd.com [accessed 21 November 2016].

46. Laerd Statistics, “Somers’ D Using SPSS Statistics,” Statistical Tutorials and Software Guides (2016), available online at https://statistics.laerd.com [accessed 21 November 2016].

47. Arlene Fink, How to Conduct Surveys: A Step-by-Step Guide (Los Angeles: SAGE, 2013), 45.

48. Executive Office of the President, Office of Management and Budget, “Revisions to the Standards for the Classification of Federal Data on Race and Ethnicity,” The White House, available online at https://www.whitehouse.gov/node/15626 [accessed 21 November 2016].

49. Laerd Statistics, “Somers’ D Using SPSS Statistics.”

50. Association of College and Research Libraries, “Diversity Standards: Cultural Competency for Academic Libraries” (2012), available online at www.ala.org/acrl/standards/diversity [accessed 30 August 2016].

51. Racial and Ethnic Diversity Committee Members, “Diversity Standards: Cultural Competency for Academic Libraries (2012),” Association of College and Research Libraries, available online at www.ala.org/acrl/standards/diversity [accessed 21 November 2016].

* Frans Albarillo is Assistant Professor/Reference & Instruction at Brooklyn College, City University of New York; e-mail: [email protected] . ©2018 Frans Albarillo, Attribution-NonCommercial ( http://creativecommons.org/licenses/by-nc/4.0/ ) CC BY-NC.

Creative Commons License

Article Views (Last 12 Months)

Contact ACRL for article usage statistics from 2010-April 2017.

Article Views (By Year/Month)

© 2024 Association of College and Research Libraries , a division of the American Library Association

Print ISSN: 0010-0870 | Online ISSN: 2150-6701

ALA Privacy Policy

ISSN: 2150-6701

A Survey of Code-switching: Linguistic and Social Perspectives for Language Technologies

A. Seza Doğruöz , Sunayana Sitaram , Barbara E. Bullock , Almeida Jacqueline Toribio

Export citation

  • Preformatted

Markdown (Informal)

[A Survey of Code-switching: Linguistic and Social Perspectives for Language Technologies](https://aclanthology.org/2021.acl-long.131) (Doğruöz et al., ACL-IJCNLP 2021)

  • A Survey of Code-switching: Linguistic and Social Perspectives for Language Technologies (Doğruöz et al., ACL-IJCNLP 2021)
  • A. Seza Doğruöz, Sunayana Sitaram, Barbara E. Bullock, and Almeida Jacqueline Toribio. 2021. A Survey of Code-switching: Linguistic and Social Perspectives for Language Technologies . In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) , pages 1654–1666, Online. Association for Computational Linguistics.

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

A curated list of research papers and resources on code-switching


Folders and files, repository files navigation, code-switching research resources.

This is the list of tutorials, workshops, papers, and resources on computational linguistic approaches to code-switching research. The list will be updated over the time. You are welcome to send a pull request for updating the list and be one of the contributors!

📌 I plan to collect theses and books on code-switching and list them here. If you have one, don't hesitate to contact me or send a pull request!

🚀 Highlights

  • If you are new on code-switching or looking for a new research direction, we have written a comprehensive survey paper on code-switching: The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges [Paper] . Feel free to read and let us know if you have any suggestions! Thanks to Alham Fikri Aji, Zheng-Xin Yong, and Thamar Solorio to make this possible 😊
  • We are organizing the code-switching workshop at EMNLP 2023! [Website]
  • We (I, Marina Zhukova, and Sudipta Kar) organized a bird-of-a-feather session at EMNLP 2022 in Abu Dhabi. We have around 30 people joining (in-person and online). Thanks for coming!
  • 📔 There was a comprehensive tutorial about code-mixing by Microsoft Research (Monojit Choudhury, Kalika Bali, Anirudh Srinivasan, and Sandipan Dandapat) at EMNLP 2019, you can check the following link .

🏫 Workshops

This is the list of the code-switching workshop series:

  • First Workshop on Computational Approaches to Code-switching, EMNLP 2014 [Website]
  • Second Workshop on Computational Approaches to Code-switching, EMNLP 2016
  • Third Workshop on Computational Approaches to Linguistic Code-switching, ACL 2018 [Website]
  • Fourth Workshop on Computational Approaches to Linguistic Code-switching, LREC 2020 [Website]
  • First Workshop on Speech Technologies for Code-switching in Multilingual Communities, Interspeech 2020 [Website]
  • Fifth Workshop on Computational Approaches to Linguistic Code-switching, NAACL 2021 [Website]
  • Sixth Workshop on Computational Approaches to Linguistic Code-switching, EMNLP 2023 [Website]

📑 Research Papers

Survey paper.

  • Winata, et al. (2023) The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges . ACL Findings [Paper]
  • Doğruöz, et al (2021) A Survey of Code-switching: Linguistic and Social Perspectives for Language Technologies . ACL [Paper]
  • Jose, et al. (2020) A Survey of Current Datasets for Code-Switching Research . International Conference on Advanced Computing and Communication Systems (ICACCS) [Paper]
  • Sitaram, et al. (2019) A Survey of Code-switched Speech and Language Processing . Arxiv [Paper]

Large Language Models

  • Yong, et al. (2023) Prompting Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages . Arxiv [Paper]

Language Identification and POS Tagging

  • Ostapenko, et al. (2022) Speaker Information Can Guide Models to Better Inductive Biases: A Case Study On Predicting Code-Switching . ACL [Paper]
  • Nguyen, et al. (2021) Automatic Language Identification in Code-Switched Hindi-English Social Media Text . Journal of Open Humanities Data [Paper]
  • Tarunesh, et al. (2021) From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text . ACL [Paper]
  • Gustavo Aguilar and Thamar Solorio. (2020) From English to Code-Switching: Transfer Learning with Strong Morphological Clues . ACL [Paper] [Code]
  • Mager, et al. (2019) Subword-Level Language Identification for Intra-Word Code-Switching . NAACL [Paper]
  • Zhang, et al. (2018) A Fast, Compact, Accurate Model for Language Identification of Codemixed Text . EMNLP [Paper]
  • Kelsey Ball and Dan Garrette. (2018) Part-of-Speech Tagging for Code-Switched, Transliterated Texts without Explicit Language Identification . EMNLP [Paper]
  • Zeynep Yirmibesoglu and Gulsen Eryigit. (2018) Detecting Code-Switching between Turkish-English Language Pair . Workshop W-NUT, EMNLP [Paper]
  • Mavem, et al. (2018) Language Identification and Analysis of Code-Switched Social Media Text . 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Victor Soto and Julia Hirschberg. (2018) Joint Part-of-Speech and Language ID Tagging for Code-Switched Data . 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Bullock, et al. (2018) Predicting the presence of a Matrix Language in code-switching . 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Soto, et al. (2018) The Role of Cognate Words, POS Tags, and Entrainment in Code-Switching . Interspeech [Paper]
  • Barman, et al. (2016) Part-of-speech Tagging of Code-mixed Social Media Content: Pipeline,Stacking and Joint Modelling . 2nd Workshop on Computational Approaches to Code-Switching, ACL [Paper]
  • Vyas, et al. (2014) POS Tagging of English-Hindi Code-Mixed Social Media Content . EMNLP [Paper]
  • Heba Elfardy and Mona Diab. (2012) Token Level Identification of Linguistic Code Switching . COLING [Paper]
  • Thamar Solorio and Yang Liu. (2008) Learning to Predict Code-Switching Points . EMNLP [Paper]
  • Dau-Cheng Lyu and Ren-Yuan Lyu. (2008) Language Identification on Code-Switching Utterances Using Multiple Cues . Interspeech [Paper]
  • Whitehouse, et al. (2022) EntityCS: Improving Zero-Shot Cross-lingual Transfer with Entity-Centric Code Switching . EMNLP [Paper] [Code]
  • Lovenia, et al. (2022) ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation . LREC [Paper] [Dataset]
  • Nguyen, et al. (2020) CanVEC-the Canberra Vietnamese-English Code-switching Natural Speech Corpus . LREC [Paper]
  • Umapathy, et al. (2020) Investigating Modelling Techniques for Natural Language Inference on Code-Switched Dialogues in Bollywood Movies . First Workshop on Speech Technologies for Code-switching in Multilingual Communities, Interspeech 2020 [Dataset]
  • Xiang, et al. (2020) Sina Mandarin Alphabetical Words:A Web-driven Code-mixing Lexical Resource . AACL-IJCNLP [TBC]
  • Chakravarthi, et al. (2020) Corpus Creation for Sentiment Analysis in Code-Mixed Tamil-English Text . Spoken Language Technologies for Under-resourced languages) and CCURL (Collaboration and Computing for Under-Resourced Languages Workshop, LREC [Paper]
  • Khanuja, et al. (2020) A New Dataset for Natural Language Inference from Code-mixed Conversations . 4th Workshop of Computational Approaches to Linguistic Code-switching, LREC [Paper]
  • Barik, et al. (2019) Normalization of Indonesian-English Code-Mixed Twitter Data . W-NUT, EMNLP [Paper] [Dataset]
  • Singh, et al. (2018) A Twitter Corpus for Hindi-English Code Mixed POS Tagging . Sixth International Workshop on Natural Language Processing for Social Media, ACL [Paper]
  • Li, et al. (2012) A Mandarin-English Code-Switching Corpus . LREC [Paper]
  • Lyu, et al. (2010) SEAME: A Mandarin-English Code-Switching Speech Corpus in South-East Asia . Interspeech [Paper]
  • Lyu, et al. (2010) An Analysis of a Mandarin-English Code-switching Speech Corpus: SEAME . Age [Paper]

Language Modeling and Speech Recognition

  • Kumar, et al. (2020) Machine Learning based Language Modelling of Code Switched Data . International Conference on Electronics and Sustainable Communication Systems (ICESC) [Paper]
  • Madhumani, et al. (2020) Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition . Arxiv [Paper]
  • Shah, et al. (2020) Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition . Arxiv [Paper]
  • Winata, et al. (2020) Meta-Transfer Learning for Code-Switched Speech Recognition . ACL [Paper] [Code]
  • Chandu, et al. (2020) Style Variation as a Vantage Point for Code-Switching . Arxiv [Paper]
  • Ganji Sreeram and Rohit Sinha (2020) Exploration of End-to-End Framework for Code-Switching Speech Recognition Task: Challenges and Enhancements . IEEE Access [Paper]
  • Winata, et al. (2019) Code-Switched Language Models Using Neural Based Synthetic Data from Parallel Sentences . CoNLL [Paper]
  • Hila Gonen and Yoav Goldberg (2019) Language Modeling for Code-Switching:Evaluation, Integration of Monolingual Data, and Discriminative Training . EMNLP [Paper]
  • Lee, et al. (2019) Linguistically Motivated Parallel Data Augmentation for Code-switch Language Modeling . Interspeech [Paper]
  • Victor Soto and Julia Hirschberg (2019) Improving Code-Switched Language Modeling Performance Using Cognate Features . Interspeech [Paper]
  • Chang, et al. (2019) Code-switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation . Interspeech [Paper]
  • Zeng, et al. (2019) On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition . Interspeech [Paper]
  • Taneja, et al. (2019) Exploiting Monolingual Speech Corpora for Code-mixed Speech Recognition . Interspeech [Paper]
  • Shan, et al. (2019) Investigating End-to-end Speech Recognition for Mandarin-english Code-switching . IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) [Paper]
  • Grandee Lee, Haizhou Li. (2019) Word and Class Common Space Embedding for Code-switch Language Modelling . IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) [Paper]
  • Hamed, et al. (2019) Code-Switching Language Modeling with Bilingual Word Embeddings: A Case Study for Egyptian Arabic-English . International Conference on Speech and Computer [Paper]
  • Winata, et al. (2018) Learn to Code-Switch: Data Augmentation using Copy Mechanism on Language Modeling . Arxiv [Paper]
  • Winata, et al. (2018) Towards End-to-end Automatic Code-Switching Speech Recognition . Arxiv [Paper]
  • Nakayama, et al. (2018) Speech Chain for Semi-Supervised Learning of Japanese-English Code-Switching ASR and TTS . IEEE Spoken Language Technology Workshop (SLT) [Paper]
  • Jesse Emond, Bhuwana Ramabhadran, Brian Roark, Pedro Moreno, and Min Ma. (2018) Transliteration Based Approaches to Improve Code-Switched Speech Recognition Performance , IEEE Spoken Language Technology Workshop (SLT) [Paper]
  • Ganji Sreeram and Rohit Sinha. (2018) Exploiting Parts-of-Speech for Improved Textual Modeling of Code-Switching Data . 2018 Twenty Fourth National Conference on Communications (NCC) [Paper]
  • Garg, et al. (2018) Code-switched Language Models Using Dual RNNs and Same-Source Pretraining . EMNLP [Paper]
  • Ewald van der Westhuizen and Thomas R. Niesler. (2018) Synthesised bigrams using word embeddings for code-switched ASR of four South African language pairs . Computer Speech and Language [Paper]
  • Biswal, et al. (2018) Multilingual Neural Network Acoustic Modelling for ASR of Under-Resourced English-isiZulu Code-Switched Speech . Interspeech [Paper]
  • Winata, et al. (2018) Code-Switching Language Modeling using Syntax-Aware Multi-Task Learning . 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper] [Code]
  • Chandu, et al. (2018) Language Informed Modeling of Code-Switched Text . 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Pratapa, et al. (2018) Language Modeling for Code-Mixing: The Role of Linguistic Theory based Synthetic Data . ACL [Paper]
  • Sivasankaran, et al. (2018) Phone Merging For Code-Switched Speech Recognition . 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Garg, et al. (2018) Dual Language Models for Code Switched Speech Recognition . Interspeech [Paper]
  • Baheti, et al. (2017) Curriculum Design for Code-switching: Experiments with Language Identification and Language Modeling with Deep Neural Networks . ICON [Paper]
  • Adel, et al. (2015) Syntactic and Semantic Features For Code-Switching Factored Language Models . IEEE Transactions on Audio, Speech, and Language Processing [Paper]
  • Ying Li and Pascale Fung. (2014) Code switch language modeling with Functional Head Constraint . ICASSP [Paper]
  • Ying Li and Pascale Fung. (2014) Language Modeling with Functional Head Constraint for Code Switching Speech Recognition . EMNLP [Paper]
  • Adel, et al. (2013) Combination of Recurrent Neural Networks and Factored Language Models for Code-Switching Language Modeling . ACL [Paper]
  • Adel, et al. (2013) Recurrent neural network language modeling for code switching conversational speech . ICASSP [Paper]
  • Vu, et al. (2012) A First Speech Recognition System for Mandarin-English Code-Switch Conversational Speech . ICASSP [Paper]
  • Ying Li and Pascale Fung. (2012) Code-switch Language Model with Inversion Constraints for Mixed Language Speech Recognition . COLING [Paper]
  • Li, et al. (2011) Asymmetric acoustic modeling of mixed language speech . ICASSP [Paper]
  • Sravani, et al. (2021) Political Discourse Analysis: A Case Study of Code Mixing and Code Switching in Political Speeches . Proceedings of the 5th Workshop on Computational Approaches to Code Switching (CALCS), NAACL [Paper]
  • Gupta, et al. (2020) A Semi-supervised Approach to Generate the Code-Mixed Text using Pre-trained Encoder and Transfer Learning . Findings of EMNLP [Paper]
  • Bryan Gregorius and Takeshi Okadome (2022) Generating Code-Switched Text from Monolingual Text with Dependency Tree . The 20th Annual Workshop of the Australasian Language Technology Association [Paper] [Code]

Speech Synthesis

  • Sai Krishna Rallabandi and Alan W Black (2019) Variational Attention using Articulatory Priors for generating Code Mixed Speech using Monolingual Corpora . Interspeech [Paper]
  • Sai Krishna Rallabandi and Alan W Black (2017) On Building Mixed Lingual Speech Synthesis Systems. Interspeech [Paper]
  • Chandu, et al. (2017) Speech Synthesis for Mixed-Language Navigation Instructions. Interspeech [Paper]
  • Guzman, et al. (2017) Metrics for modeling code-switching across corpora . Interspeech [Paper]

Representation Learning

  • Prasad, et al. (2021) The Effectiveness of Intermediate-Task Training for Code-Switched Natural Language Understanding . Proceedings of the 1st Workshop on Multilingual Representation Learning, EMNLP [Paper]
  • Winata, et al. (2021) Are Multilingual Models Effective in Code-Switching? . Proceedings of the 5th Workshop on Computational Approaches to Code Switching (CALCS), NAACL [Paper]
  • Rizal, et al. (2020) Evaluating Word Embeddings for Indonesian–English Code-Mixed Text Based on Synthetic Data . Proceedings of the 4th Workshop on Computational Approaches to Code Switching (CALCS), LREC [Paper]
  • Winata, et al. (2019) Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition . EMNLP [Paper] [Code]
  • Pratapa, et al. (2018) Word Embeddings for Code-Mixed Language Processing . EMNLP [Paper]

Machine Translation

  • Gaser, et al. (2023) Exploring Segmentation Approaches for Neural Machine Translation of Code-Switched Egyptian Arabic-English Text . EACL [Paper]
  • Vivek Srivastava and Mayank Singh (2020) PHINC: A Parallel Hinglish Social Media Code-Mixed Corpus for Machine Translation . W-NUT, EMNLP [Paper] [Dataset]
  • Thoudam Doren Singh and Thamar Solorio. (2017) Towards Translating Mixed-Code Comments from Social Media . CICLing [Paper]
  • Krishnan, et al. (2021) Multilingual Code-Switching for Zero-Shot Cross-Lingual Intent Prediction and Slot Filling . MRL, EMNLP [Paper]

Named Entity Recognition

  • Priyadharshini, et al. (2020) Named Entity Recognition for Code-Mixed Indian Corpus using Meta Embedding . 6th International Conference on Advanced Computing and Communication Systems (ICACCS) [Paper]
  • Winata, et al. (2019) Learning Multilingual Meta-Embeddings for Code-Switching Named Entity Recognition . RepL4NLP, ACL [Paper] [Code]
  • Aguilar, et al. (2018) Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task . 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Wang, et al. (2018) Code-Switched Named Entity Recognition with Embedding Attention . 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Winata, et al. (2018) Bilingual Character Representation for Efficiently Addressing Out-of-Vocabulary Words in Code-Switching Named Entity Recognition . 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Aguilar, et al. (2017) A Multi-task Approach for Named Entity Recognition in Social Media Data . 3rd Workshop on Noisy User-generated Text, EMNLP [Paper]


  • Li Nyuyen. (2018) Borrowing or Code-switching? Traces of community norms in Vietnamese-English speech. Australian Journal of Linguistics 38.4 (2018): 443-466. [Paper]
  • Fairchild, Sarah, and Janet G. Van Hell. (2017) Determiner-noun code-switching in Spanish heritage speakers. Bilingualism: Language and Cognition 20.1 (2017): 150-161. [Paper]
  • Bhatt, Rakesh M., and Agnes Bolonyai. (2011) Code-switching and the optimal grammar of bilingual language use. Bilingualism: Language and Cognition 14.4 (2011): 522-546. [Paper]
  • Lipski (2005) Code-switching or Borrowing? No sé so no puedo decir, you know. Second Workshop on Spanish Sociolinguistics [Paper]
  • Roberto R. Heredia and Jeanette Altarriba (2001) Bilingual Language Mixing: Why Do Bilinguals Code-Switch? SAGE Publications [Paper]
  • Belazi, et al. (1994) Code switching and X-bar theory: The functional head constraint . Linguistic inquiry Vol 25 No.2 Spring [Paper]
  • Shana Poplack (1980) Sometimes i’ll start a sentence in spanish y termino en espanol: toward a typology of code-switching1 . Linguistics 18(7-8) [Paper]
  • Pfaff, Carol W. (1979) Constraints on language mixing: intrasentential code-switching and borrowing in Spanish/English. Language: 291-318. [Paper]
  • Shana Poplack (1978) Syntactic structure and social function of code-switching . Vol. 2. Centro de Estudios Puertorriqueños, City University of New York [Paper]
  • Gumperz, J. J., & Hernandez, E. (1969) Cognitive aspects of bilingual communication . Institute of International Studies, University of California [Paper]

Affective Computing

  • Chakravarthi, et al. (2021) DravidianCodeMix: Sentiment Analysis and Offensive Language Identification Dataset for Dravidian Languages in Code-Mixed Text . Arxiv [Paper] [Code and Dataset]
  • Siddharth Yadav (2020) Unsupervised Sentiment Analysis for Code-mixed Data . Arxiv [Paper] [Code]
  • Wang, et al. (2017) Emotion Analysis in Code-Switching Text With Joint Factor Graph Model . IEEE/ACM Transactions on Audio, Speech, and Language Processing [Paper]
  • Wang, et al. (2016) A Bilingual Attention Network for Code-switched Emotion Prediction . COLING [Paper]
  • Sophia Lee and Zhongqing Wang (2015) Emotion in Code-switching Texts: Corpus Construction and Analysis . Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing [Paper]
  • Wang, et al. (2015) Emotion Detection in Code-switching Texts via Bilingual and Sentimental Information . ACL [Paper]

Dialog and Conversational System

  • Gupta, et al. (2018) Uncovering Code-Mixed Challenges: A Framework for Linguistically Driven Question Generation and Neural based Question Answering . CoNLL [Paper]
  • Sravani, et al. (2021) Political Discourse Analysis: A Case Study of Code Mixing and Code Switching in Political Speeches . CALCS Proceedings of the 5th Workshop on Computational Approaches to Code Switching (CALCS), NAACL [Paper]
  • Kodali, et al. (2022) SyMCoM - Syntactic Measure of Code Mixing A Study Of English-Hindi Code-Mixing . Findings of ACL [Paper]
  • Özlem Çetinoglu and Çagrı Çöltekin (2019) Challenges of Annotating a Code-Switching Treebank . SyntaxFest [Paper]

Adversarial Attack

  • Samson Tan and Shafiq Joty (2021) Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots . NAACL [Paper]

Social Linguistics

  • Bolock, et al. (2020) Who, When and Why: The 3 Ws of Code-Switching . International Conference on Practical Applications of Agents and Multi-Agent Systems [Paper]
  • Yoder, et al. (2017) Code-Switching as a Social Act:The Case of Arabic Wikipedia Talk Pages . Proceedings of the Second Workshop on Natural Language Processing and Computational Social Science, ACL [Paper]
  • Agrawal, et al. (2017) Agarwal, Prabhat, et al. I may talk in English but gaali toh Hindi mein hi denge: A study of English-Hindi code-switching and swearing pattern on social networks . International Conference on Communication Systems and Networks (COMSNETS) [Paper]
  • Khanuja, et al. (2020) GLUECoS : An Evaluation Benchmark for Code-Switched NLP . ACL [Paper]
  • Aguilar, et al. (2020) LinCE: A Centralized Benchmark for Linguistic Code-switching Evaluation . LREC [Paper]

Social Media

  • Bali, et al. (2014) “I am borrowing ya mixing ?” An Analysis of English-Hindi Code Mixing in Facebook . Proceedings of The First Workshop on Computational Approaches to Code Switching [Paper]

Text Normalization

  • Dwija Parikh and Thamar Solorio (2021) Normalization and Back-Transliteration for Code­Switched Data . CALCS Proceedings of the 5th Workshop on Computational Approaches to Code Switching (CALCS), NAACL [Paper]

Synthetic Data Generation Toolkit

  • Jayanthi, et al. (2021) CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Mixing . CALCS Proceedings of the 5th Workshop on Computational Approaches to Code Switching (CALCS), NAACL [Paper] [Code]
  • Rizvi, et al. (2021) GCM: A Toolkit for Generating Synthetic Code-mixed Text . EACL (System Demonstrations) [Paper] [Code]

Annotation Toolkit

  • Shah, et al. (2019) CoSSAT: Code-Switched Speech Annotation Tool . Proceedings of the First Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP [Paper]


  • Mehnaz, et al. (2021) GupShup: Summarizing Open-Domain Code-Switched Conversations . EMNLP

Question Answering

  • Gupta, et al. (2020) A Unified Framework for Multilingual and Code-Mixed Visual Question Answering . AACL-IJCNLP [TBA]
  • Bawa, et al. (2020) Do Multilingual Users Prefer Chat-bots that Code-mix? Let's Nudge and Find Out! . ACM on Human-Computer Interaction [Paper]
  • Banerjee, et al. (2018) A Dataset for Building Code-Mixed Goal Oriented Conversation Systems . COLING [Paper]

Position Paper

  • Nguyen, et al. (2022) Building Educational Technologies for Code-Switching: Current Practices, Difficulties and Future Directions . Languages [Paper]
  • Caciullos and Travis (2018) Bilingualism in the Community . Cambridge University Press
  • Genta Indra Winata (2021) Multilingual Transfer Learning for Code-Switched Language and Speech Neural Modeling . [Thesis]
  • Gustavo Aguilar (2020) Neural Sequence Labeling on Social Media Text . [Thesis]
  • Victor Soto Martinez (2020) Identifying and Modeling Code-Switched Language . [Thesis]

Contributors 7


Help | Advanced Search

Computer Science > Artificial Intelligence

Title: a survey on the memory mechanism of large language model based agents.

Abstract: Large language model (LLM) based agents have recently attracted much attention from the research and industry communities. Compared with original LLMs, LLM-based agents are featured in their self-evolving capability, which is the basis for solving real-world problems that need long-term and complex agent-environment interactions. The key component to support agent-environment interactions is the memory of the agents. While previous studies have proposed many promising memory mechanisms, they are scattered in different papers, and there lacks a systematical review to summarize and compare these works from a holistic perspective, failing to abstract common and effective designing patterns for inspiring future studies. To bridge this gap, in this paper, we propose a comprehensive survey on the memory mechanism of LLM-based agents. In specific, we first discuss ''what is'' and ''why do we need'' the memory in LLM-based agents. Then, we systematically review previous studies on how to design and evaluate the memory module. In addition, we also present many agent applications, where the memory module plays an important role. At last, we analyze the limitations of existing work and show important future directions. To keep up with the latest advances in this field, we create a repository at \url{ this https URL }.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

  • Skip to main content
  • Keyboard shortcuts for audio player

Code Switch

Code Switch

  • Apple Podcasts
  • Google Podcasts
  • Amazon Music

Your support helps make our show possible and unlocks access to our sponsor-free feed.

Exclusion, resilience and the Chinese American experience on 'Mott Street'

Lori Lizarraga

B.A. Parker, photographed for NPR, 9 September 2022, in New York, NY. Photo by Brandon Watson for NPR.

B.A. Parker


Leah Donnella

research papers on code switching

Author Ava Chin poses next to the cover of her recent book, Mott Street: A Chinese American Family's Story of Exclusion and Homecoming Author headshot via Tommy Kha hide caption

Author Ava Chin poses next to the cover of her recent book, Mott Street: A Chinese American Family's Story of Exclusion and Homecoming

From the time she was a young child, Ava Chin heard stories about her family's roots in the United States. Some of the most vivid stories centered her family's involvement in the building of the transcontinental railroad, in the 1800s. But when she saw pictures of the railroad's construction back in grade school, she says, not a single Chinese face was staring back at her. That was one of the moments that shaped her decision to become a writer – when she realized that there were huge chunks of American history simply not being told. This week on the podcast, we're revisiting a conversation we had with Chin about her book, Mott Street. Through decades of painstaking research, the fifth-generation New Yorker discovered the stories of how her ancestors bore and resisted the weight of the Chinese Exclusion laws in the U.S. – and how the legacy of that history still affects her family today.

  • chinese american
  • American History


  1. (PDF) Classroom Code-Switching: Three Decades of Research

    research papers on code switching

  2. (PDF) Code-Switching in Linguistics: A Position Paper

    research papers on code switching

  3. (PDF) Code-­‐Switching/Mixing in ESL Contexts: Challenges and Opportunities

    research papers on code switching


    research papers on code switching

  5. (PDF) Code-Switching: A Pedagogical Strategy in Bilingual Classrooms

    research papers on code switching

  6. (PDF) Code-Switching Pragmatics

    research papers on code switching


  1. Switching Techniques in Data Communication and Computer Networks

  2. 1429 Code Chapter 4 Guess Paper

  3. Coin Changing Problem

  4. 1429 code chapter 3 guess paper

  5. Switching papers! 🌈🫧💫 #shorts #shortsfeed #art

  6. Aiou Code 386 Past Papers


  1. Issues and Functions of Code-switching in Studies on Popular Culture: A Systematic Literature Review

    This paper highlights and reviews code-switching in studies of which the scope is its usage in popular culture, with the aims to explore the most frequently discussed issues in the realm of code ...

  2. Acculturation and attitudes toward code-switching: A bidimensional

    Furthermore, research assessing code-switching attitudes has employed direct data collection methods, such as surveys and questionnaires (e.g. Dewaele & Wei, 2014b ... here. Following Clément and Noels (1992; see also Clément et al., 1993; Damji et al., 1996), our focus in the present paper is the actual cultural identity of the individual, ...

  3. Code-Switching in Linguistics: A Position Paper

    Abstract: This paper provides a critical review of the state of the art in code-switching research. being conducted in linguistics. Three issues of theoretical and practical importance are ...

  4. (PDF) Code Switching: Linguistic

    In the world, code-switching has been studied since the 1970s (in Stell and Yakpo, 2015: 2) and received 'serious scholarly attention in the last few decades' (Poplack, 2001 (Poplack, : 2062 ...

  5. (PDF) Code-switching

    PDF | On Jan 1, 2012, Angel Lin and others published Code-switching | Find, read and cite all the research you need on ResearchGate

  6. Code-Switching in Linguistics: A Position Paper

    This paper provides a critical review of the state of the art in code-switching research being conducted in linguistics. Three issues of theoretical and practical importance are explored: (a) code-switching vs. borrowing; (b) grammaticality; and (c) variability vs. uniformity, and I take a position on all three issues. Regarding switching vs. borrowing, I argue that not all lone other-language ...

  7. Worldwide Trend Analysis of Psycholinguistic Research on Code Switching

    As a result, research on code switching has garnered growing attention, leading to a substantial number of published papers in recent decades. To gain insights into the current status and potential trends of psycholinguistic research on code switching, this study conducted a bibliometric analysis of 1,293 articles focusing on code switching ...

  8. Codeswitching: A Bilingual Toolkit for Opportunistic Speech Planning

    Introduction. Traditionally, the study of codeswitching production and bilingual speech more generally has been carried out within separate disciplines, where cognitive psychologists and psycholinguists have primarily centered on exogenously-cued language switching, 1 and sociolinguists have focused on the analysis of codeswitching patterns within discourse of members of a given speech community.

  9. [2212.09660] The Decades Progress on Code-Switching Research in NLP: A

    Code-Switching, a common phenomenon in written text and conversation, has been studied over decades by the natural language processing (NLP) research community. Initially, code-switching is intensively explored by leveraging linguistic theories and, currently, more machine-learning oriented approaches to develop models. We introduce a comprehensive systematic survey on code-switching research ...

  10. PDF The Decades Progress on Code-Switching Research in NLP: A Systematic

    Before the proliferation of social media platforms, it was more common to observe code-switching in spoken language and not so much in written language. This is not the case anymore, as multi- lingual users tend to combine the languages they speak on social media; 2) The increasing release of voice-operated devices.

  11. [PDF] Issues and Functions of Code-switching in Studies on Popular

    Code-switching is a linguistic phenomenon often associated with the architecture of discourse varieties. A good number of studies in the bilingual and multilingual contexts have zoomed in on the use of code-switching primarily analysing its roles and functions in varied discourse settings. This paper highlights and reviews code-switching in studies of which the scope is its usage in popular ...

  12. The Decades Progress on Code-Switching Research in NLP

    %0 Conference Proceedings %T The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges %A Winata, Genta %A Aji, Alham Fikri %A Yong, Zheng Xin %A Solorio, Thamar %Y Rogers, Anna %Y Boyd-Graber, Jordan %Y Okazaki, Naoaki %S Findings of the Association for Computational Linguistics: ACL 2023 %D 2023 %8 July %I Association for Computational ...

  13. PDF A Survey of Current Datasets for Code-Switching Research

    able corpora for code switching research played a major role in advancing this area of research. In this paper, we propose a set of quality metrics to evaluate the dataset and categorize them accordingly. Index Terms—code switching, natural language processing, dataset I. INTRODUCTION With the advent of social media, users prefer to mix multiple

  14. Issues in Code-Switching: Competing Theories and Models

    This paper provides a critical overview of the theoretical, analytical, and practical questions most prevalent in the study of the structural and the sociolinguistic dimensions of code-switching (CS). In doing so, it reviews a range of empirical studies from around the world. The paper first looks at the linguistic research on the structural features of CS focusing in particular on the code ...

  15. Information Code-Switching: A Study of Language Preferences in Academic

    Code-switching is an active research area that has significant implications for academic libraries. Using data from focus groups and a survey tool, this paper examines language preferences of foreign-born students for particular information tasks. ... The main finding of this paper is that students' culture and language represent an active ...

  16. A Survey of Code-switching: Linguistic and Social Perspectives for

    Abstract The analysis of data in which multiple languages are represented has gained popularity among computational linguists in recent years. So far, much of this research focuses mainly on the improvement of computational methods and largely ignores linguistic and social aspects of C-S discussed across a wide range of languages within the long-established literature in linguistics.

  17. PDF Code-Switching in Linguistics: A Position Paper

    Abstract: This paper provides a critical review of the state of the art in code-switching research being conducted in linguistics. Three issues of theoretical and practical importance are explored: (a) code-switching vs. borrowing; (b) grammaticality; and (c) variability vs. uniformity, and I take a position on all three issues.

  18. (PDF) Code Switching and Students' Performance in English

    Abstract and Figures. This study determined the influence of code switching to the academic performance of students in English. A total of 40 incoming Grade 10 students participated in this study ...

  19. GitHub

    If you are new on code-switching or looking for a new research direction, we have written a comprehensive survey paper on code-switching: The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges.Feel free to read and let us know if you have any suggestions!

  20. PDF Code-switching as a Result of Language Acquisition

    Gumperz (1982 b) defined code-switching. as "the juxtaposition within the same speech exchange of passages of speech belonging to two. different grammatical systems or subsystems" (p. 59). The emphasis is on the two grammatical. systems of one language, although most people refer to code-switching as the mixed use of.

  21. Code-Switching in the Classroom: Research Paradigms and Approaches

    Classroom code-switching refers to the alternating use of more than one linguistic code in the classroom by any of the classroom participants. This chapter provides a review of the historical ...

  22. PDF Pedagogic Code-Switching: A Case Study of the Language Practices of

    This qualitative research was conducted to investigate the language practices of two Filipino high school teachers during instruction in English language classrooms. Specifically, this study aimed to determine the types of code-switching ... code-switching in their EFL curriculum as a tool in various language learning activities (Kasperczyk ...

  23. PDF A study of code-mixing and code-switching (Urdu and Punjabi) in ...

    The problem under investigation is to examine socio-ethnic with the linguistic occurrence of code-mixing and code-switching. This research paper demonstrated parsing of the code-mixed and code-switched discourse structures uttered by the children bearing the age of 2 to 5 everyday conversations and discussions.

  24. Prompting Towards Alleviating Code-Switched Data Scarcity in Under

    Many multilingual communities, including numerous in Africa, frequently engage in code-switching during conversations. This behaviour stresses the need for natural language processing technologies adept at processing code-switched text. However, data scarcity, particularly in African languages, poses a significant challenge, as many are low-resourced and under-represented. In this study, we ...

  25. [2404.13501] A Survey on the Memory Mechanism of Large Language Model

    Large language model (LLM) based agents have recently attracted much attention from the research and industry communities. Compared with original LLMs, LLM-based agents are featured in their self-evolving capability, which is the basis for solving real-world problems that need long-term and complex agent-environment interactions. The key component to support agent-environment interactions is ...

  26. Ava Chin explores her Chinese American family history in her book ...

    Ava Chin explores her Chinese American family history in her book Mott Street : Code Switch This week on the podcast, ... Through decades of painstaking research, the fifth-generation New Yorker ...

  27. (PDF) code-switching and code mixing

    moving between distinct varieties is known as code switching. ... The codes are part of the accepted research paper: "Modified parameter-setting-free harmony search (PSFHS) algorithm for ...