1. Introduction
The present study investigates the overall rate of occurrence and duration of intrusive vowels (hereafter IVs) in Spanish, with particular emphasis on their realization among monolingual speakers of Spanish and Maya-Spanish bilinguals residing in Yucatan, Mexico. An intrusive vowel – also sometimes referred to as svarabhakti vowel – is a short, vowel-like fragment that can be inserted between an obstruent consonant and an adjacent liquid (Bradley, 2006; Navarro Tomas, 1918; Quilis, 1970, 1993). IVs can occur in both hetero- and tautosyllabic consonant clusters in which the adjacent consonant is a rhotic or lateral, similar to those presented in example 1, a-d (Schmeiser, 2009, p. 193). The superscript [ə] represents the IV.
(1) Tautosyllabic /Cɾ/ clusters
a. pronto [pəɾ] ‘soon’
b. otro [təɾ] ‘other’
Heterosyllabic /ɾ.C/ clusters
c. parte [ɾət] ‘part’
d. porque [ɾək] ‘because’
Unlike the process of epenthesis, an IV is not part of the syllabic structure of the word, but instead is argued to be the result of gestural timing (Bradley, 2006; but see Hall, 2006, for an alternate proposal). Many speakers may be unaware that they produce IVs in certain phonetic environments, resulting in highly variable rates of production within the same word produced by the same speaker (Gili Gaya, 1921, cited in Schmeiser, 2006).
Despite the variability, however, a considerable amount of research on IVs in Spanish suggests that certain phonetic environments are more favorable to the occurrence of IVs, and that IV duration in particular is influenced by factors such as consonant place of articulation, voicing, stress, and position in word (Blecua, 2001; Colantoni & Steele, 2007; Ramírez, 2006; Schmeiser, 2006, 2020, among others). The first goal of our study is to examine how frequently IVs occur in a contact-dialect of Spanish – namely Yucatan Spanish (hereafter YS)– which is characterized by a series of unique phonetic traits that are often attributed, in part, to its extensive contact with Yucatec Maya (see Michnowicz, 2015 for an overview). A second goal is to examine IV duration and explore how the length of IVs, as well as their rate of occurrence, might be influenced by both linguistic (e.g., cluster type, stress, position in word) and extralinguistic factors (e.g., age, sex, language background). A third and final goal entails describing what the combined results of frequency and duration might reveal about the trajectory of phonetic processes in YS, given that previous research has found rapid generational changes in the dialect (Michnowicz, 2015). In order to better understand any potential changes in IVs across generational groups, data is included from both real and apparent sociolinguistic time. Apparent time compares behavior across generations within the same time period (here, older vs. middle- aged vs. younger speakers in data collected in 2005), while real time involves data collected at a later date with the goal of increasing the time-depth of the analysis (here, an additional group of younger speakers interviewed a decade later in 2016) (Labov, 1994). These four age groups (older-2005, middle-2005, younger-2005, younger-2016) form the basis of the present analysis of IVs.
After reviewing some of the overall trends regarding IV rate of occurrence and duration in Spanish (section 2), section 3 describes general traits of YS and Yucatec Maya, focusing primarily on the phonetic uniqueness of the dialect. The methodology employed in the study is presented next, followed by separate sections describing the analyses of overall rate of occurrence (6.1) and duration (6.2). The study concludes with a discussion of our findings, along with closing remarks and considerations for future research.
2. Intrusive Vowels in Spanish
Research on vowel intrusion in Spanish has approached the phenomenon from a variety of angles, including those from a phonological perspective (e.g., Bradley, 2002; Hall, 2006), to acoustic-phonetic approaches (e.g., Colantoni & Steele, 2005; Ramírez, 2000; Schmeiser, 2009, 2020) and even through the lens of perception (e.g., Ramírez, 2006). While in conjunction all three approaches provide a comprehensive view of vowel intrusion, we will adopt an acoustic-phonetic approach, focusing on the overall rate of occurrence of IVs in our corpus, as well as their duration. Analyses of both rate of vowel intrusion and IV duration provide a more holistic view of the YS phonetic/phonological system, allowing for comparisons of rates of vowel intrusion established in other varieties of Spanish, as well as permitting a detailed, acoustic characterization of the IVs themselves that also lends itself well to cross-dialectal comparisons.
2.1. Rate of Occurrence
Regarding overall rate of occurrence, Blecua (2001) reports that IVs are more common in tautosyllabic (/Cɾ/) clusters than heterosyllabic (/ɾC/) clusters, and additional studies indicate a greater prevalence of vowel intrusion in the context of rhotics than laterals (Bradley, 2006; Colantoni & Steele, 2005; Hall, 2003; Ramírez, 2006). For these reasons, the envelope of variation in our work is defined as only tautosyllabic clusters consisting of a stop consonant /ptkbdg/ and a rhotic tap /ɾ/ similar to those presented in (1) a-b.[1]
The impact of consonant place of articulation and voicing on the rate of vowel intrusion is somewhat less consistent. Colantoni and Steele (2007) reported no significant differences in rate when comparing labial, coronal, and dorsal consonant clusters in Argentine Spanish, but Ramírez (2006) found a greater prevalence in dental clusters based on productions from speakers throughout Latin America. Similar conflicting results concern voicing characteristics of the adjacent consonant, with reports of higher rates in the context of voiced consonants in some cases (Colantoni & Steele, 2005, 2007), but voiceless consonants in others (Ramírez, 2006).
Rate of occurrence of IVs is also highly variable cross-dialectically, with ranges from 25% to nearly 100%, as shown in Table 1.
In summary, while most studies concur that vowel intrusion is more likely to occur in tautosyllabic clusters containing a rhotic, results vary regarding the influence of consonant place of articulation, voicing, and dialect.
2.2. IV Duration
Various works have also examined the duration of IVs, revealing consistent trends for some variables and more mixed findings for others. Regarding overall duration of IVs, Quilis (1970) reports a range of 8-56 milliseconds, with an average duration of 20-30 milliseconds. This range has been attested in several more recent studies (e.g., Colantoni & Steele, 2007; Schmeiser, 2020), although Ramírez (2006) argues for a more specific metric, stating that the duration of the IV is approximately one third that of the duration of the full vowel.
One of the most consistent findings throughout the literature concerns the role of voicing characteristics of the adjacent consonant, indicating that IVs are longer in the context of a voiced consonant than in clusters with a voiceless consonant (Bradley & Schmeiser, 2003; Colantoni & Steele, 2007; Kilpatrick et al., 2006; Schmeiser, 2020). Also consistent is the general tendency for lengthened duration in contact with dorsal consonants relative to labials (Bradley & Schmeiser, 2003; Schmeiser, 2007, 2020).
The influence of position in word (initial versus medial) and prosodic stress, however, are less clear. While some investigations have reported lengthened duration of IVs in word-initial position (Bradley & Schmeiser, 2003), others report the reverse (Colantoni & Steele, 2007), while still others find no significant difference based on position (Kilpatrick et al., 2006; Schmeiser, 2007, 2020). Regarding prosodic stress and IV duration, results are equally mixed, with some reports of lengthened duration at the onset of a stressed syllable (Bradley & Schmeiser, 2003; Colantoni & Steele, 2007), while other studies cite no significant differences between stressed and unstressed syllables (Kilpatrick et al., 2006; Schmeiser, 2006, 2020).
In conjunction, previous investigations of several varieties of Spanish confirm general tendencies regarding the influence of place of articulation and voicing on the rate of vowel intrusion and IV duration, yet less consistent findings regarding aspects such as position in word and stress. The present work will address these same phonetic traits with the aim of broadening our understanding of the process of vowel intrusion, while also providing an initial characterization of the phenomenon in YS.
3. Yucatan Spanish
Yucatan Spanish (YS), spoken in the Mexican States of Yucatan, Quintana Roo and, to a lesser extent, parts of Campeche, is a distinct regional variety characterized by lexical, morphosyntactic and phonetic/phonological features that distinguish it from other varieties of Mexican Spanish (Michnowicz, 2015). Importantly, many of the distinguishing features of YS have been attributed to direct or indirect influence of the indigenous contact language, Yucatec Maya. Phonetic/phonological features with possible Maya origins include the aspiration of /ptk/ (Michnowicz & Carpenter, 2013); occlusive productions of intervocalic /bdg/ (Michnowicz, 2009; Michnowicz et al., 2023); glottal stop insertion before vowel-initial words (Michnowicz & Kagan, 2016); the labialization of absolute final nasals (Michnowicz, 2008, 2021; Uth, 2022); and a more stress-timed prosodic rhythm (Michnowicz & Hyler, 2020).
Importantly, while these studies find differing levels of possible Maya influence, there is a consensus that speakers of YS are moving from regional forms to more pan-Hispanic forms across age groups, with the youngest speakers essentially matching Central Mexican norms for many variants (Michnowicz, 2015). As a whole, studies suggest that age group is a greater predictor of linguistic variation than bilingualism, at least in the capital city of Mérida and surrounding regions (Michnowicz, 2015). Still, features that are more likely to be direct transfers from Maya (e.g., glottal stop insertion) do show important differences between Spanish monolinguals and Maya-Spanish bilinguals (Michnowicz & Kagan, 2016).
While no studies exist on vowel intrusion in either YS or Maya, examining the phonological system of Yucatec Maya will enable us to make some predictions about how Maya-Spanish bilingualism might affect intrusive vowel realization in YS. Yucatec Maya itself has a five-vowel system that mirrors Spanish (/i/, /e/, /a/, /o/, /u/), with vowel length being a phonemic feature, as well as the non-phonemic [ə] that is used as an epenthetic vowel to resolve illicit consonant clusters, both those created through morphemic processes internal to the language, as well as in loanwords (Bricker & Orie, 2014). The phonetic status of [ə] in Maya as an epenthetic vowel to repair illicit consonant clusters can be seen in the word (kche’ /k-t͡ʃeʔ/ > [kət͡ʃeʔ] ‘our tree’) (Bennett, 2016, p. 472). Here, a schwa is inserted between the clitic pronoun k ‘our, we’ and the initial consonant of the following noun (Bricker & Orie, 2014, p. 182).
In addition to [ə] epenthesis, Maya also possesses other epenthetic processes, such as glottal stop insertion, whereby vowel-initial words receive an epenthetic /ʔ/, as vowel-initial contexts are not permitted in Maya (Lope Blanch, 1987; Michnowicz & Kagan, 2016). While IVs have not been examined in YS, glottal stop insertion does occur in the Spanish of the region, follows patterns found in Maya, and is more likely to be produced by Maya-Spanish bilingual speakers, which Michnowicz & Kagan (2016) attribute to direct transfer from the contact language. While IV epenthesis, unlike glottal stop epenthesis, does exist in (non-contact) Spanish, we might hypothesize that patterns of vowel intrusion in YS may mirror those of glottal stop insertion, as patterns of schwa use in Maya reinforce existing vowel epenthesis in Spanish, particularly for bilingual speakers.
As has been noted, in both Maya and Spanish, intrusive vowels serve to repair difficult or illicit consonant clusters. The consonant clusters in question for the present study involve the rhotic alveolar tap, /ɾ/. To complicate matters, Lipski (2012) suggests that bilingual speakers in Yucatan can variably produce a retroflex rhotic similar to the [ɻ] in English (p. 303), and this vowel-like realization may bleed the required obstruent+liquid context for IV insertion. Additional facts about the distribution of /ɾ/ in Maya suggest that bilinguals may produce /Cɾ/ clusters in a different way than monolinguals. In Maya, /ɾ/ is largely restricted to intervocalic position in polysyllabic expressive roots (Bennett, 2016, p. 482). An example proposed by Kidder (2013) (sitriiyo > si.trii.yo ‘species of bird’) shows the only acceptable consonant cluster /tɾ/, "which is exceedingly rare in native Mayan words, as is the sound /ɾ/, which is thought to be a borrowing from Spanish’’ (p. 50). Bennett (2016) also points out that the Mayan rhotic is in fact realized as a flap or tap, yet when devoiced, may become a retroflex [ɻ] or a post-alveolar fricative [ʃ] (p. 16). In this way, the rhotic realization in /Cɾ/ clusters by Maya speakers could be a tap, a trill, a fricative, or a sound approximating /ɾ/, thus affecting the intrusive vowel production in said cluster. Whether the possible retroflex-like /ɾ/ produced in the Spanish of some Maya-Spanish bilinguals has been favored by the variable production of /ɾ/ in Maya, or if it has arisen independently is a question that requires future research. The fact that the /ɾ/ in Maya is thought to be a borrowing from Spanish suggests that there could be differences in its production in bilingual speech, and high rates of approximate or fricative /ɾ/ removes contexts in which intrusive vowels are produced, therefore possibly predicting fewer instances of IVs among bilinguals.
4. Research Questions
The previous research described in sections 2 and 3 motivates the need for further exploration of vowel intrusion in a unique, contact variety of Spanish, taking into consideration the linguistic factors that have been shown to influence IV rate and duration as well as those for which no consistent pattern has yet to emerge. An additional contribution is the direct examination of how extralinguistic factors might influence rate and duration, which in turn offers insight into the complex and evolving nature of the phonetic/phonological system of YS. Thus, the present study is guided by the following research questions:
1a) How frequently do IVs occur in YS?
1b) How do linguistic factors (i.e., cluster type, position in word, stress) and extralinguistic factors (i.e., age group, speaker sex, language background) influence their frequency?
2a) What is the typical duration of IVs in YS?
2b) How do linguistic and extralinguistic factors influence the length of IVs?
3a) Do the distribution, frequency and duration of IVs show changes across generations as measured by real and apparent time, as has been found for numerous other variables in YS?
3b) What might comparisons within and across groups reveal about IVs in YS, and how they might be evolving in this dialect region?
While previous research has revealed several consistent trends regarding the influence of linguistic factors on IV frequency and duration, such as the greater likelihood of occurrence and lengthened duration in clusters containing a voiced consonant, we predict similar findings in our study due to the general nature of intrusive vowels. The potential influence of extralinguistic factors such as age and language background on IV frequency and duration, however, is somewhat less clear. On the one hand, we may expect that bilingual speakers, who already speak a language which exhibits vowel insertion, might produce more intrusive vowels, as they are accustomed to this process. On the other hand, bilingual individuals may (consciously or subconsciously) suppress vowel intrusion as a means to distinguish it from the phonological process of epenthesis characteristic of Maya. Both possibilities will be explored here. Likewise, data from both real and apparent sociolinguistic time will shed light on the current and future trajectory of IVs in YS.
5. Methods and Materials
5.1. Corpora and Participants
The data for the present study were obtained from two sets of spontaneous sociolinguistic interviews conducted in and around the capital city of Mérida, Yucatan, Mexico. The first set of interviews were conducted in 2005, consisting of 24 interviews with both male and female speakers divided into three distinct age groups: younger speakers (ages 22-29; born between 1975-1983), middle aged speakers (ages 30-49; born between 1956-1974), and older speakers (ages 50-89; born between 1916-1955). Twelve additional interviews were collected in 2016 and were conducted with men and women who were considered younger speakers (ages 18-29; born between 1987-1998). In addition to their sex and age, all participants were asked to indicate their language background, resulting in subgroups who self-reported that they were monolingual speakers of Spanish or bilingual in Spanish and Maya.[2] All interviews lasted between 30 to 60 minutes. The breakdown of participants is presented in Table 2.
In total, 36 interviews were conducted across the two collection times, resulting in a roughly equal distribution of male (n=16) and female (n=20) speakers, as well a nearly even split between monolingual Spanish speakers (n=19) and Maya-Spanish bilinguals (n=17). The presence of three distinct age groups in the 2005 data results in the possibility of analyzing linguistic phenomena in apparent time, with older speakers representing an earlier phase and younger speakers a later phase in dialect formation. The inclusion of a group of young speakers interviewed in 2016 – more than 10 years later—allows for the assessment of linguistic variation and change in real time as well, thus permitting a more profound understanding of if and how language and dialect contact may be shaping the unique speech of those residing in Yucatan.
5.2. Data Analysis and Coding
In order to examine the overall frequency of IVs, as well as their duration, all potential contexts in which an IV could occur were identified and delimited using PRAAT software (Boersma & Weenink, 2020). The envelope of variation was limited to tautosyllabic /Cɾ/ clusters, and only included the voiced and voiceless stop series (i.e., /ptkbdg/). The presence of an intrusive vowel was determined with a combination of auditory and visual cues present in PRAAT, including a visible vocalic segment with a somewhat consistent formant structure that was situated between the initial consonant and the following tap. A waveform and spectrogram of the word tres (‘three’), in which an intrusive vowel is present, is presented in Figure 1.
In the cases in which an IV was present, the total duration of the vocalic segment was measured using an automated procedure in PRAAT. IVs were subsequently coded for the linguistic and extralinguistic factors presented in Table 3.
A total of 4,328 contexts in which an IV could occur were identified in the interviews. Of those, 58 tokens were excluded due to the presence of background noise, mispronunciations, or stutters, resulting in 4,270 /Cɾ/ clusters.
5.3. Statistical Analysis
Two separate statistical analyses were conducted, one to assess IV rate of occurrence and the second to examine IV duration. For the analysis of rate, the 4,270 total tokens were coded as either “yes” or “no” to indicate the presence or absence of an IV, respectively. Analyses of IV duration were carried out on the 2,180 tokens in which an IV did occur. In both sets of analyses, each example was coded for all of the variables in Table 3, although only “cluster” and not “/C/ place” and “/C/ voice” were included in the statistical analyses.[3] The impact of the linguistic (i.e., cluster, position in word, tonicity) and extralinguistic factors (i.e., age group, language background, speaker sex) were analyzed via a series of mixed-effects logistic regression models (for rate of occurrence) and mixed-effects linear regressions (for IV duration) with speaker as a random intercept in R (R Core Team, 2023). The minimal model was determined via model comparison with ANOVA and AIC. In order to better understand interactions within the data, conditional inference trees were generated with the partykit package (Hothorn & Zeileis, 2015). Model description and goodness of fit was performed with the report package (Makowski et al., 2023).
6. Results
6.1. Intrusive vowel rate of occurrence
Of the total 4,270 potential contexts within the envelope of variation, 2,180 tokens were identified as containing an IV, for an overall rate of 51%. This rate of occurrence falls within the range reported for several Latin American varieties (Kilpatrick et al., 2006; Ramírez, 2006; Schmeiser, 2006) but slightly less than those reported for Peninsular and Argentine varieties (Blecua, 2001; Colantoni & Steele, 2005). Rate of occurrence across speakers was highly variable, ranging between 17.2%-90.1% (see Appendix A, Table A1 for individual rates).
The results of the minimal mixed-effects logistic regression revealed that while no extralinguistic factors (i.e., age group, speaker sex, language background) significantly influenced rate, cluster type, position in word, and tonicity emerged as significant. The model’s total explanatory power is moderate (conditional R2 = 0.20) and the part related to the fixed effects alone (marginal R2) is of 0.07, indicating that inter-speaker variation played an important role in the presence or absence of IVs. The full results are presented in Table 4.
In examining the relative frequencies based on cluster type (i.e., /ptkbdg/+/ɾ/) IVs occurred in clusters with voiceless consonants significantly less frequently than the reference level (/gɾ/).[4] The effect of cluster type can more easily be visualized via a conditional inference tree that includes just cluster type (Figure 2).
The first, right branching node of the tree indicates that IVs were significantly more frequent in /gɾ/ and /dɾ/ clusters (66.5% and 64.6%, respectively) (node 5). The left branching node shows an additional split in the data, with significantly more IVs occurring in /bɾ/ (52.2%), /kɾ/ (49.2%), and /tɾ/ (50.5%) (node 3) clusters than in /pɾ/clusters (44.3%) (node 4). (See also Appendix A, Table A2).
Position in word also significantly influenced frequency, in that clusters were more likely to occur in word-initial clusters than in word medial-clusters. That is, of the total number of word-initial clusters, 57.1% of them contained an IV, whereas only 46.1% of medial clusters contained an IV (see also Appendix A, Table A3).
Finally, tonicity significantly influenced the presence or absence of an IV. At the onset of the stressed syllable (empresas “companies”), IVs were present in 60.0% of the tokens analyzed, compared to 51.8% in pre-tonic syllables (trabajar “to work”) and only 41.7% in post-tonic syllables (hombre “man”) (see also Appendix A, Table A4).
In conjunction, the analyses indicate that linguistic factors significantly influenced IV rate of occurrence, revealing a general trend for greater rates of intrusion in clusters with voiced consonants and in non-labial clusters. It is worth noting, however, that important, albeit nonsignificant, differences were found for language background (p = .092). An examination of the percentages produced by both groups indicated that the monolingual speakers produced IVs in 55.4% of the possible contexts, whereas vowel intrusion occurred at a rate of 44.6% in bilingual speech. This trend will be explored further in the discussion.
6.2. Intrusive Vowel Duration
The second analysis consisted of an examination of the duration of the 2,180 tokens in which an IV did occur. Averaging across all phonetic contexts and speaker groups, IV duration ranged from 7.95-71.83 milliseconds, with a mean duration of 23.02 milliseconds (SD = 7.55), and a median of 21.80 milliseconds. While six productions fall outside of the 8-56 ms range established by Quilis (1970), the mean is consistent with multiple works on this topic.
Results of the minimal mixed-effects linear regression revealed significant effects of cluster type, tonicity, age group, and language background. The model’s total explanatory power is moderate (conditional R2 = 0.25) and the part related to the fixed effects alone (marginal R2) is 0.24, indicating that speaker variation was not a major factor in the duration of IVs.
Similar to the analysis of IV frequency, cluster type emerged as significant in the analysis of duration, revealing that IVs produced in clusters consisting of /gɾ/ were significantly longer than those produced in all other clusters with the exception of /dɾ/. Table 6 presents the mean duration of IVs by cluster.
Also mirroring the analysis of IV rate of occurrence, the duration of IVs produced in clusters at the onset of the tonic syllable were significantly longer (M = 25.44 ms, SD = 7.15) than those situated in the pre-tonic (M = 21.20 ms, SD = 6.34) and post-tonic syllables (M = 22.10 ms, SD = 8.25) (see also Appendix B, Table B1).
Unlike the analysis of rate of vowel intrusion, however, both age group and language background emerged as significant. Regarding age, the oldest speakers produced IVs that were significantly longer (M =24.73 ms, SD = 7.87) than those produced by the other three age groups (2005 younger, M = 21.61 ms, SD = 6.23; 2005 middle, M = 22.38 ms, SD = 6.74; 2016 younger, M = 21.97 ms, SD = 7.75). In terms of language background, monolingual speakers produced IVs that were significantly longer (M = 23.79 ms, SD = 7.02) than those produced by the Maya-Spanish bilinguals (M = 21.64 ms, SD = 7.73) (see also Appendix B, Tables B2-B3).
In order to gain a more complete understanding of the influence of both linguistic and extralinguistic factors on IV duration, a conditional inference tree (Figure 3) was generated including all of the variables analyzed in the regression. Variables higher up in the tree have a stronger impact on IV duration, and variable influence decreases moving down the tree.
The first significant split in the data occurs with regard to cluster type (node 1), indicating that IVs produced in /bɾ/, /dɾ/, /gɾ/, and /kɾ/ clusters (left branch) were significantly longer than those produced in /pɾ/ and /tɾ/ clusters (right branch). Continuing to the left, an additional split at node 2 indicates a difference between IVs produced in /bɾ/ and/kɾ/ (left branch, node 2) and /dɾ/ and /gɾ/ clusters (right branch,node 2), the latter having longer duration. Language background emerged as a significant factor only for /bɾ/ and/kɾ/ clusters (node 3), in which monolingual speakers’ IVs (node 5) were significantly longer than those produced by the bilingual speakers (node 4).
Moving on to the right branch of node 1, which concerns the duration of IVs produced in /pɾ/ and /tɾ/ clusters only, the role of extralinguistic factors becomes apparent. The IVs produced in /pɾ/ and /tɾ/ clusters by the (2005) older speakers were significantly longer than those produced by the other three age groups (node 7), with an additional split at node 15 based on tonicity: IVs produced in tonic syllables were significantly longer (node 17) than those produced pre- and post-tonic positions (node 16). Moving back to the left branch of node 7, which concerns the duration of IVs produced in /pɾ/ and /tɾ/ clusters among the three younger age groups, language background is significant (node 8). Native speakers of Spanish (right branch) produced significantly longer IVs than bilingual speakers, and also produced longer IVs in /pɾ/ clusters (node 13) in comparison to /tɾ/ clusters (node 14). Among the bilingual speakers (left branch of node 8), an additional significant split based on age is observed, in which the middle-aged speakers (2005 Middle) produced significantly longer IVs (node 10) than the younger speakers from both collection times (node 11).
In combination, the analyses of IV duration revealed that cluster type was the most important factor influencing duration, as it is positioned at the top of the tree. A general trend emerges for lengthened duration in clusters with voiced consonants (/dɾ/ and /gɾ/ relative to the others), and also some suggestion of lengthened duration in dorsal and dental clusters relative to labials, but the pattern is not fully uniform. The importance of language background and age group are relatively less important and somewhat inconsistent, varying based on cluster type. While language background does prove significant at lower levels of the tree for all clusters with monolingual speakers producing longer IVs than bilingual speakers, age group only emerged significant for /pɾ/ and /tɾ/. Furthermore, older speakers produced the longest IVs in these clusters regardless of their language background, while the middle and younger speakers’ productions differed based on their language background.
7. Discussion
To summarize the main findings, analyses revealed that both the rates of vowel intrusion as well as the duration of IVs themselves in YS and bilingual speech fall within ranges established for other (non-contact) varieties of Spanish. In both analyses, linguistic factors, especially place and voicing (as analyzed via cluster type) and stress, were significant, with more frequent and longer IVs occurring in clusters with voiced consonants and at the onset of stressed syllables. The role of position in word was only significant for rate of occurrence, with intrusive vowels occurring more frequently in word-initial position. In conjunction, our results support previous findings concerning the nature of IVs in Spanish, confirming that rates and duration are consistent with established norms even in naturalistic speech samples. From a broader perspective, the present study sheds light upon the possible phonetic outcomes of Spanish-indigenous language contact.
The relationship between extralinguistic factors and vowel intrusion is considerably more complex. None of the extralinguistic factors were statistically significant in the model analyzing IV rate of occurrence, although language (monolingual vs. bilingual) did reveal a notable trend: monolingual speakers produced more IVs than bilinguals. The analysis of duration, in contrast, did reveal a significant impact of language background in that monolingual speakers’ IVs were significantly longer than bilinguals’ IVs. The conditional inference tree also indicated that for specific clusters, namely /pɾ/ and /tɾ/, that older speakers produced longer IVs overall, but that the three younger groups varied based on language background: among the bilinguals, the middle-aged speakers produced significantly longer IVs in these clusters than both groups of younger speakers. That language background and age emerged as significant in different ways and in different cluster types paints a complicated picture of how patterns of vowel intrusion may manifest in YS and bilingual speech communities within this region.
Given that YS is characterized by a series of unique phonetic traits not observed in most other varieties of Spanish (outlined in section 3), one of the goals of this investigation was to assess the status of vowel intrusion among distinct groups of speakers in both real and apparent time, with the overall aim of understanding its role within the phonetic/phonological system of YS. At the outset of the investigation, based on the findings of previous works examining differences between monolingual and bilingual speakers, differences among age groups, the phonetic/phonological system of Maya, and the status of rhotics, we made several distinct – and potentially contradictory – predictions concerning the influence of extralinguistic factors on IV rate of occurrence and duration.
Regarding the relative importance of age group versus language background, previous studies have indicated that age, rather than level of bilingualism, has greater impacts on the adoption of specific phonetic traits, with younger speakers patterning more towards a pan-Hispanic norm (Michnowicz, 2015). That is, apart from glottal stop insertion (Michnowicz & Kagan, 2016), age group was found to be the principle extralinguistic factor in studies on YS, as young speakers of YS differ from older speakers regarding their production of the voiced and voiceless stop series, labialization of absolute final nasals, and exhibit a more syllable-timed rhythm that closely aligns with patterns established for other varieties of Mexican Spanish (Michnowicz, 2008, 2009, 2021; Michnowicz et al., 2023; Michnowicz & Carpenter, 2013; Michnowicz & Hyler, 2020).
We expected, then, that younger speakers of YS might pattern differently regarding vowel intrusion as well, either in terms of frequency of occurrence, the duration of the IVs themselves, or both. While age group did not emerge significant in the analysis of rate of intrusion, IV duration in /pɾ/ and /tɾ/ clusters did differ significantly across age groups, with older speakers (regardless of their level of bilingualism) producing the longest IVs, and the middle-aged bilinguals producing longer IVs than both of the younger bilingual groups. That the effects emerged in such a limited context and for only the subgroup of bilinguals, however, does not provide strong evidence that age plays a vital role in the process of vowel intrusion among the speakers analyzed here. Furthermore, although a partial trend for duration emerges in apparent time (Older speakers > Middle-aged bilinguals > Younger bilinguals), the two youngest groups of bilinguals pattern together with respect to duration, suggesting that if a change is underway regarding IVs, that change appears to be stabilizing among the youngest groups.[5]
In terms of level of bilingualism, previous studies on the phonetic outcomes of contact with Maya suggest that glottal stop insertion is one of the few processes that is clearly attributed to direct transfer from Maya (Michnowicz & Kagan, 2016). Coupled with additional traits of the vowel system and schwa epenthesis in Maya (Bricker & Orie, 2014), we hypothesized two distinct, possible outcomes: 1) Bilinguals in particular will produce more intrusive vowels than monolinguals, mimicking the patterns described for glottal stop insertion and further reinforced by vowel epenthesis in Maya, or 2) Bilinguals will produce fewer intrusive vowels than monolinguals as a means to distinguish the epenthetic processes present in Maya from the (primarily) phonetic process of vowel intrusion that occurs in Spanish.
Overall rates of IV frequency revealed a tendency for the monolingual speakers to produce more IVs (55.4%) than bilingual speakers (44.6%), but the difference was not statistically significant. Regarding duration, we also observed a significant effect of language background in that monolingual speakers produced longer IVs in some, but not all, consonant clusters. In light of these findings, we might conclude that bilingual speakers in the Yucatan (sub)consciously suppress vowel insertion in Spanish tautosyllabic /Cɾ/ clusters. When vowel intrusion does occur, the vocalic element is relatively short, perhaps again to distinguish it from epenthesis in Maya.[6] As with the impact of age, however, the effects are not fully consistent across the system, indicating that vowel intrusion in both monolingual and bilingual varieties in the Yucatan is highly variable in nature.
The notion of active suppression of vowel intrusion, however, would imply that IVs are salient enough to be perceived by listeners and that their presence in an individuals’ speech carries some kind of social value. Research on the perception of intrusive vowels has revealed that native Spanish listeners can perceive vocalic elements in /CC/ clusters that are as short as 17 milliseconds (Widdison, 2004–2005). Furthermore, Ramírez (2006) found that listeners rated /dɾ/, /gɾ/, and /kɾ/ clusters as sounding more natural when they contained an IV, but that /tɾ/ was judged as more natural when an IV was absent. The patterns described in Ramírez’s study are particularly interesting in light of ours: the longest IVs in our study were produced in /dɾ/,/gɾ/ and /kɾ/clusters, and the shortest IVs in /tɾ/ clusters. Whether the overlap in our results is coincidental, rooted in purely physiological factors of gestural timing, or a combination of the two cannot be determined here, but certainly provides motivation to examine the ways in which IVs might be perceived by speakers of YS and if they contribute to the perception of a Yucatan or bilingual accent.
Perhaps the most crucial – and missing – piece of the puzzle concerns the nature of the rhotics in YS. Impressionistic accounts of the rhotics in Yucatan, combined with the limited number of permissible consonant clusters containing /ɾ/ in Maya, suggest a wide range of variability in their pronunciation, including variants with retroflection and frication (Bennett, 2016; Kidder, 2013; Lipski, 2012). As the most suitable environment for the emergence of an IV is before a tap rhotic, speakers of YS, and especially bilinguals, may not produce the stop+rhotic clusters in a way that permits consistent patterns of vowel intrusion. Future studies of the rhotic element of the /Cɾ/ cluster may offer additional insight into if and how its realization might influence the rate of occurrence and duration of IVs. In conjunction with what is already understood about increased aspiration of /ptk/ and more occlusive realizations of /bdg/, acoustic analyses of Yucatan rhotics are an essential next step in providing a more refined view of speakers’ treatment of consonant clusters, as well as the phonological system as a whole.
8. Conclusions
To conclude, our preliminary analysis of vowel intrusion in YS provides the first acoustic characterization of the phenomenon in this dialect. Combined with works carried out on other varieties of Spanish, the present study broadens our understanding of vowel intrusion in general while also contributing to the extant body of literature concerning the unique nature of YS. Overall, linguistic factors such as cluster type, stress, and position in word were found to have a strong and consistent influence on the prevalence and durational variability of IVs, and our results are aligned with most previous descriptions of other dialects of Spanish. The relationship between extralinguistic factors such as language background and age and vowel intrusion, however, was less clear. Level of bilingualism appears to have some influence on IV duration but not significantly on frequency – findings which merit additional exploration in the future. Along with acoustic studies of the rhotics in this variety and analyses of how IVs are perceived by speakers of different backgrounds, research on YS will continue to offer insight into phonetic outcomes in situations of language and dialect contact.