© 2013 by Marianna Nadeu Rota. All Rights Reserved. the EFFECTS of LEXICAL STRESS, INTONATIONAL PITCH ACCENT, and SPEECH RATE on VOWEL QUALITY in CATALAN and SPANISH (2024)

© 2013 by Marianna Nadeu Rota. All rights reserved. THE EFFECTS OF LEXICAL STRESS, INTONATIONAL PITCH ACCENT, AND SPEECH RATE ON VOWEL QUALITY IN CATALAN AND SPANISH

BY

MARIANNA NADEU ROTA

DISSERTATION

Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Spanish in the Graduate College of the University of Illinois at Urbana-Champaign, 2013

Urbana, Illinois

Doctoral Committee:

Professor Jos´eIgnacio Hualde, Chair Professor Jennifer Cole Associate Professor Anna Mar´ıa Escobar Associate Professor Chilin Shih Assistant Professor Ryan Shosted Abstract

Lexical-stress languages tend to display stress-induced vowel quality vari- ation. In some languages the effect is very salient, resulting in a smaller vowel inventory in unstressed syllables and stress-conditioned rule-governed vowel alternations within paradigms (phonological vowel reduction). Other languages exhibit only slight phonetic variation between vowels in stressed and unstressed syllables (phonetic vowel reduction). Regarding the latter, two main hypotheses of how prosodic prominence affects vowel production have been proposed. Based on experimental data from Germanic languages, their empirical basis is still rather limited. A deeper understanding of the effects of prosodic prominence on vowels requires careful experimentation on an expanded set of languages. This dissertation investigates the effects of lexical stress and intonational pitch accent on vowel production in Spanish and Catalan. Although closely related, these languages differ importantly in their phonology. The five Spanish vowels can appear in stressed and unstressed syllables. In Central Catalan, however, seven stressed vowels alternate with only three unstressed vowels. Comparing these languages allows us to observe how the existence of phonological vowel reduction conditions the operation of phonetic reduc- tion. Stress-induced vowel centralization or reduction has sometimes been attributed to decreased vowel duration in weak prosodic positions. Thus, the role of duration is further investigated by manipulating speech rate. The results show that in both languages vowels produced at faster rate are shorter and less peripheral than those produced at normal rate. Interest- ingly, the effects of stress differ in these languages. For Catalan, unstressed vowels are shorter and more centralized than stressed vowels. On the other hand, Spanish speakers exhibit individual effects of stress, suggesting that the use of vowel quality to signal stress is not conventionalized. In addition, in Catalan, the presence of a prenuclear accent in a broad focus utterance does not affect vowel quality or duration of lexically stressed vowels. Yet, lexically unstressed vowels are longer and have more extreme vowel formants under emphatic accent. This dissertation provides a comprehensive description of prosodic ef-

ii fects on vowel production in Catalan and Spanish, hence contributing to a body of cross-linguistic research dealing with the influence of prosody at the segmental level.

iii Acknowledgments

I would like to extend my heartfelt thanks to everyone who contributed to this dissertation in one way or another. First and foremost, I am especially grateful to Jos´eIgnacio Hualde. His guidance, immediate feedback, ap- proachability, trust, vast knowledge, and linguistic jokes have made working on this dissertation very enjoyable. It has been a true honor and plea- sure to be Jos´eIgnacio’s advisee. I want to express my gratitude to Ryan Shosted for very challenging assignments that, as he predicted, ended up being extremely rewarding and for always finding time for my questions. I have benefited enormously from Jennifer Cole’s constructive criticisms and thought-provoking questions at various stages of my research. I am very thankful to Anna Mar´ıaEscobar for introducing me to very exciting sub- fields of linguistics and for her feedback and encouragement during my years at UIUC. I would like to thank Chilin Shih for very stimulating courses and for her emphasis on scientific rigor. In addition to the members of my committee, I am very indebted to all the people who participated in the experiments, for their patience and generosity, and to Catalunya R`adio and Oriol Camps for providing me with data so quickly and efficiently. During my trips to Barcelona and Madrid for data collection, I was very fortunate to receive help from Marta Bosch, Joan Carles Mora, and Joe Hilferty (Universitat de Barcelona); Pilar Pri- eto (Universitat Pompeu Fabra); Juana Gil (Centro Superior de Investiga- ciones Cient´ıficas); Eva Estebas (Universidad de Educaci´ona Distancia), and Maria del Mar Vanrell (Universidad Aut´onomade Madrid). I truly appreciate their kindness. These trips were made possible by the School of Literatures, Cultures and Linguistics Summer Fellowship, the Anthony M. Pasquariello Award for Graduate Student Research, the Tinker Foundation Field Research Grant for Graduate Student Research in Latin America and Iberia, and the Graduate College Dissertation Travel Grant. The Gradu- ate College Dissertation Completion Fellowship has helped me finish writing this dissertation in a timely manner. I feel very lucky to have met so many wonderful people at UIUC who have made me feel at home. Thanks to all of them, and especially to Doug Eddy,

iv Enric Xargay, Israel de la Fuente, Itxaso Rodr´ıguez,Lee Ragsdale, Luj´an Stasevicius, Mario L´opez, Miriam P´erez, and Olatz Mendiola, for making this not only an academically but also personally enriching experience. Finally, I want to extend an enormous hug to my friends and family for staying in touch and keeping my homesickness at bay. I am very grateful to my parents, Ignasi and Montserrat, for giving me the freedom to decide, for their encouragement, support, and (although we do not say this because we are Catalan) love. Special thanks go to my sister, Laura, and Alex` and Ar¸cAsensio for keeping countdowns and for letting me invade their home repeatedly. They have been the true sponsors of my summer research. My deepest gratitude goes to Rom`aRofes for too much to mention.

v Table of Contents

List of Abbreviations ...... viii

1 Introduction ...... 1 1.1 Sources of Variation ...... 1 1.2 Phonetic and Phonological Vowel Reduction ...... 2 1.3 Stress, Accent, Speech Rate, and Phonetic Vowel Reduction .4 1.4 Definitions: Stress, Accent, and Speech Rate ...... 10 1.5 Stress and Accent in Spanish and Catalan ...... 11 1.6 The Languages under Study ...... 14 1.7 Goals and Outline of the Dissertation ...... 22

2 Effects of Stress and Speech Rate on Vowel Quality . . . 27 2.1 Introduction ...... 27 2.2 Research Questions and Hypotheses ...... 32 2.3 Methods ...... 34 2.4 Results ...... 45 2.5 Discussion ...... 66

3 Is Phonetic Vowel Reduction Caused by Absence of Stress or Accent (or Both)? ...... 72 3.1 Introduction ...... 72 3.2 Research Questions and Hypotheses ...... 77 3.3 Methods ...... 79 3.4 Results I: Lexical Stress and Accent ...... 93 3.5 Results II: Lexical Stress Only ...... 107 3.6 Results III: Accent Only ...... 114 3.7 Results IV: Variable Application of Phonological Vowel Re- duction in Catalan Compounds ...... 117 3.8 Discussion ...... 125

4 Emphatic Stress in Central Catalan ...... 130 4.1 Introduction ...... 130 4.2 Research Questions and Hypotheses ...... 136 4.3 Methods ...... 137 4.4 Results ...... 142 4.5 Discussion ...... 146

vi 5 Conclusions ...... 149 5.1 Review of the Findings ...... 151 5.2 Discussion ...... 154 5.3 Conclusion ...... 166

A Appendix: Questionnaires ...... 167

B Appendix: Linguistic Profile of Catalan Participants . . 171

C Appendix: Speech Rate Manipulation ...... 176

D Appendix: Target Sentences ...... 178 D.1 Catalan. Accented Condition ...... 178 D.2 Catalan. Deaccented Condition ...... 180 D.3 Spanish. Accented Condition ...... 184 D.4 Spanish. Accented Condition ...... 186

E Appendix: Supplementary Analyses ...... 190 E.1 Supplement to Section 3.4 ...... 190 E.2 Supplement to Section 3.5 ...... 190

References ...... 195

vii List of Abbreviations

Adj. Adjective

Cat. Catalan

Comp. Compound

Cond. Conditional

Dim. Diminutive

Fem. Feminine

Fut. Future

N. Neuter

Pl. Plural

Sg. Singular

Str Stressed

Sp. Spanish

Uns Unstressed

Uns Full Unstressed Full

V. Verb

viii 1 Introduction

1.1 Sources of Variation

One of the difficulties encountered in the study of speech production and perception is the lack of invariance that characterizes speech, i.e., the fact that a given phoneme does not present a set of invariant acoustic cues in all of its realizations. Both linguistic and non-linguistic factors contribute to this variability. Among the non-linguistic factors we find speaker characteristics (e.g., gender and age, as well as individual differences in physiology and be- havior; Peterson & Barney, 1952; Chl´adkov´a,Boersma, & Podlipsk´y,2009; Fox & Jacewicz, 2012), socio-indexical factors (Labov, 2001), regional origin (e.g., Escudero, Boersma, Schurt Rauber, & Bion, 2009; Chl´adkov´a,Escud- ero, & Boersma, 2011; Kim, 2011), emotional state (Williams & Stevens, 1972; Waaramaa, Laukkanen, Airas, & Alku, 2010), or emotional expression (Tartter, 1980; Ohala, 1984). The acoustic and articulatory characteris- tics of speech sounds are also affected by changes in speaking conditions (e.g., clear speech or citation form vs. normal speech [Moon & Lindblom, 1994; Ferguson & Kewley-Port, 2002]; faster vs. slower speech rate [Pick- ett, Blumstein, & Burton, 1999; Smith, 2002; Sommers & Barcroft, 2006]; laboratory/preplanned vs. spontaneous speech [Harmegnies & Poch-Oliv´e, 1992; Calamai, 2002; Toledano, Moreno Sandoval, Col´asPasamontes, & Gar- rido Salas, 2005]). Another important source of variation is the context in which a segment occurs. The identity of adjacent segments may exert a strong influence on a given sound (Recasens, 1991a; Shaiman, 2002; Chl´adkov´aet al., 2011). Simi- larly, the position that a segment occupies within a syllable, word, or phrase may also condition its realization (Byrd, 1996; Fougeron, 2001; Keating, Cho, Fougeron, & Hsu, 2003; Cho & Keating, 2009; Georgeton, Audibert, & Fougeron, 2011). Some authors have also observed variation in the pro- duction of segments that could be attributed to the word class (content vs. function word; van Bergem, 1993; Meunier & Espesser, 2011), semantic pre- dictability (Clopper & Pierrehumbert, 2008; McAuliffe & Babel, 2012), or frequency (Bybee, 2001, 2006; Pierrehumbert, 2001; Pluymaekers, Ernestus,

1 & Baayen, 2005) of the word in which they were embedded. The absence vs. presence of prosodic prominence has also been identified as a source of variation.

1.2 Phonetic and Phonological Vowel Reduction

In some stress-accent languages, stress has a very salient effect on vowel qual- ity, with certain vowel contrasts neutralizing in unstressed position (phono- logical vowel reduction). In other languages, stress causes only slight sub- phonemic changes in vowel quality (phonetic vowel reduction). In languages with phonological vowel reduction, fewer vowels can appear in unstressed syllables compared to stressed syllables. This is due to the fact that two or more vowels that are contrastive in the latter context neutralize in unstressed position (Crosswhite, 2001). Hence, the term “reduction” refers here to a decrease in the number of phonological contrasts available in unstressed position, but not necessarily to a reduction in the size of the space that the vowels occupy. Phonological vowel reduction, also known as lexical vowel reduction (van Bergem, 1993), is a categorical phenomenon (i.e., it occurs regardless of the speech conditions) that is restricted to certain vowels and languages. Harris (2005) described two phonological vowel reduction tendencies or routes. Vowels affected by centripetal reduction move to more central po- sitions in the vowel space, whereas centrifugal reduction consists in vowels moving away from the center. Crosswhite (2000, 2001, 2004) drew a simi- lar distinction between prominence-reducing and contrast-enhancing reduc- tion. According to the author, the former type is articulatory-based and responds to a desire to avoid long and sonorous vowels in unstressed posi- tion.1 Since vowels tend to be shorter in unstressed syllables and sonorous vowels are incompatible with extreme shortness, in languages with this type of phonological vowel reduction, stressed low vowels alternate with unstressed higher vowels (e.g., stressed /e, a, o/ becoming unstressed [i, @, u] respectively). On the other hand, contrast-enhancing reduction targets non-corner vowels, which are eliminated in unstressed position, yielding un- stressed vowel systems with maximal dispersion (e.g., [i, a, u]). Unlike prominence-reducing reduction, this phenomenon is perceptually-based and results from the speaker’s attempts to avoid producing a speech sound that might be easily misperceived by the listener. Speakers select a sound with

1In Crosswhite (2004, p. 207), sonority is understood as low-frequency amplitude (below 3 kHz). Low vowels are more sonorous than high vowels.

2 more robust defining cues instead, and retain the easily misinterpretable sounds only in contexts which will enhance their accurate perception (i.e., stressed position). As follows from these mechanisms, the unstressed vowel systems of lan- guages with phonological vowel reduction incorporate certain vowels (/i, @, a, u/) more frequently than others (e.g., mid vowels). This may be due to a cross-linguistic preference (Johnson, 2003, p. 111–112) for these vow- els due to their relative acoustic stability over varying places of articula- tion (Stevens, 1972, 1989). That is, some vowels admit more articulatory variability than others, whereas, for other vowels, smaller changes in ar- ticulation can have a significant impact on the acoustic output, requiring more articulatory precision. In addition, by eliminating non-corner vowels, vowel systems can maintain maximal and sufficient contrast between vowels (Liljencrants & Lindblom, 1972; Lindblom, 1986), even in the case of vowel space compression (a common manifestation of phonetic vowel reduction). A number of recent approaches have established a link between phonolog- ical and phonetic vowel reduction, conceiving of the former as a result of the phonologization of the latter (Flemming, 1995, 2004; Crosswhite, 2001, 2004; Barnes, 2006). Unlike phonological vowel reduction, phonetic (or acoustic) vowel reduction is described as a gradient phenomenon (Fourakis, 1991; Padgett & Tabain, 2005) motivated by factors such as consonantal context, speech rate, or stress. Phonetic vowel reduction results, many times, from articulatory constraints: A target may not be fully reached under adverse conditions, such as extreme duration shortening or conflicting requirements for the target segment and the surrounding ones. In this sense, the condi- tions that may give rise to target undershoot can be described as universal. All vowels in all languages may potentially be subject to this phenomenon, although languages vary in whether they exhibit systematic target under- shoot or not. A crucial difference, then, between phonetic and phonological vowel re- duction is that, in cases of phonological vowel reduction, stressed and un- stressed vowels have different targets, whereas, in phonetic vowel reduction, targets are assumed to remain invariant. This phenomenon is also known as vowel centralization, because it involves displacement of unstressed or unac- cented vowels toward the center of the F1 * F2 vowel space. Phonetic vowel reduction has also been understood as increased assimilation or coarticula- tion with the surrounding consonants in unstressed position (see Lindblom, 1963; Padgett & Tabain, 2005). In other words, an idealized vowel target may not be realized due to reduced magnitude, overlap, or truncation of

3 the articulatory gestures, resulting in formant undershoot (de Jong, 1995; Mooshammer & Geng, 2008). In any case, the outcome of phonetic vowel reduction tends to be a compressed acoustic vowel space.

1.3 Stress, Accent, Speech Rate, and Phonetic Vowel Reduction

Unstressed syllables have a less pivotal role in word recognition processes and carry less information than stressed ones (Altmann & Carter, 1989). In addition, research has shown that unstressed and unaccented vowels have shorter duration, decreased coarticulatory resistance, lower muscle activity, and higher stiffness when compared to stressed or accented vowels (Fowler, 1995). The less critical role of unstressed and unaccented syllables and their articulatory characteristics explain why they may undergo changes in vowel place features. That is, deviations from the idealized targets in unstressed syllables may not have a strong impact in speech recognition. On the other hand, lexically stressed syllables are critical for word recognition, and it is crucial that the listener does not misperceive them. de Jong (2000, p. 72) described stress “as a convention in which both speakers and listeners pay more attention to certain syllables than to others.” Similarly, through the use of syntactic and intonational focus, the speaker may choose to highlight some parts of the utterance for communicative purposes (de Jong, 2004). Whether directing the attention to a particular sequence is conventionalized (as in lexical stress) or depends on the communicative context (as in ac- cent or focus2), the result is that segments under prominence will be more carefully articulated. The view that prosodic prominence will affect segments by enhancing their features is encapsulated in de Jong’s (1995) Localized Hyperarticu- lation Hypothesis, based on the Hyperarticulation Hypothesis (Lindblom, 1963, 1990). Lindblom (1990) introduced the notions hypospeech and hy- perspeech to represent the endpoints of a continuum ranging from econo- mization to maximization of the articulatory gestures, respectively. At one extreme, hypospeech (or hypoarticulation) may be explained by the fact that “unconstrained, a motor system tends to default to a low-cost form of behavior” (Lindblom, 1990, p. 413). In other words, a principle of physical “economy” would constrain articulatory movements (as well as non-speech movements) in certain speech conditions or contexts. At the other extreme,

2Vowels in accented contexts serve as “points of information focus” (Harrington, 2010, p. 191).

4 hyperspeech (or hyperarticulation) was claimed to be driven by the need or desire to maximize the distinctiveness of the acoustic signal (in order to aid speech perception and lexical access). In this sense, hypospeech would constitute speaker-oriented behavior (given that it minimizes articulatory activity), whereas hyperspeech would be listener-oriented (because it em- phasizes contrast, which facilitates perception and correct identification). Lindblom’s proposal was not devised specifically to account for the ef- fects of prosodic prominence at the segmental level, but rather referred to varying (more global) behaviors in different contexts or speaking styles. de Jong (1995) adapted this explanation to the specific case of stress by speak- ing of localized hyperarticulation in prosodically prominent syllables. The distinction between hypo- and hyperarticulation can be seen as constituting planned speech behavior, resulting from selectively modulating attention to different points of the speech signal (see also Harris, 2005). That is, the system’s tendency to economization is selectively counteracted because the speaker modulates his or her attention to the articulation of a particular linguistic unit (i.e., a stressed syllable over unstressed syllables). Thus, speakers enhance more prominent (and more informative) syllables over less prominent and less informationally-loaded ones, and listeners pay more at- tention to those as well (de Jong, 2000, 2004). According to the Localized Hyperarticulation Hypothesis,3 stressed or accented vowels will be hyperar- ticulated, meaning that their place features will be enhanced. Hence, dis- tinctions between the vowels in the system will be magnified (paradigmatic enhancement; Cho, 2005), and the vowel space will expand with respect to the unstressed vowel system. On the other hand, the Sonority Expansion Hypothesis (Beckman, Ed- wards, & Fletcher, 1992; de Jong, Beckman, & Edwards, 1993) was for- mulated to account for the finding, in English, that stressed or accented vowels were produced with a more open vocal tract than less prominent vowels (Beckman et al., 1992). A lower jaw and tongue position results in decreased impedance, and thus greater coupling of the oral cavity to the outside atmosphere, causing increased loudness. By increasing loudness and sonority, prominent vowels become more distinct from surrounding conso- nants (syntagmatic enhancement; Cho, 2005), rather than from other vowels in the system. The more open vocal tract has another effect regarding the spectral properties of vowels, as it tends to result in higher F1.4

3In the remainder of the dissertation, when the Hyperarticulation Hypothesis is men- tioned, it is meant to refer to de Jong’s adaptation of this hypothesis to the particular case of prosodic prominence. 4Following Stevens (2000, p. 261), I assume that F1 variations are dependent on tongue

5 In sum, both the Hyperarticulation Hypothesis and the Sonority Expan- sion Hypothesis were put forward to account for prosodically-induced vari- ation in vowel articulation. Both hypotheses assume that prosodic promi- nence (the presence of stress or accent) enhances the distinctiveness of the speech signal. Experimental research dealing with the effects of stress, ac- cent, and focus on vowel production has provided evidence for both hypothe- ses. Lindblom (1963) proposed a model in which vowel quality modifica- tions (reflected in F1 and, most clearly, in F2) in Swedish were modulated by duration and consonantal context, thereby establishing a causal rela- tionship between temporal and spectral reduction (see also Barnes, 2006).5 Dutch unstressed vowels displayed a higher degree of coarticulation than stressed vowels (van Bergem, 1993), as in Swedish (Lindblom, 1963), al- though Van Son & Pols (1992) did not find a clear effect of stress on Dutch vowel formants. For English, de Jong et al. (1993) found larger jaw dis- placements for /A/ when accented than when unaccented, predicted by both hypotheses. However, a more retracted tongue position and more protruded lips for accented /U/ were only compatible with the Hyperarticulation Hy- pothesis. Similarly, de Jong (1995), who specifically set to evaluate these two hypotheses, observed lower jaw position for vowels and higher position for consonants. Yet, other articulatory patterns (tongue retraction and lip protrusion for /U/) could only be accounted for by the Hyperarticulation Hypothesis. Evidence for vowel space expansion in English as predicted by this hypothesis (i.e., prosodically prominent vowels being more peripheral) also comes from more recent studies (Erickson, 2002; de Jong, 2004; Lind- blom, Agwuele, Sussman, & Cortes, 2007; Cole, Hualde, Blasingame, & Mo, 2010; Jacewicz, Fox, & Salmons, 2011).6 Mooshammer, Fuchs, & Fischer (1999) examined the issue of prosodically- motivated vowel quality variation in German using electromagnetic mid- sagittal articulography (EMMA) and observed that unstressed vowels showed greater degree of truncation of the opening gesture, resulting in shorter du- rations and reduced movement amplitudes (i.e., target undershoot). In an- height position (a higher tongue body position causing lower F1). Nevertheless, F1 can also be modulated by other mechanisms, such as pharyngealization, labialization, lip-rounding, or nasalization. 5Later work (Fourakis, 1991; Moon & Lindblom, 1994) determined that other factors also contribute to phonetic vowel reduction. 6In Erickson (2002), tongue dorsum position as well as F1 and F2 revealed more pe- ripheral vowels under emphatic stress with respect to vowels without emphatic stress. However, regardless of vowel identity and tongue dorsum height, vowels with emphatic stress were also realized with a lower jaw position. This was also found for one of the four speakers in Erickson, Suemitsu, Shibuya, & Tiede (2012).

6 other study on German vowels (Mooshammer & Geng, 2008), acoustic data revealed higher low vowels in unstressed unaccented contexts, but central- ization along the F2 dimension was minimal (vertical shrinkage of the vowel space only). The articulatory (EMMA) data, however, displayed reduction of tense vowels both in the horizontal and vertical dimension (back and low vowels centralized). Furthermore, in a study on the articulatory correlates of focus, Baumann, Becker, Grice, & M¨ucke (2007) provided evidence that vowels occupied more peripheral positions in the vowel space when they were produced with narrow and contrastive focus than when they received broad focus. This type of variation has been also observed in a few Romance lan- guages. The comparison between stressed and unstressed Pisa Italian vow- els indicated more variability, shorter durations, and more centralization in unstressed syllables (Calamai, 2001). A more compressed unstressed vowel space was also found in other Italian varieties (Savy & Cutugno, 1998). Tendencies in the same direction have been described for Brazilian Por- tuguese (Fails & Clegg, 1992; Ferreira, 2008) and a significantly higher F1 for stressed /a/ than for its unstressed counterpart was reported in Arantes (2010). Ronquest (in press) also found unstressed vowel centralization and shortening in the speech of heritage Spanish speakers (bilingual in English). Cho, Lee, & Kim (2011) investigated the effects of clear speech, prosodic phrasing, and lexical focus on the realization of vowels /i, a, u/ in Korean, a language that does not have lexical stress. /a/ was significantly lower and more anterior in the focused condition than in the unfocused condition. In addition, the high vowels had more peripheral F2 values (/i/ was more ante- rior and /u/ was more posterior) in the prominent condition. These results also indicate an expansion of the vowel space under prosodic prominence in a language without stress. In all the studies reviewed above, prominent vowels were found to have “more distinctive articulations” (de Jong et al., 1993, p. 198). Conversely, vowels that were non- or less prominent departed from the “ideal” or “canon- ical” realizations.7 Yet, others have shown that prominent vowels are more open than their non-prominent counterparts. Prominence may thus be as- sociated with a more open vocal tract, which offers less impedance to the

7Van Son (1993, p. 2) noted that “[v]owels spoken in isolation or in a neutral context, such as /hVd/ in English, are considered to approach the ideal with regard to vowel quality. Such ideal vowel realizations are called canonical realizations. Numerous factors change these canonical realizations to the realizations actually found in natural speech, e.g. speaking style, prosody, context.” Similarly, Moon & Lindblom (1994, p. 40) described undershoot as “systematic shifts away from hypothetical target values.”

7 airflow and results in vowels having more sonority (Sonority Expansion Hy- pothesis). The Hyperarticulation Hypothesis and the Sonority Expansion Hypothesis make the same predictions for low vowels (i.e., low vowels will be lower under prominence). For example, the findings in de Jong & Zawaydeh (1999, 2002) for Arabic, in Cho & Keating (2009) for English, and in Meunier & Espesser (2011) for French (as well as those in Arantes, 2010, for Brazil- ian Portuguese) are compatible with both hypotheses. de Jong & Zawaydeh (2002) found higher F1 for vowel /a/ when it was stressed (in line with the findings in de Jong & Zawaydeh, 1999), as well as when it received lexical or segmental focus. Similarly, in Meunier & Espesser (2011), French /a/ ex- hibited higher F1 in final position (locus of the accent phrase) than in other positions. In an electropalatographic and acoustic study (Cho & Keating, 2009), English stressed /E/ displayed longer duration, smaller linguopalatal contact, and higher F1 (indicating more opening) than secondarily-stressed /E/. In addition, when primarily-stressed /E/ was accented, it was also more open than when it was unaccented. The two hypotheses, however, diverge in their predictions for high vow- els (as discussed in de Jong, 1995 and Harrington, Fletcher, & Beckman, 2000). If high vowels were hyperarticulated under stress, we would expect them to be produced with a higher tongue position, resulting in a lower F1. However, if the goal was to make stressed high vowels louder, then a lower tongue/jaw position might be required. Clearly, these two strategies are incompatible. In addition, the Sonority Expansion Hypothesis does not make any prediction regarding vowel anteriority/posteriority, whereas the Hyperarticulation Hypothesis does. The Sonority Expansion Hypothesis was first proposed by Beckman et al. (1992) after observing that jaw opening and closing gestures were longer, larger (involving more displacement), and faster for accented than unac- cented realizations of the English vowel /A/ in the syllable /pAp/. Similar results were obtained by measuring lip gestures during the realization of the same vowel in the same syllable: Accented /A/ presented larger and faster opening movements with respect to its unaccented counterpart, which in turn showed larger and faster opening movements than the same vowel in unstressed position (Beckman & Edwards, 1994). In Erickson et al. (2012), jaw displacement for the English diphthong /aI/ reflected the metrical struc- ture of the utterance (with larger displacements for more metrically promi- nent syllables). Jaw displacement correlated positively with F1. Hermes, Becker, M¨ucke, & Grice (2008) explored the articulation of the German long vowels /i:, a:, o:, u:/ in four focus conditions: background (postfocus), nar-

8 row, broad, and contrastive focus. Larger lip displacements in opening and closing gestures occurred in the contrastive focus condition when compared to the background and broad focus conditions. Harrington et al. (2000) found that Australian English vowels /i, æ, a/ were produced with a significantly more lowered jaw and greater root mean square (RMS) amplitude when they were accented, revealing a “heightened contrast in loudness between the vowel and the preceding stop closure” (p. 43). In addition, accented tokens of /i/ had significantly higher F2. Whereas the jaw height and RMS amplitude findings point to the Sonority Expan- sion Hypothesis, the fact that accented /i/ was more anterior was consistent with the Hyperarticulation Hypothesis. “Mixed” results like these (with findings in the vertical dimension being accountable by the Sonority Expan- sion Hypothesis and those in the horizontal dimension being predicted by the Hyperarticulation Hypothesis) are described in other articles examining the effects of prosodic prominence on vowel quality and articulation in Amer- ican English. In Cho (2005), vowels /i, A/ also involved larger lip and jaw openings when accented (see also Cho, 2006), and /i/ was also more ante- rior in the accented condition. Similarly, in Mo, Cole, & Hasegawa-Johnson (2009), F1 correlated positively with perceived prominence, regardless of vowel identity. In addition, some vowels’ F2 showed hyperarticulation un- der prominence. These patterns exemplify both enhancing strategies occur- ring simultaneously. A smaller opening of the vocal tract and higher non- prominent vowels point to the Sonority Expansion Hypothesis, whereas the less extreme F2 values indicate target undershoot in the anterior-posterior dimension, as suggested by the Hyperarticulation Hypothesis. Some of the studies just mentioned reported vowel centralization in the absence of prosodic prominence, which results from the articulators failing to reach an articulatory target under certain conditions. Target undershoot may also be caused by faster speaking rates (Miller, 1981). Increasing speech rate results in shorter segment durations. Thus, at faster speaking rates, articulatory gestures may be truncated or more extensively overlapped due to a decreased temporal window during which they can be executed. If this happens, formant undershoot is likely to occur as well. Agwuele, Sussman, & Lindblom (2008), who found a compressed vowel space at faster than normal speech rate in English (as did Turner, Tjaden, & Weismer, 1995), placed fast speech rate toward the hypoarticulation end of the continuum. Because of the time constraints in producing speech at a fast tempo, it seems logical to assume that the output may be less carefully produced than in normal speaking conditions. Indeed, a compression of the vowel

9 space at fast speech rates has been attested in languages other than English. Jaworski (2009) analyzed vowel quality at three speech rates (fast, normal, and slow) in Russian, Polish, and Spanish. He described a decrease in the size of the acoustic space as rate increased (the largest space was for vowels produced at slow rate, followed by those produced at natural rate, and finally those produced at fast rate). Gendrot & Adda-Decker (2007) did not directly manipulate rate, but they classified vowels according to their duration into three categories (short, mid, and long vowels). Vowel space expanded with longer durations for all eight languages analyzed (Arabic, European Portuguese, French, German, Italian, Mandarin Chinese, Spanish, and US English), although for Arabic no difference was observed between long and mid duration vowels. In Japanese, a language with contrastive vowel length, increasing speaking rate yielded more peripheral short /e, o/ (Hirata & Tsukada, 2004, 2009). Long vowels were not affected by speech rate in the same manner, and the authors confirmed a ceiling/floor effect for F2 in vowels longer than 200 ms. Pitermann (2000) showed that, at faster speech rates, vowels may be more assimilated to their neighboring segments. Examination of the steady state in vowels /E, a/ in the sequences [iEi] and [iai], extracted from French sentences produced at nine (participant 1) and ten (participant 2) speaking rates, revealed a progressive F2 increase and F1 decrease as speech rate increased. Both vowels became more fronted and higher (more /i/-like) as they reduced their duration, thus showing contextual assimilation. Finally, other authors (Gay, 1977; Weismer & Berry, 2003; Stack, Strange, Jenkins, Clarke III, & Trent, 2006) did not find a robust or systematic ef- fect of speech rate on vowel quality in English, but this might be due to an unsuccessful speech rate manipulation, not triggering enough temporal reduction. It is important to note that this effect would be restricted to languages in which increasing tempo affects vowel length substantially. Because of different metrical and rhythmic properties of languages, it may be the case that vowel length is not affected similarly in all languages when speech rate is manipulated.

1.4 Definitions: Stress, Accent, and Speech Rate

Before continuing further, some definitions are in order. “Stress” and “ac- cent” are two terms that have received several definitions in the literature. In fact, in the studies reviewed in the previous section, “stress” was more

10 commonly used to refer to phrasal stress (especially in those studies focusing on English) and, less often, to refer to lexical stress. From now on, unless otherwise noted, “stress” will be used to refer to lexical primary stress, a word-level abstract property (see the discussion in Ladd, 2009, p. 48–55). The characteristics of the stress system in the two languages under study are sketched below. “Accent” is to be understood as intonational pitch ac- cent; i.e., actual sentence- or phrase-level prominence, typically a F0 event (Bolinger, 1958). A distinction has been drawn between speech rate and articulation rate (see Barik, 1977). The former refers to the number of syllables per minute considering total utterance duration (pauses included), whereas the latter excludes pauses. Here the term “speech (or speaking) rate” is used more loosely to refer to relative variations in speed.

1.5 Stress and Accent in Spanish and Catalan

The stress systems of Catalan and Spanish present many similarities (for an extensive description of Catalan stress, see Hualde, 1992; Bonet & Lloret, 1998; Wheeler, 2005; for Spanish, see Hualde, 2005, 2012). In both lan- guages, stress is contrastive, as illustrated in (1).8 Minimal pairs based on stress are rather scarce in Catalan due to the existence of phonological vowel reduction (Badia Margarit, 1972), which makes it difficult for two words to differ exclusively in the position of stress.9

(1) Cat. culli ["kuLi] ‘to pick up, 1st person sg., subj.’ vs. collir [ku"Li] ‘to pick up’ Sp. ´ıntegro ‘entire’ vs. integro ‘(I) integrate’ vs. integr´o ‘(s/he) integrated’

Even if the existence of lexical contrasts based only on the position of stress shows that it is free and not predictable (although it is true that some patterns are more common than others), there are some restrictions on stress placement. The domain of stress is the prosodic word (a morphological word plus its encl*tics and procl*tics; Oliva, 1992; Hualde, 2012), and, in both languages, stress is restricted to the last three syllables of a word (except in verb forms with encl*tics, as shown in (2)). In these examples, stress

8In this and the following examples, stressed syllables are underlined. 9The situation described is true for Central Eastern Catalan, but not for other varieties of Catalan (e.g., Valencian Catalan) that exhibit less radical phonological vowel reduction (see Section 1.6.2).

11 falls on the fourth syllable from the end of the word, due to the addition of two unstressed cl*tics. Catalan and Spanish words can be classified into three groups, depending on the position of stress with respect to the end of the word: oxytones (stress on the last syllable), paroxytones (stress on the penultimate syllable), or proparoxytones (stress on the antepenult).

(2) Cat. compra-me-la ‘buy it (fem.) for me (imperative)’ Sp. c´ompramelo ‘buy it (n.) for me (imperative)’

Stress position is not predictable based on syllable weight or morphological structure (except in the case of verb forms; Hualde, 2005, 2012; Wheeler, 2005), but certain regularities exist. The most frequent patterns in both languages are oxytonic words ending in a consonant (e.g., Sp. caracol, Cat. cargol ‘snail’) and vowel-final paroxytones (Sp. madre, Cat. mare ‘mother’), although all possibilities are attested, as exemplified in Table 1.1.10 The spelling conventions of both languages recognize these tendencies, as stress marks are placed on words that deviate from the most common patterns. Table 1.1: Stress patterns in Catalan and Spanish.

Final Vowel Final Consonant caf`e ‘coffee’ vestit ‘dress’ Cat. sof`a ‘sofa’ cavall ‘horse’ Oxytones rub´ı ‘ruby’ color ‘color’ Sp. pap´a ‘dad’ pastel ‘cake’ cama ‘leg’ `apat ‘meal’ Cat. festa ‘party’ c`arrec ‘charge’ Paroxytones codo ‘elbow’ ´arbol ‘tree’ Sp. coche ‘car’ tr´ebol ‘clover’ p`agina ‘page’ curr´ıculum ‘curriculum’ Cat. fon`etica ‘phonetics’ esp`ecimen ‘specimen’ Proparoxytones f´abrica ‘factory’ r´egimen ‘regime’ Sp. f´abula ‘fable’ hip´otesis ‘hypothesis’

As noted, cl*tics do not have lexical stress and attach to a content word to form a prosodic word. Some cl*tics (such as the definite article, certain prepositions, and pronouns) are purely unstressed (in Central Catalan, they have a schwa: en [@n],*[en] ‘in’). However, other function words do have a lexically stressed syllable (like Cat. entre and Sp. para and nuestros

10Certain Catalan oxytonic words ending in a vowel (e.g., fuster [fus"te] ‘carpenter’, germ`a ‘brother’) are actually classified as oxytones ending in a consonant (Mascar´o,1986; Hualde, 1992; Bonet & Lloret, 1998). Diachronically, these words lost a final consonant (in some cases still preserved in the orthography), which results, synchronically, in two forms of the root, with or without final consonant (cf. fusteret [fust@"REt] ‘carpenter, dim.’, germans ‘siblings’).

12 in Example (3)) that can attract a pitch accent in citation form or when they are focused, even if they are normally destressed and, thus, unable to receive a pitch accent in running speech (for Spanish, see Hualde, 2005, 2007, 2009; for Catalan, see Wheeler, 2005). In Catalan, the full vowel in these function words is maintained (see Section 1.6.2). A third group comprises those function words that are stressed and receive a pitch accent when used in context (see (4)). With the exception of words that undergo destressing in connected speech, lexically stressed syllables serve as anchors for pitch accents. In fact, pitch accents can only be associated with stressed syllables (with an exception that is the focus of Chapter 4). Therefore, stress and accent tend to covary in most circ*mstances (Hualde, 2005), although see Chapter 3.

(3) Cat. entre les muntanyes ‘between the mountains’ Sp. para nuestros amigos ‘for our friends’ (4) Cat. la meva germana ‘my sister’ Sp. esta palabra ‘this word’

1.5.1 Acoustic Correlates of Stress and Accent in Catalan and Spanish

Studies on the production and perception of stress and accent in Spanish have singled out duration as an important correlate of stress, and F0 as the most salient correlate of accent. In Quilis (1971), duration was found to be a good correlate of stress (stressed vowels having longer duration; see also Kim, 2011), but not intensity. Alfano, Savy, & Llisterri (2008) noted the same effect, although it was restricted to oxytonic words. Unlike in other studies, Alfano et al. (2008) compared the duration of different syllables within the same word. Whereas in oxytonic words, the stressed syllable was longer than the unstressed one, stressed and unstressed vowels showed no durational differences in proparoxytonic and paroxytonic words. Note that in oxytonic words the stressed syllable may present extra lengthening given that it is also word-final. In fact, the stressed syllables in oxytonic words were longer than stressed syllables in paroxytonic and proparoxytonic words. In earlier work, both production and perception data suggested that F0 was the most important correlate of stress (Quilis, 1971; Llisterri, Machuca, de la Mota, Riera, & R´ıos, 2005). However, these studies examined the correlates of stress in accented contexts, and, therefore, there was covari- ation of stress and accent. In order to tease apart the acoustic correlates

13 of stress and accent, Ortega-Llebaria & Prieto (2007) analyzed the acous- tic correlates of stressed and unstressed vowels in accented and unaccented conditions separately (following Sluijter & van Heuven, 1996a,b). Their re- sults revealed that F0, intensity, and spectral tilt cued accent (rising pitch accent on stressed syllable vs. flat F0, higher vs. lower overall intensity, and higher vs. lower spectral tilt values for the accented and unaccented conditions respectively). Stress was conveyed by means of duration, vowel quality, and spectral tilt (longer vs. shorter duration, less vs. more central /o/, and higher vs. lower spectral tilt values for the stressed and unstressed conditions respectively). A linear discriminant analysis pointed to duration as the most effective correlate of stress, followed by vowel quality and spec- tral tilt. Overall intensity did not help predict stress. In a later article, the same authors (Ortega-Llebaria & Prieto, 2011) confirmed that stress was primarily cued by duration. Accented syllables were also found to be longer than unaccented ones (see also Kim, 2011). Overall intensity could also dis- tinguish between stressed and unstressed syllables, but only in the accented condition. Vowel quality and spectral tilt did not emerge as relevant cues to stress and accent in Spanish. This latter finding was consistent with the results of an experiment on the perception of stress in unaccented contexts (Ortega-Llebaria & Prieto, 2009). Duration and overall intensity (but not spectral tilt) contributed to the perception of stress, although differently for vowels /i, a/. For Catalan, Astruc & Prieto (2006) found that duration, spectral bal- ance, and vowel quality cued the presence of lexical stress in contexts of phrasal deaccentuation, whereas accent was cued by pitch and overall in- tensity (and less robustly vowel quality and duration). Ortega-Llebaria & Prieto (2011) reported a significant effect of stress on duration, but not on overall intensity. Accent did not affect duration or intensity. A study on the perception of stress in unaccented contexts in Catalan (Ortega-Llebaria, Vanrell, & Prieto, 2010) revealed that listeners rely on duration and in- tensity, although duration is clearly the most robust correlate. This paper provided evidence that Central Catalan speakers can perceive stress even in the absence of important cues such as F0 and vowel quality.

1.6 The Languages under Study

The languages under study in this dissertation are Northern-Central Iberian Spanish and Central Eastern Catalan. These two languages are spoken in the Iberian Peninsula, and are genetically related, both having evolved from

14 Latin. Spanish is an Ibero-Romance language, whereas the classification of Catalan has been more debated (Moll, 2006[1952]; L´opez del Castillo, 1991), with some authors grouping it with the Ibero-Romance languages and others classifying it as a Gallo-Romance language (for example, Griera, 1965). In another view, it has been considered a transition language between the two groups. The most accepted view nowadays positions Catalan within the Gallo-Romance subfamily from its origins up to the 15th century, and within the Ibero-Romance branch from then on (Montoya Abat, 2002). Historically, the Catalan language originated in the northernmost area of the territory where it is currently spoken, on both sides of the Pyrenees and in contact with Occitan. It spread southwards with the territorial expansion of the Crown of Aragon and Catalonia through formerly Islamic lands where other Romance varieties (Mo¸carabic)and Arabic were spoken, coming into contact with Aragonese and Castilian (Spanish) in the process. This ex- plains the apparent change in linguistic affiliation (from Gallo-Romance to Ibero-Romance) that has been postulated. The Spanish language developed in the original territory of the medieval Kingdom of Castile, which had its capital in Burgos. Its territorial expansion followed the pattern explained for Catalan, with the difference that Castilian eventually became the dominant language of all of Spain, thus receiving also the name of ‘Spanish’. In- creased centralization in Spain has led to the current situation where other languages, such as Catalan, are spoken in a situation of bilingualism with Spanish.

1.6.1 Spanish

Iberian or Peninsular Spanish can be divided into two broad dialectal va- rieties: Southern Peninsular Spanish (spoken in Andalusia, Murcia, and part of Extremadura) and Northern-Central Peninsular Spanish. Two very salient features that distinguish the latter from Southern Peninsular Spanish and from Latin American varieties include the preservation of the /T/-/s/ phonemic distinction (e.g., caza /"kaTa/ ‘hunt’ vs. casa /"kasa/ ‘house’) and the strident post-velar or uvular realization of /x/ (rojo ["roXo] ‘red’). For more details, see Hualde (2005) and Lipski (2012).

Vowel Phenomena in Spanish

Spanish has a symmetrical vowel inventory, with two high (/i, u/), two mid (/e, o/), and one low (/a/) vowels. These five vowels can appear in stressed and unstressed position. The Spanish vowel system departs from

15 the common Romance seven-vowel system, which presents another set of mid vowels (/E, O/). Late Latin stressed vowels /E, O/ diphthongized to /ie, ue/ respectively in all contexts in Spanish, giving rise to morphophonological alternations as in (5).

(5) pienso ‘(I) think’, but pensar ‘to think’ (cf. Cat. penso ["pEnsu]) bueno ‘good’, but bondad ‘kindness’ (cf. Cat. bo ["bO])

In spite of important phonological differences in the consonant inventory among varieties of Spanish, the vowel system is quite stable phonologically, with only minor phonetic differences (Chl´adkov´aet al., 2011; Kim, 2011). There are, however, certain phenomena affecting Spanish vowels. In Puerto Rican Spanish, a variable phenomenon involving raising of unstressed (mostly word-final) /e, o/ (> [i, u]) in the speech of older speakers in rural regions has been reported (Oliver, 2008). This alternation between unstressed mid and high vowels is also found in areas of Mexico and parts of Extremadura (Viudas Camarasa, Ariza Viguera, & Salvador Plans, 1987, p. 28) and Andalusia (Becerra Hiraldo & Vargas Labella, 1986, p. 14). Mid vowel raising has also been observed in mid-low vowel sequences, resulting in diphthongization, in Mexican and, less frequently, Colombian Spanish (Garrido, 2008). Andean Spanish has been characterized as having “unstressed vowel re- duction”. This term, in this case, encompasses different related phenomena including vowel shortening, devoicing, and complete elision, especially when preceding /s/ (Lipski, 1990). In an experimental study, Delforge (2008) noted that in Peruvian Spanish /i, e, u/ were frequently devoiced, and that vowel reduction understood as centralization did not occur. Lope Blanch (1963) reported sporadic elision of unstressed vowels, especially in contact with /s/ in Mexican Spanish. Finally, Eastern Andalusian Spanish clearly departs from other Iberian and Latin American varieties in its opening of /e, o/ to [E, O] in word-final position before an elided /s/ (Hualde, 2005), so that vowel opening functions as a plural marker. This more open quality may spread to other vowels in the word, by a process of vowel harmony.

16 1.6.2 Catalan

Catalan is spoken in four different states. It is spoken in the Spanish au- tonomous communities of Balearic Islands, Catalonia, and Valencia (where it is co-official with Spanish), and in the regions of El Carxe (Murcia) and la Franja de Ponent (Aragon). Outside of Spain, Catalan is the sole official language of the Principality of Andorra, and it is also spoken, without offi- cial status, in the South of France (Languedoc-Roussillon region) and in the town of Alghero on the isle of Sardinia (Italy). Catalan is divided into two broad dialectal blocks: Western and Eastern Catalan (see the map in Fig- ure 1.1). The Western block encompasses the varieties spoken in Aragon, Murcia, Valencia, and Western Catalonia. The other varieties constitute the Eastern block. The variety under study here is Central Catalan (in dark grey in Figure 1.1), which occupies a geographically central position within the Eastern Catalan block.

© 2013 by Marianna Nadeu Rota. All Rights Reserved. the EFFECTS of LEXICAL STRESS, INTONATIONAL PITCH ACCENT, and SPEECH RATE on VOWEL QUALITY in CATALAN and SPANISH (2024)
Top Articles
Latest Posts
Article information

Author: Errol Quitzon

Last Updated:

Views: 5757

Rating: 4.9 / 5 (59 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Errol Quitzon

Birthday: 1993-04-02

Address: 70604 Haley Lane, Port Weldonside, TN 99233-0942

Phone: +9665282866296

Job: Product Retail Agent

Hobby: Computer programming, Horseback riding, Hooping, Dance, Ice skating, Backpacking, Rafting

Introduction: My name is Errol Quitzon, I am a fair, cute, fancy, clean, attractive, sparkling, kind person who loves writing and wants to share my knowledge and understanding with you.