Mind your /ti/’s and q’s: A subsegmental approach to affrication in Québec French

This paper presents experimental evidence for an additional phase in affrication in /ti, ty/ sequences in Québec French. Namely, beyond the standard stop release and fricative-like portion of what is standardly transcribed [ts], a phase resembling a partially voiceless vowel or aspiration frequently manifests itself before (or in the absence of) the full vowel. Phonetic correlates of this phase are intermediate voicing and a mid-point decline in centre of gravity, in stark contrast with target fricative /s/. This sort of multi-phased affricate tentatively lacks counterparts in the literature on unaspirated affricates. While the final representation of these segments and the motivation of their internal composition are left for future work, the potential consequences of the addition of this intermediate phase are briefly explored, in particular with reference to Q Theory.


Introduction
Affrication in Québec French (QF) is the well-known process by which, according to traditional descriptions, coronal stops are pronounced with a fricative-like release before high front vocoids, that is, /t, d/ ⟶ [ts, dz] before /i, j, y, ɥ/. This process, which is categorical within words but optional between word boundaries (Dumas 1987;Ostiguy and Tousignant 1993), nevertheless presents some complications in the literature. Namely, frication (whether due to an affricate or an underlying fricative) is an independent environment for a number of processes targeting following high vowels, such as lenition, devoicing and deletion (e.g., Gendron 1966;Cedergren and Simoneau 1985). Additionally, partial or total /d/-devoicing has been noted in QF affricate sequences, especially in the speech of young women (Bento 1998).
The present paper expands upon this literature by proposing the existence of an additional phase resembling a devoiced vowel as a part of affrication in /ti, ty/ sequences. In order to motivate such a process, an experiment involving a reading task of these sequences and /s/ was performed. The non-vocalic portions of /ti, ty/ sequences were then segmented, and voicing, centre of gravity (COG) and dispersion were measured. After meeting /s/-like frequencies, the affricates of QF show a significant decline in COG over time with, for most speakers, an accompanying rise in dispersion. Voicing is also absent from the traditional fricative-like phase of affricates but intermediate in the following phase, where present. Meanwhile, the COG of target /s/ consonants remains stable over time. We deduce from this information that the output of affrication of /t/ in QF should not be considered a simple or target [ts] but rather something closer to an aspirated affricate [ts h ] or an affricate followed by a partially voiceless vowel.
The rest of this paper is structured as follows: Section 2 explores the linguistic and phonetic background of affrication in QF and more generally. Section 3 presents the methodology of the experiment, whose results are laid out in §4. Section 5 discusses these results and their consequences for theories of representation, forwarding the argument that Q Theory may be especially appropriate for the complex sequences generated by this process. This paper ends with a summary in section 6. 2 2 Background

Linguistic background
Affrication has long been noted in descriptions of QF. For instance, Dunn (1880: 53, 180) states: "On serait tenté de dire que le d n'existe pas dans la langue franco-canadienne, car, dans la prononciation, nous remplaçons cette lettre par une autre qui renferme un son sifflant et que l'on pourrait indiquer par dz…. Au t comme aux d les Canadiens-fr[ançais] donnent un son sifflant." See also Rousseau (1935) for an early phonetic description, as well as a dialectal survey of the same phenomenon in Hexagonal French at the beginning of the twentieth century. Though one of the more stereotypical aspects of QF pronunciation (e.g., Friesner 2010), this process has long been and remains unstigmatized within Québec and varies little with social class (Dunn 1880;Dumas 1987).
Similarly, affrication is present in almost all geographical areas of Québec with the exception of the Charlevoix region, where it is more variable (Poirier 1994), and varieties of Acadian French spoken in Gaspésie, Côte-Nord and les Îles de la Madelaine (Dumas 1987). Affrication is largely absent from Acadian French in general, with the noted exceptions of Prince Edward Island (King and Ryan 1989) and Northeast New Brunswick (Cichocki and Perreault 2018), where additional, local variants of affrication include the palatalized [dʒ] and aspirated [t h ] (in the more traditional sense of the word).

Phonetic background
According to an X-ray study by Charbonneau and Jacques (1972), the articulation of coronal stops in affrication settings can be distinguished from those in neutral contexts by several factors. First, both the active and passive articulators are slightly different: whereas they show simple stops (i.e., /t, d/ not before high front vocoids) to be articulated with the tongue tip against the alveolar region, in affrication settings, these stops are articulated with the tongue predorsum further back in the postalveolar or prepalatal region. The two kinds of stops are additionally distinguished by the rapidity of tongue blade lowering after stop release, being much slower in the case of affricated stops. Finally, later portions of affricated stops are articulated with the tongue tip pointed down towards the lower teeth, whereas the tongue blade is fairly flat in the case of unaffricated stops. In both cases, the tongue body bunches almost immediately after stop release to form the gestures of the following vocalic segment; again, what distinguishes the two kinds of stops is the direction and velocity of the concurrent tongue tip movement and the resultant surface area in the anterior oral tract.
Typologically speaking, what is called affrication in QF is not entirely rare, and it has a welldocumented phonetic grounding. The production of an affricate from a coronal stop before a high (front) vocoid is just one of several possible outcomes of assibilation more generally; others include /t/ ⟶ [s] and [tʃ] (Hall et al. 2006). Assibilation targets are typically, but not necessarily coronal, while triggers are most often high and front vocoids (Kim 2001). This process is favoured in coronal stop + high front vocoid sequences in particular because of the similar loci of stop release and relatively high degree of closure in the vocoid, creating the requisite conditions for turbulence (Jäger 1978). While all released stops necessarily have some small degree of friction release before all vowels, this release is demonstrably longer before high vowels (e.g., Ohala 1983;Clements 1999). Hall et al. (2006) distinguish two phases within affrication, namely, burst friction (as mentioned above) and aspiration. The former corresponds to the release of the plosive element and therefore shows spectral properties characteristic of its place of generation. In particular, the energy of this phase is typically in the 3500 to 7000 Hz range. This phase necessarily precedes and is shorter than the aspiration phase, which shows higher and less dispersed spectral energy. Spectral peaks resembling the formants of the adjoining vowel can also be observed, commensurate with the positioning of articulators to produce the vocoid's constriction. In order to avoid confusion surrounding the term aspiration, the rest of this paper will primarily refer to this phase as friction or fricative-like, in comparison with a stop release phase (not considered in the experiment). 3 We consider here evidence from QF for the need to subdivide the fricative-like phase in two, resulting in an additional but optional phase. Such a proposal is based on the dynamic but fairly abrupt behaviour of both spectral energy and vowel-like formants between burst friction and so-called pure vowels (i.e., not mixed with any frication), as illustrated in Figures 1a and 1b. Specifically, there appears to be an abrupt lowering and/or dispersion of higher energy in the spectrum, as well as the appearance or strengthening of formants associated with the following vowel. a.
b. Figure 1. Affricates in /ty/ (têtu) and /dy/ (dûment) sequences, speaker 1, with the proposed division between frication and an intermediate phase Whether this phase or the former, much more fricative-like phase should be considered as more analogous with the aspiration phase of Hall et al. (2006) is still unclear, as are the exact criteria for identification of boundaries. This paper examines some, but certainly not all, quantifiable means of subdividing affrication, as discussed in §3.
In this paper, we examine the nature of this phase in /ti, ty/ sequences in QF and how it may be distinguished from the preceding phase as well as from the underlying fricative /s/. Where present, this phase is considered to be a voiceless variant of the following vowel, i.e., [i̥ ] or [ẙ], though aspiration of the affricate is another possibility. Evidence from the voiced affricate sequences (not discussed here) also suggests the term fricativized high vowel may be more appropriate, with a transcription along the lines of [s̩ ] and [s̩ ʷ]. Regardless of what exactly this intermediate phase may be and how exactly to define and delineate it, the evidence strongly suggests the aspiration phase is far from stable.

Methodology
A reading list of French words was constructed for the experiment. Target sequences included tokens of /ti/ and /ty/ in the word-initial and word-final contexts. The list comprises of one word per sequence, per context for each of the five following consonant types: voiceless plosive, voiced plosive, voiceless fricative, voiced fricative and sonorant. Three additional words included the target sequences in absolute word-final position. As a suitable word-initial /ty/ + voiced fricative (other than /ʁ/) could not be found, this design in the end yielded 25 words. An additional 24 targets involving /di, dy/ were included in the stimuli but are not analyzed here. Finally, 50 distractor words not containing the target sequences were added to the reading list. This items included, among others, series of /p/-and /l/-initial words to distract from the /t, d/-initial words, as well as words containing intervocalic /s/ which served as controls (as well as /z/, for future research). This list was randomized four times and each incorporated into a slideshow presentation.
Ten native speakers of QF were recruited for the purposes of this study; the results of the first five speakers are presented here. These five speakers were all female, with an average age of approximately 24. No participants came from non-affricating areas of Québec. Each speaker read the four randomized lists aloud at a self-directed pace into a Samson Meteor microphone. Recordings were performed in Praat in mono with a sampling rate of 44.1 kHz.
Target /ti, ty/ sequences were then divided into the following phases, where present: fricative release, voiceless vowel and vowel. Stop release was not included in measurements, though the onset of frication was defined as the end of this phase. The voiceless vowel phase was distinguished visually from the frication phase based on abrupt changes in spectral energy and in formant structure. The reader is referred back to Figures 1a and 1b for examples of this segmentation scheme, in comparison with Figure 1 of Hall et al. (2006: 64). All in all, 387 sequences were analyzed, as two lists from speaker 2 had to be excluded due to microphone error.
Voicing of each phase was extracted automatically using information from the Praat Voice Report (pitch range: 75-500 Hz, otherwise, with standard settings) and expressed as a percentage. COG was extracted from each phase at five ms intervals based on a spectrogram with a maximum frequency of 11 kHz (standard settings, otherwise) after application of a 500 Hz high-pass filter. The standard deviation of these spectral slices provided the dispersion measurement. Mean COG was calculated for each phase for each token. Finally, timestamps were scaled within word, speaker and reading in order to allow for the COG and dispersion measurements to be passed to an SSANOVA function in R using the gss package (Gu 2014).

Results
The results for voicing and spectral characteristics are presented in this section. Duration was also measured, but given the subjective nature of phase identification and given other factors which may influence vowel duration (namely, the presence of lengthening and laxing consonants in closed, word-final syllables), these numbers are to be met with some skepticism. However, it was found that the intermediate phase was on average twice as short as the friction phase (0.04 s vs. 0.08 s), both of which were shorter than true vowels (0.12 s). Similarly, another result which must be considered tentative (due again to the subjective nature of segmentation) but which is still reported here is the frequency of various affrication types. The maximally four-phased sequence (stop release, friction, voiceless vowel and vowel) was evidenced quite often (318 cases), while the full vowel was missing from this sequence in 29 cases. Various other profiles comprised the remaining 40 cases.

Voicing
The average percentage of voicing by phase is provided by speaker and vowel in Table 1. Trends were fairly uniform between vowels, with the exception of speaker 2, whose intermediate phase was on average far less voiced for /y/ than for /i/ (20.2% voiced vs. 64.1%, respectively). Otherwise, we observe no important vowel-specific differences. In addition, we observe little phase-internal variation in the averages. That is, regardless of speaker or vowel (the above noted exception aside), the friction phase showed nearly no voicing, while the pure vowel phase was almost at ceiling rates. In between, the proposed additional phase showed intermediate levels of voicing. The boxplot in Figure 2 illustrates these same trends in more detail. Note that while the averages remained similar within the intermediate phase, some potentially important variation in ranges of voicing of this phase can be ascertained. Speaker 1 aside (who was fairly consistent in her devoicing), within vowel categories, some speakers appear to devoice more often than others. For instance, speakers 4 and 5 devoiced the intermediate phase of /i/ more frequently than speakers 2 and 3. This trend reverses somewhat for /y/, with speaker 2 devoicing more than 5 and speakers 3 and 4 somewhere in between. The amount of data and speakers do not currently allow for a statistical exploration of these effects, but future work may be able to take them into account.

Spectral characteristics
In this section, we consider mean COG by segment type: first, /t/ before /i, y/ and second, /s/, in order to provide a point of comparison with previous studies. Then we look at changes in COG over time (along with dispersion), comparing intervocalic fricative /s/ with the frication and intermediate partially voiceless vowel phases of /ti, ty/ sequences. The (pure) vowels of neither /s/ nor the affricate sequences are included. Note that the discussion of COG over time does not presuppose or depend on the existence of phases. As frequency was not normalized, we consider individual, rather than group trends. Table 2 presents the mean COG of /t/ (without burst friction) before /i/ and /y/ separately and of intervocalic /s/. Mean COG was lower for each participant in the consonantal phases of /t/ before /y/ than of those before /i/. In fact, for certain speakers, the /t/ in /ti/ sequences had a mean COG similar to that of /s/. The boxplot in Figure 3 breaks the /t/ in /ti, ty/ sequences down further into friction and the intermediate voiceless vowel phases, in comparison with that speaker's /s/. Colour indicates vowel type. When broken down into phases, the initial phase of /ti/ sequences (stop release aside) is quite similar to /s/. In /ty/ sequences, this phase has lower COG for all speakers except speaker 4. For all speakers, the intermediate phase has lower COG than the preceding phase and /s/, regardless of the vowel. Figure 4 presents the SSANOVA results for each speaker, using a three-way interaction of normalized time, segment type and speaker. The y axis represents the COG in Hz. Solid lines indicate COG and dashed lines (towards the bottom of the graphs) dispersion, while colour indicates segment type. A common characteristic is that centre of gravity of the non-vocalic phases of /ti, ty/ sequences showed an important decline, in comparison with the relatively stable centre of gravity of /s/. Speakers 2, 3 and 4 showed a later and more steep decline in comparison with that of speakers 1 and 5. Meanwhile, increases in dispersion in /ti, ty/ sequences seemed to correlate with decrease in centre of gravity, with the exception of speaker 5, whose dispersion remained relatively flat for both segment types.
A linear mixed effects model was performed in R with the RStudio software package and the nlme (Pinheiro et al. 2018) library to determine the relationship between centre of gravity and normalized time according to segment type (/s/ vs. the non-vocalic portions of assibilated /t/). The latter two factors were entered as fixed effects with an interaction between the two. By-subject random intercepts and slopes were also included. The main effect of time proved insignificant (β = 434.1, SE = 344.1, p = 0.2072), but the main effect of segment found /t/ to have a generally higher COG than /s/ (β = 310.9, SE = 79.2, p < 0.001). This significantly higher COG, only by 310.9±79.2 Hz, is at best suggestive of a slight difference in place of articulation, or perhaps an artefact of the model design. The interaction between time and segment found the COG to fall significantly (β = -3563.7, SE = 136.2, p < 0.001). The model, however, explains relatively little of the variation present in the data (R 2 = 0.271).

Discussion
On one hand, the results provide yet more evidence for the long-documented fricative-like pronunciation of /t/ before /i, y/ following its initial release. Namely, this portion of the segment begins with a profile quite similar to that of /s/ in voicelessness, centre of gravity and dispersion. On the other hand, the evidence also suggests that this phase is perhaps less unitary than suggested by its transcription as [ts] or [t s ].
Regardless of how one segments such sequences (i.e., whether we accept the notion of an intermediate phase), COG falls at a significant rate in these segments, in addition to showing sporadic or incomplete voicing. The apparently concomitant rise in dispersion points to a widening of the range of spectral energy in addition to its overall decline, suggestive of the gradual addition of lower energy to the spectrum (potentially associated with vowel formants), all while maintaining, at least for some duration, its initial higher energy (associated with fricatives). In this sense, the (roughly) latter half of these segments are neither fully fricative-like nor fully vowel-like. However, additional measurements need to be performed in order to confirm the origin of these effects. In particular, formant activity and skewness are likely to be informative as to how both extremes of the spectrum behave. Additionally, the effect of the following vowel (/i/ vs. /y/) must be more robustly investigated.
It is unclear to what degree these so-called affricates in QF are different from presumably stable affricates in other languages. This is primarily due to methodological differences between this study and others in the literature, which tend to employ averaging of spectra and mid-point measurements of COG. Differences in terminology also require a closer look. This study used a deliberately dynamic approach, taking spectral slices and COG at five ms intervals. In one study with a similar methodology, Butler (2012) finds a gradual rise in the COG of affricate [tʃ] in comparison with that of fricative [s] in Khmer, while the COG of both fall at similar rates at the right boundary of the segment.
Otherwise, we can at least infer that, even if some differences are documented between the COG of /s/ and /ts/, their ranges or variation are not as extremely different as seen here. The closest evidence found for another instance of similarly multi-phased affricates comes from Nyagrong Minyag (Van Way 2018), in which the interquartile range of COG values for /ts/ appear to be 5000 and 8750 Hz (p. 80), versus that of the fricative portion of /s h / at 7500 and 8750 Hz (p. 87), as estimated from boxplots. The larger range of COG in /ts/, measured at its "stable portion of frication" (p. 78) without periodic energy, may point to a late decline similar to that noted here. Meanwhile, in both Eastern and Western Catalan (Recasens and Mira 2018: 153), /s/ and /ts/ showed both similar COG means and ranges. Taking Eastern Catalan as an indicative example, /s/ had a mean and range of 3960.6 and 591 Hz, respectively, while /ts/ had a mean and range of 4095.9 and 611.3 Hz.
All in all, more work is required to determine what points of comparison there may be between QF affricates and others. Based on examples from Van Way (2018), however, it may be that the so-called voiceless vowel portion of QF affricates discussed here is in fact true aspiration, e.g., [ts h i] rather than [tsi̥ i]. These may, however, functionally be notational variants of the same entity. This question is left open for now, especially in the absence of /di, dy/ sequences, which are tentatively more suggestive of fricativevowel mixing than breathy aspiration.
Regardless of what exactly intervenes between the [s]-like phase and the vowel of /ti, ty/ sequences, we have seen evidence that something must intervene, as their consonantal (or at least not entirely vocalic) portions are more often than not unstable in the locus and dispersion of their frication. That is, while the [s] portion is evidenced, the whole segment does not behave as such. Even in the absence of such an intervening phase, but especially so now with its argued presence, affrication in QF regularly creates complex segments with identifiable internal structure, and it may do so within both the consonantal and vocalic elements of these sequences.
In this sense, Q Theory (e.g., Inkelas and Shih 2016) is particularly well advantaged to model affrication. In this theory, the traditional segment (here, Q) is maintained but divided into subsegments (represented by q) which frequently, but not necessarily, number maximally three per segment. When combined with Agreement by Correspondence (ABC) Theory (e.g., Walker 2000) to give ABC+Q Theory (e.g., Inkelas and Shih 2016), we can model phonetically-motivated interactions between subsegments. This is particularly desirable if we consider the intermediate phase as evidenced here to be a partial assimilation which, unless relegated to phonetics, proves problematic in optimality theoretic frameworks, if not the majority of bimodal frameworks. In the case of [tsi̥ i], we may consider assibilation of the stop, due to aerodynamic constraints linking /t/ and /i/, as feeding vowel devoicing, due to the affinity of fricatives with voicelessness (Ohala 1997) and their well-documented likelihood of leading to vowel devoicing in QF (Cedergren and Simoneau 1985;Bayles 2016). In Q-theoretic terms, this would correspond to a derivation along the following lines: C(t1t2t3)V(i1i2i3) ⟶ C(t1s2s3)V(i1i2i3) ⟶ C(t1s2s3)V(i̥ 1i2i3).

Conclusion
This paper examined the prevalence and ramifications of a proposed additional phase of affrication of /t/ in QF resembling a partially devoiced vowel (or potentially, aspiration). This phase was characterised by abrupt changes in higher spectral energy and the appearance or strengthening of vowel-like formants. It was found in a reading task experiment that such a phase was more often present than not. Additionally, the non-vocalic portions of affricates demonstrated a significant decline in centre of gravity, in comparison with the stable profile of /s/, as well as a rise in dispersion for most speakers. Finally, this phase demonstrated middling rates of voicing, while the friction phases showed next to zero percent voicing and the pure vocalic phases near-ceiling rates of voicing. In comparison with other languages, it may be that the behaviour of these affricates differs from that of (unaspirated) affricates in other languages. More studies, especially ones with a dynamic approach to centre of gravity, need to be consulted in the future.
At this stage, it is difficult to take a firm stance on what this phase must be, or whether or not it constitutes a real target in QF. However, affricate sequences strongly show evidence that either such a phase exists as a target, or that some aspect of these sequences allow for, if not prefer loose interpolation. It is advanced for the time being that /ti, ty/ sequences in QF are routinely realized as either affricate + partially devoiced vowels, or rather that these affricates are additionally aspirated, in the classic sense. This paper argues that the former can especially be elegantly captured in Q Theory in its ability to formalize partial assimilations. The latter analysis may equally be achievable in this theory if underspecification of subsegments is to be implemented. This, however, requires a more robust notion of the specification (featural or otherwise) of subsegments themselves, and is thus left for future work.
Finally, the behaviour of /di, dy/ sequences are likely to shed further light on the nature of affrication in QF. Of particular interest is the partial or total stop devoicing documented in the literature (Bento 1998) and how this may interact with the voicing effects noted here. If an intermediate phase is observed, we may expect it to have similar properties to the partially devoiced, vowel-like phase described here. Alternatively, in the absence of devoicing, we may observe fricative-vowel mixing. Either will help better establish the results documented here and the phenomenon more generally.