To recap as briefly as possible, in English, when a word ends in a consonant cluster, which also ends in a /t/ or a /d/, sometimes that /t/ or /d/ is deleted. This deletion can affect a whole host of different words, but the ones which have been of most interest to the field are the regular past tense (e.g., packed), the semiweak past tense (e.g., kept) and morphologically simplex words (e.g., pact), which I'll call mono. Other morphological cases which can be affected, and which I believe have occasionally and erroneously been categorized with the semiweak are no-change past tense (e.g., cost), "devoicing" (or something) past tense (e.g., built), stem changing past tense (e.g., found), etc. For the sake of this post, I'm only looking at the the main three cases: past, semiweak, and mono.
Now, Guy (1991) came up with a specific proposal where if you described the proportion of pronounced /t d/ for past as p, for semiweak as pj and for mono as pk, then j= 2, and k = 3. It is specifically whether or not j= 2 and k = 3 that I'm interested in here. If you've calculated the proportions of pronounced /t d/ for each grammatical class, you can calculate j by log(semiweak)⁄log(past) and k by log(mono)⁄log(past). The trick is in how you decide to calculate those proportions.
For this post, you can play along at home. Here's code to get set up. It'll load the Buckeye data I've been using, and do some data prep.
So, how do you calculate the rate at which /t d/ are pronounced at the end of the word when you have a big data set from many different speakers? Traditional practice within sociolinguistics has been to just pool all of the observations from each grammatical class across all speakers.
So you come out with j = 1.91, k = 3.1, which is a pretty good fit to the proposal of Guy (1991).
The problem is that this isn't really the best way to calculate proportions like this. There are some words which are super frequent, and they therefore get more "votes" in the proportion of their grammatical class. And, some speakers talk more than others, and they get more "votes" towards making the over-all proportions look more similar to their own. One approach to ameliorate this is to first calculate the proportion for each word within a grammatical class within a speaker, then for each grammatical class within a speaker, then within a grammatical class. Here's the code for this nested proportion approach.
All of a sudden, we're down to j = 1.34 and k = 2.05, and I haven't even dipped into mixed-effects models black magic yet.
But when it comes to modeling the proposal of Guy (1991), calculating the proportions is really just a mean to an end. I asked Cross Validated how to directly model j and k, and apparently you can do so using a complementary log-log link. So here is the mixed effects model for j and k directly.
The model estimates look very similar to the nested proportions approach, j = 1.38, k = 2.11.
What if we fit the model without the by-word random intercepts?
Now we're a bit closer back to the original pooled proportions estimates, j = 1.57, k = 3.19.
My personal conclusion from all this is that the apparent j = 2, k = 3 pattern is driven mostly by the lexical effects of highly frequent words. This table recaps all of the results, plus the estimates of two more model. One has just a by speaker random intercept, and a flat model, which looks just like the maximum likelihood estimate of the fully pooled approach, because it is.
The lesson is that it can matter a low how you calculate your proportions.