Sunday, February 17, 2013

A difference between men and women.

This post was originally going to be a lot more mathy, with a bit of explanation about the source-filter model of speech production with an aside about dead dog heads mounted on compressed air tanks thrown in there, and a whole description of my methods, but I felt like I was sort of burying the lede there. Instead, I'm focusing more on how people are interested in magnifying the difference between men and women.

It started off with me estimating the vocal tract lengths of the speakers in the Philadelphia Neighborhood Corpus. Given sufficient acoustic data from a speaker, and making some simplifying assumptions, and taking into account the acoustic theory of speech, you can roughly estimate how long a person's vocal tract (meaning distance from vocal cords to lips) is. I went ahead and did this for the speakers in the PNC, and plotted the results over age.

Pretty cool, right? There's nothing especially earth shattering here. It's known that men, on average, have longer vocal tracts than women. I was a little bit surprised by how late in age the bend in the growth of vocal tracts were.

Here's the density distribution of vocal tract lengths for everyone over 25 in the corpus.

That's a pretty big effect size. Mark Liberman has recently posted about the importance of reporting effect sizes. He was focusing on how even though people are really obsessed with cognitive differences between men and women, the distributions of men and women are almost always highly overlapping.

Following Mark on this, I went ahead and calculated Cohen's-d for these VTL estimates.
So, 1.71 is a fairly large Cohen's-d effect size. I had heard that the difference in vocal tract length between men and women was disproportionately large given just body size differences. I managed to find some data on American male/female height differences, but the effect size is not impressively smaller than the VTL effect size (1.64, about 95% the VTL effect size).

Compared to the effect that Mark was looking at (science test scores), these effect sizes are enormous. The effect size of height between men and women is about 23 times larger than the science test score differences which warranted a writeup in the New York Times.

Yet, still not big enough.

As I was thinking about how height difference is perhaps one of the largest statistical differences between men and women, it also struck me how often it is still not big enough for social purposes. Sociological Images has a good blog post about how even though Prince Charles was about the same height, if not shorter than Princess Diana, in posed pictures he was posed to look much taller than her. Here's an example of them on a postage stamp:

And in another post, they provide this picture of a reporter being comically boosted to appear taller than the woman he's interviewing.

My take away point is that when it comes to socially constructing large and inherent differences between men and women, even the largest statistical difference there is out there is still not good enough for people, and needs to be augmented and supported. Then take into account that most other psychological and cognitive differences have drastically smaller effect sizes, and it really brings into focus how the emphasis on gender differences must draw almost all of its energy from social motivations, rather than from evidence or data or facts.


  1. Daniel Ezra JohnsonFebruary 17, 2013 at 7:21 PM

    Your takeaway point is sort of a third-order one. The first order takeaway from data like this is to compare means - and in an ideal world, demonstrate statistical significance - and say "men do X more than women". Usually the part where it's like "0.04% more" is omitted.

    The second-order takeaway looks at the (usually high) degree of overlap between the two distributions - inversely related to Cohen's d - and says "it's kind of ridiculous to say 'men do X more than women' given this amount of overlap. It makes it sound like all men do X more than all women, and that's very far from the case."

    And I agree that the motivation for the emphasis on gender differences is social. If you could measure how important gender differences are in most people's lives, Cohen's d would be off the charts. An additional motivation is backlash against various claims that men and women are the same.

    In your first extract of code (there's probably a word for that), there's a mistake in the formula for pooled standard deviation. You have to square the individual standard deviations, as you've done in your second code extract. If you do, you end up with a Cohen's d of 1.71 rather than 1.89.

    (There's also some question about whether the denominator there should be (n1 + n2) or (n1 + n2 - 2), but that obviously makes less of a difference.)

    I was also wondering why you used the height data for 50-59-year-olds. If you use the data for all adults (20 and over) you get a larger pooled standard deviation and a smaller effect size: 1.25 instead of 1.64.

    While I'm here, I thought I'd ask what formula you used for converting acoustic data into vocal tract length. I found a very simple one, VTL = speed of sound / (4 * F1), at the amusing site But I also found much more complex calculations like and

  2. Thanks for catching the Cohen's-d error. I've updated it here. Also, I went for height of 50-59 year olds because that age range contained the median age of speakers older than 25 in the PNC.

    As for estimating VTL, I went for the method described here:

    The formula for L is ((2n-1)c) / (4F_n), where n is the formant number, c is the speed of sound, and F_n is the frequency. For that to work, though, you have to use formant measurements of schwa. So, I took vowels that were reasonably close to having formant ratios of F2/F1=3 and F3/F1=5. With these, I estimated the VTL based on F1, F2 and F3. The estimates based on the three formants for the most part agreed within 0.5 cm.

  3. The usual way of thinking about sex differences in human vocal-tract length vs. sex differences in other linear dimensions compares means, not effect sizes; and the numbers I usually cite are an 8% difference in average general linear dimensions vs. a 15% difference in average vocal tract length.

    Your data gives

    176.6/162.2 = 1.088779

    16.98/14.91 = 1.138833

    so 9% vs. 14%, which is roughly the same.

  4. Like the way you've tested whether or not the vocal difference btw the men and women is disproportionate to the difference in size. Small query though: I'm familiar with work that suggests the difference in boys' and girls' pitch is disproportionate to their size, but was not aware anyone had made that claim for, e.g., 59 year olds.


Disqus for Val Systems