Sunday, February 17, 2013

A difference between men and women.

This post was originally going to be a lot more mathy, with a bit of explanation about the source-filter model of speech production with an aside about dead dog heads mounted on compressed air tanks thrown in there, and a whole description of my methods, but I felt like I was sort of burying the lede there. Instead, I'm focusing more on how people are interested in magnifying the difference between men and women.

It started off with me estimating the vocal tract lengths of the speakers in the Philadelphia Neighborhood Corpus. Given sufficient acoustic data from a speaker, and making some simplifying assumptions, and taking into account the acoustic theory of speech, you can roughly estimate how long a person's vocal tract (meaning distance from vocal cords to lips) is. I went ahead and did this for the speakers in the PNC, and plotted the results over age.


Pretty cool, right? There's nothing especially earth shattering here. It's known that men, on average, have longer vocal tracts than women. I was a little bit surprised by how late in age the bend in the growth of vocal tracts were.

Here's the density distribution of vocal tract lengths for everyone over 25 in the corpus.



That's a pretty big effect size. Mark Liberman has recently posted about the importance of reporting effect sizes. He was focusing on how even though people are really obsessed with cognitive differences between men and women, the distributions of men and women are almost always highly overlapping.

Following Mark on this, I went ahead and calculated Cohen's-d for these VTL estimates.
So, 1.71 is a fairly large Cohen's-d effect size. I had heard that the difference in vocal tract length between men and women was disproportionately large given just body size differences. I managed to find some data on American male/female height differences, but the effect size is not impressively smaller than the VTL effect size (1.64, about 95% the VTL effect size).



Compared to the effect that Mark was looking at (science test scores), these effect sizes are enormous. The effect size of height between men and women is about 23 times larger than the science test score differences which warranted a writeup in the New York Times.

Yet, still not big enough.

As I was thinking about how height difference is perhaps one of the largest statistical differences between men and women, it also struck me how often it is still not big enough for social purposes. Sociological Images has a good blog post about how even though Prince Charles was about the same height, if not shorter than Princess Diana, in posed pictures he was posed to look much taller than her. Here's an example of them on a postage stamp:

And in another post, they provide this picture of a reporter being comically boosted to appear taller than the woman he's interviewing.


My take away point is that when it comes to socially constructing large and inherent differences between men and women, even the largest statistical difference there is out there is still not good enough for people, and needs to be augmented and supported. Then take into account that most other psychological and cognitive differences have drastically smaller effect sizes, and it really brings into focus how the emphasis on gender differences must draw almost all of its energy from social motivations, rather than from evidence or data or facts.

Thursday, February 7, 2013

I recommend Lexicon Valley

Perhaps the most frustrating thing about being a linguist is the enormous gap among educated people about how little they actually know about language, and how confident they are that they know a lot about language. If you keep up with this blog, I spend a lot of time venting this frustration here (etc. etc. etc.).

But I didn't start blogging in order to complain about how other people are getting it wrong. I started blogging to have an informal outlet for passion for linguistics! I've been a little concerned about the negative tone of a few of my recent posts, so here's a more positive one.

But... it does start off with a complaint. At the LSA this year, David Pesetsky's plenary focused on the failure of linguistics (and more specifically, generative linguistics) to penetrate the popular science press. Instead, stories about physicists discovering the most common English word is "the," and psychologists arguing that structure of language is really words like beads on a string get a lot more play. At the Q&A, Ray Jackendoff made the point that there is a folk linguistics that is intricately tied up in social politics that acts as a major roadblock to the popular advancement of real linguistic research. I've said similar things before.

What is to be done about this state of affairs is the topic of another blog post. Right now, I'd like to bring attention to a bright light of potential linguistics popularization.

Lexicon Valley


Lexicon Valley is a podcast hosted by Slate. I've been listening to it off and on since it started, and I have to say I've always enjoyed it. The hosts play two roles in a dialectic. Mike Vuolo is the patient intellectual, and I've always been impressed by the background research he's done. Bob Garfield is the voice of the untutored establishment, and, well, I think that description adequately sums up my opinion of what he brings to the show. It's actually an important role he plays, because without a vocal foil, Vuolo's research would lie rather flat. It's also important for the cause of linguists to have people hear brash knee jerk reactions rebuked by careful research.

They have covered a few topics I know a little bit about, and I've always started listening to each show bracing myself for frustration and disappointment. It's a learned reaction I have from every other discussion of language in popular media. But Lexicon Valley usually carries through for me. They've done great shows on African American English, grammatical gender, and the English epicene pronoun, speaking to actual linguists in each case, and most recently they've just done a really good portrayal of Labov's department store study (Part 1,Part 2).

They did catch a lot of flack recently for their show on creaky voice. I was so nervous when I started listening to it, because the recent coverage creaky voice has gotten has been worse than terrible. Per usual, though, Vuolo's research and discussion were excellent. Garfield, on the other hand, spouted some really negative attitudes, and I think he deserves every criticism of sexism that he got. Even within the dialectic of the show, Garfield brought a net negative contribution that time round. On the subsequent show, though, Vuolo read out some pretty harsh commentary about Garfield. Garfield offered a nonpology (something about how he can't be sexist, he has daughters), but it was good to have some of the criticism read out loud.

On average, modulo Garfield's frustrating attitudes, I would highly recommend the podcast, and would recommend recommending the podcast.

Could it be better?


While I think Lexicon Valley has done some great work so far, I don't think it has yet provided coverage of linguistics in quite the way Pesetsky dreams of. So far, they've mostly covered topics that are reactive to popular gripes or misconceptions about language. In some respect, it'd be hard for them to do otherwise, because the popular understanding of language science is far below that of almost any natural science, or so it seems from this angle.

I hope, though, that they might find a way to approach linguistic topics which are not just reactive. Just addressing the idea that there are functional elements which have no phonological realization would be enormous. Garfield could play the skeptic, believing that what you see is what you get.

So linguists, listen in, get a feel for the show, and maybe if you have a topic which could be nicely formatted into a 20 minute conversation, send it in to them!

Sunday, February 3, 2013

Does language "cool"?

A few months ago, I posted about how I was relatively unimpressed by a paper arguing that the observed Zipfian distribution of words in a corpus is due to "preferential attachment" aka the Matthew Effect aka the rich get richer. The author of that paper is apparently also a co-author of a paper called "Languages cool as they expand: Allometric scaling and the decreasing need for new words." The writeup in Inside Science summarizes it like this:
[A] recent analysis has found that as a language grows over time, it becomes more set in its ways. New words are always being added, according to this study, but few become widely used and part of the standard vocabulary.
My linguist hackles immediately raised at this statement, and that's because there is a large and fundamental difference between what a linguist understands the term "language" to refer to, and what the authors of the column and paper understand it to refer to. What the physicists and the reporter mean by "language" is roughly "a set of words," and in the context of the paper, they almost seem to mean "the set of words which have been published."

This "language is words" axiom is part of most people's folk linguistics that we have to train people out of when they take Intro to Linguistics. That's why it's a little hard to take the work of these physicists seriously at first glance. It is as if they were trying to write a serious paper on biological evolution with the assumption that traits acquired by an organism during its life were inheritable.

But there is an aspect of linguistic knowledge relating to the set of words and morphemes a speaker knows, which linguists call the "lexicon". So, I'll just go ahead and reread the paper mentally replacing each instance of "language" with "lexicon" in order to get through it.

Overall Thoughts

This paper seems to be a relatively competent (modulo Mark Liberman's concerns about OCR errors) description of the statistical properties of large corpora. But that's really as far as I think any of the claims can go. I am totally unconvinced that their results shed any light on language change, development, evolution, etc. I'm not even sure that the simplest statement that "the lexicon of languages has grown over the past 200 years" can be supported by the results reported.

The key problem that I see with the paper is the conflation of "new to the corpus" and "new to the lexicon." Here's how the problem of sampling language was describe to me, and I believe it goes back to Good (1953) and is key to Good-Turing Smoothing. Say you are a entomologist working in a rain forest, trying to make a survey of insect life. You put out your net for a night to collect a sample, then count up all the species in your net. Some bug species are going to be a lot more frequent than others. You'll have some species that show up many times in the net, but even more species will show up in the net with only one member. Now, let's say that you come back to the same rain forest two years later, and repeat the sample. You are nearly guaranteed to observe new species in your net this time around, but the key question is whether they are just new to the net, or are they new to the rain forest. If they're new to the rain forest, did they migrate in, or are they hybrids of two other species, or has a species you saw previously evolved really rapidly so that you're seeing it as different now?

These are really interesting and important questions for our entomologist to answer, but you cannot arrive at a definitive answer based simply on the fact that this new species has now showed up in your net. In fact, depending on a few factors, the answer with the highest probability is that the new species is simply new to your net. The Good-Turing estimate of the probability that the very next bug you catch will be a new species is that it's roughly equivalent to the proportion of bugs you've already caught that belong to a species you've only seen once.

The situation gets even more confusing if you come back to the same rain forest two years later with a net twice the size.

The paper has a figure plotting the increase in lexicon size over time. My first thought when I saw it was that it must be the case that the overall size of the corpus at each time point must also be going up. Coming back to the entomologist in the rain forest, the number of species in his net is merely a sample of how many species there are in forest. In the same exact way, the number of words in a lexicon can only be estimated by the words which people happened to write down. As you increase the size of the net, you're going to find more species which were already in the forest, but not in your net. As you increase the size of your corpus, you're going to find more words which were already in the lexicon, but not in the corpus.

Now, you need to add to this that at any given point in time, the true maximum number of possible words you could potentially observe in any given language is ∞. Yes, in fact, the whole reason language is interesting to study is because given a finite set of mental objects, and a finite set of operations to combine them, you can come up with an infinite set of stings, and that goes for words too, not just sentences. In 1951, "iPod" was a possible word of English, it just wasn't used, or at least not for the same purpose it is now.

Regarding the question of whether the "active" (as I'll call it) lexicons of languages have grown over the past 200 years, well, indeed, the overall number of printed words has also increased. Almost all of their results seem to have more to do with the technological development of publishing than it does with any other linguistic or cultural development. It is as if the entomologist said that over the past decade, the biodiversity in his rainforest has exploded, when really what's going on is his nets have been getting progressively larger.

Now, it might be the case that the active lexicon has grown more than would be expected given the increase in the size of the corpus year over year, but as far as I can tell, the authors did not try to estimate whether this was the case.

What about this cooling down?

The "cooling" effect referred to by the paper is the suggestion that as a language "grows" (which as I just said is dubious), the frequency with which particular words are used becomes more stable. Some words are more frequent than others, but words are less likely to move up and down in frequency over time/as the lexicon grows. Back to entomology, the suggestion is that as more species cram into a rainforest, each species is less likely to become more or less populous.

Again, though, the frequency, even relatively frequency, of a word in a corpus is merely an estimate of its true frequency. As the size of the corpus increases, so should the reliability of its frequency estimates, and we would predict decreasing volatility of those frequency estimates. The authors check for this, and find exactly this relationship between corpus size and frequency volatility, but I can't tell whether there was excess "cooling" left over. I wish they had said, "there was x proportion of cooling left unaccounted for by simply accounting for the size of the corpus," but I think this is perhaps another symptom of the assumption that the corpus=the lexicon=the language that I complained about before.

The Allure of Big Data

The reporter who wrote the Inside Science article did what it appears that the editors of Scientific Reports did not, asked a linguist to comment on the paper. Bill Kretzschmar was "underwhelmed," saying that most of these results are not new to linguists. I would take this as a word of warning about the allure of big data. The results discussed in this paper are not, by and large, new, but rather have never been done with data of this scale. But unfortunately, a fact which is already known does not get more interesting when it is reestablished with data 100 or 1000 times larger than before.

Tuesday, November 27, 2012

To take "Zombie Nouns" seriously, you must've had your brains eaten.

At first, I didn't feel like blogging about the NYT Column on "Zombie Nouns" because I feel like I've been spending too much time being critical here, arguing against usage advice like this is futile, and I knew Mark Liberman would cover it. In fact, I drafted this post all the way back during the summer, and just let it sit. But now, I've seen the column, nearly verbatim, pop up on TED-Ed as a fully animated "lesson", which presumably means some educators are actually assigning it to classrooms of fertile and impressionable minds! It really can't pass without comment now.

Helen Sword says that you should avoid using nominalizations, which she calls "zombie nouns." They're nouns that have been made out of other parts of speech. To take one of her examples, calibrate + ion = calibration.

What is so wrong about nominalizations? Not exactly clear. She seems to take aim at unnecessarily jargonistic writing, which frequently contains novel coinings of words of all types, including nominalizations. So sure, being jargonistic to obscure your other intellectual shortcomings is not so good. But is it really, actually, the mere use of nominalizations that's doing the damage there?

She also seems to take a page out of the anti-passive voice book, saying, "it fails to tell us who is doing what," which just like the passive, is just not true. For example, in the sentence
  • My criticism of her column is a day late and a dollar short.
It's very clear who is doing what, even though I used a nominalization (in bold).

But on top of the half baked usage advice, there are some more reprehensible social attitudes being expressed. For example, she lists epistemology as a useful nominalization for expressing a complex idea, but heteronormativity as one only out of touch academics who are enchanted by jargon use. First off, I would not want to use epistemology as an example when explaining what nominalizations are. What's it derived from? Episteme? Episteme has a Wikipedia page, so I guess it's that. Which brings me to the next issue here. It's embarrassing for me to admit, but whenever someone says or writes epistemology, I have to go look it up on Wikipedia. How does using epistemology not count as being out of touch with how ordinary people speak? Heteronormativity, on the other hand, is pretty easy to wrap your mind around. From Wikipedia:
Heteronormativity is a term to describe any of a set of lifestyle norms that hold that people fall into distinct and complementary genders (man and woman) with natural roles in life. It also holds that heterosexuality is the normal sexual orientation, and states that sexual and marital relations are most (or only) fitting between a man and a woman. Consequently, a "heteronormative" view is one that involves alignment of biological sex, sexuality, gender identity, and gender roles.
That's a pretty complex idea. But you know what? It's pretty easy to decode most of that meaning from the word itself, at least, if you're vaguely familiar with the politics of the time. Hetero(sexual) + normative + ity. It seems to me that she's saying more about her position on sex and gender politics here than she is about usage advice.

But who is this person, and why is she writing an opinion column in the New York Times, and getting the full TED treatment? Just like everyone, she's selling something: the icing on the cake, and my reason for blogging about this at all. She has a book out called The Writer's Diet, which has an accompanying online Writer's Diet Test. No, it's not diet as in "food for thought and inspiration," like a Chicken Soup for the Writer's Soul. It's diet as in dieting as in "drop 20 lbs and get the six pack abs you always wanted." Just paste in a paragraph of your writing into the test, and it'll rate you along a five point scaled labeled:

lean fit & trim needs toning flabby heart attack territory

Ain't nothing like exploiting the collective dysmorphia of a nation to push your quarter-baked usage decrees. But in doing so, Sword actually clarifies the role that books like hers play. The analogy to the diet and weight loss industry is entirely apt. The dieting industry makes their money by sowing seeds of personal insecurity, then reaps their harvest with offers of unfounded, unscientific, and ultimately futile dieting pills, products, methods, 10 step plans, meals, regimes, books, magazines, etc.

I won't mince words. The NYT column and the TED-Ed video have the equivalent intellectual content of the magazines in the supermarket aisle promising you 5 super easy steps to trim your belly fat to get a sexy beach bod in time for the summer. And they serve the same purpose: to undermine the confidence of every-day folk, so that they may be taken advantage of by self-appointed gurus.

Thursday, November 15, 2012

Creative Work

Whenever I hear "creative" people describe their creative process, or more precisely their creative woes, I am always struck by the strong similarities to my own experiences trying to do science. I do consider myself as trying to do science.

Take, for example, this excellent statement on self-disappointment at the early stages of your career from Ira Glass.

Ira Glass on Storytelling from David Shiyang Liu on Vimeo.

This almost perfectly sums up how I felt about almost all of the early work I did in graduate school. I can't say that I've actually gotten to the point where the work I produce meets up with my my own personal standards, but it has been on an upward trend, and I'd say Ira Glass' advice is spot on. If you want to write good papers, just write a lot of papers, and if you want to be good at giving talks, give a lot of talks, preferably in a context where you feel comfortable being bad or mediocre.

That last bit, being comfortable with being bad is really reminiscent of things Brother Ali says in this interview.

Ill Doctrine: Brother Ali Meets the Little Hater from ANIMALNewYork.com on Vimeo.

There are a few things Brother Ali says that really resonate with me.
There was a moment where I was so stressed out. And I'm like, "Man, everything that I ever did that people liked, I just got lucky. I'm a fraud."
...
It's a weird weird thing to have what you create also be your livelihood. What we create is also our sense of self. What we create is also the way the world views us.
...
And so I start thinking about it. Ok, it's not that I'm blocked. It's not that I don't have anything to say. It's that I don't know how to say what I need to say. Or it's that I don't think that it's going to be received well. Or it's that the people that love me and have supported me and have, you know, gave me the little bit of freedom in my life that I have, I don't want to let them down and I don't want to hurt their feelings by saying what needs to be said.
I think almost all academics of any variety feel this way from time to time.

But I wonder if some people might not be surprised that I would feel so similarly to creative artists in the pursuit of my science, or that maybe take it as evidence that I what I do is not science. It is certainly doubted about Linguistics occasionally. But I think these people (probably strawmen) are mistaken in thinking that science is not a creative process. This was recognized by Max Weber in is 1918 essay "Science as a Vocation" (which I've blogged about before).
[I]nspiration plays no less a role in science than it does in the realm of art. It is a childish notion to think that a mathematician attains any scientifically valuable results by sitting at his desk with a ruler, calculating machines or other mechanical means. The mathematical imagination of a Weierstrass is naturally quite differently oriented in meaning and result than is the imagination of an artist, and differs basically in quality. But the psychological processes do not differ. Both are frenzy (in the sense of Plato's 'mania') and 'inspiration.'
He also suggests that the best science and the best art is produced by individuals devoted to the science and art for their own sake, rather than being driven by the express goal of producing something new, for the sake of novelty.

The distinction that Weber draws between art and science is that science is necessarily committed to the abandonment of old science. That is, art from the Renaissance is still, and always will be, art, but science from the same period is no longer science. It has been superseded by more recent developments.

Anyway, here's the song Brother Ali was talking about, which I'm sure almost all academics can identify with, except for the suicide ideation, hopefully.

Wednesday, November 7, 2012

Nate Silver vs.the baseline

The 2012 election has been declared a victory for Nate Silver. As Rick Reilly said:
For me, as a data geek, this is nothing but good news. There's been a lot of talk about how Silver's high profile during the election could have broader effects on how every day people think about data and prediction. There's also talk about how Silver's performance is challenging to established punditry, as summed up in this XKCD comic.


Coming at this from the other side, though, I'm curious as a data person about how much secret sauce Silver's got. Sure, in broad qualitative strokes, he got the map right. But quantitatively, Silver's model also produced more detailed estimates about voting shares by state. How accurate were those?

Well, to start out, there is not some absolute sense of accuracy. When it comes to predicting which states would go to which candidates, it's easy to say Silver's predictions were maximally accurate. But what's tricker is to figure out how many he could have gotten wrong and still have us call his prediction accurate. For example, Ohio was a really close race. If Ohio had actually gone to Romney, but all of Silver's other predictions were right, could we call that a pretty accurate prediction? Maybe. But now let's say that he got all of conventional battle ground states right, but out of nowhere, California went for Romney. It's the same situation of getting one state wrong, but in this case it's big state, and an anomalous outcome that Silver's model would have missed. Would his prediction be inaccurate in that case? What if it was Rhode Island instead? That would be equally anomalous, but would have a smaller impact on the final election result. Now let's imagine a different United States where all of the races in all of the states had razor thin margins, and Silver correctly predicted 30 out of 50. In that case, we might say it was an accurate prediction.

All of this is to say that the notion of "accuracy" is really dependent upon what you're comparing the prediction to, and what the goal of the prediction is.

So what I want to know is how much Silver's model improves his prediction over what's just immediately obvious from the available data. That is, I want to see how much closer Silver's prediction of the vote share in different states was than some other baseline prediction. For the baseline, I'll take the average of the most recent polls from that state, as handily provided by Nate Silver on the 538 site. I also need to compare both the averaging method and the 538 method to the actual outcomes, which I've copy-pasted from the NPR big board. (Note: I think they might still be updating the results there, so I might have to update this post at some future date with the final tally.)

First I'll look at the Root Mean Square Error for the simple average-of-polls prediction and the 538 prediction. I'll take Obama and Romney separately. The "Silver advantage" row is just the poll averaging prediction divided by the 538 prediction.

ObamaRomney
Averaging Polls3.34.1
5381.81.7
Silver Advantage1.82.4

So it looks like Silver has definitely got some secret sauce, effectively halving the RMSE of the stupid poll averaging prediction. I also tried out a version of the RSME weighted by the electoral votes of each state, for a more results oriented view of the accuracy. I just replaced the mean of the squared error by a weighted average of the squared error, weighted by the electoral votes of the state. The results come out basically the same.

ObamaRomney
Averaging Polls3.23.1
5381.51.5
Silver Advantage2.22.0

So what was it about the 538 forecast that made it so much better than simply averaging polls? I think these plots might help answer that. They both plot the error in the 538 forecast against the error in poll averaging.


It looks like for both Obama and Romney, the 538 forecast did more to boost up the prediction in places where they outperformed their polls than tamping them down where they underperformed. The effect is especially striking for Romney.

So, Silver's model definitely outperforms simple poll watching & averaging. Which is good for him, because it means he's actually doing something to earn his keep.

You can grab the data I and R code was working with at this github repository. There's also this version of the R code on RPubs.

Friday, July 27, 2012

Teens and Texting and Grammar

I'm just one man, one linguist, impotently shouting into the vast mediascape, "PLEASE POPULAR MEDIA! PLEASE DON'T RUN WITH THE TEEN TEXTING GRAMMAR STORY!"

There is a paper out in New Media and Society called Texting, techspeak, and tweens: The relationship between text messaging and English grammar skills. If you are a linguist, and you winced at the title, I have to warn you, you're not done wincing yet.

Is the key problem that the authors collected data on text messaging behaviors from self reports? No.

Is the key problem that the authors did not directly assess whether or not the teens in the study used "techspeak"? No. (Let's set aside the fact that high volumes of txtspeak are increasingly associated with out of touch adults).

Is the key problem that the authors didn't include any figures plotting the relationship between any of their measures? No.

Is the key problem that the authors included no control group of teens who don't text, or adults who adopted texting late in life? No.

The key problem is that the authors appear to have no idea what grammar or language are. I quote:
Similar to synchronous online communications such as instant messaging, the speed, ease, and brevity of text messaging have created a perfect platform for adapting the English language to better suit attributes of the technology. This has led to an evolution in grammar, the basis of which we shall call ‘techspeak.’ This language differs from English in that it takes normal English words and modifies them [...]
The depth of misunderstanding and naiveté present in this quote about the relationship between actual language and grammar and the way we write is equivalent to thinking that the sun revolves around the Earth, and that stars are bright dots on a large dome in the sky. Mind you, the Earth-centric, skydome model of the universe is a perfectly reasonable one until you are exposed to the most basic, rudimentary scientific understanding of how the world works.

The authors of this paper appear not to have been exposed to the most basic, rudimentary scientific understanding about how language and grammar work.

From Appendix A of the paper, I present to you the 20 point "grammar" assessment used in the study.
  1. There (is, are) two ways to make enemies.
  2. One of the men forgot to bring (his, their) tools.
  3. Gail and Sue (make, makes) friends easily.
  4. The coach thought he had (tore, teared, torn) a ligament.
  5. During the flood, we (dranked, drank, drunk, drunked) bottled water.
  6. The boy called for help, and I (swum, have swam, swam) out to him.
  7. Fortunately, Jim’s name was (accepted, excepted) from the roster of those who would have to clean bathrooms because he was supposed to go downtown to (accept, except) a reward for the German Club.
  8. I don’t know how I could (lose, loose) such a big dress. It is so large that it is (lose, loose) on me when I wear it!
  9. The man around the corner from the sandlots (come, comes) to our meetings.
  10. The man and his little girls (was, were) not injured in the accident.
  11. The pictures in this new magazine (shows, show) the rugged beauty of the West.
  12. The orders from that company (is, are) on your desk there.
  13. The (boys, boys’, boy’s, boys’s) hats were lost in the water because they were careless in not tying them to the side of the boat.
  14. (Its, It’s, Its’) an honor to accept the awards certificates and medals presented to the club.
  15. Worried, and frayed, the old man paced the floor waiting for his daughter. (Correct/Incorrect)
  16. The boy yelled, ‘Please help me’! (Correct/Incorrect)
  17. She got out of the car, waved hello, and walked into the house. (Correct/Incorrect)
  18. When Suzie arrived at the dance, no one else was there. (Correct/Incorrect)
  19. Dad and I enjoyed our trip to new york city. (Correct/Incorrect)
  20. The boy’s mother picked him up from school. (Correct/Incorrect)
To quote what it was the authors were trying to assess:
The first portion of the assessment consisted of 16 questions designed to test the student’s grasp of verb/noun agreement, use of correct tense, homophones, possessives, and apostrophes. [...] The second portion of the assessment asked participants to indicate whether or not a sentence was correct, such as ‘The boy yelled, “Please help me”!’ (Correct/ Incorrect). This portion tested the student’s understanding of comma usage, punctuation, and capitalization.
Virtually none of these points (homophones, apostrophes, comma usage, punctuation and capitalization) fall under the purview of what is scientifically understood to be "grammar". Arnold Zwicky has suggested the term "garmmra" for such things. Punctuation, comma rules, spelling conventions, etc. are all only arbitrary decisions settled upon a long time ago, and have nothing, nothing to do with human language. You could, by fiat, swap periods and commas (like many cultures do with their numeral systems), insist that sentence initial adverbs be followed by a semicolon, and decide to revert back to the symbols <þ> and <ð> to spell the sounds we currently both spell with <th>, and you know how many things that would change about English grammar? Zero things.

The remaining points of assessment could be considered to be well within the domain of grammar (tense and subject/verb agreement), except authors chose really poor, very variable items for the evaluation. The very first item involves verbal agreement with an expletive subject, and the rest involve cases of coordination, and agreement attraction! These are items which really lie on the outside edges of linguistic processing abilities, and there is no way that they could serve as reliable measures of fluency and grammatical competence. Search the work of any good writer, and I'm sure you'll find examples of both kinds of usage.

And then there's the second item: "One of the men forgot to bring (his, their) tools." Both possibilites are acceptable English, and have been for a long time.

The most depressing thing about this grammar assessment is where the researchers say they got it.
This assessment was adapted from a ninth-grade grammar review test.
I'm reminded of a piece I read called For Ebonics, the New Milennium Is Pretty Much Like the Old One, which said: "This suggests to me a catastrophic failure of the public school 'language arts' curriculum: people spend years in various language arts classes and leave with the same 19th-century folk notions that they started with."


So what have these authors actually found? Well, maybe it's the case that the more people who write in a broader range of contexts for a broader range of purposes, the more the arbitrary, conventionalized aspects of the writing system of English will undergo natural drift. What effect with this have on English grammar, as it is represented in the minds of every day English users? Probably just as much as the current writing system does: a minimal one.



And what about my plea to the popular media? Even if someone of note finds this post and reads it, I already know that it won't matter at all. Per my commentary on the coverage on vocal fry, no one is going to report on this piece because they care about science or facts. This research fits snugly into pre-existing biases about young people and the general decline of society, and frankly, these biases seem to have more to do with why these researchers did the study in the first place than science or facts. And there's is no way that something so trivial as a bunch of experts on language and grammar are about to derail this train of garbage and nonsense.

UPDATE! There is, in fact, actual paper on the topic of Instant Messaging and Grammar by Sali Tagliamonte and Derek Denis from 2008 called "Linguistic Ruin? LOL! Instant Messaging and Teen Language." Remember hearing about that in the news? Here's selections from their conclusions.
In a million and a half words of IM discourse among 71 teenagers, the use of short forms, abbreviations, and emotional language is infinitesimally small, less than 3% of the data.
Our foray into the IM environment through quantitative sociolinguistic analysis, encompassing four areas of grammar and over 20,000 individual examples, reveals that IM is firmly rooted in the model of the extant language,reflecting the same structured heterogeneity (variation) and the same dynamic, ongoing processes of linguistic change that are currently under way in the speech community in which the teenagers live.

UPDATE! See also Enregistering internet language by Lauren Squires (2010)