Wednesday, August 17, 2011

Does blogging do me any good? A quantitative analysis.

I've been wondering if blogging does me any good. I don't mean for the heart and soul. I enjoy blogging and am going to keep it up (except for those end-of-semester hiatuses). But I've been wondering if blogging does me any good professionally, or whatever. Obviously, "a professional or whatever good" is hard to define, so I'll define it according to the data that I have.

I maintain, along with this blog, an academic website where I have all of my more serious research stuff. I've got Google analytics set up on both my blog, and my academic site, keeping track of page views. So, if I can detect that page views of my blog drive some page views to my academic website, then I'll conclude that blogging is doing me some professional good. This makes a certain kind of sense, since what matters to me at this particular stage of my professional life is getting my ideas out there, and my ideas are catalogued on my academic site.

The raw data

Here is one year's worth of traffic to Val Systems. Those two huge spikes are thanks to Mark Liberman, who reblogged my post about Brittany Spears' tongue, and to the Car Talk Guys, who linked to my post about their short-a system on the Car Talk site for a bit Sociological images, where I guest posted about a "grammar" book.

Now here is the traffic from my academic site, and my research page on that site from the same time period.


As you can see, my academic site gets a lot less page views than my blog. Prospects are not very bright.

Autocorrelation

My first step of analysis was to figure out how correlated page views of each site were within each site. That is, how correlated are page views on my blog with page views from one day later on my blog, or two days later, etc. To calculate this, I used the acf() function in R. Here's the autocorrelation function from my blog. The x-axis represents how many days into the future you're comparing page views, and the y-axis represents the correlation between page views separated by that many days.


It looks like page views on my blog are pretty well correlated with the pages views from one day before (0.45). After that, there is a correlation drop off, which I'll interpret as new-post-decay. It seems like influence that a single new post has on my blog traffic is fairly minimal after five days.

Here's the autocorrelation function for my academic site.


As you can see, the over-all size of the correlations are much smaller than for the blog. This is most likely because each new post is a new event that happens on my blog, which can have an effect which lasts for a few days, whereas nothing happens on my academic site in the same way. However, there is an apparently cyclic pattern, where page views are most positively correlated at 7 day intervals, and most negatively correlated at 3 to 4 day intervals.

Duh! Who does work on the weekends?

To factor out this cyclic pattern, I fit a linear regression of page views for my academic site and research page with weekday as a categorical predictor. I'll use the residuals from these regressions for doing the cross-correlation.

Cross-correlation

Next, I checked the cross-correlation of (residualized) page views. This checks to see how correlated page views are between any two of the sites at different time lags. First, here's the cross correlation of my main academic site and my research page. I knew these would have to be highly correlated, since my research page is the most clicked link on my main page.



Correlations with negative lag indicate that visits to my research page were correlated with visits to my main academic site a few days later. Positive lags mean visits to my academic page indicate that visits to my academic site were correlated with visits to my research page a few days later. The correlation at 0 indicates how correlated visits to my academic page and my research page were on the same day.

Unsurprisingly, the only strong correlation between visits to my main academic site and my research page are on the same day. That spike around 10 days makes no sense, so it's probably just noise.

So, drum-roll please, how correlated are visits to my blog and my main academic site?


I would analyze this as bupkis. Likewise for my research page.


To sum up

It looks like blogging is just a fun diversion for me right now. Even though it would have been a lot of fun to come to my advisor or department chair with strong results that blogging is professionally fruitful, I'm fine with the way things turned out.

However, I shouldn't have been surprised. If I was trying to use blogging as a platform for promoting my professional work, I wasn't doing it very well. If you're looking at my blog now (vs an RSS subscription), you may notice that I've added some links to the right, which lead to my academic site, and to my github site. Why not try to make blogging work for me a little bit?

No comments:

Post a Comment