February 1, 2006

Blog Anniversary and Statistics

Posted by Arcane Gazebo at February 1, 2006 2:38 PM

This blog is now three years old. I'm always sort of surprised that I've been able to stick with it, as I'm usually terrible at keeping up with long-term projects. Coincidentally, the day I started blogging was also the day Ryan North started posting Dinosaur Comics. Indeed, it was a beautiful day to be stomping on things.

(It's also the birthday of commenter JSpur, who is 49 +/- 3 today. He's in Hawaii right now and therefore, I think, not reading this.)

Anyway, on this auspicious occasion I have some random trivia about the evolution intelligent design of this blog over the years:

Ever since Google started indexing me, the most popular search leading to this page has been "gazebo". At first I was somewhere around the 60th hit for this search string; now I'm typically the 10th or 11th. (I get a lot more of these hits when I'm on the first page of search results.)

Also, the second most popular search string from last month was the title to this post; for a period of a few weeks Google decided that I was among the top five experts on this phrase, ranked just behind Fleshbot and Xeni Jardin. Fortunately they seem to have recalibrated and I no longer get these hits.

Because I am a big nerd, I have plotted monthly totals for comments and the average number of comments per post, over the blog's three-year history. Happily, the trend is increasing. (Comments are the best part of blogging, after all.)

And finally, the top five most-commented posts:

4. (tie) Everyone's talking about it [Open Thread] on October 3, 2005 (24 comments)
Finish Line on October 16, 2005 (24 comments)
2. (tie) Shyness and serotonin on September 26, 2005 (31 comments)
Halloween thread on October 31, 2005 (31 comments)
1. Essential 90's Albums/New Year's Resolution on January 6, 2006 (36 comments)

Thanks to everyone who comments here; hopefully we'll have another good year of obscure music and liberal ranting.

I could almost draw a straight line through the data... :)

I think my most commented post remains "Angry Hornets" with 18 comments. (A couple of the D & D scheduling comments got close and maybe one of them broke the record, but those don't really count the same way.) There's something wrong about that.

Posted by: Mason | February 1, 2006 4:30 PM

I get a slightly better fit to an exponential, which predicts that the number of comments per post should double every 12.8 months. Hopefully this saturates somewhere. Alternatively I could imagine a quadratic model.

If I were actually going to model this, I would assume that (a) the bulk of comments come from regular commenters; (b) visitors to the site convert to regular commenters at a constant rate (per visitor); (c) the number of visitors scales with the number of blogs linking to me, which increases at a rate proportional to my current visibility. So that would be exponential. Also, I would expect the number of comments to scale faster than linear in the number of commenters, as discussions become livelier.

Ok, when I said above that I am a big nerd, I was understating the point.

Posted by: Arcane Gazebo | February 1, 2006 4:52 PM

We have Internet access on the Big Island, dude. And even when I don't have my laptop booted up I often read you and Quantum Chaotic Thoughts on my BlackBerry- you're both bookmarked. Just haven't figured out how to comment wirelessly yet.

Anyway, happy birthday to the blog- and I am proud to share this day with it.

Thoroughly enjoyed the foaming at the mouth over the SOTU. You rock.

Your mom and I hiked the lava fields today. And on that note, happy New Year of the Dog and I'll check in again tomorrow.

Posted by: JSpur | February 1, 2006 11:22 PM

So you could say you get Moore comments every 12.8 months?

You could probably expect roughly N^2 behavior between number of posters / posts. Actually, depending on what format you've got your data in, it might be fun to try and compare it to a few different models...

Do you have data on, say, average number of posts per person, or perhaps counting the unique posters per thread rather than posts?

Posted by: Lemming | February 2, 2006 10:01 AM

Oh, and exponential growth is quite reasonable when a website is small relative to teh intarweb.

Not that I'd call it small or anything, *coughcough*.

Seriously though, teh intarweb is a big place. Once a website is large wrt teh intarweb, the growth would look more like (sizeOfIntarweb - alpha^(-t)).

In reality, of course, sizeOfIntarweb is growing.

What's a function that looks kinda like an inverse tangent, and to either side behaves suspiciously like an exponential (locally, at least, far to each side)?

I could generate a C-inf (but non-analytic) function that behaves as described trivially and uglyly, but is there something nice? Analytic, even, maybe?

Posted by: Lemming | February 2, 2006 10:08 AM

Lemming: I have access to the database with all the comments and posts, so in principle I can get those data with appropriate SQL queries.

The distribution of comments per person is interesting. There are a handful of high-volume commenters with 150-600 comments over the history of the blog; on the order of 20 other regulars with between 10 and 50 comments, and then a long tail of people who have only posted once or twice.

Posted by: Arcane Gazebo | February 2, 2006 1:39 PM

There are various so-called "preferential attachment" models of some of this stuff available (the original use is for paper citations, despite what the authors of some of these papers [especially Barabasi, who claims to have invented this model but didn't] state). In your back-of-the-envelope calculations, my mind immediately went in that direction.

I am not aware of a study pertaining directly to blog comments (although I'm almost positive some network theorists have done it), but the observation of a power-law distribution (which has the heavy tails you describe) in user comments is what I would expect based on similar studies that I've read.

Posted by: Mason | February 2, 2006 2:03 PM

Ok, the data for number of distinct commenters per post is here. If we assume that regular commenters are produced at an exponential rate, and total comment count scales with the square of the number of commenters, this should be exponential with a time constant twice that of the comment count. And indeed, an exponential fit gives a doubling time of 28.0 months.

Incidentally, I think the sharp drop in this data at month 23 can be attributed directly to the departure of one high-volume commenter! Around the same time one of the other main regulars arrives so it recovers pretty quickly.

Posted by: Arcane Gazebo | February 2, 2006 3:55 PM

So when is the PRL coming out?

Posted by: Mason | February 2, 2006 5:12 PM

Heh, maybe I'd have better luck getting this into PRL than we did with our qubit results.

Posted by: Arcane Gazebo | February 2, 2006 5:43 PM
