jump to navigation

Which words do you own?–Raincoaster March 17, 2007

Posted by caveblogem in 3QD, Blogs and Blogging, daily kos, dailykos, linguistics, Neil Gaiman, Other, Three Quarks Daily, vocabulary, writing.

I first saw raincoaster‘s blog in WordPress’s list of fast growing blogs or popular blogs or something like that.  I check in once and a while and am always entertained.  We share a passion for H. P. Lovecraft and squids and a couple of other things. 

I took a sample of her blog posts yesterday and processed them, and I was a little bummed out at first about the fact that the largest words in the cloud are tag words, and so they present little new information.  Many of the others have to do with current events, so they don’t seem like the timeless blogger fingerprints I had envisioned a day or two ago. 

Nevertheless, if you look at some of the smaller words that pop out, distinguishing her vocabulary from those of Three Quarks Daily, Daily Kos, Pretty Good on Paper, and Neil Gaiman’s Journal, you’ll see some interesting nuggets (below, click to enlarge).


On a side note about Neil Gaiman, Dan somebody (whose relationship to Mr. Gaiman I did not quite get) has done some interesting analysis of Mr. Gaiman’s blog over a long span of time.  The links, if you should wish to pursue them, are in the comment thread to the post on Mr. Gaiman here.  The analysis and method Dan used seems more sophisticated than mine.  For example, he passes Mr. Gaiman’s words through something called a “Yahoo Term Extraction API.”  If I remember my Latin roots correctly it seems to have something to do with bees.  At any rate, his analysis also takes a slightly different tack, examining “terms,” rather than words, and chopping off words that have fewer than five letters.  So what you will see are dynamic clouds of what I suspect are topical concerns of Mr. Gaiman, rather than the individual word usage.  They are fascinating to look at, however, and quite clever.

The day before yesterday I stood for several minutes looking at a book on Java at my local Barnes and Noble.  If it hadn’t had water damage, which made that funny noise and wouldn’t let me flip pages easily, I would have actually bought it.  If anyone is interested in saving me the time and trouble it would take to learn some sort of dynamic html, I would happily partner up, supplying data and analysis for the creation of some sort of acceptable widget. 

Up next, Alabaster Crippens.


1. Dan Guy - March 17, 2007

I am a long-time fan of Mr. Gaiman, and he sometimes allows me to do various bits of web work for him — most notably the Oracular Orb.

I don’t think my methods are more sophisticated than your own; we’re just looking at different things. Your analysis of what my tools provide is accurate.

The Yahoo! Term Extraction API takes a block of text and returns the statistically significant phrases, be they terms with a large search engine result set or statistically improbable combinations of words.

2. caveblogem - March 17, 2007

Dan Guy,

I remember the Oracular Orb. That was nice work, really cool.

If you don’t mind my asking, how does the Yahoo Term Extraction API decide what constitutes a phrase? Does it only look at word groups entered into a search engine (and thus user-generated)? Or is it a tagging thing (and thus author generated)?

I guess I’m assuming that it can’t ferret them out using grammatical rules that distinguish phrases from clauses and that sort of thing and I almost hope that I’m wrong.

3. raincoaster - March 17, 2007

This is interesting analysis, and not just because it’s about me. I would have hoped that “metaphysics” and “allegory” would have been higher in the rankings, but then I’ve been blogging about Anna Nicole Smith and crap lately, so I guess that’ll teach me.

There are also some great tools for telling you which grade level your writing is at: Spy Magazine tested some and found that The New Yorker used too much jargon, and that Juggs porn mag was Grade 12, whereas Harpers was, I believe Grade 8 level.

Also, the Great Firewall of China blocks me, even though I’m communist: it allows the major right-wing blogs in, though. That doesn’t relate to anything here, but it’s interesting, isn’t it?

4. caveblogem - March 18, 2007


Thanks, and thanks for the link. I think your sample covered a shorter period of time than some of the others, which might have skewed the results a little. You post pretty frequently.

I have a feeling that those tools that analyze the level of one’s writing are a little suspect (although they obviously nailed The New Yorker and Harpers). I’ve never seen Juggs. Is it Canadian? In general they probably get it right. But some peoples’ writing is lucid even though they use bigger words. And some people just write kookety prose, no matter what size their words.

Sorry to hear about being blocked in China. I suspect that you are not the Democratic Socialist sort of Communist that they are looking for out there. If I were them I would let the major right-wing blogs in too. They have become the sort of idiotic parody that is sure to frighten any who might question the virtues of censorship.

5. raincoaster - July 26, 2007

I think Juggs is American, although most of the Juggular material they feature is manufactured in Japan and only implanted in America.

I think China is much more anarcho-capitalist than they are communist of any persuasion lately. “PersuAsian?”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: