jump to navigation

Which words do you own?–Neil Gaiman March 16, 2007

Posted by caveblogem in Blogs and Blogging, bookmooch, Books, Cartooning, fiction, literature, Neil Gaiman, Other, vocabulary, web 2.0, writing.
trackback

Note: This is part of a continuing series on the actual vocabulary in use in the blogosphere.  Posts on this subject started here.] 

I began to read the work of Neil Gaiman last year when somebody suggested I read Good Omens, a collaboration between Mr. Gaiman and Terry Pratchett.  Then I read American Gods and Neverwhere and everything else I could get my hands on.  The only thing I haven’t been able to get ahold of is his latest, Fragile Things, which nobody has posted on Bookmooch or Paperbackswap (have to be a little frugal this year, I’m afraid.)  Anyway, Mr. Gaiman is a tremendously talented writer of creepy and interesting tales.  And he writes a darn good blog, too, which I subscribe to and read whenever I can.

I sampled 22,000 words from Mr. Gaiman’s site, spanning the period January 6 – March 14, yesterday morning.  I had to run the spell-check a little differently from the way I normally do, because Mr. Gaiman uses the English spellings of words like color, organize, check (cheque, a draft on one’s checking account), favorite, and orangutan.  So I just changed these to the Americanized versions in his list so that I could merge it in with the others.

I have started to add some words to my spell-checker, and with Mr. Gaiman’s blog I added googled, blog, blogger, blogging, edamame, and perhaps a couple of others that I forgot to write down at the time but which I was absolutely certain were correctly spelled words.

The Blogger’s Vocabulary List is getting larger with each blog I incorporate.  The latest, which includes samples from Three Quarks Daily, Daily Kos, this blog (Pretty Good on Paper) and Neil Gaiman’s Journal, contains 9,383 different words.  In a couple of months I should be able to make a pretty good estimate of the size of the vocabulary in actual use out there (here?) in the blogosphere.  Check this space for updates.

Mr. Gaiman added 1,112 words to the list, an impressive feat at this point for an individual blogger.  Here is a vocabulary cloud composed of the words Mr. Gaiman added to the list, with font sizes at twice the point size as the number of times they appeared in his 20,000-word sample (click for a larger image).

cloudng.jpg

I’ve decided to stop estimating the size of the vocabularies of individual blogs in this study because such estimates are too artificial.  Even bloggers and writers use most of their words in conversation.  And since your vocabulary is altered by each conversational partner, (your conversational partner asks a question about broccoli or oysters and you find yourself using these words yourself, if only to ask for clarification) estimates of this sort don’t seem all that relevant.

What does Mr. Gaiman’s vocabulary cloud say about him as a blogger?  What does it say about the bloggers to which his words were compared?  What will Raincoaster‘s vocabulary cloud say about her or us or anything, when it is added to this growing pool tomorrow? 

Anybody?
Anybody? 
Anybody?
Bueller?

Advertisements

Comments»

1. Alabaster Crippens - March 16, 2007

I’m finding all this increasingly intriguing I must say. Especially the fingerprint you were talking about the other day. What words do I use that others don’t? Is it our differences that make us different? (erm…what?)
I’m very intrigued by this ongoing study. I’ve always assumed I’ve got a quite varied vocabulary…but then occasionally I look at what I say and note that in fact I often use similar sentence structures (If I had a penny for every time I split the infinitive to try and imitate my speaking style…I’d be minted).
Anyway, thanks for the investigation…keep it up.

2. caveblogem - March 16, 2007

Alabaster,

Thanks for stopping by. Link to me and I’ll do yours next.

3. Alabaster Crippens - March 16, 2007

Cool, I’ll point a post in your direction shortly.

4. Dan Guy - March 17, 2007

Neil has frequency clouds for all 6+ years of his blog.

http://www.neilgaiman.com/journal/labels/clouds/words.php
Total words: 954,414
Unique words: 35,189
Displayed words: 28,220 (It doesn’t display words shorter than 5 characters, combines similar words, and removes common articles.)

http://www.neilgaiman.com/journal/labels/clouds/terms.php
Then I pass the words through the Yahoo Term Extraction API to get only the “significant” terms:

http://www.neilgaiman.com/journal/labels/clouds/dynamic_term_cloud.php
And because the terms didn’t show trending, I created a dynamic version that displays them by month with a slider to make them grow and shrink with frequency.

5. Which words do you own?--Raincoaster « Pretty Good on Paper - March 17, 2007

[…] The links, if you should wish to pursue them, are in the comment thread to the post on Mr. Gaiman here.  The analysis and method Dan used seems more sophisticated than mine.  For example, he […]

6. EelKat - March 17, 2007

One comment asks “What words do I use that others don’t?”. I can garuntee that I use words no one else does! Here are a few:

Thw Twighlight Manor
EelKat
Crystonia
fantabulous
manofastaka
varsa
ratzi

You’ll find those words (and many more) on my blogs, and you won’t find them anywhere where you do not also find me… the joys of having created a new language

7. caveblogem - March 17, 2007

EelKat,

This series of posts originated in an attempt to quantify the size of the vocabulary in use in the blogosphere. So I have necessarily limited the analysis to words in the MS Word dictionary, or that I recognize as being in some dictionary somewhere. If I expand it to analyze words that I don’t recognize I can’t pick out mis-spellings and stuff like that. So there would be a new word added each time a word was mis-spelled.

I also love to create new words and see if I can spread their use. Unfortunately, I’m not quite technically skilled enough to figure out how to adapt the analysis to accomodate made-up words (or even recognize them, I think.)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: