jump to navigation

A short digression March 17, 2007

Posted by caveblogem in Blogs and Blogging, linguistics, literature, Other, vocabulary.

I always put posts up without doing significant research into what other people are doing on whatever subjects I happen to be writing about.  Luckily, this makes me look careless and ignorant, instead of lazy and self-absorbed, which is probably closer to the truth. 

Anyway, this morning I got a comment somewhere in this vocabulary thread from kuipercliff, who pointed me to two very interesting sites that perform and display research along similar lines.

WordCount™ is an artistic experiment in the way we use language. It presents the 86,800 most frequently used English words, ranked in order of commonness. Each word is scaled to reflect its frequency relative to the words that precede and follow it, giving a visual barometer of relevance. The larger the word, the more we use it. The smaller the word, the more uncommon it is.

The British National Corpus (BNC) is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of current British English, both spoken and written.

The BNC database is proprietary, and English, too, so I’ll keep using my own, for now.  But both of these sites are pretty interesting, as is Kuipercliff’s blog.


No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: