Contact us  |  Search suppliers   
 
 
  

The word's out: Statistics meets linguistics

How many words do you know?

I've been reading a wonderful little book called Geekspeak that gives a rather nice means of working this out.

And it's also a nice illustration of the power of sampling!

One approach would be to find a big dictionary, and go through every entry, ticking off every word that you know.

But there is a better way, which can arguably give a good estimate in a much shorter time.
Simply sample the population of words by opening the dictionary at random 100 (or 200, or 500) times. Each time, look at (say) the first entry at the top of the page. Do you know the meaning of this word? If yes, then add one to your word score. At the end of the exercise, divide your score by the sample size (100, 200 etc) to get an estimate of the fraction of words in the dictionary that you know. Then multiply that fraction by the total number of words in the dictionary to get an estimate of your vocabulary size!

You will also need to work out the total number of words in the dictionary - preferably without having to count them. To do this, you could look at the number of the last page in the dictionary, and take that as the number of pages (not perfect, but close enough). Then open the dictionary a couple times at random, count the number of different words on each page, and take the average. Then multiply that by the number of pages and voila! You have your estimate of the number of words in the dictionary.

Some modification of your final estimate might be required to allow for the fact that the dictionary may include lots of extensions of the stem of each word (e.g. abstract, abstractedness, abstractedly, etc).

Geekspeak also deals with lots of other fascinating (and geeky) topics, such as:
• How heavy is your house?
• Do the dead outnumber the living?
• How powerful is a fly?
• How fast is a fart?
• How many people on a treadmill would it take to power an electric kettle?

As the blurb to the book says, it allows you to ‘... exercise your brain, and impress your friends, in ways you never thought possible'.
I recommend it.

Scott MacLean, Nulink Analytics, FAMSRS

Geekspeak - How Life + Mathematics = Happiness by Dr Graham Tattersall was published by Harper Collins in 2008 (ISBN 978-0-00-726338-7) 


Print this page



Other Articles in this edition

  • The real value of research
  • Statistics e-book launched
  • Advice for turbulent times
  • The brain whispers
  • Mystery uncovered
  • A research dilemma: Shopper research worth its salt
  • A ‘how to' guide for qualitative research (but Goodthinking it's not)
  • AFS ‘Smart Askers' CATI Room Grows
  • AMSRO news
  • Career moves
  • Client's point of view: Focusing on the next steps
  • Continuum: Shaping the world
  • MacLean sets up Nulink Analytics
  • Numbers hosts Q-Con
  • President's point of view: 2010, the year research became the hero?
  • Research to help define Australian cyber-safety environment
  • Ruby Cha Cha in top 50 coolest companies
  • Society noticeboard
  • Statistics: Top and tail of analysis
  • The Newspaper Works announces shortlist
  • Vittles launches consultancy
  • e-Rewards acquires Research Now

    Research News   Edition index (February 2010)


  • top of page     


      Home page | AMSRO | Web site privacy statement | Disclaimer

    Australian Market & Social Research Society
    Level 1, 3 Queen Street Glebe NSW 2037
    Postal address: Level 1, 3 Queen Street Glebe NSW 2037
    Tel: 02 9566 3100 Fax: 02 9571 5944
    Email: amsrs@amsrs.com.au

    Copyright © 2007 Australian Market & Social Research Society.
    No material may be reproduced without prior approval.

    Another site by RUCC