The word's out: Statistics meets linguistics
How many words do you know? I've been reading a wonderful little book called Geekspeak that gives a rather nice means of working this out. And it's also a nice illustration of the power of sampling! One approach would be to find a big dictionary, and go through every entry, ticking off every word that you know. But there is a better way, which can arguably give a good estimate in a much shorter time. Simply sample the population of words by opening the dictionary at random 100 (or 200, or 500) times. Each time, look at (say) the first entry at the top of the page. Do you know the meaning of this word? If yes, then add one to your word score. At the end of the exercise, divide your score by the sample size (100, 200 etc) to get an estimate of the fraction of words in the dictionary that you know. Then multiply that fraction by the total number of words in the dictionary to get an estimate of your vocabulary size! You will also need to work out the total number of words in the dictionary - preferably without having to count them. To do this, you could look at the number of the last page in the dictionary, and take that as the number of pages (not perfect, but close enough). Then open the dictionary a couple times at random, count the number of different words on each page, and take the average. Then multiply that by the number of pages and voila! You have your estimate of the number of words in the dictionary. Some modification of your final estimate might be required to allow for the fact that the dictionary may include lots of extensions of the stem of each word (e.g. abstract, abstractedness, abstractedly, etc). Geekspeak also deals with lots of other fascinating (and geeky) topics, such as: • How heavy is your house? • Do the dead outnumber the living? • How powerful is a fly? • How fast is a fart? • How many people on a treadmill would it take to power an electric kettle? As the blurb to the book says, it allows you to ‘... exercise your brain, and impress your friends, in ways you never thought possible'. I recommend it. Scott MacLean, Nulink Analytics, FAMSRS Geekspeak - How Life + Mathematics = Happiness by Dr Graham Tattersall was published by Harper Collins in 2008 (ISBN 978-0-00-726338-7)
Print this page
Other Articles in this edition
The real value of research
Statistics e-book launched
Advice for turbulent times
The brain whispers
Mystery uncovered
A research dilemma: Shopper research worth its salt
A ‘how to' guide for qualitative research (but Goodthinking it's not)
AFS ‘Smart Askers' CATI Room Grows
AMSRO news
Career moves
Client's point of view: Focusing on the next steps
Continuum: Shaping the world
MacLean sets up Nulink Analytics
Numbers hosts Q-Con
President's point of view: 2010, the year research became the hero?
Research to help define Australian cyber-safety environment
Ruby Cha Cha in top 50 coolest companies
Society noticeboard
Statistics: Top and tail of analysis
The Newspaper Works announces shortlist
Vittles launches consultancy
e-Rewards acquires Research Now
Research News
Edition index (February 2010)
|