Friday, December 17, 2010

Finally, we can get some important work done...

***Spoiler Alert - there is some vulgarity in this post, but it's all for science.***

Google recently released a new tool that allows users access to millions of digitized books published over several centuries so that they can search how frequently a word, or several words, appeared over time. The implications of this sort of technology are immense; students, researchers and even lay-people now can plug in words or simple phrases and gauge what authors, as a whole, were thinking at various times.
We can now make charts that measure the "published zeitgeist" and from that, gain a better understanding of how opinions were trending. For instance, the chart below illustrates the historical frequency of the more polite words "penis" "vagina"against the courser, more offensive "dick" and "pussy" between the years: 1920 and 2008.



Upon review of the data it becomes clear that somewhere around 1980 the publishing community became over-run with vulgar animals, as use of the more genteel/clinical nouns began to level off, and published instances of the harsher terms started to tick up (of course this study was hastily assembled and its implications haven't yet been fully explored, it is possible that beginning in the 1980's there was a jump in Richard Nixon biographies and cat ownership, the Barryrides research staff will conduct a follow-up study to confirm this. However, based on projections from this data, it is reasonable to believe that in another 10-15 years medical texts will trade the words "penis and vagina" for "dick and pussy" - completely reasonable).
Another quick illustration of the awesome potential for this tool can be found in the chart below which attempts to pin down when a Clint Eastwood/Monkey movie would most likely be made:



One can see that while over time, "Clint Eastwood" references in published literature (indicated by the blue line) remained relatively flat, while use of the word "monkey" (shown in red) peaked in 1976. With interest in Clint Eastwood and Monkeys at their respective all-time-highs, one would expect Hollywood to take notice, therefore it should come as no surprise that in 1978, the film "Any Which Way but Loose" was released (script approval, filming, and distribution would account for the 2 year gap).

The possibilities for this technology are limitless, more studies to come...

1 comment:

Justin said...

This is absolute brilliance. I'm so happy I found your blog.