757 Monkeys, Typewriters, and Shakespeare — Project GorillaSpeare
by enferex on Apr.10, 2008, under cool ideas, humor, IRC, lulz
I am sure many of you have heard of the thought experiment relating monkeys, typewriters, and Shakespeare, to the concept of entropy. Monkeys Typewriters Shakespeare you say!? How much cooler can things get? Well, this creative thought experiment goes as follows:
“The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type a particular chosen text, such as the complete works of William Shakespeare” [Source: http://en.wikipedia.org/wiki/Infinite_monkey_theorem].
I am not going to go into the history of that study, or much more. The wiki link above should do you justice. So what do monkeys and typewriters have to do with the 757ers? Well I’ll let you take a look for yourself, as I should not impose any bias:

That’s right, nerds, computers, and text generation. So I had the idea, well if there is a potential for monkeys to produce such a marvelous work as Shakespeare, surely my fellow Homo Sapiens should be able to generate something of equivalent brilliance. Thus, the birth of Project GorillaSpeare. The idea was to gather a log in #proto on the 757 IRC server, and eventually compare the log to Hamlet. Thanks to Project Guttenberg, I obtained a pure text of Shakespeare’s Hamlet, from which I parsed out the lines that represented who was to say what in the play, yep Hamlet is written as a play, and I also removed newlines, and some of the play-actions following a similar form to: [Ham. exits]. Once parsed, I wrote some code that compared each character of Hamlet to the first instance in the IRC log file of that character. Also captured was the user who constructed that character (spaces included). The processing job ended when the IRC log ran out. Now I must say, my parsing job was not perfect, nor can I credit the findings as being anything of scientific worth. But enough with the wordy-foreplay and on to the results:
- Parsed Hamlet Text: 164642 characters
- Parsed IRC Log: 32365 characters from January 11, 2008 till April 5, 2008. (log gathering only when I was logged in).
- We banged out about 19.657% of Hamlet
- About every 5.087 characters we plopped out 1 character of Hamlet.
|
Index |
Handle | Hamlet Character Matches |
| 1 | telmnstr | 2140 |
| 2 | count | 1027 |
| 3 | enferex | 549 |
| 4 | remad | 379 |
| 5 | sean | 294 |
| 6 | derez | 284 |
| 7 | skhisma | 198 |
| 8 | chad | 196 |
| 9 | zotobot | 193 |
| 10 | Fister | 144 |
The rest of the results can be obtained here.
So what does this “study” tell us about our entropy? Well, for one, I would think that a 1/5 ratio of Hamlet to Nerds is pretty efficient, but that’s my opinion. The results do not tell us too much, I just figured it would be interesting to see how efficient the IRC room is at generating a novel, without the premise of doing such. Granted, we are not communicating a novel per’se, rather what our blabberings have generated is still somewhat ordered, in comparison to a text that is not our goal of generating. In the thought experiment, the monkeys are typing pseudo-randomly. The next phase (GorillaSpeare 2.0) is to compare our writings to monkeys and measure, what I assume the original intent of the monkeys was, and that is a fairly good quality of pseudo randomness. My conclusion is that monkeys, our brethren, are awesome, and we as homo sapiens are no higher. If we were asked to bang on some keyboards without a premise, I’m sure we could do just a good of job.
-Matt (enferex)

April 10th, 2008 on 5:06 pm
Awesome! We’re the best monkeys around.
December 4th, 2008 on 5:04 pm
Hey I noticed that your 19.657% of Hamlet was also the ratio of your log’s characters to the character’s from the Hamlet text. Doesn’t that mean there was a 1:1 correlation of characters entered to characters matched? That is, your log file could have only gotten through 19.657803% of Hamlet text.
December 5th, 2008 on 1:54 am
IRCC, Hamlet was parsed one character at a time. The first time that same character was found in the IRC log, a match was considered successful, and the iterative process was continued with the next Hamlet character and continuing in the log, where we had previously left off. This process iterated until either the log ran out or Hamlet. To be successful we would need to collect much more information from the channel. I did not clarify this properly. So I only assume, without double checking, that the IRC log was well exhausted before Hamlet.
-Matt (enferex)
December 7th, 2008 on 4:26 am
Parsed IRC Log: 32365 characters from January 11, 2008 till April 5, 2008. (log gathering only when I was logged in).
I had assumed that the IRC Log was originally 32365, I suppose that is the number of characters matched with the Hamlet file instead. That number is the only thing that is causing me trouble.
December 7th, 2008 on 11:12 am
No worries my friend. I would hope the log was larger than the 32365 characters I posted. Those values attribute to the “Parsed IRC Log” should have been the matched count, or at least I would hope they are. It has been a while since I conducted the test, and it was just some geeky humor, it does not really hold any mathematical merit. I cannot say with 100% certainty that the values I concluded with were accurate.
-Matt (enferex)