The Enron Mobile Email Dataset

Home > Software > The Enron Mobile Email Dataset

This dataset consists of sentences written by Enron employees on BlackBerry mobile devices. We provide a series of test sets we recommend for use in text entry evaluations.

The sentence and sentence fragments were found by looking for messages with the default BlackBerry signature at the end of an email. All the sentences were manually reviewed and corrected. For each sentence, we provide metadata about the category (business, personal, Enron-specific), how easy the sentence was to remember, and how quickly and accurately the sentence was typed on full-sized keyboards.

For further details, see our paper A Versatile Dataset for Text Entry Evaluations Based on Genuine Mobile Emails.

Files:
enronmobile.zip Zip file containing the Enron Mobile Dataset (contains everything below and much much more).
readme.txt Readme describing the dataset

Memorable test sets:
mem1.txt 40 easy to remember sentences, set 1
mem2.txt 40 easy to remember sentences, set 2
mem3.txt 40 easy to remember sentences, set 3
mem4.txt 40 easy to remember sentences, set 4
mem5.txt 40 easy to remember sentences, set 5
mem.zip All 200 memorable sentences.
mem_wav.zip All 200 memorable sentences with WAV audio recordings of each.

Character combination sets:
bi40.txt 40 sentences with representative character bigram frequencies
bi80.txt 80 sentences with representative character bigram frequencies
bi160.txt 160 sentences with representative character bigram frequencies
bi320.txt 320 sentences with representative character bigram frequencies

Memorable character combination set:
mem_bi.txt 40 sentence memorable sentences with representative bigram frequencies