Combining textual and non-textual features for e-mail importance estimation

conference paper
In this work, we present a binary classification problem in which we aim to identify those email messages that the receiver will reply to. The future goal is to develop a tool that informs a knowledge worker which emails are likely to need a reply. The Enron corpus was used to extract training examples. We analysed the word n-grams that characterize the messages that the receiver replies to. Additionally, we compare a Naive Bayes classifier to a decision tree classifier in the task of distinguishing replied from non-replied e-mails. We found that textual features are well-suited for obtaining high accuracy. However, there are interesting differences between recall and precision for the various feature selections.
TNO Identifier
745688
Source title
Proceedings of the 25th Benelux Conference on Artificial Intelligence, 2013
Pages
1-7
Files
To receive the publication files, please send an e-mail request to TNO Repository.