Malcolm Corney, Alison Anderson and George Mohay
Queensland University of Technology
Australia
Olivier de Vel
Defence Science and Technology Organisation
Australia
This paper describes an investigation of authorship gender attribution mining from e-mail text documents. We used an extended set of predominantly topic content-free e-mail document features such as style markers, structural characteristics and gender-preferential language features together with a Support Vector Machine learning algorithm. Experiments using a corpus of e-mail documents generated by a large number of authors of both genders gave promising results for author gender categorisation.
Keywords: computer forensics, authorship attribution, email, data mining