Show Summary Details

Page of

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

date: 22 January 2019

Abstract and Keywords

Author profiling is the analysis of people’s writing in an attempt to find out which classes they belong to, such as gender, age group or native language. Many of the techniques for author profiling are derived from the related task of Author Identification, so we will look at this topic first. Author identification is the task of finding out who is most likely to have written a disputed document, and there are a number of computational approaches to this. The three main subtasks are the compilation of corpora of texts known to be written by the candidate authors, the selection of linguistic features to represent those texts, and statistics for discriminating between those features which are most indicative of a particular author’s writing style. Plagiarism is the unacknowledged use of another author’s original work, and we will look at software for its detection. The chapter will cover the types of text obfuscation strategies used by plagiarists, commercial plagiarism detection software and its shortcomings, and recent research systems. Strategies have been developed for both external plagiarism detection (where the original source is searched for in a large document collection) and intrinsic plagiarism detection (where the source text is not available, necessitating a search for inconsistencies within the suspicious document). The specific problems of plagiarism by translation of an original in another language, and the unauthorized copying of sections of computer code, are described. Evaluation forums and publicly available test data sets are covered for each of the main topics of this chapter.

Keywords: computational stylometry, disputed authorship, author identification, plagiarism, age gender

Access to the complete content on Oxford Handbooks Online requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.

Please subscribe or login to access full text content.

If you have purchased a print title that contains an access token, please see the token for information about how to register your code.

For questions on access or troubleshooting, please check our FAQs, and if you can''t find the answer there, please contact us.