Abstract and Keywords
This chapter presents an overview of methods for measuring the similarity of words and texts, using corpus-based and knowledge-based measures of semantic similarity. It describes several word-to-word measures of similarity, using knowledge resources such as WordNet, or large corpora of raw or encyclopedic texts, and shows how these word-based measures can be combined into a text-to-text similarity metric that can be effectively applied to similarity or paraphrase recognition. The metrics are evaluated and compared using a number of experiments performed on several word and text similarity datasets. The chapter concludes by discussing the main techniques proposed to date, and discussing the emerging trends.
Access to the complete content on Oxford Handbooks Online requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.
If you have purchased a print title that contains an access token, please see the token for information about how to register your code.