Show Summary Details

Page of

PRINTED FROM OXFORD HANDBOOKS ONLINE ( © Oxford University Press, 2022. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

date: 27 June 2022

Abstract and Keywords

There is a considerable literature on applications of statistical methods in natural-language processing. This chapter focuses on two types of applications: (1) recognition/transduction applications based on Shannon’s Noisy Channel such as speech recognition, optical character recognition (OCR), spelling correction, part-of-speech (POS) tagging, and machine translation (MT); and (2) discrimination/ranking applications such as sentiment analysis, information retrieval, spam email filtering, author identification, and word sense disambiguation (WSD). Shannon’s Noisy-Channel model is often used for the first type, and linear separators such as Naive Bayes and logistic regression are often used for the second type. These techniques have produced successful products that are being used by large numbers of people every day: web search, spelling correction, translation, etc. Despite successes such as these, it should be mentioned that all approximations have their limitations. At some point, perhaps in the not-too-distant future, the next generation may discover that the low-hanging fruit has been pretty well picked over, and it may be necessary to revisit some of these classic limitations.

Keywords: Shannon’s Noisy-Channel model, Naive Bayes, logistic regression, term weighting, statistical methods, linear separators, web search, spelling correction, machine translation, word sense disambiguation

Access to the complete content on Oxford Handbooks Online requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.

Please subscribe or login to access full text content.

If you have purchased a print title that contains an access token, please see the token for information about how to register your code.

For questions on access or troubleshooting, please check our FAQs, and if you can''t find the answer there, please contact us.