Automatic Keyword Extraction from Individual Documents

May 3, 2010

Book Chapter

Automatic Keyword Extraction from Individual Documents

Abstract

This paper introduces a novel and domain-independent method for automatically extracting keywords, as sequences of one or more words, from individual documents. We describe the method’s configuration parameters and algorithm, and present an evaluation on a benchmark corpus of technical abstracts. We also present a method for generating lists of stop words for specific corpora and domains, and evaluate its ability to improve keyword extraction on the benchmark corpus. Finally, we apply our method of automatic keyword extraction to a corpus of news articles and define metrics for characterizing the exclusivity, essentiality, and generality of extracted keywords within a corpus.

Revised: May 11, 2010 | Published: May 3, 2010

Citation

Rose S.J., D.W. Engel, N.O. Cramer, and W.E. Cowley. 2010. Automatic Keyword Extraction from Individual Documents. In Text Mining: Application and Theory, edited by MWBerry, J Kogan. 3-20. Chichester:John Wiley & Sons. PNNL-SA-67401.