When an internet user clicks on a result in a search engine, a request is submitted to the destination web server that includes a referrer field containing the search terms given by the user. Using this information, website owners can analyze the search terms leading to their websites to better understand their visitors’ needs. This work explores some of the features that can be used for classification-based analysis of such referring search terms. We present initial results for the example task of classifying HTTP requests’ countries of origin. A system that can accurately predict the country of origin from query text may be a valuable complement to IP lookup methods which are susceptible to the obfuscation of dereferrers or proxies. We suggest that the addition of semantic features improves classifier performance in this example application. We begin by looking at related work and presenting our approach. After describing initial experiments and results, we discuss paths forward for this work.
Revised: June 28, 2012 |
Published: May 11, 2012
Citation
May C.J., M.J. Henry, L.R. McGrath, E.B. Bell, E.J. Marshall, and M.L. Gregory. 2012.Semantic Features for Classifying Referring Search Terms. In Proceedings of Northwest Natural Language Processing Conference (NW-NLP 2012), May 11, 2012, Redmond, Washington. Seattle, Washington:University of Washington.PNNL-SA-87895.