Title
Annotating URLs with query terms: What factors predict reliable annotations?
Author
TNO Informatie- en communicatietechnologie
Verberne, S.
Hinne, M.
van der Heijden, M.
Kraaij, W.
D'hondt, E.
van der Weide, T.
Publication year
2009
Abstract
A number of recent studies have investigated the relation be-ween URLs and associated query terms from search engine log files. In [5], the query terms associated with the domain of a URL were used as features for a URL classification task. The idea is that query terms that lead to successful classification of a URL are reliable semantic descriptors of the URL content. We follow up on this work by investigating which properties of a URL and its associated query terms predict the classification success. We construct a number of URL and query properties as predictors and proceed to analyze these in-depth. We conclude that the classification success | and thus the reliability of the query terms as URL descriptors | cannot easily be predicted from properties of the URL and the queries.
Subject
Language Modeling
Attention Metadata
To reference this document use:
http://resolver.tudelft.nl/uuid:d7e86907-42ed-49ce-a149-738c679c7082
TNO identifier
445980
Source
Proceedings of the Understanding the User (UIIR) workshop at SIGIR 2009
Document type
conference paper