G. Ramirez

The Role of Non-Content Features in XML Retrieval

ABSTRACT

The research presented investigates the use of non-content features for effective information retrieval. We use the expression non-content features to refer to the structural markup within a document or a collection and the document's surface features, i.e. document's (derived) metadata (e.g. size). Our main hypothesis is that the best use information retrieval systems can make of this type of information will be determined by the different types of search tasks and contextual factors. We focus our investigation on three main aspects: (1) The analysis of existing and the creation of new retrieval strategies on the use of non-content features, (2) the use of relevance feedback techniques to refine the non-content information given a user need, and (3) the study of the relationships between user search tasks and contextual factors and the structural characteristics of the relevant information.