Incorporating terminology evolution for query translation in text retrieval with association rules
Document Type
Conference Proceeding
Publication Date
12-1-2010
Journal / Book Title
International Conference on Information and Knowledge Management Proceedings
Abstract
Time-stamped documents such as newswire articles, blog posts and other web-pages are often archived online. When these archives cover long spans of time, the terminology within them could undergo significant changes. Hence when users pose queries pertaining to historical information over such documents, the queries need to be translated taking into account these temporal changes in order to provide accurate responses to users. For example, a query on Sri Lanka should automatically retrieve documents with its former name Ceylon. We call such concepts SITACs, i.e., Semantically Identical Temporally Altering Concepts. In order to discover SITACs, we propose an approach based on a novel framework constituting an integration of natural language processing, association rule mining and contextual similarity as a learning technique. The proposed approach has been experimented with real data and has been found to yield good results with respect to efficiency and accuracy. © 2010 ACM.
DOI
10.1145/1871437.1871730
Montclair State University Digital Commons Citation
Kalurachchi, Amal; Varde, Aparna S.; Bedathur, Srikanta; Weikum, Gerhard; Peng, Jing; and Feldman, Anna, "Incorporating terminology evolution for query translation in text retrieval with association rules" (2010). Department of Computer Science Faculty Scholarship and Creative Works. 677.
https://digitalcommons.montclair.edu/compusci-facpubs/677