Date of Award

5-2010

Document Type

Thesis

Degree Name

Master of Science (MS)

College/School

College of Science and Mathematics

Department/Program

Computer Science

Thesis Sponsor/Dissertation Chair/Project Chair

Aparna Varde

Committee Member

Anna Feldman

Committee Member

Jing Peng

Abstract

With an exponential growth in archival of time-stamped documents such as newswire articles, blog posts and other web-pages, information retrieval (IR) has become a challenging task. The degree of complexity in this IR task increases when these archives cover long time-spans and the terminology in them has undergone significant changes. When users pose queries pertaining to historical information over such document collections, the queries need to be translated, incorporating temporal changes, to provide accurate responses. For example, a query on Sri Lanka should automatically retrieve documents with its former name Ceylon. We call such concepts SITACs i.e., Semantically Identical Temporally Altering Concepts. To discover SITACs from a given corpus, we propose a methodology which integrates natural language processing, association rule mining, and contextual similarity. By using the SITACs discovered, historical queries over text corpora can be addressed effectively. Proposed methodology was experimented with Gutenberg corpus which contains speeches of American presidents since first speech of Mr. George Washington in 1795 to speech of Mr. George W. Bush in 2006. Search engines and IR systems can be benefited by the techniques we provide in this research.

File Format

PDF

Recommended Citation

Kaluarachchi, Amal, "The SITAC Approach for Time-Aware Query Translation in Text Archives" (2010). Theses, Dissertations and Culminating Projects. 894.
https://digitalcommons.montclair.edu/etd/894

Download

Included in

Computer Sciences Commons

COinS

Theses, Dissertations and Culminating Projects

The SITAC Approach for Time-Aware Query Translation in Text Archives

Date of Award

Document Type

Degree Name

College/School

Department/Program

Thesis Sponsor/Dissertation Chair/Project Chair

Committee Member

Committee Member

Abstract

File Format

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Theses, Dissertations and Culminating Projects

The SITAC Approach for Time-Aware Query Translation in Text Archives

Author

Date of Award

Document Type

Degree Name

College/School

Department/Program

Thesis Sponsor/Dissertation Chair/Project Chair

Committee Member

Committee Member

Abstract

File Format

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links

//<![CDATA[ document.write("<a href='mailto:" + "digitalcommons" + "@" + "mail.montclair.edu" + "'>" + "Contact Us" + "<\/a>") //]]>