Tracking semantic change

The problem of tracking semantic change in searching document archives has been addressed, e.g., by Berberich et al. (2009), who proposed a solution to this problem by reformu- lating a query into terms prevalent in the past by comparing the contexts over time- captured by co-occurrence statistics. The approach requires a recurrent computation which can affect efficiency and scalability. Kaluarachchi et al. (2010, 2011), proposed to discover semantically identical concepts (or named entities) that are used at different times using an association rule mining technique using events (sentences containing a subject, a verb, objects, and nouns) associated to two distinct entities. Two entities are semantically related if the associated events occur multiple times in a document archive. The approach relies on linguistic properties and events, which are subjected to change over time as well. Kanhabua and Nørvåg (2010) tracked named entity changes from anchor texts in Wikipedia and associated each version of a term with a period of validity using Wikipedia history as well as New York Times Annotated Corpus. Unfortunately, the method has limited applicability as link information, such as anchor texts, is not always available in other document archives. In more recent work, Mazeika et al. (2011) extracted named entities from the YAGO ontology and tracked their changed usage patterns using the New York Times Annotated Corpus. Similar to the work by Kanhabua and Nørvåg (2010), relying on the ontological knowledge is expensive and requires human annotators.