Search CORE

1,353,548 research outputs found

Models of Co-occurrence

Author: Melamed I. Dan
Publication venue
Publication date: 01/01/1998
Field of study

A model of co-occurrence in bitext is a boolean predicate that indicates whether a given pair of word tokens co-occur in corresponding regions of the bitext space. Co-occurrence is a precondition for the possibility that two tokens might be mutual translations. Models of co-occurrence are the glue that binds methods for mapping bitext correspondence with methods for estimating translation models into an integrated system for exploiting parallel texts. Different models of co-occurrence are possible, depending on the kind of bitext map that is available, the language-specific information that is available, and the assumptions made about the nature of translational equivalence. Although most statistical translation models are based on models of co-occurrence, modeling co-occurrence correctly is more difficult than it may at first appear

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Fixed versus Dynamic Co-Occurrence Windows in TextRank Term Weights for Information Retrieval

Author: Cheng Qikai
Lioma Christina
Lu Wei
Publication venue
Publication date: 06/04/2017
Field of study

TextRank is a variant of PageRank typically used in graphs that represent documents, and where vertices denote terms and edges denote relations between terms. Quite often the relation between terms is simple term co-occurrence within a fixed window of k terms. The output of TextRank when applied iteratively is a score for each vertex, i.e. a term weight, that can be used for information retrieval (IR) just like conventional term frequency based term weights. So far, when computing TextRank term weights over co- occurrence graphs, the window of term co-occurrence is al- ways ?xed. This work departs from this, and considers dy- namically adjusted windows of term co-occurrence that fol- low the document structure on a sentence- and paragraph- level. The resulting TextRank term weights are used in a ranking function that re-ranks 1000 initially returned search results in order to improve the precision of the ranking. Ex- periments with two IR collections show that adjusting the vicinity of term co-occurrence when computing TextRank term weights can lead to gains in early precision

arXiv.org e-Print Archive

CiteSeerX

Multilingual learning with parameter co-occurrence clustering

Author: Bane Max
Kirby James
Riggle Jason
Sylak John
Publication venue
Publication date: 02/12/2011
Field of study

Edinburgh Research Explorer