200,974 research outputs found
Models of Co-occurrence
A model of co-occurrence in bitext is a boolean predicate that indicates
whether a given pair of word tokens co-occur in corresponding regions of the
bitext space. Co-occurrence is a precondition for the possibility that two
tokens might be mutual translations. Models of co-occurrence are the glue that
binds methods for mapping bitext correspondence with methods for estimating
translation models into an integrated system for exploiting parallel texts.
Different models of co-occurrence are possible, depending on the kind of bitext
map that is available, the language-specific information that is available, and
the assumptions made about the nature of translational equivalence. Although
most statistical translation models are based on models of co-occurrence,
modeling co-occurrence correctly is more difficult than it may at first appear
Co-occurrence Filter
Co-occurrence Filter (CoF) is a boundary preserving filter. It is based on
the Bilateral Filter (BF) but instead of using a Gaussian on the range values
to preserve edges it relies on a co-occurrence matrix. Pixel values that
co-occur frequently in the image (i.e., inside textured regions) will have a
high weight in the co-occurrence matrix. This, in turn, means that such pixel
pairs will be averaged and hence smoothed, regardless of their intensity
differences. On the other hand, pixel values that rarely co-occur (i.e., across
texture boundaries) will have a low weight in the co-occurrence matrix. As a
result, they will not be averaged and the boundary between them will be
preserved. The CoF therefore extends the BF to deal with boundaries, not just
edges. It learns co-occurrences directly from the image. We can achieve various
filtering results by directing it to learn the co-occurrence matrix from a part
of the image, or a different image. We give the definition of the filter,
discuss how to use it with color images and show several use cases.Comment: accepted to CVPR 201
Fixed versus Dynamic Co-Occurrence Windows in TextRank Term Weights for Information Retrieval
TextRank is a variant of PageRank typically used in graphs that represent
documents, and where vertices denote terms and edges denote relations between
terms. Quite often the relation between terms is simple term co-occurrence
within a fixed window of k terms. The output of TextRank when applied
iteratively is a score for each vertex, i.e. a term weight, that can be used
for information retrieval (IR) just like conventional term frequency based term
weights. So far, when computing TextRank term weights over co- occurrence
graphs, the window of term co-occurrence is al- ways ?xed. This work departs
from this, and considers dy- namically adjusted windows of term co-occurrence
that fol- low the document structure on a sentence- and paragraph- level. The
resulting TextRank term weights are used in a ranking function that re-ranks
1000 initially returned search results in order to improve the precision of the
ranking. Ex- periments with two IR collections show that adjusting the vicinity
of term co-occurrence when computing TextRank term weights can lead to gains in
early precision
Co-occurrence Vectors from Corpora vs. Distance Vectors from Dictionaries
A comparison was made of vectors derived by using ordinary co-occurrence
statistics from large text corpora and of vectors derived by measuring the
inter-word distances in dictionary definitions. The precision of word sense
disambiguation by using co-occurrence vectors from the 1987 Wall Street Journal
(20M total words) was higher than that by using distance vectors from the
Collins English Dictionary (60K head words + 1.6M definition words). However,
other experimental results suggest that distance vectors contain some different
semantic information from co-occurrence vectors.Comment: 6 pages, appeared in the Proc. of COLING94 (pp. 304-309)
Semantic Concept Co-Occurrence Patterns for Image Annotation and Retrieval.
Describing visual image contents by semantic concepts is an effective and straightforward way to facilitate various high level applications. Inferring semantic concepts from low-level pictorial feature analysis is challenging due to the semantic gap problem, while manually labeling concepts is unwise because of a large number of images in both online and offline collections. In this paper, we present a novel approach to automatically generate intermediate image descriptors by exploiting concept co-occurrence patterns in the pre-labeled training set that renders it possible to depict complex scene images semantically. Our work is motivated by the fact that multiple concepts that frequently co-occur across images form patterns which could provide contextual cues for individual concept inference. We discover the co-occurrence patterns as hierarchical communities by graph modularity maximization in a network with nodes and edges representing concepts and co-occurrence relationships separately. A random walk process working on the inferred concept probabilities with the discovered co-occurrence patterns is applied to acquire the refined concept signature representation. Through experiments in automatic image annotation and semantic image retrieval on several challenging datasets, we demonstrate the effectiveness of the proposed concept co-occurrence patterns as well as the concept signature representation in comparison with state-of-the-art approaches
- …
