7,672 research outputs found
A Machine learning approach to POS tagging
We have applied inductive learning of statistical decision trees
and relaxation labelling to the Natural Language Processing (NLP)
task of morphosyntactic disambiguation (Part Of Speech Tagging).
The learning process is supervised and obtains a language
model oriented to resolve POS ambiguities. This model consists
of a set of statistical decision trees expressing distribution of
tags and words in some relevant contexts.
The acquired language models are complete enough to be directly
used as sets of POS disambiguation rules, and include more complex
contextual information than simple collections of n-grams usually
used in statistical taggers.
We have implemented a quite simple and fast tagger that has been
tested and evaluated on the Wall Street Journal (WSJ) corpus with
a remarkable accuracy.
However, better results can be obtained by translating the trees
into rules to feed a flexible relaxation labelling based tagger.
In this direction we describe a tagger which is able to use
information of any kind (n-grams, automatically acquired constraints,
linguistically motivated manually written constraints, etc.), and in
particular to incorporate the machine learned decision trees.
Simultaneously, we address the problem of tagging when only
small training material is available, which is crucial in any process
of constructing, from scratch, an annotated corpus. We show that quite
high accuracy can be achieved with our system in this situation.Postprint (published version
Experiments on applying relaxation labeling to map multilingual hierarchies
This paper explores the automatic construction of a multilingual
Lexical Knowledge Base from preexisting lexical resources. This paper
presents a new approach for linking already existing hierarchies. The
Relaxation labeling algorithm is used to select --among all the
candidate connections proposed by a bilingual dictionary-- the right
conection for each node in the taxonomy.Postprint (published version
Geometrical aspects and connections of the energy-temperature fluctuation relation
Recently, we have derived a generalization of the known canonical fluctuation
relation between heat capacity and
energy fluctuations, which can account for the existence of macrostates with
negative heat capacities . In this work, we presented a panoramic overview
of direct implications and connections of this fluctuation theorem with other
developments of statistical mechanics, such as the extension of canonical Monte
Carlo methods, the geometric formulations of fluctuation theory and the
relevance of a geometric extension of the Gibbs canonical ensemble that has
been recently proposed in the literature.Comment: Version accepted for publication in J. Phys. A: Math and The
On the accuracy of pixel relaxation labeling
There are no author-identified significant results in this report
A graph theoretic approach to scene matching
The ability to match two scenes is a fundamental requirement in a variety of computer vision tasks. A graph theoretic approach to inexact scene matching is presented which is useful in dealing with problems due to imperfect image segmentation. A scene is described by a set of graphs, with nodes representing objects and arcs representing relationships between objects. Each node has a set of values representing the relations between pairs of objects, such as angle, adjacency, or distance. With this method of scene representation, the task in scene matching is to match two sets of graphs. Because of segmentation errors, variations in camera angle, illumination, and other conditions, an exact match between the sets of observed and stored graphs is usually not possible. In the developed approach, the problem is represented as an association graph, in which each node represents a possible mapping of an observed region to a stored object, and each arc represents the compatibility of two mappings. Nodes and arcs have weights indicating the merit or a region-object mapping and the degree of compatibility between two mappings. A match between the two graphs corresponds to a clique, or fully connected subgraph, in the association graph. The task is to find the clique that represents the best match. Fuzzy relaxation is used to update the node weights using the contextual information contained in the arcs and neighboring nodes. This simplifies the evaluation of cliques. A method of handling oversegmentation and undersegmentation problems is also presented. The approach is tested with a set of realistic images which exhibit many types of sementation errors
Maximum Weight Matching via Max-Product Belief Propagation
Max-product "belief propagation" is an iterative, local, message-passing
algorithm for finding the maximum a posteriori (MAP) assignment of a discrete
probability distribution specified by a graphical model. Despite the
spectacular success of the algorithm in many application areas such as
iterative decoding, computer vision and combinatorial optimization which
involve graphs with many cycles, theoretical results about both correctness and
convergence of the algorithm are known in few cases (Weiss-Freeman Wainwright,
Yeddidia-Weiss-Freeman, Richardson-Urbanke}.
In this paper we consider the problem of finding the Maximum Weight Matching
(MWM) in a weighted complete bipartite graph. We define a probability
distribution on the bipartite graph whose MAP assignment corresponds to the
MWM. We use the max-product algorithm for finding the MAP of this distribution
or equivalently, the MWM on the bipartite graph. Even though the underlying
bipartite graph has many short cycles, we find that surprisingly, the
max-product algorithm always converges to the correct MAP assignment as long as
the MAP assignment is unique. We provide a bound on the number of iterations
required by the algorithm and evaluate the computational cost of the algorithm.
We find that for a graph of size , the computational cost of the algorithm
scales as , which is the same as the computational cost of the best
known algorithm. Finally, we establish the precise relation between the
max-product algorithm and the celebrated {\em auction} algorithm proposed by
Bertsekas. This suggests possible connections between dual algorithm and
max-product algorithm for discrete optimization problems.Comment: In the proceedings of the 2005 IEEE International Symposium on
Information Theor
- …