906 research outputs found
Resource Constrained Structured Prediction
We study the problem of structured prediction under test-time budget
constraints. We propose a novel approach applicable to a wide range of
structured prediction problems in computer vision and natural language
processing. Our approach seeks to adaptively generate computationally costly
features during test-time in order to reduce the computational cost of
prediction while maintaining prediction performance. We show that training the
adaptive feature generation system can be reduced to a series of structured
learning problems, resulting in efficient training using existing structured
learning algorithms. This framework provides theoretical justification for
several existing heuristic approaches found in literature. We evaluate our
proposed adaptive system on two structured prediction tasks, optical character
recognition (OCR) and dependency parsing and show strong performance in
reduction of the feature costs without degrading accuracy
Ontologies on the semantic web
As an informational technology, the World Wide Web has enjoyed spectacular success. In just ten years it has transformed the way information is produced, stored, and shared in arenas as diverse as shopping, family photo albums, and high-level academic research. The âSemantic Webâ was touted by its developers as equally revolutionary but has not yet achieved anything like the Webâs exponential uptake. This 17 000 word survey article explores why this might be so, from a perspective that bridges both philosophy and IT
Having Your Cake and Eating It Too: Autonomy and Interaction in a Model of Sentence Processing
Is the human language understander a collection of modular processes
operating with relative autonomy, or is it a single integrated process? This
ongoing debate has polarized the language processing community, with two
fundamentally different types of model posited, and with each camp concluding
that the other is wrong. One camp puts forth a model with separate processors
and distinct knowledge sources to explain one body of data, and the other
proposes a model with a single processor and a homogeneous, monolithic
knowledge source to explain the other body of data. In this paper we argue that
a hybrid approach which combines a unified processor with separate knowledge
sources provides an explanation of both bodies of data, and we demonstrate the
feasibility of this approach with the computational model called COMPERE. We
believe that this approach brings the language processing community
significantly closer to offering human-like language processing systems.Comment: 7 pages, uses aaai.sty macr
Recommended from our members
Discovering latent structures in syntax trees and mixed-type data
Gibbs sampling is a widely applied algorithm to estimate parameters in statistical models. This thesis uses Gibbs sampling to resolve practical problems, especially on natural language processing and mixed type data. It includes three independent studies. The first study includes a Bayesian model for learning latent annotations. The technique is capable of parsing sentences in a wide variety of languages, producing results that are on-par with or surpass previous approaches in accuracy, and shows promising potential for parsing low-resource languages. The second study presents a method to automatically complete annotations from partially-annotated sentence data, with the help of Gibbs sampling. The algorithm significantly reduces the time required to annotate sentences for natural language processing, without a significant drop in annotation accuracy. The last study proposes a novel factor model for uncovering latent factors and exploring covariation among multiple outcomes of mixed types, including binary, count, and continuous data. Gibbs sampling is used to estimate model parameters. The algorithm successfully discovers correlation structures of mixed-type
data in both simulated and real-word data.Operations Research and Industrial Engineerin
Three Algorithms for Competence-Oriented Anaphor Resolution
In the last decade, much effort went into the design of robust third-person pronominal anaphor resolution algorithms. Typical approaches are reported to achieve an accuracy of 60-85%. Recent research addresses the question of how to deal with the remaining difficult-toresolve anaphors. Lappin (2004) proposes a sequenced model of anaphor resolution according to which a cascade of processing modules employing knowledge and inferencing techniques of increasing complexity should be applied. The individual modules should only deal with and, hence, recognize the subset of anaphors for which they are competent. It will be shown that the problem of focusing on the competence cases is equivalent to the problem of giving precision precedence over recall. Three systems for high precision robust knowledge-poor anaphor resolution will be designed and compared: a ruleset-based approach, a salience threshold approach, and a machine-learning-based approach. According to corpus-based evaluation, there is no unique best approach. Which approach scores highest depends upon type of pronominal anaphor as well as upon text genre
Carroll's Autonomous Induction Theory: Combining Views from UG and Information Processing Theories
Without other mechanisms such as induction and parsers, UG-based approaches to linguistic cognition seem to fail to explain the logical problem of language acquisition. Hence, a property theory has to be adopted to combine UG views with other cognitive mechanisms like information processing and restructuring (Ellis, 2008). Pienemann (1998, 2003)'s Processibility Theory, and Leveltâs (1989) psycholinguistic theory of speech production, Jackendof's (1987, 1997, 2002) MOGUL, and Carrollâs (2001, 2002) Autonomous Induction Theory (AIT) are among the models which try to add new views to the UG-based approaches. Although suffering from a number of criticisms and having a high degree of abstractness, AIT with its major premises and conceptions related to the role of induction, attention, input, input processing, feedback, learning, and UG seems to be able to explain some of the UG enigma in second language acquisition
- âŠ