Search CORE

677 research outputs found

Semi-supervised SRL system with Bayesian inference

Author: A. Björkelund
A. Haghighi
A. Haghighi
D. Das
D. Dowty
D. Jain
L. Màrquez
M. Palmer
M. Surdeanu
N.N. Pise
S.S. Pradhan
Publication venue: HAL CCSD
Publication date: 01/01/2014
Field of study

International audienceWe propose a new approach to perform semi-supervised training of Semantic Role Labeling models with very few amount of initial labeled data. The proposed approach combines in a novel way supervised and unsupervised training, by forcing the supervised classifier to over-generate potential semantic candidates, and then letting unsupervised inference choose the best ones. Hence, the supervised classifier can be trained on a very small corpus and with coarse-grain features, because its precision does not need to be high: its role is mainly to constrain Bayesian inference to explore only a limited part of the full search space. This approach is evaluated on French and English. In both cases, it achieves very good performance and outperforms a strong supervised baseline when only a small number of annotated sentences is available and even without using any previously trained syntactic parser

Crossref

INRIA a CCSD electronic archive server

Transforming Graph Representations for Statistical Relational Learning

Author: Aha David W.
McDowell Luke K.
Neville Jennifer
Rossi Ryan A.
Publication venue
Publication date: 01/01/2012
Field of study

Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed

arXiv.org e-Print Archive

CiteSeerX

Minimal supervision for language learning: bootstrapping global patterns from local knowledge

Author: Connor Michael
Publication venue
Publication date: 01/12/2011
Field of study

A fundamental step in sentence comprehension involves assigning semantic roles to sentence constituents. To accomplish this, the listener must parse the sentence, find constituents that are candidate arguments, and assign semantic roles to those constituents. Each step depends on prior lexical and syntactic knowledge. Where do children begin in solving this problem when learning their first languages? To experiment with different representations that children may use to begin understanding language, we have built a computational model for this early point in language acquisition. This system, BabySRL, learns from transcriptions of natural child-directed speech and makes use of psycholinguistically plausible background knowledge and realistically noisy semantic feedback to begin to classify sentences at the level of ``who does what to whom.'' Starting with simple, psycholinguistically-motivated representations of sentence structure, the BabySRL is able to learn from full semantic feedback, as well as a supervision signal derived from partial semantic background knowledge. In addition we combine the BabySRL with an unsupervised Hidden Markov Model part-of-speech tagger, linking clusters with syntactic categories using background noun knowledge so that they can be used to parse input for the SRL system. The results show that proposed shallow representations of sentence structure are robust to reductions in parsing accuracy, and that the contribution of alternative representations of sentence structure to successful semantic role labeling varies with the integrity of the parsing and argument-identification stages. Finally, we enable the BabySRL to improve both an intermediate syntactic representation and its final semantic role classification. Using this system we show that it is possible for a simple learner in a plausible (noisy) setup to begin comprehending simple semantics when initialized with a small amount of concrete noun knowledge and some simple syntax-semantics mapping biases, before acquiring any specific verb knowledge

Illinois Digital Environment for Access to Learning and Scholarship Repository

05051 Abstracts Collection -- Probabilistic, Logical and Relational Learning - Towards a Synthesis

Author: De Raedt Luc
Dietterich Tom
Getoor Lise
Muggleton Stephen H.
Publication venue: Dagstuhl Seminar Proceedings. 05051 - Probabilistic, Logical and Relational Learning - Towards a Synthesis
Publication date: 01/01/2006
Field of study

From 30.01.05 to 04.02.05, the Dagstuhl Seminar 05051 ``Probabilistic, Logical and Relational Learning - Towards a Synthesis\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

Dagstuhl Research Online Publication Server

Emergent limits of an indirect measurement from phase transitions of inference

Author: Nagata Kenji
Okada Masato
Tokuda Satoru
Publication venue
Publication date: 05/01/2020
Field of study

Measurements are inseparable from inference, where the estimation of signals of interest from other observations is called an indirect measurement. While a variety of measurement limits have been defined by the physical constraint on each setup, the fundamental limit of an indirect measurement is essentially the limit of inference. Here, we propose the concept of statistical limits on indirect measurement: the bounds of distinction between signals and noise and between a signal and another signal. By developing the asymptotic theory of Bayesian regression, we investigate the phenomenology of a typical indirect measurement and demonstrate the existence of these limits. Based on the connection between inference and statistical physics, we also provide a unified interpretation in which these limits emerge from phase transitions of inference. Our results could pave the way for novel experimental design, enabling assess to the required quality of observations according to the assumed ground truth before the concerned indirect measurement is actually performed

arXiv.org e-Print Archive

Recommended from our members

Learning to Extract Action Descriptions from Narrative Text

Author: Cavazza Marc
Do Quynh Ngoc Thi
Ludwig Oswaldo
Moens Marie-Francine
Smith Cameron
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/01/2017
Field of study

This paper focuses on the mapping of natural language sentences in written stories to a structured knowledge representation. This process yields an exponential explosion of instance combinations since each sentence may contain a set of ambiguous terms, each one giving place to a set of instance candidates. The selection of the best combination of instances is a structured classification problem that yields a highdemanding combinatorial optimization problem which, in this paper, is approached by a novel and efficient formulation of a genetic algorithm, which is able to exploit the conditional independence among variables, while improving the parallel scalability. The automatic rating of the resulting set of instance combinations, i.e. possible text interpretations, demands an exhaustive exploitation of the state-of-the-art resources in natural language processing to feed the system with pieces of evidence to be fused by the proposed framework. In this sense, a mapping framework able to reason with uncertainty, to integrate supervision, and evidence from external sources, was adopted. To improve the generalization capacity while learning from a limited amount of annotated data, a new constrained learning algorithm for Bayesian networks is introduced. This algorithm bounds the search space through a set of constraints which encode information on mutually exclusive values. The mapping of natural language utterances to a structured knowledge representation is important in the context of game construction, e.g. in an RPG setting, as it alleviates the manual knowledge acquisition bottleneck. The effectiveness of the proposed algorithm is evaluated on a set of three stories, yielding nine experiments. Our mapping framework yields performance gains in predicting the most likely structured representations of sentences when compared with a baseline algorithm

Greenwich Academic Literature Archive

Kent Academic Repository