Search CORE

15 research outputs found

Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification

Author: Foster Jennifer
He Yifan
Liu Qun
Shouxun Lin
Tu Zhaopeng
van Genabith Josef
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 11/07/2012
Field of study

Convolution kernels support the modeling of complex syntactic information in machine-learning tasks. However, such models are highly sensitive to the type and size of syntactic structure used. It is therefore an important challenge to automatically identify high impact sub-structures relevant to a given task. In this paper we present a systematic study investigating (combinations of) sequence and convolution kernels using different types of substructures in document-level sentiment classification. We show that minimal sub-structures extracted from constituency and dependency trees guided by a polarity lexicon show 1.45 point absolute improvement in accuracy over a bag-of-words classifier on a widely used sentiment corpus

Irish Universities

DCU Online Research Access Service

Distributional lexical semantics: toward uniform representation paradigms for advanced acquisition and processing tasks

Author: Basili R
Pennacchiotti M
Publication venue: Cambridge University Press
Publication date: 01/01/2010
Field of study

The distributional hypothesis states that words with similar distributional properties have similar semantic properties (Harris 1968). This perspective on word semantics, was early discussed in linguistics (Firth 1957; Harris 1968), and then successfully applied to Information Retrieval (Salton, Wong and Yang 1975). In Information Retrieval, distributional notions (e.g. document frequency and word co-occurrence counts) have proved a key factor of success, as opposed to early logic-based approaches to relevance modeling (van Rijsbergen 1986; Chiaramella and Chevallet 1992; van Rijsbergen and Lalmas 1996).</jats:p

Crossref

ART

Cross-language frame semantics transfer in bilingual corpora

Author: A. Moschitti
C.J. Fillmore
D. Gildea
L. Heyer
M. Palmer
T. Landauer
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

Abstract. Recent work on the transfer of semantic information across languages has been recently applied to the development of resources annotated with Frame information for different non-English European languages. These works are based on the assumption that parallel corpora annotated for English can be used to transfer the semantic information to the other target languages. In this paper, a robust method based on a statistical machine translation step augmented with simple rule-based post-processing is presented. It alleviates problems related to preprocessing errors and the complex optimization required by syntax-dependent models of the cross-lingual mapping. Different alignment strategies are here in-vestigated against the Europarl corpus. Results suggest that the quality of the de-rived annotations is surprisingly good and well suited for training semantic role labeling systems.

CiteSeerX

Crossref

ART

Because Syntax does Matter: Improving Predicate-Argument Structures Parsing Using Syntactic Features

Author: Ribeyre Corentin
Seddah Djamé
Villemonte de La Clergerie Éric
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

International audienceParsing full-fledged predicate-argument structures in a deep syntax framework requires graphs to be predicted. Using the DeepBank (Flickinger et al., 2012) and the Predicate-Argument Structure treebank (Miyao and Tsujii, 2005) as a test field, we show how transition-based parsers, extended to handle connected graphs, benefit from the use of topologically different syntactic features such as dependencies, tree fragments, spines or syntactic paths, bringing a much needed context to the parsing models, improving notably over long distance dependencies and elided coordinate structures. By confirming this positive impact on an accurate 2nd-order graph-based parser (Martins and Almeida, 2014), we establish a new state-of-the-art on these data sets

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Structured lexical similarity via convolution Kernels on dependency trees

Author: Basili R
Croce D
Moschitti A
Publication venue: Association for computational linguistics
Publication date: 01/01/2011
Field of study

A central topic in natural language process-ing is the design of lexical and syntactic fea-tures suitable for the target application. In this paper, we study convolution dependency tree kernels for automatic engineering of syntactic and semantic patterns exploiting lexical simi-larities. We define efficient and powerful ker-nels for measuring the similarity between de-pendency structures, whose surface forms of the lexical nodes are in part or completely dif-ferent. The experiments with such kernels for question classification show an unprecedented results, e.g. 41 % of error reduction of the for-mer state-of-the-art. Additionally, semantic role classification confirms the benefit of se-mantic smoothing for dependency kernels.

CiteSeerX

ART

Supervised semantic relation mining from linguistically noisy text documents

Author: Basili R
Giannone C
Moschitti A
Naggar P
Publication venue: Springer Verlag
Publication date: 16/11/2010
Field of study

Crossref

ART

Enhanced discriminative models with tree kernels and unsupervised training for entity detection

Author: Cerisara Christophe
Rojas Barahona Lina Maria
Publication venue: HAL CCSD
Publication date: 12/02/2015
Field of study

International audienceThis work explores two approaches to improve the discriminative models that are commonly used nowadays for entity detection: tree-kernels and unsupervised training. Feature-rich classifiers have been widely adopted by the Natural Language processing (NLP) community because of their powerful modeling capacity and their support for correlated features, which allow separating the expert task of designing features from the core learning method. The first proposed approach consists in leveraging the fast and efficient linear models with unsupervised training, thanks to a recently proposed approximation of the classifier risk, an appealing method that provably converges towards the minimum risk without any labeled corpus. In the second proposed approach, tree kernels are used with support vector machines to exploit dependency structures for entity detection , which relieve designers from the burden of carefully design rich syntactic features manually. We study both approaches on the same task and corpus and show that they offer interesting alternatives to supervised learning for entity recognition

INRIA a CCSD electronic archive server

Discriminative Reranking for Spoken Language Understanding

Author: Alessandro Moschitti
Giuseppe Riccardi
Marco Dinarelli
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Kernel engineering on parse trees

Author: SUN JUN
Publication venue
Publication date: 14/07/2011
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS