Search CORE

696 research outputs found

Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification

Author: Foster Jennifer
He Yifan
Liu Qun
Shouxun Lin
Tu Zhaopeng
van Genabith Josef
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 11/07/2012
Field of study

Convolution kernels support the modeling of complex syntactic information in machine-learning tasks. However, such models are highly sensitive to the type and size of syntactic structure used. It is therefore an important challenge to automatically identify high impact sub-structures relevant to a given task. In this paper we present a systematic study investigating (combinations of) sequence and convolution kernels using different types of substructures in document-level sentiment classification. We show that minimal sub-structures extracted from constituency and dependency trees guided by a polarity lexicon show 1.45 point absolute improvement in accuracy over a bag-of-words classifier on a widely used sentiment corpus

Irish Universities

DCU Online Research Access Service

Convolution Kernels for Subjectivity Detection

Author: Klakow Dietrich
Wiegand Michael
Publication venue
Publication date: 10/05/2011
Field of study

Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011. Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa. NEALT Proceedings Series, Vol. 11 (2011), 254-261. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/16955

DSpace at Tartu University Library

Root-Weighted Tree Automata and their Applications to Tree Kernels

Author: Mignot Ludovic
Ouali-Sebti Nadia
Ziadi Djelloul
Publication venue
Publication date: 01/01/2015
Field of study

In this paper, we define a new kind of weighted tree automata where the weights are only supported by final states. We show that these automata are sequentializable and we study their closures under classical regular and algebraic operations. We then use these automata to compute the subtree kernel of two finite tree languages in an efficient way. Finally, we present some perspectives involving the root-weighted tree automata

arXiv.org e-Print Archive

HAL - Normandie Université

Handling Tree-Structured Values in RapidMiner

Author: Jungermann Felix
Publication venue
Publication date: 21/02/2012
Field of study

Attribute value types play an important role in mostly every datamin- ing task. Most learners, for instance, are restricted to particular value types. The usage of such learners is just possible after special forms of preprocessing. RapidMiner most commonly distinguishes between nom- inal and numerical values which are well-known to every RapidMiner- user. Although, covering a great fraction of attribute types being present in nowadays datamining tasks, nominal and numerical attribute values are not sufficient for every type of feature. In this work we are focusing on attribute values containing a tree-structure. We are presenting the handling and especially the possibilities to use tree-structured data for modelling. Additionally, we are introducing particular tasks which are offering tree-structured data and might benefit from using those struc- tures for modelling. All methods presented in this paper are contained in the Information Extraction Plugin1 for RapidMiner

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

A Convolution Kernel Approach to Identifying Comparisons in Text

Author: LAUW Hady W.
TKACHENKO Maksim
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

Crossref

Institutional Knowledge at Singapore Management University

Exploiting Entity BIO Tag Embeddings and Multi-task Learning for Relation Extraction with Imbalanced Data

Author: Chen Long
Li Bo
Sheng Zhonghao
Xie Rui
Ye Wei
Zhang Shikun
Publication venue
Publication date: 01/01/2019
Field of study

In practical scenario, relation extraction needs to first identify entity pairs that have relation and then assign a correct relation class. However, the number of non-relation entity pairs in context (negative instances) usually far exceeds the others (positive instances), which negatively affects a model's performance. To mitigate this problem, we propose a multi-task architecture which jointly trains a model to perform relation identification with cross-entropy loss and relation classification with ranking loss. Meanwhile, we observe that a sentence may have multiple entities and relation mentions, and the patterns in which the entities appear in a sentence may contain useful semantic information that can be utilized to distinguish between positive and negative instances. Thus we further incorporate the embeddings of character-wise/word-wise BIO tag from the named entity recognition task into character/word embeddings to enrich the input representation. Experiment results show that our proposed approach can significantly improve the performance of a baseline model with more than 10% absolute increase in F1-score, and outperform the state-of-the-art models on ACE 2005 Chinese and English corpus. Moreover, BIO tag embeddings are particularly effective and can be used to improve other models as well

arXiv.org e-Print Archive

Crossref