Search CORE

48,772 research outputs found

Efficient Multi-Template Learning for Structured Prediction

Author: Mao Qi
Tsang Ivor W.
Publication venue
Publication date: 01/01/2013
Field of study

Conditional random field (CRF) and Structural Support Vector Machine (Structural SVM) are two state-of-the-art methods for structured prediction which captures the interdependencies among output variables. The success of these methods is attributed to the fact that their discriminative models are able to account for overlapping features on the whole input observations. These features are usually generated by applying a given set of templates on labeled data, but improper templates may lead to degraded performance. To alleviate this issue, in this paper, we propose a novel multiple template learning paradigm to learn structured prediction and the importance of each template simultaneously, so that hundreds of arbitrary templates could be added into the learning model without caution. This paradigm can be formulated as a special multiple kernel learning problem with exponential number of constraints. Then we introduce an efficient cutting plane algorithm to solve this problem in the primal, and its convergence is presented. We also evaluate the proposed learning paradigm on two widely-studied structured prediction tasks, \emph{i.e.} sequence labeling and dependency parsing. Extensive experimental results show that the proposed method outperforms CRFs and Structural SVMs due to exploiting the importance of each template. Our complexity analysis and empirical results also show that our proposed method is more efficient than OnlineMKL on very sparse and high-dimensional data. We further extend this paradigm for structured prediction using generalized

p

-block norm regularization with

p>1

, and experiments show competitive performances when

p \in [1,2)

arXiv.org e-Print Archive

OPUS - University of Technology Sydney

Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology

Author: Balahur A.
Bautin M.
Dryer M. S.
Esuli A.
Güngördü Z.
Krizhevsky A.
Lang P.
Lee J. H.
McCarthy E. D.
Mesquita B.
Mihalcea R.
Mikolov T.
Plutchik R.
Schmid H.
Vessel E. A.
You Q.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/08/2015
Field of study

Every culture and language is unique. Our work expressly focuses on the uniqueness of culture and language in relation to human affect, specifically sentiment and emotion semantics, and how they manifest in social multimedia. We develop sets of sentiment- and emotion-polarized visual concepts by adapting semantic structures called adjective-noun pairs, originally introduced by Borth et al. (2013), but in a multilingual context. We propose a new language-dependent method for automatic discovery of these adjective-noun constructs. We show how this pipeline can be applied on a social multimedia platform for the creation of a large-scale multilingual visual sentiment concept ontology (MVSO). Unlike the flat structure in Borth et al. (2013), our unified ontology is organized hierarchically by multilingual clusters of visually detectable nouns and subclusters of emotionally biased versions of these nouns. In addition, we present an image-based prediction task to show how generalizable language-specific models are in a multilingual context. A new, publicly available dataset of >15.6K sentiment-biased visual concepts across 12 languages with language-specific detector banks, >7.36M images and their metadata is also released.Comment: 11 pages, to appear at ACM MM'1

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Establishing a New State-of-the-Art for French Named Entity Recognition

Author: Dupont Yoann
Muller Benjamin
Romary Laurent
Sagot Benoît
Suárez Pedro Javier Ortiz
Publication venue
Publication date: 11/05/2020
Field of study

The French TreeBank developed at the University Paris 7 is the main source of morphosyntactic and syntactic annotations for French. However, it does not include explicit information related to named entities, which are among the most useful information for several natural language processing tasks and applications. Moreover, no large-scale French corpus with named entity annotations contain referential information, which complement the type and the span of each mention with an indication of the entity it refers to. We have manually annotated the French TreeBank with such information, after an automatic pre-annotation step. We sketch the underlying annotation guidelines and we provide a few figures about the resulting annotations

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes