2,045 research outputs found

    Universal, Unsupervised (Rule-Based), Uncovered Sentiment Analysis

    Get PDF
    We present a novel unsupervised approach for multilingual sentiment analysis driven by compositional syntax-based rules. On the one hand, we exploit some of the main advantages of unsupervised algorithms: (1) the interpretability of their output, in contrast with most supervised models, which behave as a black box and (2) their robustness across different corpora and domains. On the other hand, by introducing the concept of compositional operations and exploiting syntactic information in the form of universal dependencies, we tackle one of their main drawbacks: their rigidity on data that are structured differently depending on the language concerned. Experiments show an improvement both over existing unsupervised methods, and over state-of-the-art supervised models when evaluating outside their corpus of origin. Experiments also show how the same compositional operations can be shared across languages. The system is available at http://www.grupolys.org/software/UUUSA/Comment: 19 pages, 5 Tables, 6 Figures. This is the authors version of a work that was accepted for publication in Knowledge-Based System

    One model, two languages: training bilingual parsers with harmonized treebanks

    Full text link
    We introduce an approach to train lexicalized parsers using bilingual corpora obtained by merging harmonized treebanks of different languages, producing parsers that can analyze sentences in either of the learned languages, or even sentences that mix both. We test the approach on the Universal Dependency Treebanks, training with MaltParser and MaltOptimizer. The results show that these bilingual parsers are more than competitive, as most combinations not only preserve accuracy, but some even achieve significant improvements over the corresponding monolingual parsers. Preliminary experiments also show the approach to be promising on texts with code-switching and when more languages are added.Comment: 7 pages, 4 tables, 1 figur

    Towards Syntactic Iberian Polarity Classification

    Full text link
    Lexicon-based methods using syntactic rules for polarity classification rely on parsers that are dependent on the language and on treebank guidelines. Thus, rules are also dependent and require adaptation, especially in multilingual scenarios. We tackle this challenge in the context of the Iberian Peninsula, releasing the first symbolic syntax-based Iberian system with rules shared across five official languages: Basque, Catalan, Galician, Portuguese and Spanish. The model is made available.Comment: 7 pages, 5 tables. Contribution to the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA-2017) at EMNLP 201

    How important is syntactic parsing accuracy? An empirical evaluation on rule-based sentiment analysis

    Get PDF
    This version of the article has been accepted for publication, after peer review and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/s10462-017-9584-0[Abstract]: Syntactic parsing, the process of obtaining the internal structure of sentences in natural languages, is a crucial task for artificial intelligence applications that need to extract meaning from natural language text or speech. Sentiment analysis is one example of application for which parsing has recently proven useful. In recent years, there have been significant advances in the accuracy of parsing algorithms. In this article, we perform an empirical, task-oriented evaluation to determine how parsing accuracy influences the performance of a state-of-the-art rule-based sentiment analysis system that determines the polarity of sentences from their parse trees. In particular, we evaluate the system using four well-known dependency parsers, including both current models with state-of-the-art accuracy and more innacurate models which, however, require less computational resources. The experiments show that all of the parsers produce similarly good results in the sentiment analysis task, without their accuracy having any relevant influence on the results. Since parsing is currently a task with a relatively high computational cost that varies strongly between algorithms, this suggests that sentiment analysis researchers and users should prioritize speed over accuracy when choosing a parser; and parsing researchers should investigate models that improve speed further, even at some cost to accuracy.Carlos Gómez-Rodríguez has received funding from the European Research Council (ERC), under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, Grant Agreement No 714150), Ministerio de Economía y Competitividad (FFI2014-51978-C2-2-R), and the Oportunius Program (Xunta de Galicia). Iago Alonso-Alonso was funded by an Oportunius Program Grant (Xunta de Galicia). David Vilares has received funding from the Ministerio de Educación, Cultura y Deporte (FPU13/01180) and Ministerio de Economía y Competitividad (FFI2014-51978-C2-2-R)

    Retrieval of bilingual Spanish-English information by means of a standard automatic translation system

    Get PDF
    This paper describes our participation in bilingual retrieval (queries in Spanish on documents in English), by means of an information retrieval system based on the vector model. The queries, formulated in Spanish, were translated into English by means of a commercial automatic translation system; the terms extracted from the resulting translations were filtered in order to get rid of empty words and then they were normalised by stemming. Results are poorer than those obtained through monolingual retrieval with the original queries in English slightly above 15%

    John's ellipsoid and the integral ratio of a log-concave function

    Get PDF
    We extend the notion of John’s ellipsoid to the setting of integrable log-concave functions. This will allow us to define the integral ratio of a log-concave function, which will extend the notion of volume ratio, and we will find the log-concave function maximizing the integral ratio. A reverse functional affine isoperimetric inequality will be given, written in terms of this integral ratio. This can be viewed as a stability version of the functional affine isoperimetric inequality.Ministerio de Economía y CompetitividadFondo Europeo de Desarrollo RegionalConsejería de Industria, Turismo, Empresa e Innovación (Comunidad Autónoma de la Región de Murcia)Coordenação de aperfeiçoamento de pessoal de nivel superiorInstituto Nacional de Matemática Pura e Aplicad

    Creación de un treebank de dependencias universales mediante recursos existentes para lenguas próximas: el caso del gallego

    Get PDF
    [Resumen] En este trabajo presentamos una nueva estrategia para crear treebanks de lenguas con pocos recursos para el análisis sintáctico. El método consiste en la adaptación y combinación de diferentes treebanks anotados con dependencias universales de variedades lingüísticas próximas, con el objetivo de entrenar un analizador sintáctico para la lengua elegida, en nuestro caso el gallego. Durante el proceso de selección y adaptación de los treebanks de origen, analizamos el impacto de propiedades de tres niveles diferentes: (i) la distancia entre las lenguas de origen y destino, (ii) la adaptación de características léxico-ortográficas, y (iii) las directrices de anotación entre los treebanks. Usando la estrategia propuesta, entrenamos un analizador sintáctico estadístico para etiquetar, con resultados prometedores y sin datos previos de gallego, un pequeño corpus de esta lengua. La corrección manual de este corpus, usado como gold-standard, nos permitió probar la eficacia del método propuesto.Ministerio de Economía y Competitividad; FFI2014-51978-C2-1-RMinisterio de Economía y Competitividad; FJCI-2014-22853Ministerio de Economía y Competitividad; FFI2014-51978-C2-2-

    The validity of Rhetorical categories in audiovisual culture

    Full text link
    Los ámbitos de la comunicación, la sociedad y el arte han evolucionado como consecuencia de los cambios culturales y del progreso de las nuevas tecnologías. Sin embargo, en la cultura audiovisual dominante hoy, la Retórica clásica sigue vigente y ha sabido adaptarse a los nuevos modelos de comunicación que integran elementos visuales, lingüísticos y acústicos que pueden ser analizados y entendidos a través de la Retórica CulturalCommunication, Society, and Art evolve as a consequence of cultural change and new technologies’ development. However, in current audiovisual mainstream Culture, Classical Rhetoric is still useful. It adapts itself to new communicative models incorporating visual, linguistic, and acoustic elements, which can be studied and understood through Cultural RhetoricEste trabajo es resultado de la investigación realizada en el proyecto METAPHORA (Referencia FFI2014-53391-P), proyecto de investigación financiado por la Secretaría de Estado de Investigación, Desarrollo e Innovació