22 research outputs found

    Combiner espaces sémantiques, structure et contraintes.

    Get PDF
    This paper presents the methods that we developed for the tasks 1 and 4 of the DEFT'14 Text Mining contest. In the task 1 the goal was to automatically categorise the literary genre of short texts, while in the task 4 the goal was to assign the session where a scientific paper is presented in a conference by analysing its content. These methods we developed rely on a common representation of the input texts in semantic spaces constructed using Random Indexing. In these high dimension spaces, each text and each term is represented a vector. For this edition of the DEFT, we tried to address the proposed tasks by designing methods that combine classical machine learning algorithms for clustering and categorisation with (i) rule based methods to represent for instance the patterns of poetic texts in the task 1 (ii) constraints solving methods to take into account the informations we had about the organisation of the sessions in the task 4. The results obtained NDCG=0.4278 (rank 2) in the task 1 and FScore=1 (rank 1) in the task 4 show the great performance of these hybrid methods.JRC.G.2-Global security and crisis managemen

    Resource Creation and Evaluation for Multilingual Sentiment Analysis in Social Media Texts

    Get PDF
    Sentiment analysis (SA) regards the classification of texts according to the polarity of the opinions they express. SA systems are highly relevant to many real-world applications (e.g. marketing, eGovernance, business intelligence, behavioral sciences) and also to many tasks in Natural Language Processing (NLP) – information extraction, question answering, textual entailment, to name just a few. The importance of this field has been proven by the high number of approaches proposed in research, as well as by the interest that it raised from other disciplines and the applications that were created using its technology. In our case, the primary focus is to use sentiment analysis in the context of media monitoring, to enable tracking of global reactions to events. The main challenge that we face is that tweets are written in different languages and an unbiased system should be able to deal with all of them, in order to process all (possible) available data. Unfortunately, although many linguistic resources exist for processing texts written in English, for many other languages data and tools are scarce. Following our initial efforts described in (Balahur and Turchi, 2013), in this article we extend our study on the possibility to implement a multilingual system that is able to a) classify sentiment expressed in tweets in various languages using training data obtained through machine translation; b) to verify the extent to which the quality of the translations influences the sentiment classification performance, in this case, of highly informal texts; and c) to improve multilingual sentiment classification using small amounts of data annotated in the target language. To this aim, varying sizes of target language data are tested. The languages we explore are: Arabic, Turkish, Russian, Italian, Spanish, German and French.JRC.G.2-Global security and crisis managemen

    Description of six scenarios and of the results of six validated trials

    Get PDF
    Description of six scenarios and of the results of six validated trialsThis deliverable aims at presenting and analysing the processes of elaboration and validation of the PALETTE scenarios. After having defined these two processes and situated them into the PALETTE methodology, the scenarios are presented. For each scenario, the specific methodology of elaboration and validation is described with a special focus on the participation of the concerned Communities of Pratcice (CoPs). Then the results of the validation are presented as well as the reports of their technical feasability and the usability of PALETTE services from a user perspective. Finally we reflect on and we discuss about the whole process of validation of the scenarios and we describe the next steps towards the development of the scenarios and their trilas with the CoPs

    Abstracts from the 3rd International Genomic Medicine Conference (3rd IGMC 2015)

    Get PDF

    Multimodal auto-tagging of music title using estimator aggregration

    No full text
    International audienceThis paper presents the participation to the MusiClef 2012 Multimodal Music Tagging task. It expounds the approach that consists of an aggregation of estimators as a procedure to combine different sources of information

    Multimodal auto-tagging of music title using estimator aggregration

    No full text
    International audienceThis paper presents the participation to the MusiClef 2012 Multimodal Music Tagging task. It expounds the approach that consists of an aggregation of estimators as a procedure to combine different sources of information

    A model driven approach for bridging ILOG Rule Language and RIF

    Get PDF
    Abstract. Nowadays many companies run their business using Business Rule Management Systems (BRMS), that offer: a clear separation between decision logic and procedural structure; the ability to modify a rulebase set rather than processes and the reusability of rules across applications. All these factors allow a company to quickly react and align its policies to the ever-changing market needs. Despite these advantages, different BRMSs can be found on the market, each of them implementing a proprietary business rule language (ex.: JBoss uses Drools, IBM uses Ilog Rule Language, etc.). Rule Interchange Format(RIF) is a W3C open standard aiming at reducing the heterogeneity among business rule languages, which makes rules less reusable and interchangeable. Our work is focused on providing an implementation based on a Model Driven approach for bridging Ilog Rule Language to RIF.
    corecore