83 research outputs found

    System combination using machine learning in NLP tasks

    Get PDF
    La combinación de sistemas constituye un área de investigación ampliamente estudiada en el ámbito del Reconocimiento de Patrones, en donde se han desarrollado múltiples técnicas para aprovechar la diversidad de métodos de clasificación de los que se dispone actualmente gracias al Aprendizaje Automático. En el desarrollo de esta Tesis Doctoral se ha realizado un estudio de las técnicas de combinación existentes y su grado de implicación en tareas del PLN. Asimismo se han expuesto algunos trabajos sobre tareas concretas y un estudio comparativo con los resultados arrojados por muchas de estas técnicas implementadas y aplicadas sobre la tarea de etiquetado morfosintáctico. El uso de un gran número de corpus diferentes y los experimentos llevados a cabo nos han permitido extraer algunas conclusiones que creemos de gran utilidad de cara al uso de estas técnicas en el futuro dentro del PLN.The combination of systems is an area of widely studied research in the field of Pattern Recognition, where many techniques have been developed for taking advantage of the diversity of classification methods that are currently available thanks to Machine Learning. During the work implied in this PhD Thesis we have carried out a study of the existing combination techniques and their implication in NLP tasks. Some works on concrete tasks have also been exposed as well as a comparative study with the results obtained by many of these techniques implemented and deployed over the POS-tagging task. By using many different corpora and making many different experiments we have been able to draw some conclusions that can be very helpful for using these techniques in the future inside NLP

    A Technique for Distributed Systems Specification

    Get PDF
    In this paper we show how an object-oriented specification language is usefvl for the specification of distributed systems. The main constructors in this language are the objects. An object consists of a state, a behaviour and a set of transition rules between states. The specification is composed by three sections: definition of algebraic data types to represent the domain of object attributes, definition of classes that group objects with common features, and definition of relationships among classes. We show two possible styles for defining the behaviour of objects, in one hand we use a transition system (state oriented) and in the other hand we use an algebraic model of processes description (constraint oriented). We illustrate the paper with the specification of the dining philosophers problem, a typical example in distributed programming

    InstanceRank: Bringing order to datasets

    Get PDF
    In this paper we present InstanceRank, a ranking algorithm that reflects the relevance of the instances within a dataset. InstanceRank applies a similar solution to that used by PageRank, the web pages ranking algorithm in the Google search engine. We also present ISR, an instance selection technique that uses InstanceRank. This algorithm chooses the most representative instances from a learning database. Experiments show that ISR algorithm, with InstanceRank as ranking criteria, obtains similar results in accuracy to other instance reduction techniques, noticeably reducing the size of the instance set.Ministerio de Educación y Ciencia HUM2007-66607-C04-0

    On the Reusability of User Interface Declarative Models

    Get PDF
    The automatic generation of user interfaces based on declarative models achieves a significant reduction of the development effort. In this paper, we analyze the feasibility of using two well-known techniques such as XInclude and Packaging in the new context of reusing user-interface model specifications. After analyzing the suitability of each technique for UI reutilization and implementing both techniques in a real system, we show that both techniques are suited to be used within the context of today’s existing model-based user interfaces

    Reusing UI elements with Model-Based User Interface Development

    Get PDF
    This paper introduces the potential for reusing UI elements in the context of Model-Based UI Development (MBUID) and provides guidance for future MBUID systems with enhanced reutilization capabilities. Our study is based upon the development of six inter-related projects with a specific MBUID environment which supports standard techniques for reuse such as parametrization and sub-specification, inclusion or shared repositories. We analyze our experience and discuss the benefits and limitations of each technique supported by our MBUID environment. The system architecture, the structure and composition of UI elements and the models specification languages have a decisive impact on reusability. In our case, more than 40% of the elements defined in the UI specifications were reused, resulting in a reduction of 55% of the specification size. Inclusion, parametrization and sub-specification have facilitated modularity and internal reuse of UI specifications at development time, whereas the reuse of UI elements between applications has greatly benefited from sharing repositories of UI elements at run time.Ministerio de Ciencia e Innovación DPI2010-19154Junta de Andalucía TIC-633

    Aproximación léxica basada en recursos para la tarea TWEET-NORM

    Get PDF
    This paper proposes a resource-based lexical approach for addressing the TWEET-NORM task. The proposed system exposes a simple but extensible modular architecture in which each analysis module independently proposes correction candidates for each OOV word. Each one of these analysis modules tries to address a specific problem and each one works in a very different way. The resources are used as the main component for the OOV detection system and they works as support for the validation and filtering of candidates.Este artículo propone una aproximación léxica basada en recursos para abordar la tarea TWEET-NORM. El sistema presenta una arquitectura modular sencilla pero extensible en la cual cada módulo de análisis propone candidatos para cada palabra OOV de forma independiente. Cada uno de estos módulos de análisis intenta abordar una problemática específica y cada uno opera de forma muy distinta. Los recursos se usan como base fundamental del sistema de detección de OOVs y como apoyo para la validación y filtrado de candidatos

    Dynamic Topic-Related Tweet Retrieval

    Get PDF
    Twitter is a social network in which people publish publicly accessible brief, instant messages. With its exponential growth and the public nature and transversality of its contents, more researchers are using Twitter as a source of data for multiple purposes. In this context, the ability to retrieve those messages (tweets) related to a certain topic becomes critical. In this work, we define the topic-related tweet retrieval task and propose a dynamic, graph-based method with which to address it. We have applied our method to capture a data set containing tweets related to the participation of the Spanish team in the Euro 2012 soccer competition, measuring the precision and recall against other simple but commonly used approaches. The results demonstrate the effectiveness of our method, which significantly increases coverage of the chosen topic and is able to capture related but unknown à priori subtopics

    An approach to the use of word embeddings in an opinion classification task

    Get PDF
    In this paper we show how a vector-based word representation obtained via word2vec can help to im- prove the results of a document classifier based on bags of words. Both models allow obtaining nu- meric representations from texts, but they do it very differently. The bag of words model can representdocuments by means of widely dispersed vectors in which the indices are words or groups of words.word2vec generates word level representations building vectors that are much more compact, where in- dices implicitly contain information about the context of word occurrences. Bags of words are very effec- tive for document classification and in our experiments no representation using only word2vec vectorsis able to improve their results. However, this does not mean that the information provided by word2vecis not useful for the classification task. When this information is used in combination with the bags ofwords, the results are improved, showing its complementarity and its contribution to the task. We havealso performed cross-domain experiments in which word2vec has shown much more stable behaviorthan bag of words models.Junta de Andalucía P11-TIC-7684 M

    Obtaining Adaptation of Virtual Courses by Using a Collaborative Tool and Learning Design

    Get PDF
    In this work is described a collaborative tool Learning Activity Management System, LAMS (Macquarie University, Australia) which has been developed for designing, managing and delivering online collaborative learning activities. It provides teachers with a highly intuitive visual authoring environment for creating sequences of learning activities. These activities can include a range of individual tasks, small group work and whole class activities based on both content and collaboration. Then a methodology to apply this tool is described
    corecore