    An analysis of Twitter corpora and the differences between formal and colloquial tweets

    This work reviews recent publications addressing the Twitter translation task, and highlights the lack of appropriate corpora that represents the colloquial language used in Twitter. It also discusses the most well-know issues in the Twitter genre: the use of hashtags and the amount of OOVs, with especial focus in comparing the differences between formal and colloquial texts. // Este trabajo resume las publicaciones recientes en el área de la traducción automática de tweets, destacando la falta de un corpus que represente el lenguaje coloquial presente en Twitter. También se tratan los problemas más conocidos del género de Twitter: el uso de hashtags y la gran cantidad de palabras OOV, con especial enfoque en las diferencias entre tweets formales y coloquialesPeer ReviewedPostprint (author's final draft

    ALICE: Acquisition of Language through an Interactive Comprehension Environment

    Integration of several state-of-the-art technologies related to spoken language and natural language processing used in Intelligent Computer Assisted Language Learning (ICALL) systems. We envision to show that the technology has a level of maturity that suggests that the time may be right to use it at high school. // Integración de tecnologías del estado del arte en procesamiento del habla y procesamiento del lenguaje natural aplicadas a los asistentes inteligentes para el aprendizaje de lenguas. El objetivo es mostrar que el nivel de madurez de la tecnología permite que sea aplicada al aprendizaje de segundas lenguas en secundaria.Peer ReviewedPostprint (published version

    The use of domain ontologies for improving the adaptability and collaborative ability of a web dialogue system

    Dialogue systems can be used for guiding the users accessing web services, enhancing the web usability. However, they are expensive to develop and difficult to adapt to different types of web services. The knowledge model of a web service can be seen as the basis to define the semantics of the information to be exchanged among the components of a dialogue system. This approach facilitates the integration of the different types of knowledge involved in human-machine communication and provides a unified framework easier to apply to new web services. Furthermore, the representation of the web service knowledge according to an ontology can enhance the reasoning capabilities of the underlying system. This article describes the use of domain ontologies in a mixed-initiative web dialogue system for improving both its adaptability and its collaborative ability.Peer ReviewedPostprint (published version

    Un sistema de diálogo multicanal para acceder a la información y servicios de las administraciones públicas

    En este artículo se presenta un sistema de diálogo desarrollado para el proyecto HOPS. El proyecto HOPS tiene como objetivo facilitar el acceso a la información y servicios de las administraciones locales en los que el conocimiento de la aplicación está representado en una ontología. Esta representación permite gestionar la interacción con el usuario en modo oral o textual en diferentes lenguas. El gestor de diálogo utiliza las ontologías para decidir cuál será la siguiente interacción con el usuario, así como para generar en las diferentes lenguas las gramáticas, léxicos y mensajes que intervendrán en cada interacción. Esta forma de representar el conocimiento implicado en la comunicación permite la reutilización de los recursos desarrollados en diferentes aplicaciones.This article presents a dialogue system developed for the European project HOPS. Hops project focuses on facilitating the access to the information and services of local administrations using ontologies to represent knowledge. This representation allows managing the user interaction in textual and vocal mode in different languages. The dialogue controller uses ontologies both for managing user interactions and for generating grammars, lexicons and messages implied in communication. This way of representing knowledge implied in communication allows reusing developed resources in future applications

    English language learning activity using spoken language and intelligent computer-assisted technologies

    This paper presents work in progress on language technologies applied to secondary school education. The application presented integrates several state-of-the-art technologies related to spoken language and intelligent computer-assisted language learning. We envision to show that the technology has reached a level of maturity that suggests that the time may be right to use it to second language learning. To achieve this objective, an activity was designed to be tested at several Spanish high schools. The aim was to carry out a proof of concept in real conditions and to obtain feedback from the students through a questionnaire as well as from the teachers by means of an interview. The activity was designed with the collaboration of some of the teachers at the secondary schools.Peer ReviewedPostprint (published version

    The UPC submission to the WMT 2012 shared task on quality estimation

    In this paper, we describe the UPC system that participated in the WMT 2012 shared task on Quality Estimation for Machine Translation. Based on the empirical evidence that fluencyrelated features have a very high correlation with post-editing effort, we present a set of features for the assessment of quality estimation for machine translation designed around different kinds of n-gram language models, plus another set of features that model the quality of dependency parses automatically projected from source sentences to translations. We document the results obtained on the shared task dataset, obtained by combining the features that we designed with the baseline features provided by the task organizers.Peer ReviewedPostprint (published version

    A graphical interface for MT evaluation and error analysis

    Error analysis in machine translation is a necessary step in order to investigate the strengths and weaknesses of the MT systems under development and allow fair comparisons among them. This work presents an application that shows how a set of heterogeneous automatic metrics can be used to evaluate a test bed of automatic translations. To do so, we have set up an online graphical interface for the ASIYA toolkit, a rich repository of evaluation measures working at different linguistic levels. The current implementation of the interface shows constituency and dependency trees as well as shallow syntactic and semantic annotations, and word alignments. The intelligent visualization of the linguistic structures used by the metrics, as well as a set of navigational functionalities, may lead towards advanced methods for automatic error analysis.Peer ReviewedPostprint (published version

    Guiding the user when searching information on the web

    This paper describes how we approach the problem of guiding the user when accessing informational web services. We developed a mixed-initiative dialogue system that provides access to web services in several languages. In order to facilitate the adaptation of the system to new informational web services dialogue and task management were separated and general descriptions of the several tasks involved in the communication process were incorporated.Peer ReviewedPostprint (published version

    Estructura y gestión de tareas en un sistema de diálogo para acceder a servicios web

    Los sistemas de diálogo se pueden ver como una interfaz de usuario para acceder a otras aplicaciones. No sólo tienen que tratar con los requerimientos de los usuarios, sino también con los propios de las aplicaciones. Este trabajo se centra en cómo representar y gestionar las tareas para dos tipos de servicios web: transaccionales y búsqueda de información. Los servicios transaccionales suelen ser simples de gestionar, excepto cuando el usuario desconoce el signi cado de los parámetros que el sistema le pide. Los sistemas de búsqueda, en cambio, necesitan estrategias mucho más complejas para acceder a las aplicaciones y mostrar los resultados. De ahí la necesidad de sistemas que guíen al usuario para que pueda acceder de manera fácil a la información. En nuestra propuesta, las especi caciones de las tareas se utilizan para determinar cuándo y cómo obtener más información del usuario, y cómo presentar los resultado de forma clara. // Dialog systems can be seen as user interfaces to access other applications. They have to address the user needs, as well as the requirements of the applications. This work is concerned with the representation and the management of the application tasks. We have studied two types of web services: form- lling and information-seeking. We claim that dialogue systems may not intend to use the same strategies for all types of applications. Form- lling applications do not need assistants, but explanations about the meaning of the elds. Information-seeking engines need complex strategies to access and display results, and hence assistants may guide the user to give the query constraints. In our proposal, the task models we describe are used to determine how to acquire more reliable constraints from the user and how to adapt them in order to obtain more suitable results; as well as the most appropriate presentation of results.Peer ReviewedPostprint (author’s final draft

    Hacia la interacción en lenguaje natural

    En éste documento se presenta la investigación que está siendo llevada a cabo en el Grupo de Procesamiento de Lenguaje Natural (GPLN) de la Universidad Politécnica de Cataluña (UPC). En concreto, hemos articulado la presentación de las diferentes líneas de trabajo tomando como referencia su aplicación en un asistente virtual. Creemos que su uso y implantación irá en aumento en los próximos diez años, de ahí la importancia del estado de las tecnologías del lenguaje natural y, aún mas, de los nuevos retos que este tipo de aplicaciones nos plantean.Peer ReviewedPostprint (published version
