36 research outputs found

    A comparison of services for intent and entity recognition for conversational recommender systems

    Get PDF
    Conversational Recommender Systems (CoRSs) are becoming increasingly popular. However, designing and developing a CoRS is a challenging task since it requires multi-disciplinary skills. Even though several third-party services are available for supporting the creation of a CoRS, a comparative study of these platforms for the specific recommendation task is not available yet. In this work, we focus our attention on two crucial steps of the Conversational Recommendation (CoR) process, namely Intent and Entity Recognition. We compared four of the most popular services, both commercial and open source. Furthermore, we proposed two custom-made solutions for Entity Recognition, whose aim is to overcome the limitations of the other services. Results are very interesting and give a clear picture of the strengths and weaknesses of each solution

    Evaluation of speech recognizers for use in advanced combat helicopter crew station research and development

    Get PDF
    The U.S. Army Crew Station Research and Development Facility uses vintage 1984 speech recognizers. An evaluation was performed of newer off-the-shelf speech recognition devices to determine whether newer technology performance and capabilities are substantially better than that of the Army's current speech recognizers. The Phonetic Discrimination (PD-100) Test was used to compare recognizer performance in two ambient noise conditions: quiet office and helicopter noise. Test tokens were spoken by males and females and in isolated-word and connected-work mode. Better overall recognition accuracy was obtained from the newer recognizers. Recognizer capabilities needed to support the development of human factors design requirements for speech command systems in advanced combat helicopters are listed

    Sistema móvil de información y guiado

    Get PDF
    Actualmente con la gran cantidad de dispositivos móviles inteligentes (Smartphone), los usuarios demandan nuevos servicios que automaticen o faciliten sus tareas cotidianas. Por ello, el objetivo de este Trabajo Fin de Grado es el desarrollo de una aplicación que automatice una tarea cotidiana del usuario aplicando para ello la inteligencia ambiental, mediante un dispositivo móvil Android. A través de la aplicación, los usuarios podrán automatizar sus preferencias mediante el posicionamiento del mismo y/o periodo de tiempo. Además, la aplicación permite la interacción tradicional mediante el teclado virtual del dispositivo y también, el reconocimiento de voz. A su vez, entre las funcionalidades secundarias de la aplicación se encuentra la creación de perfiles, reglas y localizaciones con las preferencias del usuario y su administración. Para llevar a cabo dicha aplicación, es necesario apoyarse en otras tecnologías adicionales, empezando por el uso de diferentes lenguajes de programación (Java, SQL y XML), implementando las propias APIs de Android (Google Voice Recognizer, Google Places y Speech To Text), proporcionando a la aplicación mayor flexibilidad y eficiencia. Por último, se emplea como motor de base de datos, el integrado en el sistema operativo Android (SQLite), proporcionando de este modo la función de almacenamiento a la aplicación. Como complemento a esta aplicación, se lleva a cabo un estudio detallado de los principales sistemas operativos móviles, profundizando especialmente en el sistema operativo elegido para el desarrollo de la aplicación (Android). Además, se realiza un estudio de los sistemas de dialogo, detallando sus arquitectura general y describiendo el tipo de sistema implementado en la aplicación. Estos estudios complementarios, forman parte del análisis previo al desarrollo de la aplicación, permitiendo seleccionar tanto el sistema operativo más atractivo para el desarrollo de la misma, así como eligiendo el tipo de sistema de dialogo más adecuando para la aplicaciónNowadays and with the huge amount of intelligent mobile devices (smartphones), users ask for new services which can make easier their everyday tasks. The objective of this research paper is the development of an application that makes automatic some tasks of the users, applying () in an Android mobile device. With the application, users will be able to automate their preferences by the positioning of the device and/or by the same period of time. Furthermore, the application allows the interaction between the virtual keyboard of the device and also between the voice recognition function. The creation of new profiles, rules and locations with the preferences of the user and his administration are considered as secondary functionalities. In order to create this application, it is necessary to lie on other additional technologies, such as the use of different programming languages (Java, SQL and XML), implementing Android’s own APIs (Google Voice Recognizer, Google Places and Speech To Text), giving flexibility and efficiency to the application. Finally, as database engine, it is used the same one which is integrated in Android operating system (SQLite). As a complement for this application, it has been done a detailed study about the principal mobile operating systems, with particular regard to the operating system chosen for the development of the application (Android). In addition, it is carried out a study of dialogue systems, detailing its overall architecture and describing the type of system implemented in the application. These further investigations are part of the pre-analysis application development, allowing the selection of the operating system that best fits with the development of the application, as well as choosing the type of system of dialogue more suitable for the application.Grado en Ingeniería Informátic

    Uma perspectiva musicológica sobre a formação da categoria ciberpunk na música para audiovisuais – entre 1982 e 2017

    Get PDF
    A categoria musical “ciberpunk” é usada em plataformas digitais como ferramenta para identificar obras que, segundo quem aplica esse rótulo, materializam em som características que são ciberpunk. Tenho como objectivos identificar o que se entende por música ciberpunk, descobrir como esta categoria participa na construção de uma identidade colectiva e um sentimento de comunidade na internet, e propor uma teoria musical que compreenda as suas principais propriedades, quais os seus participantes e como esta está situada socialmente. As minhas questões de investigação são: quais as principais características da música ciberpunk? Quais os seus principais usos e mediações? O que designam os consumidores por música ciberpunk? Qual o papel das bandas sonoras de audiovisuais ciberpunk para a construção social desta categoria? Defendo que o acto de escuta e relação com a música ciberpunk está imbuído de um conjunto de factores determinantes para que esta seja interpretada como ciberpunk por parte dos participantes. Estas pré-condições baseiam-se numa literacia audiovisual que conduzem a significados particulares, pois são fruto de uma exposição a bandas sonoras de audiovisuais narrativos ciberpunk criados e disseminados no período histórico entre 1982 e 2017. Realizei uma recolha e reflexão de textos que abordam esta música para identificar quais os discursos interpretativos existentes, analisei bandas sonoras de uma amostra representativa de audiovisuais ciberpunk de forma a encontrar padrões musicais, e uma etnografia digital de observação não participante nos contextos do YouTube, sites de library music, BandCamp e páginas de fãs do ciberpunk, para determinar como os ouvintes se relacionam com esta música. A análise demonstra que o processo de circulação de referências, que se funda, em grande medida, por citações transmediais da música do filme Blade Runner (1982, Scott), permite a construção e sedimentação de uma linguagem musical própria. Nas indústrias do cinema, televisão, videojogos e internet, assumem-se como convenções que convergem e atravessam diferentes formatos, mas na cultura ciberpunk adquirem significados émicos. Os seus ouvintes possuem uma memória colectiva de relações entre significantes musicais e significados ciberpunk que aplicam no momento da escuta, e transpõem essas associações para outros géneros musicais, como o synthwave e o vaporwave, de modo a legitimarem o seu estatuto como géneros musicais ciberpunk. Baseando-me nestes resultados, concluí que esta categoria se sustenta por actos de engajamento que edificam um género musical emergente, pois a teoria proposta nesta dissertação demonstra que os agentes desta cultura exploram uma linguagem musical e discursiva. Fazem circular a categoria entre plataformas de reprodução e venda musical, como o YouTube e o BandCamp, para afirmar uma identidade colectiva, capitalizar convenções e negociar expectativas, e construir um mercado digital de nicho.The “cyberpunk” musical category is used in digital platforms as a tool to identify works that, according to who applies the label, materialize in sound cyberpunk characteristics. I aim to identify what is meant by cyberpunk music, find how this category participates in the construction of collective identity and a feeling of community on the internet, and purpose a musical theory that frame it socially, understanding its main properties and who are the participants. My research questions are: what are the main characteristics of cyberpunk music? What are its main uses and mediations? What consumers designate by cyberpunk music? What is the role of cyberpunk audiovisual soundtracks in the social construction of this category? I argue that the listening act and relation with cyberpunk music depend on a determinant set of factors for it to be interpreted as cyberpunk by its participants. Those pre-conditions are based on an audiovisual literacy that conducts particular meanings, for they are a product of an exposition to cyberpunk narrative audiovisuals created and disseminated in the historical period of 1982 to 2017. I collected and reflected upon texts that discuss this music to discern the existing interpretive discourses, analyzed soundtracks of a representative sample of cyberpunk audiovisuals to find musical patterns, and non-participant digital ethnography in contexts of YouTube, sites of library music, BandCamp and cyberpunk digital fanbases, to determine how listeners relate with this music. The analysis showed that the circulating process of references is funded, largely, by transmedial citations of Blade Runner’s music (1982, Scott), allowing construction and sedimentation of its musical language. In the industries of cinema, television, videogames, and internet, constituents of this language are conventions that converge and cross different formats, but for the cyberpunk culture, they acquire emic meanings. Its listeners possess a collective memory of relations between a musical signifier and a cyberpunk signified that apply in the listening moment, and transpose these associations to other musical genres, like synthwave and vaporwave, to legitimate them as cyberpunk musical genres. Based on these results, I concluded that this category is sustained by engagement acts that are building an emergent musical genre, and the purposed theory of this dissertation shows that agents from this culture are exploring a musical and discursive language. They circulate the category between music reproduction and selling platforms, like YouTube and BandCamp, to assert a collective identity, capitalize conventions and negotiate expectations, and build a niche digital market

    Multimodal Interaction for Enhancing Team Coordination on the Battlefield

    Get PDF
    Team coordination is vital to the success of team missions. On the battlefield and in other hazardous environments, mission outcomes are often very unpredictable because of unforeseen circumstances and complications encountered that adversely affect team coordination. In addition, the battlefield is constantly evolving as new technology, such as context-aware systems and unmanned drones, becomes available to assist teams in coordinating team efforts. As a result, we must re-evaluate the dynamics of teams that operate in high-stress, hazardous environments in order to learn how to use technology to enhance team coordination within this new context. In dangerous environments where multi-tasking is critical for the safety and success of the team operation, it is important to know what forms of interaction are most conducive to team tasks. We have explored interaction methods, including various types of user input and data feedback mediums that can assist teams in performing unified tasks on the battlefield. We’ve conducted an ethnographic analysis of Soldiers and researched technologies such as sketch recognition, physiological data classification, augmented reality, and haptics to come up with a set of core principles to be used when de- signing technological tools for these teams. This dissertation provides support for these principles and addresses outstanding problems of team connectivity, mobility, cognitive load, team awareness, and hands-free interaction in mobile military applications. This research has resulted in the development of a multimodal solution that enhances team coordination by allowing users to synchronize their tasks while keeping an overall awareness of team status and their environment. The set of solutions we’ve developed utilizes optimal interaction techniques implemented and evaluated in related projects; the ultimate goal of this research is to learn how to use technology to provide total situational awareness and team connectivity on the battlefield. This information can be used to aid the research and development of technological solutions for teams that operate in hazardous environments as more advanced resources become available

    Prototipo de interfaz humano-máquina basado en AIML con capacidad de realizar tareas preprogramadas

    Get PDF
    Tesis (Ingeniero en Automatización y Robótica)La inteligencia artificial está evolucionando con nuevas y mejores innovaciones todos los días. Pero fundamentalmente la I.A. siempre ha intentado perseguir y obtener rasgos de inteligencia humana.” (Bishwajeet, 2015).En el mundo actual es inevitable la evolución tecnológica hacia lo simple y automático y la expansión de esto a los distintos ámbitos de los negocios. Hoy en día existen aplicaciones que son capaces de dar respuestas satisfactorias a los usuariosa través de chat, pero aun así las capacidades de respuesta siguen siendo limitadas. Este proyecto presenta una investigación y desarrollo de un prototipo capaz de impulsar las capacidades de las actuales interfaces humano-maquina, las cuales son bastante escasas y limitadas en su funcionamiento. Haciendo uso de herramientas como el lenguaje AIML, Python 3 y Google Speech API se logró una capacidad de funcionamiento más amplía y efectiva para estas interfaces de respuesta automática. Para esto se implementó una interfaz de comunicación capaz de recibir solicitudes realizadas por voz del usuario y dar una respuesta acorde, ya sea por medio del chat, por voz, o por alguna aplicación interna de este mismo. También se realizaron pruebas para evaluar el grado de error que este tiene en sus con una metodología objetiva y subjetiva. “La evaluación de naturalidad de la conversación se ha divido en dos partes: una parte objetiva, que toma en cuenta la cantidad de respuestas erróneas y correctas y otra subjetiva, que se enfoca en la experiencia del usuario.” (Quintero, 2015). La evaluación de esta interfaz resultó ser positiva obteniendo un 80% de precisión en el reconocimiento de voz y un 90% de precisión en el procesamiento de AIML. De esto se determinó estas interfaces son benfeciosas siempre y cuando se tomen todas las medidas y se ajusten los parámetros de manera correcta. De no ser así las capacidades de procesamiento se pueden ver mermadas considerablemente. Palabras clave: AIML, Chatterbot, Google Speech API, Reconociemiento por voz, Interfaz humano-maquina, automatico.“Artificial Intelligence is coming up with new and greater innovations every day. But fundamentally, A.I. has always tried to pursue and attain intelligent human traits” (Bishwajeet, 2015). It is a fact that nowadays there are totally functional applications which are capable of giving satisfying responses to the users, frequently by chat communication, even so, the capabilites of answer are still limited. This Project presents an investigation and develpment of a prototype capable of boosting the capabilites of the current human-machine interfaces, which are very poor and limited in its functionality. Using tools like Chatterbot, AIML language and Google Speech API the performance boosting was acomplished in an efective and wider form. For this purpose an interface was developed with the capacity to receive voice request of the user and give and answer by chat, voice or an internal aplication. The margin of error of the responses to the user can also be evaluated, one of the ways is using an objective and a subjective methodology. “The evaluation of the conversation’s naturalness has been Split into two parts: an objective part, that measures the correct and incorrect answers, and a subjective part, that focuses on the user's experience”. (Quintero, 2015). The evaluation of this interface ended being positive, obtaning an 80% of accuracy in speech recognition and a 90% of accuracy in AIML processing. After this results it was determined that this interfaces can be beneficial but only if all the measurements are taken and the paremeters are adjusted correctly. If not, the processing capabilities can be decreased considerably

    «Ποιος θέλει να γίνει εκατομμυριούχος;» a la Ελληνικά

    Get PDF
    Αυτή η εργασία περιγράφει αναλυτικά τις τεχνικές που χρησιμοποιούνται για την κατασκευή ενός εικονικού παίκτη για το δημοφιλές τηλεοπτικό παιχνίδι «Ποιος θέλει να γίνει εκατομμυριούχος;» και βασίζεται πάνω στο αντίστοιχο άρθρο [1] στο οποίο έχει γίνει υλοποίηση για την αγγλική και την ιταλική έκδοση του παιχνιδιού. Επίσης σε αυτήν την εργασία έγινε μια προσπάθεια εφαρμογής των διαφόρων τεχνικών που περιγράφονται μέσα στο άρθρο. H υλοποίηση του εικονικού παίκτη για την ελληνική έκδοση του παιχνιδιού έγινε στην γλώσσα προγραμματισμού Java και θα παρουσιαστεί αναλυτικά. Ο εικονικός παίκτης πρέπει να απαντήσει σε μια σειρά από ερωτήσεις πολλαπλής επιλογής που τίθενται σε φυσική γλώσσα, επιλέγοντας τη σωστή απάντηση μεταξύ τεσσάρων διαφορετικών επιλογών. Εάν δεν είναι σίγουρος για κάποια απάντηση μπορεί να χρησιμοποιήσει τις σανίδες σωτηρίας (lifelines) ή να αποχωρήσει από το παιχνίδι. Η αρχιτεκτονική του εικονικού παίκτη αποτελείται από 1) μια μονάδα (module) Απάντησης Ερωτημάτων (Question Answering) (QA), η οποία αξιοποιεί την μηχανή αναζήτησης της Google για να ανακτήσει τα πιο σχετικά χωρία κειμένου που είναι χρήσιμα στο να προσδιοριστεί η σωστή απάντηση σε μία ερώτηση, 2) μια μονάδα Βαθμολόγησης Απαντήσεων (Answer Scoring) (AS), η οποία αποδίδει μια βαθμολογία σε κάθε υποψήφια απάντηση σύμφωνα με διαφορετικά κριτήρια με βάση τα αποσπάσματα των κειμένων που ανακτώνται από την μονάδα QA, και 3) μια μονάδα Λήψης Αποφάσεων (Decision Making) (DM), η οποία επιλέγει τη στρατηγική για το παιχνίδι σύμφωνα με συγκεκριμένους κανόνες, και σύμφωνα με τις βαθμολογίες που αποδίδονται στις υποψήφιες απαντήσεις. Τέλος στην εργασία αξιολογούνται τόσο η ακρίβεια του εικονικού παίκτη να απαντήσει σωστά στις ερωτήσεις του παιχνιδιού, όσο και η ικανότητά του να παίζει πραγματικά παιχνίδια για να κερδίσει χρήματα. Τα πειράματα έχουν διεξαχθεί με ερωτήσεις που προέρχονται από την ελληνική έκδοση του επιτραπέζιου παιχνιδιού. Σε γενικές γραμμές παρατηρείται ότι η μέση ακρίβεια του εικονικού παίκτη είναι σημαντικά καλύτερη από την απόδοση των ανθρώπινων παικτών. Όσον αφορά τη δυνατότητα να παίξει πραγματικά παιχνίδια, το οποίο περιλαμβάνει τον ορισμό μιας κατάλληλης στρατηγικής για τη χρήση των σανίδων σωτηρίας προκειμένου να αποφασίσει είτε να απαντήσει σε μια ερώτηση ακόμη και σε μια κατάσταση αβεβαιότητας ή να αποσυρθεί από το παιχνίδι παίρνοντας τα χρήματα που έχει κερδίσει μέχρι τώρα, ο εικονικός παίκτης κερδίζει κατά μέσο όρο περισσότερα χρήματα από το μέσο ποσό που κέρδισαν οι ανθρώπινοι παίκτες.This work describes in detail the techniques used to build a virtual player for the popular TV game “Who Wants to Be a Millionaire?” and is based on the corresponding article [1] in which the virtual player has been implemented for the English and the Italian versions of the game. Also in this work an attempt was made to apply the various techniques described in the article. The implementation of the virtual player for the Greek version of the game was made using the programming language Java and will be presented in detail. The virtual player must answer a series of multiple-choice questions posed in natural language by selecting the correct answer among four different choices. If he is not sure about an answer he can use the lifelines or quit the game. The architecture of the virtual player consists of 1) a Question Answering (QA) module, which leverages the use of Google search engine to retrieve the most relevant passages of text useful to identify the correct answer to a question, 2) an Answer Scoring (AS) module, which assigns a score to each candidate answer according to different criteria based on the passages of text retrieved by the QA module, and 3) a Decision Making (DM) module, which chooses the strategy for playing the game according to specific rules as well as to the scores assigned to the candidate answers. Finally, in this work both the accuracy of the virtual player to answer correctly the questions of the game, and its ability to play real games in order to earn money are evaluated. The experiments have been conducted with questions derived from the Greek version of the board game. Generally, it is observed that the average accuracy of the virtual player is significantly better that the performance of the human players. Regarding the ability to play real games, which involves the definition of a proper strategy for the usage of lifelines in order to decide whether to answer a question even in a condition of uncertainty or to retire from the game by taking the earned money, the virtual player wins on average more money than the average amount earned by human players

    Aspects of Coherence for Entity Analysis

    Get PDF
    Natural language understanding is an important topic in natural language proces- sing. Given a text, a computer program should, at the very least, be able to under- stand what the text is about, and ideally also situate it in its extra-textual context and understand what purpose it serves. What exactly it means to understand what a text is about is an open question, but it is generally accepted that, at a minimum, un- derstanding involves being able to answer questions like “Who did what to whom? Where? When? How? And Why?”. Entity analysis, the computational analysis of entities mentioned in a text, aims to support answering the questions “Who?” and “Whom?” by identifying entities mentioned in a text. If the answers to “Where?” and “When?” are specific, named locations and events, entity analysis can also pro- vide these answers. Entity analysis aims to answer these questions by performing entity linking, that is, linking mentions of entities to their corresponding entry in a knowledge base, coreference resolution, that is, identifying all mentions in a text that refer to the same entity, and entity typing, that is, assigning a label such as Person to mentions of entities. In this thesis, we study how different aspects of coherence can be exploited to improve entity analysis. Our main contribution is a method that allows exploiting knowledge-rich, specific aspects of coherence, namely geographic, temporal, and entity type coherence. Geographic coherence expresses the intuition that entities mentioned in a text tend to be geographically close. Similarly, temporal coherence captures the intuition that entities mentioned in a text tend to be close in the tem- poral dimension. Entity type coherence is based in the observation that in a text about a certain topic, such as sports, the entities mentioned in it tend to have the same or related entity types, such as sports team or athlete. We show how to integrate features modeling these aspects of coherence into entity linking systems and esta- blish their utility in extensive experiments covering different datasets and systems. Since entity linking often requires computationally expensive joint, global optimi- zation, we propose a simple, but effective rule-based approach that enjoys some of the benefits of joint, global approaches, while avoiding some of their drawbacks. To enable convenient error analysis for system developers, we introduce a tool for visual analysis of entity linking system output. Investigating another aspect of co- herence, namely the coherence between a predicate and its arguments, we devise a distributed model of selectional preferences and assess its impact on a neural core- ference resolution system. Our final contribution examines how multilingual entity typing can be improved by incorporating subword information. We train and make publicly available subword embeddings in 275 languages and show their utility in a multilingual entity typing tas

    Semi-Supervised Named Entity Recognition:\ud Learning to Recognize 100 Entity Types with Little Supervision\ud

    Get PDF
    Named Entity Recognition (NER) aims to extract and to classify rigid designators in text such as proper names, biological species, and temporal expressions. There has been growing interest in this field of research since the early 1990s. In this thesis, we document a trend moving away from handcrafted rules, and towards machine learning approaches. Still, recent machine learning approaches have a problem with annotated data availability, which is a serious shortcoming in building and maintaining large-scale NER systems. \ud \ud In this thesis, we present an NER system built with very little supervision. Human supervision is indeed limited to listing a few examples of each named entity (NE) type. First, we introduce a proof-of-concept semi-supervised system that can recognize four NE types. Then, we expand its capacities by improving key technologies, and we apply the system to an entire hierarchy comprised of 100 NE types. \ud \ud Our work makes the following contributions: the creation of a proof-of-concept semi-supervised NER system; the demonstration of an innovative noise filtering technique for generating NE lists; the validation of a strategy for learning disambiguation rules using automatically identified, unambiguous NEs; and finally, the development of an acronym detection algorithm, thus solving a rare but very difficult problem in alias resolution. \ud \ud We believe semi-supervised learning techniques are about to break new ground in the machine learning community. In this thesis, we show that limited supervision can build complete NER systems. On standard evaluation corpora, we report performances that compare to baseline supervised systems in the task of annotating NEs in texts. \u
    corecore