295 research outputs found

    From Commands to Goal-based Dialogs: A Roadmap to Achieve Natural Language Interaction in RoboCup@Home

    Full text link
    On the one hand, speech is a key aspect to people's communication. On the other, it is widely acknowledged that language proficiency is related to intelligence. Therefore, intelligent robots should be able to understand, at least, people's orders within their application domain. These insights are not new in RoboCup@Home, but we lack of a long-term plan to evaluate this approach. In this paper we conduct a brief review of the achievements on automated speech recognition and natural language understanding in RoboCup@Home. Furthermore, we discuss main challenges to tackle in spoken human-robot interaction within the scope of this competition. Finally, we contribute by presenting a pipelined road map to engender research in the area of natural language understanding applied to domestic service robotics.Comment: 12 pages, 2 tables, 1 figure. Accepted and presented (poster) in the RoboCup 2018 Symposium. In pres

    Natural Language Processing for Under-resourced Languages: Developing a Welsh Natural Language Toolkit

    Get PDF
    Language technology is becoming increasingly important across a variety of application domains which have become common place in large, well-resourced languages. However, there is a danger that small, under-resourced languages are being increasingly pushed to the technological margins. Under-resourced languages face significant challenges in delivering the underlying language resources necessary to support such applications. This paper describes the development of a natural language processing toolkit for an under-resourced language, Cymraeg (Welsh). Rather than creating the Welsh Natural Language Toolkit (WNLT) from scratch, the approach involved adapting and enhancing the language processing functionality provided for other languages within an existing framework and making use of external language resources where available. This paper begins by introducing the GATE NLP framework, which was used as the development platform for the WNLT. It then describes each of the core modules of the WNLT in turn, detailing the extensions and adaptations required for Welsh language processing. An evaluation of the WNLT is then reported. Following this, two demonstration applications are presented. The first is a simple text mining application that analyses wedding announcements. The second describes the development of a Twitter NLP application, which extends the core WNLT pipeline. As a relatively small-scale project, the WNLT makes use of existing external language resources where possible, rather than creating new resources. This approach of adaptation and reuse can provide a practical and achievable route to developing language resources for under-resourced languages

    Combining automatic speech recognition with semantic natural language processing in schizophrenia

    Get PDF
    Natural language processing (NLP) tools are increasingly used to quantify semantic anomalies in schizophrenia. Automatic speech recognition (ASR) technology, if robust enough, could significantly speed up the NLP research process. In this study, we assessed the performance of a state-of-the-art ASR tool and its impact on diagnostic classification accuracy based on a NLP model. We compared ASR to human transcripts quantitatively (Word Error Rate (WER)) and qualitatively by analyzing error type and position. Subsequently, we evaluated the impact of ASR on classification accuracy using semantic similarity measures. Two random forest classifiers were trained with similarity measures derived from automatic and manual transcriptions, and their performance was compared. The ASR tool had a mean WER of 30.4%. Pronouns and words in sentence-final position had the highest WERs. The classification accuracy was 76.7% (sensitivity 70%; specificity 86%) using automated transcriptions and 79.8% (sensitivity 75%; specificity 86%) for manual transcriptions. The difference in performance between the models was not significant. These findings demonstrate that using ASR for semantic analysis is associated with only a small decrease in accuracy in classifying schizophrenia, compared to manual transcripts. Thus, combining ASR technology with semantic NLP models qualifies as a robust and efficient method for diagnosing schizophrenia.</p

    GailBot: An automatic transcription system for Conversation Analysis

    Get PDF
    Researchers studying human interaction, such as conversation analysts, psychologists, and linguists, all rely on detailed transcriptions of language use. Ideally, these should include so-called paralinguistic features of talk, such as overlaps, prosody, and intonation, as they convey important information. However, creating conversational transcripts that include these features by hand requires substantial amounts of time by trained transcribers. There are currently no Speech to Text (STT) systems that are able to integrate these features in the generated transcript. To reduce the resources needed to create detailed conversation transcripts that include representation of paralinguistic features, we developed a program called GailBot. GailBot combines STT services with plugins to automatically generate first drafts of transcripts that largely follow the transcription standards common in the field of Conversation Analysis. It also enables researchers to add new plugins to transcribe additional features, or to improve the plugins it currently uses. We describe GailBot’s architecture and its use of computational heuristics and machine learning. We also evaluate its output in relation to transcripts produced by both human transcribers and comparable automated transcription systems. We argue that despite its limitations, GailBot represents a substantial improvement over existing dialogue transcription software

    Five sources of bias in natural language processing

    Get PDF
    Recently, there has been an increased interest in demographically grounded bias in natural language processing (NLP) applications. Much of the recent work has focused on describing bias and providing an overview of bias in a larger context. Here, we provide a simple, actionable summary of this recent work. We outline five sources where bias can occur in NLP systems: (1) the data, (2) the annotation process, (3) the input representations, (4) the models, and finally (5) the research design (or how we conceptualize our research). We explore each of the bias sources in detail in this article, including examples and links to related work, as well as potential counter-measures

    FrameNet annotation for multimodal corpora: devising a methodology for the semantic representation of text-image interactions in audiovisual productions

    Get PDF
    Multimodal analyses have been growing in importance within several approaches to Cognitive Linguistics and applied fields such as Natural Language Understanding. Nonetheless fine-grained semantic representations of multimodal objects are still lacking, especially in terms of integrating areas such as Natural Language Processing and Computer Vision, which are key for the implementation of multimodality in Computational Linguistics. In this dissertation, we propose a methodology for extending FrameNet annotation to the multimodal domain, since FrameNet can provide fine-grained semantic representations, particularly with a database enriched by Qualia and other interframal and intraframal relations, as it is the case of FrameNet Brasil. To make FrameNet Brasil able to conduct multimodal analysis, we outlined the hypothesis that similarly to the way in which words in a sentence evoke frames and organize their elements in the syntactic locality accompanying them, visual elements in video shots may, also, evoke frames and organize their elements on the screen or work complementarily with the frame evocation patterns of the sentences narrated simultaneously to their appearance on screen, providing different profiling and perspective options for meaning construction. The corpus annotated for testing the hypothesis is composed of episodes of a Brazilian TV Travel Series critically acclaimed as an exemplar of good practices in audiovisual composition. The TV genre chosen also configures a novel experimental setting for research on integrated image and text comprehension, since, in this corpus, text is not a direct description of the image sequence but correlates with it indirectly in a myriad of ways. The dissertation also reports on an eye-tracker experiment conducted to validate the approach proposed to a text-oriented annotation. The experiment demonstrated that it is not possible to determine that text impacts gaze directly and was taken as a reinforcement to the approach of valorizing modes combination. Last, we present the Frame2 dataset, the product of the annotation task carried out for the corpus following both the methodology and guidelines proposed. The results achieved demonstrate that, at least for this TV genre but possibly also for others, a fine-grained semantic annotation tackling the diverse correlations that take place in a multimodal setting provides new perspective in multimodal comprehension modeling. Moreover, multimodal annotation also enriches the development of FrameNets, to the extent that correlations found between modalities can attest the modeling choices made by those building frame-based resources.Análises multimodais vêm crescendo em importância em várias abordagens da Linguística Cognitiva e em diversas áreas de aplicação, como o da Compreensão de Linguagem Natural. No entanto, há significativa carência de representações semânticas refinadas de objetos multimodais, especialmente em termos de integração de áreas como Processamento de Linguagem Natural e Visão Computacional, que são fundamentais para a implementação de multimodalidade no campo da Linguística Computacional. Nesta tese, propomos uma metodologia para estender o método de anotação da FrameNet ao domínio multimodal, uma vez que a FrameNet pode fornecer representações semânticas refinadas, particularmente com um banco de dados enriquecido por Qualia e outras relações interframe e intraframe, como é o caso do FrameNet Brasil. Para tornar a FrameNet Brasil capaz de realizar análises multimodais, delineamos a hipótese de que, assim como as palavras em uma frase evocam frames e organizam seus elementos na localidade sintática que os acompanha, os elementos visuais nos planos de vídeo também podem evocar frames e organizar seus elementos na tela ou trabalhar de forma complementar aos padrões de evocação de frames das sentenças narradas simultaneamente ao seu aparecimento na tela, proporcionando diferentes perfis e opções de perspectiva para a construção de sentido. O corpus anotado para testar a hipótese é composto por episódios de um programa televisivo de viagens brasileiro aclamado pela crítica como um exemplo de boas práticas em composição audiovisual. O gênero televisivo escolhido também configura um novo conjunto experimental para a pesquisa em imagem integrada e compreensão textual, uma vez que, neste corpus, o texto não é uma descrição direta da sequência de imagens, mas se correlaciona com ela indiretamente em uma miríade de formas diversa. A Tese também relata um experimento de rastreamento ocular realizado para validar a abordagem proposta para uma anotação orientada por texto. O experimento demonstrou que não é possível determinar que o texto impacta diretamente o direcionamento do olhar e foi tomado como um reforço para a abordagem de valorização da combinação de modos. Por fim, apresentamos o conjunto de dados Frame2, produto da tarefa de anotação realizada para o corpus seguindo a metodologia e as diretrizes propostas. Os resultados obtidos demonstram que, pelo menos para esse gênero de TV, mas possivelmente também para outros, uma anotação semântica refinada que aborde as diversas correlações que ocorrem em um ambiente multimodal oferece uma nova perspectiva na modelagem da compreensão multimodal. Além disso, a anotação multimodal também enriquece o desenvolvimento de FrameNets, na medida em que as correlações encontradas entre as modalidades podem atestar as escolhas de modelagem feitas por aqueles que criam recursos baseados em frames.CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superio

    Topic modelling of Finnish Internet discussion forums as a tool for trend identification and marketing applications

    Get PDF
    The increasing availability of public discussion text data on the Internet motivates to study methods to identify current themes and trends. Being able to extract and summarize relevant information from public data in real time gives rise to competitive advantage and applications in the marketing actions of a company. This thesis presents a method of topic modelling and trend identification to extract information from Finnish Internet discussion forums. The development of text analytics, and especially topic modelling techniques, is reviewed and suitable methods are identified from the literature. The Latent Dirichlet Allocation topic model and the Dynamic Topic Model are applied in finding underlying topics from the Internet discussion forum data. The discussion data collection with web scarping and text data preprocessing methods are presented. Trends are identified with a method derived from outlier detection. Real world events, such as the news about Finnish army vegetarian meal day and the Helsinki summit of presidents Trump and Putin, were identified in an unsupervised manner. Applications for marketing are considered, e.g. automatic search engine advert keyword generation and website content recommendation. Future prospects for further improving the developed topical trend identification method are proposed. This includes the use of more complex topic models, extensive framework for tuning trend identification parameters and studying the use of more domain specific text data sources such as blogs, social media feeds or customer feedback
    corecore