1,166 research outputs found

    A study of the use of natural language processing for conversational agents

    Get PDF
    Language is a mark of humanity and conscience, with the conversation (or dialogue) as one of the most fundamental manners of communication that we learn as children. Therefore one way to make a computer more attractive for interaction with users is through the use of natural language. Among the systems with some degree of language capabilities developed, the Eliza chatterbot is probably the first with a focus on dialogue. In order to make the interaction more interesting and useful to the user there are other approaches besides chatterbots, like conversational agents. These agents generally have, to some degree, properties like: a body (with cognitive states, including beliefs, desires and intentions or objectives); an interactive incorporation in the real or virtual world (including perception of events, communication, ability to manipulate the world and communicate with others); and behavior similar to a human (including affective abilities). This type of agents has been called by several terms, including animated agents or embedded conversational agents (ECA). A dialogue system has six basic components. (1) The speech recognition component is responsible for translating the user’s speech into text. (2) The Natural Language Understanding component produces a semantic representation suitable for dialogues, usually using grammars and ontologies. (3) The Task Manager chooses the concepts to be expressed to the user. (4) The Natural Language Generation component defines how to express these concepts in words. (5) The dialog manager controls the structure of the dialogue. (6) The synthesizer is responsible for translating the agents answer into speech. However, there is no consensus about the necessary resources for developing conversational agents and the difficulties involved (especially in resource-poor languages). This work focuses on the influence of natural language components (dialogue understander and manager) and analyses, in particular the use of parsing systems as part of developing conversational agents with more flexible language capabilities. This work analyses what kind of parsing resources contributes to conversational agents and discusses how to develop them targeting Portuguese, which is a resource-poor language. To do so we analyze approaches to the understanding of natural language, and identify parsing approaches that offer good performance, based on which we develop a prototype to evaluate the impact of using a parser in a conversational agent.linguagem é uma marca da humanidade e da consciência, sendo a conversação (ou diálogo) uma das maneiras de comunicacão mais fundamentais que aprendemos quando crianças. Por isso uma forma de fazer um computador mais atrativo para interação com usuários é usando linguagem natural. Dos sistemas com algum grau de capacidade de linguagem desenvolvidos, o chatterbot Eliza é, provavelmente, o primeiro sistema com foco em diálogo. Com o objetivo de tornar a interação mais interessante e útil para o usuário há outras aplicações alem de chatterbots, como agentes conversacionais. Estes agentes geralmente possuem, em algum grau, propriedades como: corpo (com estados cognitivos, incluindo crenças, desejos e intenções ou objetivos); incorporação interativa no mundo real ou virtual (incluindo percepções de eventos, comunicação, habilidade de manipular o mundo e comunicar com outros agentes); e comportamento similar ao humano (incluindo habilidades afetivas). Este tipo de agente tem sido chamado de diversos nomes como agentes animados ou agentes conversacionais incorporados. Um sistema de diálogo possui seis componentes básicos. (1) O componente de reconhecimento de fala que é responsável por traduzir a fala do usuário em texto. (2) O componente de entendimento de linguagem natural que produz uma representação semântica adequada para diálogos, normalmente utilizando gramáticas e ontologias. (3) O gerenciador de tarefa que escolhe os conceitos a serem expressos ao usuário. (4) O componente de geração de linguagem natural que define como expressar estes conceitos em palavras. (5) O gerenciador de diálogo controla a estrutura do diálogo. (6) O sintetizador de voz é responsável por traduzir a resposta do agente em fala. No entanto, não há consenso sobre os recursos necessários para desenvolver agentes conversacionais e a dificuldade envolvida nisso (especialmente em línguas com poucos recursos disponíveis). Este trabalho foca na influência dos componentes de linguagem natural (entendimento e gerência de diálogo) e analisa em especial o uso de sistemas de análise sintática (parser) como parte do desenvolvimento de agentes conversacionais com habilidades de linguagem mais flexível. Este trabalho analisa quais os recursos do analisador sintático contribuem para agentes conversacionais e aborda como os desenvolver, tendo como língua alvo o português (uma língua com poucos recursos disponíveis). Para isto, analisamos as abordagens de entendimento de linguagem natural e identificamos as abordagens de análise sintática que oferecem um bom desempenho. Baseados nesta análise, desenvolvemos um protótipo para avaliar o impacto do uso de analisador sintático em um agente conversacional

    Computational linguistics in the Netherlands 1996 : papers from the 7th CLIN meeting, November 15, 1996, Eindhoven

    Get PDF

    Apperceptive patterning: Artefaction, extensional beliefs and cognitive scaffolding

    Get PDF
    In “Psychopower and Ordinary Madness” my ambition, as it relates to Bernard Stiegler’s recent literature, was twofold: 1) critiquing Stiegler’s work on exosomatization and artefactual posthumanism—or, more specifically, nonhumanism—to problematize approaches to media archaeology that rely upon technical exteriorization; 2) challenging how Stiegler engages with Giuseppe Longo and Francis Bailly’s conception of negative entropy. These efforts were directed by a prevalent techno-cultural qualifier: the rise of Synthetic Intelligence (including neural nets, deep learning, predictive processing and Bayesian models of cognition). This paper continues this project but first directs a critical analytic lens at the Derridean practice of the ontologization of grammatization from which Stiegler emerges while also distinguishing how metalanguages operate in relation to object-oriented environmental interaction by way of inferentialism. Stalking continental (Kapp, Simondon, Leroi-Gourhan, etc.) and analytic traditions (e.g., Carnap, Chalmers, Clark, Sutton, Novaes, etc.), we move from artefacts to AI and Predictive Processing so as to link theories related to technicity with philosophy of mind. Simultaneously drawing forth Robert Brandom’s conceptualization of the roles that commitments play in retrospectively reconstructing the social experiences that lead to our endorsement(s) of norms, we compliment this account with Reza Negarestani’s deprivatized account of intelligence while analyzing the equipollent role between language and media (both digital and analog)

    Proceedings of the Seventh International Conference Formal Approaches to South Slavic and Balkan languages

    Get PDF
    Proceedings of the Seventh International Conference Formal Approaches to South Slavic and Balkan Languages publishes 17 papers that were presented at the conference organised in Dubrovnik, Croatia, 4-6 Octobre 2010

    Computational linguistics in the Netherlands 1996 : papers from the 7th CLIN meeting, November 15, 1996, Eindhoven

    Get PDF

    Proceedings

    Get PDF
    Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories. Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti. NEALT Proceedings Series, Vol. 9 (2010), 268 pages. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15891

    Semantic consistency in text generation

    Get PDF
    Automatic input-grounded text generation tasks process input texts and generate human-understandable natural language text for the processed information. The development of neural sequence-to-sequence (seq2seq) models, which are usually trained in an end-to-end fashion, pushed the frontier of the performance on text generation tasks expeditiously. However, they are claimed to be defective in semantic consistency w.r.t. their corresponding input texts. Also, not only the models are to blame. The corpora themselves always include examples whose output is semantically inconsistent to its input. Any model that is agnostic to such data divergence issues will be prone to semantic inconsistency. Meanwhile, the most widely-used overlap-based evaluation metrics comparing the generated texts to their corresponding references do not evaluate the input-output semantic consistency explicitly, which makes this problem hard to detect. In this thesis, we focus on studying semantic consistency in three automatic text generation scenarios: Data-to-text Generation, Single Document Abstractive Summarization, and Chit-chat Dialogue Generation, by seeking for the answers to the following research questions: (1) how to define input-output semantic consistency in different text generation tasks? (2) how to quantitatively evaluate the input-output semantic consistency? (3) how to achieve better semantic consistency in individual tasks? We systematically define the semantic inconsistency phenomena in these three tasks as omission, intrinsic hallucination, and extrinsic hallucination. For Data-to-text Generation, we jointly learn a sentence planner that tightly controls which part of input source gets generated in what sequence, with a neural seq2seq text generator, to decrease all three types of semantic inconsistency in model-generated texts. The evaluation results confirm that the texts generated by our model contain much less omissions while maintaining low level of extrinsic hallucinations without sacrificing fluency compared to seq2seq models. For Single Document Abstractive Summarization, we reduce the level of extrinsic hallucinations in training data by automatically introducing assisting articles to each document-summary instance to provide the supplemental world-knowledge that is present in the summary but missing from the doc ument. With the help of a novel metric, we show that seq2seq models trained with as sisting articles demonstrate less extrinsic hallucinations than the ones trained without them. For Chit-chat Dialogue Generation, by filtering out the omitted and hallucinated examples from training set using a newly introduced evaluation metric, and encoding it into the neural seq2seq response generation models as a control factor, we diminish the level of omissions and extrinsic hallucinations in the generated dialogue responses

    A Computational Theory of the Use-Mention Distinction in Natural Language

    Get PDF
    To understand the language we use, we sometimes must turn language on itself, and we do this through an understanding of the use-mention distinction. In particular, we are able to recognize mentioned language: that is, tokens (e.g., words, phrases, sentences, letters, symbols, sounds) produced to draw attention to linguistic properties that they possess. Evidence suggests that humans frequently employ the use-mention distinction, and we would be severely handicapped without it; mentioned language frequently occurs for the introduction of new words, attribution of statements, explanation of meaning, and assignment of names. Moreover, just as we benefit from mutual recognition of the use-mention distinction, the potential exists for us to benefit from language technologies that recognize it as well. With a better understanding of the use-mention distinction, applications can be built to extract valuable information from mentioned language, leading to better language learning materials, precise dictionary building tools, and highly adaptive computer dialogue systems. This dissertation presents the first computational study of how the use-mention distinction occurs in natural language, with a focus on occurrences of mentioned language. Three specific contributions are made. The first is a framework for identifying and analyzing instances of mentioned language, in an effort to reconcile elements of previous theoretical work for practical use. Definitions for mentioned language, metalanguage, and quotation have been formulated, and a procedural rubric has been constructed for labeling instances of mentioned language. The second is a sequence of three labeled corpora of mentioned language, containing delineated instances of the phenomenon. The corpora illustrate the variety of mentioned language, and they enable analysis of how the phenomenon relates to sentence structure. Using these corpora, inter-annotator agreement studies have quantified the concurrence of human readers in labeling the phenomenon. The third contribution is a method for identifying common forms of mentioned language in text, using patterns in metalanguage and sentence structure. Although the full breadth of the phenomenon is likely to elude computational tools for the foreseeable future, some specific, common rules for detecting and delineating mentioned language have been shown to perform well
    corecore