582 research outputs found

    A study of the use of natural language processing for conversational agents

    Get PDF
    Language is a mark of humanity and conscience, with the conversation (or dialogue) as one of the most fundamental manners of communication that we learn as children. Therefore one way to make a computer more attractive for interaction with users is through the use of natural language. Among the systems with some degree of language capabilities developed, the Eliza chatterbot is probably the first with a focus on dialogue. In order to make the interaction more interesting and useful to the user there are other approaches besides chatterbots, like conversational agents. These agents generally have, to some degree, properties like: a body (with cognitive states, including beliefs, desires and intentions or objectives); an interactive incorporation in the real or virtual world (including perception of events, communication, ability to manipulate the world and communicate with others); and behavior similar to a human (including affective abilities). This type of agents has been called by several terms, including animated agents or embedded conversational agents (ECA). A dialogue system has six basic components. (1) The speech recognition component is responsible for translating the user’s speech into text. (2) The Natural Language Understanding component produces a semantic representation suitable for dialogues, usually using grammars and ontologies. (3) The Task Manager chooses the concepts to be expressed to the user. (4) The Natural Language Generation component defines how to express these concepts in words. (5) The dialog manager controls the structure of the dialogue. (6) The synthesizer is responsible for translating the agents answer into speech. However, there is no consensus about the necessary resources for developing conversational agents and the difficulties involved (especially in resource-poor languages). This work focuses on the influence of natural language components (dialogue understander and manager) and analyses, in particular the use of parsing systems as part of developing conversational agents with more flexible language capabilities. This work analyses what kind of parsing resources contributes to conversational agents and discusses how to develop them targeting Portuguese, which is a resource-poor language. To do so we analyze approaches to the understanding of natural language, and identify parsing approaches that offer good performance, based on which we develop a prototype to evaluate the impact of using a parser in a conversational agent.linguagem é uma marca da humanidade e da consciência, sendo a conversação (ou diálogo) uma das maneiras de comunicacão mais fundamentais que aprendemos quando crianças. Por isso uma forma de fazer um computador mais atrativo para interação com usuários é usando linguagem natural. Dos sistemas com algum grau de capacidade de linguagem desenvolvidos, o chatterbot Eliza é, provavelmente, o primeiro sistema com foco em diálogo. Com o objetivo de tornar a interação mais interessante e útil para o usuário há outras aplicações alem de chatterbots, como agentes conversacionais. Estes agentes geralmente possuem, em algum grau, propriedades como: corpo (com estados cognitivos, incluindo crenças, desejos e intenções ou objetivos); incorporação interativa no mundo real ou virtual (incluindo percepções de eventos, comunicação, habilidade de manipular o mundo e comunicar com outros agentes); e comportamento similar ao humano (incluindo habilidades afetivas). Este tipo de agente tem sido chamado de diversos nomes como agentes animados ou agentes conversacionais incorporados. Um sistema de diálogo possui seis componentes básicos. (1) O componente de reconhecimento de fala que é responsável por traduzir a fala do usuário em texto. (2) O componente de entendimento de linguagem natural que produz uma representação semântica adequada para diálogos, normalmente utilizando gramáticas e ontologias. (3) O gerenciador de tarefa que escolhe os conceitos a serem expressos ao usuário. (4) O componente de geração de linguagem natural que define como expressar estes conceitos em palavras. (5) O gerenciador de diálogo controla a estrutura do diálogo. (6) O sintetizador de voz é responsável por traduzir a resposta do agente em fala. No entanto, não há consenso sobre os recursos necessários para desenvolver agentes conversacionais e a dificuldade envolvida nisso (especialmente em línguas com poucos recursos disponíveis). Este trabalho foca na influência dos componentes de linguagem natural (entendimento e gerência de diálogo) e analisa em especial o uso de sistemas de análise sintática (parser) como parte do desenvolvimento de agentes conversacionais com habilidades de linguagem mais flexível. Este trabalho analisa quais os recursos do analisador sintático contribuem para agentes conversacionais e aborda como os desenvolver, tendo como língua alvo o português (uma língua com poucos recursos disponíveis). Para isto, analisamos as abordagens de entendimento de linguagem natural e identificamos as abordagens de análise sintática que oferecem um bom desempenho. Baseados nesta análise, desenvolvemos um protótipo para avaliar o impacto do uso de analisador sintático em um agente conversacional

    Gapping as Constituent Coordination

    Get PDF
    A number of coordinate constructions in natural languages conjoin sequences which do not appear to correspond to syntactic constituents in the traditional sense. One striking instance of the phenomenon is afforded by the gapping construction of English, of which the following sentence is a simple example: (1) Harry eats beans, and Fred, potatoes Since all theories agree that coordination must in fact be an operation upon constituents, most of them have dealt with the apparent paradox presented by such constructions by supposing that such sequences as the right conjunct in the above example, Fred, potatoes, should be treated in the grammar as traditional constituents, of type S, but with pieces missing or deleted

    Da linguística gerativa à gramática categorial : sujeitos lexicais em infinitivos controlados

    Get PDF
    Orientadores: Marcelo Esteban Coniglio, Sonia Maria Lazzarini CyrinoTese (doutorado) - Universidade Estadual de Campinas, Instituto de Filosofia e Ciências HumanasResumo: A presente tese situa-se na interface da lógica e da linguística; o seu objeto de estudo são os pronomes lexicais em sentenças de controle em três línguas Românicas: Português, Italiano e Espanhol. Esse assunto tem recebido mais atenção na linguística gerativa, especialmente nos anos recentes, do que na gramática de cunho lógico. Talvez como consequência disso, há ainda muito a ser entendido sobre essas estruturas linguísticas e as suas propriedades lógicas. Essa tese tenta preencher as lacunas na literatura \--- ou, pelo menos, avançar nessa direção \--- colocando questões que não foram suficientemente exploradas até agora. Para tal efeito avançamos duas perguntas-chaves, uma linguística e a outra lógica. Elas são, respectivamente: Qual é o estatuto sintático dos pronomes lexicais em estruturas de controle? E: Quais são os mecanismos disponíveis, em uma gramática lógica livre de contração, para se reusar recursos semânticos? A tese divide-se, consequentemente, em duas partes: linguística gerativa e gramática categorial. Na Parte I revisamos algumas das principais teorias de controle gerativistas e a recente discussão acerca das cláusulas infinitivas com sujeito lexical. Na Parte II revisamos a literatura categorial, atendendo principalmente às propostas acerca das estruturas de controle e dos pronomes anafóricos. Em última instância, mostraremos que as propostas linguísticas e lógicas prévias precisam ser modificadas para se explicar o fenômeno linguístico em questão. Com efeito, nos capítulos finais de cada uma das partes avançamos propostas alternativas que, a nosso ver, resultam mais adequadas que as suas rivais. Mais específicamente, na Parte I avançamos uma proposta linguística na linha do cálculo de controle T/Agr de Landau. Na Parte II apresentamos duas propostas categoriais, uma na linha do cálculo categorial combinatório e a outra, na gramática lógica de tipos. Finalmente mostramos a implementação da última proposta em um analisador sintático e de demonstração categorialAbstract: The present thesis lies at the interface of logic and linguistics; its object of study are control sentences with overt pronouns in Romance languages (European and Brazilian Portuguese, Italian and Spanish). This is a topic that has received considerably more attention on the part of linguists, especially in recent years, than from logicians. Perhaps for this reason, much remains to be understood about these linguistic structures and their underlying logical properties. This thesis seeks to fill the lacunas in the literature \--- or at least take steps in this direction \--- by way of addressing a number of issues that have so far been under-explored. To this end we put forward two key questions, one linguistic and the other logical. These are, respectively: What is the syntactic status of the surface pronoun? And: What are the available mechanisms to reuse semantic resources in a contraction-free logical grammar? Accordingly, the thesis is divided into two parts: generative linguistics and categorial grammar. Part I starts by reviewing the recent discussion within the generative literature on infinitive clauses with overt subjects, paying detailed attention to the main accounts in the field. Part II does the same on the logical grammar front, addressing in particular the issues of control and of anaphoric pronouns. Ultimately, the leading accounts from both camps will be found wanting. The closing chapter of each of Part I and Part II will thus put forward alternative candidates, that we contend are more successful than their predecessors. More specifically, in Part I we offer a linguistic account along the lines of Landau's T/Agr theory of control. In Part II we present two alternative categorial accounts: one based on Combinatory Categorial Grammar, the other on Type-Logical Grammar. Each of these accounts offers an improved, more fine-grained perspective on control infinitives featuring overt pronominal subjects. Finally, we include an Appendix in which our type-logical proposal is implemented in a categorial parser/theorem-prover (categorial parser/theorem-prover)DoutoradoFilosofiaDoutora em Filosofia2013/08115-1, 2015/09699-2FAPESPCAPE

    Grammars and Processors

    Get PDF
    The paper discusses the role of grammars in sentence processing, and explores some consequences of the Strong Competence Hypothesis of Bresnan and Kaplan for combinatory theories of grammar

    Derivation and structure in categorial grammar

    Get PDF

    CLiFF Notes: Research In Natural Language Processing at the University of Pennsylvania

    Get PDF
    The Computational Linguistics Feedback Forum (CLIFF) is a group of students and faculty who gather once a week to discuss the members\u27 current research. As the word feedback suggests, the group\u27s purpose is the sharing of ideas. The group also promotes interdisciplinary contacts between researchers who share an interest in Cognitive Science. There is no single theme describing the research in Natural Language Processing at Penn. There is work done in CCG, Tree adjoining grammars, intonation, statistical methods, plan inference, instruction understanding, incremental interpretation, language acquisition, syntactic parsing, causal reasoning, free word order languages, ... and many other areas. With this in mind, rather than trying to summarize the varied work currently underway here at Penn, we suggest reading the following abstracts to see how the students and faculty themselves describe their work. Their abstracts illustrate the diversity of interests among the researchers, explain the areas of common interest, and describe some very interesting work in Cognitive Science. This report is a collection of abstracts from both faculty and graduate students in Computer Science, Psychology and Linguistics. We pride ourselves on the close working relations between these groups, as we believe that the communication among the different departments and the ongoing inter-departmental research not only improves the quality of our work, but makes much of that work possible

    Structure and Intonation

    Full text link

    Distributed Representations for Compositional Semantics

    Full text link
    The mathematical representation of semantics is a key issue for Natural Language Processing (NLP). A lot of research has been devoted to finding ways of representing the semantics of individual words in vector spaces. Distributional approaches --- meaning distributed representations that exploit co-occurrence statistics of large corpora --- have proved popular and successful across a number of tasks. However, natural language usually comes in structures beyond the word level, with meaning arising not only from the individual words but also the structure they are contained in at the phrasal or sentential level. Modelling the compositional process by which the meaning of an utterance arises from the meaning of its parts is an equally fundamental task of NLP. This dissertation explores methods for learning distributed semantic representations and models for composing these into representations for larger linguistic units. Our underlying hypothesis is that neural models are a suitable vehicle for learning semantically rich representations and that such representations in turn are suitable vehicles for solving important tasks in natural language processing. The contribution of this thesis is a thorough evaluation of our hypothesis, as part of which we introduce several new approaches to representation learning and compositional semantics, as well as multiple state-of-the-art models which apply distributed semantic representations to various tasks in NLP.Comment: DPhil Thesis, University of Oxford, Submitted and accepted in 201

    CCGbank: User\u27s Manual

    Get PDF

    Students´ language in computer-assisted tutoring of mathematical proofs

    Get PDF
    Truth and proof are central to mathematics. Proving (or disproving) seemingly simple statements often turns out to be one of the hardest mathematical tasks. Yet, doing proofs is rarely taught in the classroom. Studies on cognitive difficulties in learning to do proofs have shown that pupils and students not only often do not understand or cannot apply basic formal reasoning techniques and do not know how to use formal mathematical language, but, at a far more fundamental level, they also do not understand what it means to prove a statement or even do not see the purpose of proof at all. Since insight into the importance of proof and doing proofs as such cannot be learnt other than by practice, learning support through individualised tutoring is in demand. This volume presents a part of an interdisciplinary project, set at the intersection of pedagogical science, artificial intelligence, and (computational) linguistics, which investigated issues involved in provisioning computer-based tutoring of mathematical proofs through dialogue in natural language. The ultimate goal in this context, addressing the above-mentioned need for learning support, is to build intelligent automated tutoring systems for mathematical proofs. The research presented here has been focused on the language that students use while interacting with such a system: its linguistic propeties and computational modelling. Contribution is made at three levels: first, an analysis of language phenomena found in students´ input to a (simulated) proof tutoring system is conducted and the variety of students´ verbalisations is quantitatively assessed, second, a general computational processing strategy for informal mathematical language and methods of modelling prominent language phenomena are proposed, and third, the prospects for natural language as an input modality for proof tutoring systems is evaluated based on collected corpora
    corecore