78 research outputs found

    Query types in the meeting domain: assessing the role of argumentative structure in answering questions on meeting discussion records

    Get PDF
    We define a new task of question answering on meeting records and assess its difficulty in terms of types of information and retrieval techniques required. The importance of this task is revealed by the increasingly growing interest in the design of sophisticated interfaces for accessing meeting records such as meeting browsers. We ground our work on the empirical analysis of elicited user queries. We assess what is the type of information sought by the users and perform a user query classification along several semantic dimensions of the meeting content. We found that queries about argumentative processes and outcomes represent the majority among the elicited queries (about 60%). We also assess the difficulty in answering the queries and focus on the requirements of a prospective QA system to successfully deal with them. Our results suggest that standard Information Retrieval and Question Answering alone can only account for less than 20% of the queries and need to be completed with additional type of information and inference

    Social talk capabilities for dialogue systems

    Get PDF
    Small talk capabilities are an important but very challenging extension to dialogue systems. Small talk (or social talk) refers to a kind of conversation, which does not focus on the exchange of information, but on the negotiation of social roles and situations. The goal of this thesis is to provide knowledge, processes and structures that can be used by dialogue systems to satisfactorily participate in social conversations. For this purpose the thesis presents research in the areas of natural-language understanding, dialogue management and error handling. Nine new models of social talk based on a data analysis of small talk conversations are described. The functionally-motivated and content-abstract models can be used for small talk conversations on various topics. The basic elements of the models consist of dialogue acts for social talk newly developed on basis of social science theory. The thesis also presents some conversation strategies for the treatment of so-called out-of-domain (OoD) utterances that can be used to avoid errors in the input understanding of dialogue systems. Additionally, the thesis describes a new extension to dialogue management that flexibly manages interwoven dialogue threads. The small talk models as well as the strategies for handling OoD utterances are encoded as computational dialogue threads

    Social talk capabilities for dialogue systems

    Get PDF
    Small talk capabilities are an important but very challenging extension to dialogue systems. Small talk (or “social talk”) refers to a kind of conversation, which does not focus on the exchange of information, but on the negotiation of social roles and situations. The goal of this thesis is to provide knowledge, processes and structures that can be used by dialogue systems to satisfactorily participate in social conversations. For this purpose the thesis presents research in the areas of natural-language understanding, dialogue management and error handling. Nine new models of social talk based on a data analysis of small talk conversations are described. The functionally-motivated and content-abstract models can be used for small talk conversations on various topics. The basic elements of the models consist of dialogue acts for social talk newly developed on basis of social science theory. The thesis also presents some conversation strategies for the treatment of so-called “out-of-domain” (OoD) utterances that can be used to avoid errors in the input understanding of dialogue systems. Additionally, the thesis describes a new extension to dialogue management that flexibly manages interwoven dialogue threads. The small talk models as well as the strategies for handling OoD utterances are encoded as computational dialogue threads

    Anaphora resolution for Arabic machine translation :a case study of nafs

    Get PDF
    PhD ThesisIn the age of the internet, email, and social media there is an increasing need for processing online information, for example, to support education and business. This has led to the rapid development of natural language processing technologies such as computational linguistics, information retrieval, and data mining. As a branch of computational linguistics, anaphora resolution has attracted much interest. This is reflected in the large number of papers on the topic published in journals such as Computational Linguistics. Mitkov (2002) and Ji et al. (2005) have argued that the overall quality of anaphora resolution systems remains low, despite practical advances in the area, and that major challenges include dealing with real-world knowledge and accurate parsing. This thesis investigates the following research question: can an algorithm be found for the resolution of the anaphor nafs in Arabic text which is accurate to at least 90%, scales linearly with text size, and requires a minimum of knowledge resources? A resolution algorithm intended to satisfy these criteria is proposed. Testing on a corpus of contemporary Arabic shows that it does indeed satisfy the criteria.Egyptian Government

    Quantifying mutual-understanding in dialogue

    Get PDF
    PhDThere are two components of communication that provide a natural index of mutual-understanding in dialogue. The first is Repair; the ways in which people detect and deal with problems with understanding. The second is Ellipsis/Anaphora; the use of expressions that depend directly on the accessibility of the local context for their interpretation. This thesis explores the use of these two phenomena in systematic comparative analyses of human-human dialogue under different task and media conditions. In order to do this it is necessary to a) develop reliable, valid protocols for coding the different Repair and Ellipsis/Anaphora phenomena b) establish their baseline patterns of distribution in conversation and c) model their basic statistical inter-relationships and their predictive value. Two new protocols for coding Repair and Ellipsis/Anaphora phenomena are presented and applied to two dialogue corpora, one of ordinary 'everyday' conversations and one of task-oriented dialogues. These data illustrate that there are significant differences in how understanding is created and negotiated across conditions. Repair is shown to be a ubiquitous feature in all dialogue. The goals of the speaker directly affect the type of Repair used. Giving instructions leads to a higher rate of self-editing; following instructions increases corrections and requests for clarification. Medium and familiarity also influence Repair; when eye contact is not possible there are a greater number of repeats and clarifications. Anaphora are used less frequently in task-oriented dialogue whereas types of Ellipsis increase. The use of Elliptical phrases that check, confirm or acknowledge is higher when there is no eye contact. Familiar pairs use more elliptical expressions, especially endophora and elliptical questions. Following instructions leads to greater use of elliptical (non-sentential) phrases. Medium, task and social norms all have a measureable effect on the components of dialogue that underpin mutual-understanding

    Sentence-level sentiment tagging across different domains and genres

    Get PDF
    The demand for information about sentiment expressed in texts has stimulated a growing interest into automatic sentiment analysis in Natural Language Processing (NLP). This dissertation is motivated by an unmet need for high-performance domain-independent sentiment taggers and by pressing theoretical questions in NLP, where the exploration of limitations of specific approaches, as well as synergies between them, remain practically unaddressed. This study focuses on sentiment tagging at the sentence level and covers four genres: news, blogs, movie reviews, and product reviews. It draws comparisons between sentiment annotation at different linguistic levels (words, sentences, and texts) and highlights the key differences between supervised machine learning methods that rely on annotated corpora (corpus-based, CBA) and lexicon-based approaches (LBA) to sentiment tagging. Exploring the performance of supervised corpus-based approach to sentiment tagging, this study highlights the strong domain-dependence of the CBA. I present the development of LBA approaches based on general lexicons, such as WordNet, as a potential solution to the domain portability problem. A system for sentiment marker extraction from WordNet's relations and glosses is developed and used to acquire lists for a lexicon-based system for sentiment annotation at the sentence and text levels. It demonstrates that LBA's performance across domains is more stable than that of CBA. Finally, the study proposes an integration of LBA and CBA in an ensemble of classifiers using a precision-based voting technique that allows the ensemble system to incorporate the best features of both CBA and LBA. This combined approach outperforms both base learners and provides a promising solution to the domain-adaptation problem. The study contributes to NLP (1) by developing algorithms for automatic acquisition of sentiment-laden words from dictionary definitions; (2) by conducting a systematic study of approaches to sentiment classification and of factors affecting their performance; (3) by refining the lexicon-based approach by introducing valence shifter handling and parse tree information; and (4) by development of the combined, CBA/LBA approach that brings together the strengths of the two approaches and allows domain-adaptation with limited amounts of labeled training data
    • 

    corecore