760 research outputs found

    Topic Independent Identification of Agreement and Disagreement in Social Media Dialogue

    Full text link
    Research on the structure of dialogue has been hampered for years because large dialogue corpora have not been available. This has impacted the dialogue research community's ability to develop better theories, as well as good off the shelf tools for dialogue processing. Happily, an increasing amount of information and opinion exchange occur in natural dialogue in online forums, where people share their opinions about a vast range of topics. In particular we are interested in rejection in dialogue, also called disagreement and denial, where the size of available dialogue corpora, for the first time, offers an opportunity to empirically test theoretical accounts of the expression and inference of rejection in dialogue. In this paper, we test whether topic-independent features motivated by theoretical predictions can be used to recognize rejection in online forums in a topic independent way. Our results show that our theoretically motivated features achieve 66% accuracy, an improvement over a unigram baseline of an absolute 6%.Comment: @inproceedings{Misra2013TopicII, title={Topic Independent Identification of Agreement and Disagreement in Social Media Dialogue}, author={Amita Misra and Marilyn A. Walker}, booktitle={SIGDIAL Conference}, year={2013}

    Annotated Corpus for Citation Context Analysis

    Get PDF
    In this paper, we present a corpus composed of 85 scientific articles annotated with 2092 citations analyzed using context analysis. We obtained a high Inter-annotator agreement; therefore, we assure reliability and reproducibility of the annotation performed by three coders in an independent way. We applied this corpus to classify citations according to qualitative criteria using a medium granularity categorization scheme enriched by annotated keywords and labels to obtain high granularity. The annotation schema handle three dimensions: PURPOSE: POLARITY: ASPECTS. Citation purpose define functions classification: use, critique, comparison and background with more specific classes stablished using keywords: Based on, Supply; Useful; Contrast; Acknowledge, Corroboration, Debate; Weakness and Hedges. Citation aspects complement the citation characterization: concept, method, data, tool, task, among others. Polarity has three levels: Positive, Negative and Neutral. We developed the schema and annotated the corpus focusing in applications for citation influence assessment, but we suggest that applications as summary generation and information retrieval also could use this annotated corpus because of the organization of the scheme in clearly defined general dimensions

    Tension Analysis in Survivor Interviews: A Computational Approach

    Get PDF
    Tension is an emotional experience that can occur in different contexts. This phenomenon can originate from a conflict of interest or uneasiness during an interview. In some contexts, such experiences are associated with negative emotions such as fear or distress. People tend to adopt different hedging strategies in such situations to avoid criticism or evade questions. In this thesis, we analyze several survivor interview transcripts to determine different characteristics that play crucial roles during tension situation. We discuss key components of tension experiences and propose a natural language processing model which can effectively combine these components to identify tension points in text-based oral history interviews. We validate the efficacy of our model and its components with experimentation on some standard datasets. The model provides a framework that can be used in future research on tension phenomena in oral history interviews

    Classification of non-heat generating outdoor objects in thermal scenes for autonomous robots

    Get PDF
    We have designed and implemented a physics-based adaptive Bayesian pattern classification model that uses a passive thermal infrared imaging system to automatically characterize non-heat generating objects in unstructured outdoor environments for mobile robots. In the context of this research, non-heat generating objects are defined as objects that are not a source for their own emission of thermal energy, and so exclude people, animals, vehicles, etc. The resulting classification model complements an autonomous bot\u27s situational awareness by providing the ability to classify smaller structures commonly found in the immediate operational environment. Since GPS depends on the availability of satellites and onboard terrain maps which are often unable to include enough detail for smaller structures found in an operational environment, bots will require the ability to make decisions such as go through the hedges or go around the brick wall. A thermal infrared imaging modality mounted on a small mobile bot is a favorable choice for receiving enough detailed information to automatically interpret objects at close ranges while unobtrusively traveling alongside pedestrians. The classification of indoor objects and heat generating objects in thermal scenes is a solved problem. A missing and essential piece in the literature has been research involving the automatic characterization of non-heat generating objects in outdoor environments using a thermal infrared imaging modality for mobile bots. Seeking to classify non-heat generating objects in outdoor environments using a thermal infrared imaging system is a complex problem due to the variation of radiance emitted from the objects as a result of the diurnal cycle of solar energy. The model that we present will allow bots to see beyond vision to autonomously assess the physical nature of the surrounding structures for making decisions without the need for an interpretation by humans.;Our approach is an application of Bayesian statistical pattern classification where learning involves labeled classes of data (supervised classification), assumes no formal structure regarding the density of the data in the classes (nonparametric density estimation), and makes direct use of prior knowledge regarding an object class\u27s existence in a bot\u27s immediate area of operation when making decisions regarding class assignments for unknown objects. We have used a mobile bot to systematically capture thermal infrared imagery for two categories of non-heat generating objects (extended and compact) in several different geographic locations. The extended objects consist of objects that extend beyond the thermal camera\u27s field of view, such as brick walls, hedges, picket fences, and wood walls. The compact objects consist of objects that are within the thermal camera\u27s field of view, such as steel poles and trees. We used these large representative data sets to explore the behavior of thermal-physical features generated from the signals emitted by the classes of objects and design our Adaptive Bayesian Classification Model. We demonstrate that our novel classification model not only displays exceptional performance in characterizing non-heat generating outdoor objects in thermal scenes but it also outperforms the traditional KNN and Parzen classifiers

    Recognizing speculative language in research texts

    Get PDF
    This thesis studies the use of sequential supervised learning methods on two tasks related to the detection of hedging in scientific articles: those of hedge cue identification and hedge cue scope detection. Both tasks are addressed using a learning methodology that proposes the use of an iterative, error-based approach to improve classification performance, suggesting the incorporation of expert knowledge into the learning process through the use of knowledge rules. Results are promising: for the first task, we improved baseline results by 2.5 points in terms of F-score by incorporating cue cooccurence information, while for scope detection, the incorporation of syntax information and rules for syntax scope pruning allowed us to improve classification performance from an F-score of 0.712 to a final number of 0.835. Compared with state-of-the-art methods, the results are very competitive, suggesting that the approach to improving classifiers based only on the errors commited on a held out corpus could be successfully used in other, similar tasks. Additionaly, this thesis presents a class schema for representing sentence analysis in a unique structure, including the results of different linguistic analysis. This allows us to better manage the iterative process of classifier improvement, where different attribute sets for learning are used in each iteration. We also propose to store attributes in a relational model, instead of the traditional text-based structures, to facilitate learning data analysis and manipulation
    corecore