51,374 research outputs found

    Multimodal person recognition for human-vehicle interaction

    Get PDF
    Next-generation vehicles will undoubtedly feature biometric person recognition as part of an effort to improve the driving experience. Today's technology prevents such systems from operating satisfactorily under adverse conditions. A proposed framework for achieving person recognition successfully combines different biometric modalities, borne out in two case studies

    Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations

    Get PDF
    Dialog act (DA) recognition is a task that has been widely explored over the years. Recently, most approaches to the task explored different DNN architectures to combine the representations of the words in a segment and generate a segment representation that provides cues for intention. In this study, we explore means to generate more informative segment representations, not only by exploring different network architectures, but also by considering different token representations, not only at the word level, but also at the character and functional levels. At the word level, in addition to the commonly used uncontextualized embeddings, we explore the use of contextualized representations, which provide information concerning word sense and segment structure. Character-level tokenization is important to capture intention-related morphological aspects that cannot be captured at the word level. Finally, the functional level provides an abstraction from words, which shifts the focus to the structure of the segment. We also explore approaches to enrich the segment representation with context information from the history of the dialog, both in terms of the classifications of the surrounding segments and the turn-taking history. This kind of information has already been proved important for the disambiguation of DAs in previous studies. Nevertheless, we are able to capture additional information by considering a summary of the dialog history and a wider turn-taking context. By combining the best approaches at each step, we achieve results that surpass the previous state-of-the-art on generic DA recognition on both SwDA and MRDA, two of the most widely explored corpora for the task. Furthermore, by considering both past and future context, simulating annotation scenario, our approach achieves a performance similar to that of a human annotator on SwDA and surpasses it on MRDA.Comment: 38 pages, 7 figures, 9 tables, submitted to JAI

    Writer Identification Using Inexpensive Signal Processing Techniques

    Full text link
    We propose to use novel and classical audio and text signal-processing and otherwise techniques for "inexpensive" fast writer identification tasks of scanned hand-written documents "visually". The "inexpensive" refers to the efficiency of the identification process in terms of CPU cycles while preserving decent accuracy for preliminary identification. This is a comparative study of multiple algorithm combinations in a pattern recognition pipeline implemented in Java around an open-source Modular Audio Recognition Framework (MARF) that can do a lot more beyond audio. We present our preliminary experimental findings in such an identification task. We simulate "visual" identification by "looking" at the hand-written document as a whole rather than trying to extract fine-grained features out of it prior classification.Comment: 9 pages; 1 figure; presented at CISSE'09 at http://conference.cisse2009.org/proceedings.aspx ; includes the the application source code; based on MARF described in arXiv:0905.123

    Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech

    Get PDF
    We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-act-like units such as Statement, Question, Backchannel, Agreement, Disagreement, and Apology. Our model detects and predicts dialogue acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialogue act sequence. The dialogue model is based on treating the discourse structure of a conversation as a hidden Markov model and the individual dialogue acts as observations emanating from the model states. Constraints on the likely sequence of dialogue acts are modeled via a dialogue act n-gram. The statistical dialogue grammar is combined with word n-grams, decision trees, and neural networks modeling the idiosyncratic lexical and prosodic manifestations of each dialogue act. We develop a probabilistic integration of speech recognition with dialogue modeling, to improve both speech recognition and dialogue act classification accuracy. Models are trained and evaluated using a large hand-labeled database of 1,155 conversations from the Switchboard corpus of spontaneous human-to-human telephone speech. We achieved good dialogue act labeling accuracy (65% based on errorful, automatically recognized words and prosody, and 71% based on word transcripts, compared to a chance baseline accuracy of 35% and human accuracy of 84%) and a small reduction in word recognition error.Comment: 35 pages, 5 figures. Changes in copy editing (note title spelling changed

    Computational Sociolinguistics: A Survey

    Get PDF
    Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201
    • …
    corecore