17 research outputs found

    Towards Understanding Egyptian Arabic Dialogues

    Full text link
    Labelling of user's utterances to understanding his attends which called Dialogue Act (DA) classification, it is considered the key player for dialogue language understanding layer in automatic dialogue systems. In this paper, we proposed a novel approach to user's utterances labeling for Egyptian spontaneous dialogues and Instant Messages using Machine Learning (ML) approach without relying on any special lexicons, cues, or rules. Due to the lack of Egyptian dialect dialogue corpus, the system evaluated by multi-genre corpus includes 4725 utterances for three domains, which are collected and annotated manually from Egyptian call-centers. The system achieves F1 scores of 70. 36% overall domains.Comment: arXiv admin note: substantial text overlap with arXiv:1505.0308

    A Finite State and Data-Oriented Method for Grapheme to Phoneme Conversion

    Full text link
    A finite-state method, based on leftmost longest-match replacement, is presented for segmenting words into graphemes, and for converting graphemes into phonemes. A small set of hand-crafted conversion rules for Dutch achieves a phoneme accuracy of over 93%. The accuracy of the system is further improved by using transformation-based learning. The phoneme accuracy of the best system (using a large set of rule templates and a `lazy' variant of Brill's algoritm), trained on only 40K words, reaches 99% accuracy.Comment: 8 page

    Automatic extraction of rules for sentence boundary disambiguation

    Get PDF
    ABSTRACT Transformation-based learning (TBL) is the most important machine learning theory aiming at the automatic extraction of rules based on already tagged corpora. However, the application of this theory to a certain application without taking into account the features that characterize this application may cause problems regarding the training time cost as well as the accuracy of the extracted rules. In this paper we present a variation of the basic idea of the TBL and we apply it to the extraction of the sentence boundary disambiguation rules in real-world text, a prerequisite for the vast majority of the natural language processing applications. We show that our approach achieves considerably higher accuracy results and, moreover, requires minimal training time in comparison to the traditional TBL

    Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech

    Get PDF
    We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-act-like units such as Statement, Question, Backchannel, Agreement, Disagreement, and Apology. Our model detects and predicts dialogue acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialogue act sequence. The dialogue model is based on treating the discourse structure of a conversation as a hidden Markov model and the individual dialogue acts as observations emanating from the model states. Constraints on the likely sequence of dialogue acts are modeled via a dialogue act n-gram. The statistical dialogue grammar is combined with word n-grams, decision trees, and neural networks modeling the idiosyncratic lexical and prosodic manifestations of each dialogue act. We develop a probabilistic integration of speech recognition with dialogue modeling, to improve both speech recognition and dialogue act classification accuracy. Models are trained and evaluated using a large hand-labeled database of 1,155 conversations from the Switchboard corpus of spontaneous human-to-human telephone speech. We achieved good dialogue act labeling accuracy (65% based on errorful, automatically recognized words and prosody, and 71% based on word transcripts, compared to a chance baseline accuracy of 35% and human accuracy of 84%) and a small reduction in word recognition error.Comment: 35 pages, 5 figures. Changes in copy editing (note title spelling changed

    Dialogue Act Recognition Approaches

    Get PDF
    This paper deals with automatic dialogue act (DA) recognition. Dialogue acts are sentence-level units that represent states of a dialogue, such as questions, statements, hesitations, etc. The knowledge of dialogue act realizations in a discourse or dialogue is part of the speech understanding and dialogue analysis process. It is of great importance for many applications: dialogue systems, speech recognition, automatic machine translation, etc. The main goal of this paper is to study the existing works about DA recognition and to discuss their respective advantages and drawbacks. A major concern in the DA recognition domain is that, although a few DA annotation schemes seem now to emerge as standards, most of the time, these DA tag-sets have to be adapted to the specificities of a given application, which prevents the deployment of standardized DA databases and evaluation procedures. The focus of this review is put on the various kinds of information that can be used to recognize DAs, such as prosody, lexical, etc., and on the types of models proposed so far to capture this information. Combining these information sources tends to appear nowadays as a prerequisite to recognize DAs

    Noise Robust Dialogue Act Recognition for Task-oriented Dialogues

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (석사)-- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : 전기·컴퓨터곡학뢀, 2015. 8. 이상ꡬ.λŒ€ν™” μ‹œμŠ€ν…œκ³Ό 이메일, κ²Œμ‹œκΈ€ μš”μ•½ μ‹œμŠ€ν…œ ꡬ좕에 μžˆμ–΄ λŒ€ν™” μ˜λ„ λΆ„λ₯˜λŠ” μ€‘μš”ν•œ 역할을 ν•œλ‹€. μ΄λŠ” 각각의 μ‹œμŠ€ν…œλ“€μ΄ λ°œν™”, 메일, κ²Œμ‹œκΈ€ ν˜•νƒœμ˜ 데이터에 λŒ€ν•˜μ—¬ λŒ€ν™” μ˜λ„λ₯Ό λΆ„λ₯˜ν•˜κ³  이 정보λ₯Ό ν•˜μœ„ μž‘μ—…μ˜ μž…λ ₯으둜 μ‚¬μš©ν•˜κΈ° λ•Œλ¬Έμ΄λ‹€. κ·Έλž˜μ„œ λŒ€ν™” μ˜λ„ λΆ„λ₯˜ μ„±λŠ₯이 ν•΄λ‹Ή μ‹œμŠ€ν…œ 의 전체 μ„±λŠ₯에 크게 영ν–₯을 μ£ΌκΈ° λ•Œλ¬Έμ— μ„±λŠ₯ ν–₯상 츑면에 μžˆμ–΄ μ€‘μš”ν•˜λ‹€. λŒ€ν™” μ˜λ„ λΆ„λ₯˜λŠ” λŒ€ν™” λ‚΄ λ°œν™”μ— λŒ€ν™” μ˜λ„λ₯Ό ν• λ‹Ήν•˜λŠ” λ¬Έμ œμ΄λ‹€. 특히 λŒ€ν™” μ‹œμŠ€ν…œμ—μ„œλŠ” μŒμ„± 인식 μ—λŸ¬κ°€ μ‘΄μž¬ν•˜κΈ° λ•Œλ¬Έμ— μ—λŸ¬μ— κ°•μΈν•œ λŒ€ν™” μ˜λ„ λΆ„λ₯˜ λͺ¨λΈμ΄ ν•„μš”ν•˜λ‹€. λ”°λΌμ„œ λ³Έ λ…Όλ¬Έμ—μ„œλŠ” 두 λͺ…μ˜ μ‚¬λžŒμ΄ νŠΉμ • λͺ©μ μ„ 가지고 μ§„ν–‰ν•˜λŠ” 과제 지ν–₯ν˜• λŒ€ν™”λΌλŠ” μƒν™©μ—μ„œ λ°œν™”, ν™”μž, λŒ€ν™” μ˜λ„λ₯Ό κ³ λ €ν•˜μ—¬ λŒ€ν™” ꡬ쑰λ₯Ό λͺ¨μ‚¬ν•˜λŠ” 생성λͺ¨λΈμ„ λ§Œλ“€μ–΄ λ…Έμ΄μ¦ˆ 데이터에 λŒ€μ‘ν•˜μ˜€λ‹€. 이 λͺ¨λΈμ˜ 기반이 λ˜λŠ” 가정은 ν™”μžλŠ” μ–΄λ– ν•œ ν–‰μœ„λ₯Ό μˆ˜ν–‰ν•˜κ³ μž ν•˜λŠ” λͺ©μ μ„ 가지고, κ·Έ λͺ©μ μ— λ§žλŠ” μ μ ˆν•œ μ–΄νœ˜ 집합을 μ‚¬μš©ν•˜μ—¬ μƒλŒ€λ°©μ—κ²Œ 말을 ν•œλ‹€λŠ” 것이닀. 즉 μ œμ•ˆν•œ λͺ¨λΈμ€ μ΄λŸ¬ν•œ 가정을 κ³ λ €ν•˜μ—¬ 마λ₯΄μ½”ν”„ λͺ¨λΈμ„ κ°œμ„ ν•˜μ˜€λ‹€. 과제 지ν–₯ν˜• 데이터인 HCRC map task, live chat, SACTI-1 λ§λ­‰μΉ˜λ₯Ό μ΄μš©ν•œ μ‹€ν—˜μ„ 톡해 μ œμ•ˆν•œ λͺ¨λΈμ΄ κΈ°μ‘΄ 마λ₯΄μ½”ν”„ λͺ¨λΈμ— λΉ„ν•˜μ—¬ 더 λ‚˜μ€ μ„±λŠ₯을 보이고, ν˜„μž¬κΉŒμ§€λ„ λŒ€ν™” μ˜λ„ λΆ„λ₯˜ μ„±λŠ₯이 높은 SVM-HMMκ³Ό 경쟁λ ₯ μžˆλŠ” κ²°κ³Όλ₯Ό λ³΄μ΄λŠ” 것을 확인 ν•˜μ˜€λ‹€. 특히 λŒ€ν™” μ‹œμŠ€ν…œμ˜ μŒμ„± 인식 λͺ¨λ“ˆμ˜ μ—λŸ¬λ₯Ό λͺ¨λ°©ν•œ SACTI-1 λ§λ­‰μΉ˜μ— λŒ€ν•˜μ—¬ μ œμ•ˆν•œ λͺ¨λΈμ΄ SVM-HMM에 λΉ„ν•˜μ—¬ λ…Έμ΄μ¦ˆμ— 강인함을 λ³΄μ˜€λ‹€.In spoken dialog system, e-mail summary system and thread summary system development, dialogue act classifier plays an important role because the systems depend on the performance of classifying dialogue acts of utterances, e-mails and posts to improve completeness of the system. The dialogue act classification problem is a well-known problem to assign the dialogue acts to utterances in a conversation. One of the main challenges in the development of robust dialog systems is especially to deal with noisy input due to imperfect results from Automatic Speech Recognition (ASR) module. The challenge in dialogue act recognition is the mapping from noisy user utterances to dialogue acts. In this paper, to cope with noisy utterances, we describe a noise robust generative model of task-oriented conversation that captures both the speaker information and the dialogue act associated with each utterance under the assumption that a speaker says about something by using appropriate vocabulary with the aim of getting someone to do somethings. The proposed model is based on Markov model and is modified to reflect the assumption. In the experiments, we evaluate the classification results by comparing them to the simple Markov model and state-of-the-art SVM-HMM results. The proposed model is a better conversation model than the simple Markov model and shows the competitive classification results in comparison with SVM-HMM in the task-oriented HCRC map task corpus, live-chat corpus and SACTI-1 corpus. Results based on SACTI-1 corpus which simulates ASR errors particularly show that the proposed model is robust against noisy user utterances.1. μ„œλ‘  1 1.1 μ—°κ΅¬μ˜ λ°°κ²½ 1 1.2 μ—°κ΅¬μ˜ λ‚΄μš© 및 λ²”μœ„ 3 1.3 λ…Όλ¬Έμ˜ ꡬ성 6 2. 문제 μ •μ˜ 7 2.1 λŒ€ν™”λ¬Έμ˜ κ΅¬μ„±μš”μ†Œ 7 2.2 λŒ€ν™” μ˜λ„ λΆ„λ₯˜ 문제 μ •μ˜ 12 2.3 λŒ€ν™”λ¬Έμ˜ νŠΉμ§• 및 문제 ν•΄κ²°μ˜ μ–΄λ €μš΄ 점 13 3. κ΄€λ ¨ 연ꡬ 15 3.1 지도 ν•™μŠ΅ 기반의 λŒ€ν™” μ˜λ„ λΆ„λ₯˜ 연ꡬ 15 3.2 λŒ€ν™” μ˜λ„μ˜ 의쑴 관계λ₯Ό λͺ¨λΈλ§ ν•œ 연ꡬ 16 3.3 κΈ°μ‘΄ μ—°κ΅¬μ˜ ν•œκ³„μ  22 4. 마λ₯΄μ½”ν”„ λͺ¨λΈ 기반 λŒ€ν™” μ˜λ„ λΆ„λ₯˜ 24 4.1 배경지식 24 4.1.1 μ–Έμ–΄λͺ¨λΈ 24 4.1.2 마λ₯΄μ½”ν”„ λͺ¨λΈκ³Ό 은닉 마λ₯΄μ½”ν”„ λͺ¨λΈ 25 4.2 μž…μΆœλ ₯ 마λ₯΄μ½”ν”„ λͺ¨λΈμ„ λ³€ν˜•ν•œ λŒ€ν™” μ˜λ„ λΆ„λ₯˜ λͺ¨λΈ 26 5. μ„±λŠ₯ 평가 31 5.1 λŒ€ν™” λ§λ­‰μΉ˜ 31 5.2 비ꡐλͺ¨λΈ 및 κ°œλ°œν™˜κ²½ 38 5.3 μ„±λŠ₯ 평가 μΈ‘μ •μΉ˜ 39 5.4 μ‹€ν—˜ κ²°κ³Ό 및 뢄석 40 5.4.1 λΆ„λ₯˜ μ„±λŠ₯ 41 5.4.2 ASR λ…Έμ΄μ¦ˆμ— λŒ€ν•œ 강인성 45 5.4.3 ν™•μž₯μ„± 48 6. κ²°λ‘  및 ν–₯ν›„ 연ꡬ 50 6.1 κ²°λ‘  50 6.2 ν–₯ν›„ 연ꡬ 51 μ°Έκ³ λ¬Έν—Œ 53 ABSTRACT 57Maste
    corecore