17 research outputs found
Towards Understanding Egyptian Arabic Dialogues
Labelling of user's utterances to understanding his attends which called
Dialogue Act (DA) classification, it is considered the key player for dialogue
language understanding layer in automatic dialogue systems. In this paper, we
proposed a novel approach to user's utterances labeling for Egyptian
spontaneous dialogues and Instant Messages using Machine Learning (ML) approach
without relying on any special lexicons, cues, or rules. Due to the lack of
Egyptian dialect dialogue corpus, the system evaluated by multi-genre corpus
includes 4725 utterances for three domains, which are collected and annotated
manually from Egyptian call-centers. The system achieves F1 scores of 70. 36%
overall domains.Comment: arXiv admin note: substantial text overlap with arXiv:1505.0308
A Finite State and Data-Oriented Method for Grapheme to Phoneme Conversion
A finite-state method, based on leftmost longest-match replacement, is
presented for segmenting words into graphemes, and for converting graphemes
into phonemes. A small set of hand-crafted conversion rules for Dutch achieves
a phoneme accuracy of over 93%. The accuracy of the system is further improved
by using transformation-based learning. The phoneme accuracy of the best system
(using a large set of rule templates and a `lazy' variant of Brill's algoritm),
trained on only 40K words, reaches 99% accuracy.Comment: 8 page
Automatic extraction of rules for sentence boundary disambiguation
ABSTRACT Transformation-based learning (TBL) is the most important machine learning theory aiming at the automatic extraction of rules based on already tagged corpora. However, the application of this theory to a certain application without taking into account the features that characterize this application may cause problems regarding the training time cost as well as the accuracy of the extracted rules. In this paper we present a variation of the basic idea of the TBL and we apply it to the extraction of the sentence boundary disambiguation rules in real-world text, a prerequisite for the vast majority of the natural language processing applications. We show that our approach achieves considerably higher accuracy results and, moreover, requires minimal training time in comparison to the traditional TBL
Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech
We describe a statistical approach for modeling dialogue acts in
conversational speech, i.e., speech-act-like units such as Statement, Question,
Backchannel, Agreement, Disagreement, and Apology. Our model detects and
predicts dialogue acts based on lexical, collocational, and prosodic cues, as
well as on the discourse coherence of the dialogue act sequence. The dialogue
model is based on treating the discourse structure of a conversation as a
hidden Markov model and the individual dialogue acts as observations emanating
from the model states. Constraints on the likely sequence of dialogue acts are
modeled via a dialogue act n-gram. The statistical dialogue grammar is combined
with word n-grams, decision trees, and neural networks modeling the
idiosyncratic lexical and prosodic manifestations of each dialogue act. We
develop a probabilistic integration of speech recognition with dialogue
modeling, to improve both speech recognition and dialogue act classification
accuracy. Models are trained and evaluated using a large hand-labeled database
of 1,155 conversations from the Switchboard corpus of spontaneous
human-to-human telephone speech. We achieved good dialogue act labeling
accuracy (65% based on errorful, automatically recognized words and prosody,
and 71% based on word transcripts, compared to a chance baseline accuracy of
35% and human accuracy of 84%) and a small reduction in word recognition error.Comment: 35 pages, 5 figures. Changes in copy editing (note title spelling
changed
Dialogue Act Recognition Approaches
This paper deals with automatic dialogue act (DA) recognition. Dialogue acts are sentence-level units that represent states of a dialogue, such as questions, statements, hesitations, etc. The knowledge of dialogue act realizations in a discourse or dialogue is part of the speech understanding and dialogue analysis process. It is of great importance for many applications: dialogue systems, speech recognition, automatic machine translation, etc. The main goal of this paper is to study the existing works about DA recognition and to discuss their respective advantages and drawbacks. A major concern in the DA recognition domain is that, although a few DA annotation schemes seem now to emerge as standards, most of the time, these DA tag-sets have to be adapted to the specificities of a given application, which prevents the deployment of standardized DA databases and evaluation procedures. The focus of this review is put on the various kinds of information that can be used to recognize DAs, such as prosody, lexical, etc., and on the types of models proposed so far to capture this information. Combining these information sources tends to appear nowadays as a prerequisite to recognize DAs
Noise Robust Dialogue Act Recognition for Task-oriented Dialogues
νμλ
Όλ¬Έ (μμ¬)-- μμΈλνκ΅ λνμ : μ κΈ°Β·μ»΄ν¨ν°κ³΅νλΆ, 2015. 8. μ΄μꡬ.λν μμ€ν
κ³Ό μ΄λ©μΌ, κ²μκΈ μμ½ μμ€ν
ꡬμΆμ μμ΄ λν μλ λΆλ₯λ μ€μν μν μ νλ€. μ΄λ κ°κ°μ μμ€ν
λ€μ΄ λ°ν, λ©μΌ, κ²μκΈ ννμ λ°μ΄ν°μ λνμ¬ λν μλλ₯Ό λΆλ₯νκ³ μ΄ μ 보λ₯Ό νμ μμ
μ μ
λ ₯μΌλ‘ μ¬μ©νκΈ° λλ¬Έμ΄λ€. κ·Έλμ λν μλ λΆλ₯ μ±λ₯μ΄ ν΄λΉ μμ€ν
μ μ 체 μ±λ₯μ ν¬κ² μν₯μ μ£ΌκΈ° λλ¬Έμ μ±λ₯ ν₯μ μΈ‘λ©΄μ μμ΄ μ€μνλ€. λν μλ λΆλ₯λ λν λ΄ λ°νμ λν μλλ₯Ό ν λΉνλ λ¬Έμ μ΄λ€.
νΉν λν μμ€ν
μμλ μμ± μΈμ μλ¬κ° μ‘΄μ¬νκΈ° λλ¬Έμ μλ¬μ κ°μΈν λν μλ λΆλ₯ λͺ¨λΈμ΄ νμνλ€. λ°λΌμ λ³Έ λ
Όλ¬Έμμλ λ λͺ
μ μ¬λμ΄ νΉμ λͺ©μ μ κ°μ§κ³ μ§ννλ κ³Όμ μ§ν₯ν λνλΌλ μν©μμ λ°ν, νμ, λν μλλ₯Ό κ³ λ €νμ¬ λν ꡬ쑰λ₯Ό λͺ¨μ¬νλ μμ±λͺ¨λΈμ λ§λ€μ΄ λ
Έμ΄μ¦ λ°μ΄ν°μ λμνμλ€. μ΄ λͺ¨λΈμ κΈ°λ°μ΄ λλ κ°μ μ νμλ μ΄λ ν νμλ₯Ό μννκ³ μ νλ λͺ©μ μ κ°μ§κ³ , κ·Έ λͺ©μ μ λ§λ μ μ ν μ΄ν μ§ν©μ μ¬μ©νμ¬ μλλ°©μκ² λ§μ νλ€λ κ²μ΄λ€. μ¦ μ μν λͺ¨λΈμ μ΄λ¬ν κ°μ μ κ³ λ €νμ¬ λ§λ₯΄μ½ν λͺ¨λΈμ κ°μ νμλ€.
κ³Όμ μ§ν₯ν λ°μ΄ν°μΈ HCRC map task, live chat, SACTI-1 λ§λμΉλ₯Ό μ΄μ©ν μ€νμ ν΅ν΄ μ μν λͺ¨λΈμ΄ κΈ°μ‘΄ λ§λ₯΄μ½ν λͺ¨λΈμ λΉνμ¬ λ λμ μ±λ₯μ 보μ΄κ³ , νμ¬κΉμ§λ λν μλ λΆλ₯ μ±λ₯μ΄ λμ SVM-HMMκ³Ό κ²½μλ ₯ μλ κ²°κ³Όλ₯Ό 보μ΄λ κ²μ νμΈ νμλ€. νΉν λν μμ€ν
μ μμ± μΈμ λͺ¨λμ μλ¬λ₯Ό λͺ¨λ°©ν SACTI-1 λ§λμΉμ λνμ¬ μ μν λͺ¨λΈμ΄ SVM-HMMμ λΉνμ¬ λ
Έμ΄μ¦μ κ°μΈν¨μ 보μλ€.In spoken dialog system, e-mail summary system and thread summary system development, dialogue act classifier plays an important role because the systems depend on the performance of classifying dialogue acts of utterances, e-mails and posts to improve completeness of the system. The dialogue act classification problem is a well-known problem to assign the dialogue acts to utterances in a conversation.
One of the main challenges in the development of robust dialog systems is especially to deal with noisy input due to imperfect results from Automatic Speech Recognition (ASR) module. The challenge in dialogue act recognition is the mapping from noisy user utterances to dialogue acts. In this paper, to cope with noisy utterances, we describe a noise robust generative model of task-oriented conversation that captures both the speaker information and the dialogue act associated with each utterance under the assumption that a speaker says about something by using appropriate vocabulary with the aim of getting someone to do somethings. The proposed model is based on Markov model and is modified to reflect the assumption.
In the experiments, we evaluate the classification results by comparing them to the simple Markov model and state-of-the-art SVM-HMM results. The proposed model is a better conversation model than the simple Markov model and shows the competitive classification results in comparison with SVM-HMM in the task-oriented HCRC map task corpus, live-chat corpus and SACTI-1 corpus. Results based on SACTI-1 corpus which simulates ASR errors particularly show that the proposed model is robust against noisy user utterances.1. μλ‘ 1
1.1 μ°κ΅¬μ λ°°κ²½ 1
1.2 μ°κ΅¬μ λ΄μ© λ° λ²μ 3
1.3 λ
Όλ¬Έμ κ΅¬μ± 6
2. λ¬Έμ μ μ 7
2.1 λνλ¬Έμ ꡬμ±μμ 7
2.2 λν μλ λΆλ₯ λ¬Έμ μ μ 12
2.3 λνλ¬Έμ νΉμ§ λ° λ¬Έμ ν΄κ²°μ μ΄λ €μ΄ μ 13
3. κ΄λ ¨ μ°κ΅¬ 15
3.1 μ§λ νμ΅ κΈ°λ°μ λν μλ λΆλ₯ μ°κ΅¬ 15
3.2 λν μλμ μμ‘΄ κ΄κ³λ₯Ό λͺ¨λΈλ§ ν μ°κ΅¬ 16
3.3 κΈ°μ‘΄ μ°κ΅¬μ νκ³μ 22
4. λ§λ₯΄μ½ν λͺ¨λΈ κΈ°λ° λν μλ λΆλ₯ 24
4.1 λ°°κ²½μ§μ 24
4.1.1 μΈμ΄λͺ¨λΈ 24
4.1.2 λ§λ₯΄μ½ν λͺ¨λΈκ³Ό μλ λ§λ₯΄μ½ν λͺ¨λΈ 25
4.2 μ
μΆλ ₯ λ§λ₯΄μ½ν λͺ¨λΈμ λ³νν λν μλ λΆλ₯ λͺ¨λΈ 26
5. μ±λ₯ νκ° 31
5.1 λν λ§λμΉ 31
5.2 λΉκ΅λͺ¨λΈ λ° κ°λ°νκ²½ 38
5.3 μ±λ₯ νκ° μΈ‘μ μΉ 39
5.4 μ€ν κ²°κ³Ό λ° λΆμ 40
5.4.1 λΆλ₯ μ±λ₯ 41
5.4.2 ASR λ
Έμ΄μ¦μ λν κ°μΈμ± 45
5.4.3 νμ₯μ± 48
6. κ²°λ‘ λ° ν₯ν μ°κ΅¬ 50
6.1 κ²°λ‘ 50
6.2 ν₯ν μ°κ΅¬ 51
μ°Έκ³ λ¬Έν 53
ABSTRACT 57Maste