5 research outputs found

    The Effect of Answer Patterns for Supervised Named Entity Recognition in Thai

    Get PDF

    Conditional random fields with dynamic potentials for Chinese named entity recognition.

    Get PDF
    Wu, Yiu Kei.Thesis (M.Phil.)--Chinese University of Hong Kong, 2008.Includes bibliographical references (p. 69-75).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Chinese NER Problem --- p.1Chapter 1.2 --- Contribution of Our Proposed Framework --- p.3Chapter 2 --- Related Work --- p.6Chapter 2.1 --- Hidden Markov Models --- p.7Chapter 2.2 --- Maximum Entropy Models --- p.8Chapter 2.3 --- Conditional Random Fields --- p.10Chapter 3 --- Our Proposed Model --- p.14Chapter 3.1 --- Background --- p.14Chapter 3.1.1 --- Problem Formulation --- p.14Chapter 3.1.2 --- Conditional Random Fields --- p.16Chapter 3.1.3 --- Semi-Markov Conditional Random Fields --- p.26Chapter 3.2 --- The Formulation of Our Proposed Model --- p.28Chapter 3.2.1 --- The Main Principle --- p.28Chapter 3.2.2 --- The Detailed Formulation --- p.36Chapter 3.2.3 --- Adapting Features from Original CRF to CRFDP --- p.51Chapter 4 --- Experiments --- p.54Chapter 4.1 --- Datasets --- p.55Chapter 4.2 --- Features --- p.57Chapter 4.3 --- Evaluation Metrics --- p.61Chapter 4.4 --- Results and Discussion --- p.63Chapter 5 --- Conclusions and Future Work --- p.67Bibliography --- p.69A --- p.76B --- p.78C --- p.8

    Sequence labeling to detect stuttering events in read speech

    Get PDF
    Stuttering is a speech disorder that, if treated during childhood, may be prevented from persisting into adolescence. A clinician must first determine the severity of stuttering, assessing a child during a conversational or reading task, recording each instance of disfluency, either in real time, or after transcribing the recorded session and analysing the transcript. The current study evaluates the ability of two machine learning approaches, namely conditional random fields (CRF) and bi-directional long-short-term memory (BLSTM), to detect stuttering events in transcriptions of stuttering speech. The two approaches are compared for their performance both on ideal hand-transcribed data and also on the output of automatic speech recognition (ASR). We also study the effect of data augmentation to improve performance. A corpus of 35 speakers’ read speech (13K words) was supplemented with a corpus of 63 speakers’ spontaneous speech (11K words) and an artificially-generated corpus (50K words). Experimental results show that, without feature engineering, BLSTM classifiers outperform CRF classifiers by 33.6%. However, adding features to support the CRF classifier yields performance improvements of 45% and 18% over the CRF baseline and BLSTM results, respectively. Moreover, adding more data to train the CRF and BLSTM classifiers consistently improves the results

    A Formal Model of Ambiguity and its Applications in Machine Translation

    Get PDF
    Systems that process natural language must cope with and resolve ambiguity. In this dissertation, a model of language processing is advocated in which multiple inputs and multiple analyses of inputs are considered concurrently and a single analysis is only a last resort. Compared to conventional models, this approach can be understood as replacing single-element inputs and outputs with weighted sets of inputs and outputs. Although processing components must deal with sets (rather than individual elements), constraints are imposed on the elements of these sets, and the representations from existing models may be reused. However, to deal efficiently with large (or infinite) sets, compact representations of sets that share structure between elements, such as weighted finite-state transducers and synchronous context-free grammars, are necessary. These representations and algorithms for manipulating them are discussed in depth in depth. To establish the effectiveness and tractability of the proposed processing model, it is applied to several problems in machine translation. Starting with spoken language translation, it is shown that translating a set of transcription hypotheses yields better translations compared to a baseline in which a single (1-best) transcription hypothesis is selected and then translated, independent of the translation model formalism used. More subtle forms of ambiguity that arise even in text-only translation (such as decisions conventionally made during system development about how to preprocess text) are then discussed, and it is shown that the ambiguity-preserving paradigm can be employed in these cases as well, again leading to improved translation quality. A model for supervised learning that learns from training data where sets (rather than single elements) of correct labels are provided for each training instance and use it to learn a model of compound word segmentation is also introduced, which is used as a preprocessing step in machine translation
    corecore