686 research outputs found
Robust Dependency Parsing of Spontaneous Japanese Speech and Its Evaluation
Spontaneously spoken Japanese includes a lot of grammatically\ud
ill-formed linguistic phenomena such as fillers,\ud
hesitations, inversions, and so on, which do not appear in\ud
written language. This paper proposes a method of robust\ud
dependency parsing using a large-scale spoken language\ud
corpus, and evaluates the availability and robustness\ud
of the method using spontaneously spoken dialogue\ud
sentences. By utilizing stochastic information about the\ud
appearance of ill-formed phenomena, the method can robustly\ud
parse spoken Japanese including fillers, inversions,\ud
or dependencies over utterance units. As a result of an experiment,\ud
the parsing accuracy provided 87.0%, and we\ud
confirmed that it is effective to utilize the location information\ud
of a bunsetsu, and the distance information between\ud
bunsetsus as stochastic information
音声翻訳における文解析技法について
本文データは平成22年度国立国会図書館の学位論文(博士)のデジタル化実施により作成された画像ファイルを基にpdf変換したものである京都大学0048新制・論文博士博士(工学)乙第8652号論工博第2893号新制||工||968(附属図書館)UT51-94-R411(主査)教授 長尾 真, 教授 堂下 修司, 教授 池田 克夫学位規則第4条第2項該当Doctor of EngineeringKyoto UniversityDFA
An investigation of grammar design in natural-language speech-recognition.
With the growing interest and demand for human-machine interaction, much work concerning speech-recognition has been carried out over the past three decades. Although a variety of approaches have been proposed to address speech-recognition issues, such as stochastic (statistical) techniques, grammar-based techniques, techniques integrated with linguistic features, and other approaches, recognition accuracy and robustness remain among the major problems that need to be addressed. At the state of the art, most commercial speech products are constructed using grammar-based speech-recognition technology. In this thesis, we investigate a number of features involved in grammar design in natural-language speech-recognition technology. We hypothesize that: with the same domain, a semantic grammar, which directly encodes some semantic constraints into the recognition grammar, achieves better accuracy, but less robustness; a syntactic grammar defines a language with a larger size, thereby it has better robustness, but less accuracy; a word-sequence grammar, which includes neither semantics nor syntax, defines the largest language, therefore, is the most robust, but has very poor recognition accuracy. In this Master\u27s thesis, we claim that proper grammar design can achieve the appropriate compromise between recognition accuracy and robustness. The thesis has been proven by experiments using the IBM Voice-Server SDK, which consists of a VoiceXML browser, IBM ViaVoice Speech Recognition and Text-To-Speech (TTS) engines, sample applications, and other tools for developing and testing VoiceXML applications. The experimental grammars are written in the Java Speech Grammar Format (JSGF), and the testing applications are written in VoiceXML. The tentative experimental results suggest that grammar design is a good area for further study. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2003 .S555. Source: Masters Abstracts International, Volume: 43-01, page: 0244. Adviser: Richard A. Frost. Thesis (M.Sc.)--University of Windsor (Canada), 2004
An investigation of the electrolytic plasma oxidation process for corrosion protection of pure magnesium and magnesium alloy AM50.
In this study, silicate and phosphate EPO coatings were produced on pure magnesium using an AC power source. It was found that the silicate coatings possess good wear resistance, while the phosphate coatings provide better corrosion protection. A Design of Experiment (DOE) technique, the Taguchi method, was used to systematically investigate the effect of the EPO process parameters on the corrosion protection properties of a coated magnesium alloy AM50 using a DC power. The experimental design consisted of four factors (treatment time, current density, and KOH and NaAlO2 concentrations), with three levels of each factor. Potentiodynamic polarization measurements were conducted to determine the corrosion resistance of the coated samples. The optimized processing parameters are 12 minutes, 12 mA/cm2 current density, 0.9 g/l KOH, 15.0 g/l NaAlO2. The results of the percentage contribution of each factor determined by the analysis of variance (ANOVA) imply that the KOH concentration is the most significant factor affecting the corrosion resistance of the coatings, while treatment time is a major factor affecting the thickness of the coatings. (Abstract shortened by UMI.)Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .M323. Source: Masters Abstracts International, Volume: 44-03, page: 1479. Thesis (M.A.Sc.)--University of Windsor (Canada), 2005
Getting Past the Language Gap: Innovations in Machine Translation
In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT
Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech
We describe a statistical approach for modeling dialogue acts in
conversational speech, i.e., speech-act-like units such as Statement, Question,
Backchannel, Agreement, Disagreement, and Apology. Our model detects and
predicts dialogue acts based on lexical, collocational, and prosodic cues, as
well as on the discourse coherence of the dialogue act sequence. The dialogue
model is based on treating the discourse structure of a conversation as a
hidden Markov model and the individual dialogue acts as observations emanating
from the model states. Constraints on the likely sequence of dialogue acts are
modeled via a dialogue act n-gram. The statistical dialogue grammar is combined
with word n-grams, decision trees, and neural networks modeling the
idiosyncratic lexical and prosodic manifestations of each dialogue act. We
develop a probabilistic integration of speech recognition with dialogue
modeling, to improve both speech recognition and dialogue act classification
accuracy. Models are trained and evaluated using a large hand-labeled database
of 1,155 conversations from the Switchboard corpus of spontaneous
human-to-human telephone speech. We achieved good dialogue act labeling
accuracy (65% based on errorful, automatically recognized words and prosody,
and 71% based on word transcripts, compared to a chance baseline accuracy of
35% and human accuracy of 84%) and a small reduction in word recognition error.Comment: 35 pages, 5 figures. Changes in copy editing (note title spelling
changed
Individual and Domain Adaptation in Sentence Planning for Dialogue
One of the biggest challenges in the development and deployment of spoken
dialogue systems is the design of the spoken language generation module. This
challenge arises from the need for the generator to adapt to many features of
the dialogue domain, user population, and dialogue context. A promising
approach is trainable generation, which uses general-purpose linguistic
knowledge that is automatically adapted to the features of interest, such as
the application domain, individual user, or user group. In this paper we
present and evaluate a trainable sentence planner for providing restaurant
information in the MATCH dialogue system. We show that trainable sentence
planning can produce complex information presentations whose quality is
comparable to the output of a template-based generator tuned to this domain. We
also show that our method easily supports adapting the sentence planner to
individuals, and that the individualized sentence planners generally perform
better than models trained and tested on a population of individuals. Previous
work has documented and utilized individual preferences for content selection,
but to our knowledge, these results provide the first demonstration of
individual preferences for sentence planning operations, affecting the content
order, discourse structure and sentence structure of system responses. Finally,
we evaluate the contribution of different feature sets, and show that, in our
application, n-gram features often do as well as features based on higher-level
linguistic representations
Research in the Language, Information and Computation Laboratory of the University of Pennsylvania
This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania.
It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition.
Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue it’s easier than ever to do so: this document is accessible on the “information superhighway”. Just call up http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html
In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authors’ abstracts in the web version of this report.
The abstracts describe the researchers’ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn
Computational Approaches to the Syntax–Prosody Interface: Using Prosody to Improve Parsing
Prosody has strong ties with syntax, since prosody can be used to resolve some syntactic ambiguities. Syntactic ambiguities have been shown to negatively impact automatic syntactic parsing, hence there is reason to believe that prosodic information can help improve parsing. This dissertation considers a number of approaches that aim to computationally examine the relationship between prosody and syntax of natural languages, while also addressing the role of syntactic phrase length, with the ultimate goal of using prosody to improve parsing.
Chapter 2 examines the effect of syntactic phrase length on prosody in double center embedded sentences in French. Data collected in a previous study were reanalyzed using native speaker judgment and automatic methods (forced alignment). Results demonstrate similar prosodic splitting behavior as in English in contradiction to the original study’s findings.
Chapter 3 presents a number of studies examining whether syntactic ambiguity can yield different prosodic patterns, allowing humans and/or computers to resolve the ambiguity. In an experimental study, humans disambiguated sentences with prepositional phrase- (PP)-attachment ambiguity with 49% accuracy presented as text, and 63% presented as audio. Machine learning on the same data yielded an accuracy of 63-73%. A corpus study on the Switchboard corpus used both prosodic breaks and phrase lengths to predict the attachment, with an accuracy of 63.5% for PP-attachment sentences, and 71.2% for relative clause attachment.
Chapter 4 aims to identify aspects of syntax that relate to prosody and use these in combination with prosodic cues to improve parsing. The aspects identified (dependency configurations) are based on dependency structure, reflecting the relative head location of two consecutive words, and are used as syntactic features in an ensemble system based on Recurrent Neural Networks, to score parse hypotheses and select the most likely parse for a given sentence. Using syntactic features alone, the system achieved an improvement of 1.1% absolute in Unlabelled Attachment Score (UAS) on the test set, above the best parser in the ensemble, while using syntactic features combined with prosodic features (pauses and normalized duration) led to a further improvement of 0.4% absolute.
The results achieved demonstrate the relationship between syntax, syntactic phrase length, and prosody, and indicate the ability and future potential of prosody to resolve ambiguity and improve parsing
- …