17 research outputs found
Automatic correction of grammatical errors in non-native English text
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 99-107).Learning a foreign language requires much practice outside of the classroom. Computer-assisted language learning systems can help fill this need, and one desirable capability of such systems is the automatic correction of grammatical errors in texts written by non-native speakers. This dissertation concerns the correction of non-native grammatical errors in English text, and the closely related task of generating test items for language learning, using a combination of statistical and linguistic methods. We show that syntactic analysis enables extraction of more salient features. We address issues concerning robustness in feature extraction from non-native texts; and also design a framework for simultaneous correction of multiple error types. Our proposed methods are applied on some of the most common usage errors, including prepositions, verb forms, and articles. The methods are evaluated on sentences with synthetic and real errors, and in both restricted and open domains. A secondary theme of this dissertation is that of user customization. We perform a detailed analysis on a non-native corpus, illustrating the utility of an error model based on the mother tongue. We study the benefits of adjusting the correction models based on the quality of the input text; and also present novel methods to generate high-quality multiple-choice items that are tailored to the interests of the user.by John Sie Yuen Lee.Ph.D
Programming With Jack (Fourth Edition)
This manual describes the implementation of Jack™, with emphasis on how to extend it and modify it. The principle purpose of this manual is to describe what functions in the Jack libraries are available to be used in writing new features for Jack. The manual also gives an overview of how Jack works, for those interested in modifying its current behavior. This manual assumes that you already know how to use Jack, and are familiar with its basic terminology
Study of a Multivariate Technique for the Search of Single Top-Quark Production with sqrt(s) = 8 TeV in the CMS Experiment at CERN
This work is presenting a study for the search of the single top-quark production in the CMS Experiment at CERN focusing on the s-channel process and muon decay mode as the final state topology, using a multivariate technique based on the Boosted Decision Trees (BDT) algorithm. The study is based on the collision data collected at 8 TeV in the CMS detector with a luminosity of 19.3 fb^(-1).
The multivariate technique is utilized with an optimization procedure for understanding what are the appropriate variables to use for separation of the signal and background events. The BDT output is obtained by optimizing the choice of the input variables by iterating in a feedback loop globally sensitive to the correlation coefficients of the variables. Then, the optimized BDT discriminant is compared with the analysis which was performed without any optimization on the choice of inputs.
It has been investigated that the BDT output does not reveal any significant change in the separating power as the most globally correlated variables are removed, iteratively. Therefore, reducing the variable list in this way can be advantageous since it advances our understanding for the physical meaning of the output classifier. This study is a first consideration for the optimization of the BDT analyses in the single top-quark production and in the next step, this results will be used to fit the data accounting the systematic uncertainties and extract the cross-section for the BDT discriminant obtained so far
Advanced techniques for personalized, interactive question answering
Using a computer to answer questions has been a human dream since the beginning of
the digital era. A first step towards the achievement of such an ambitious goal is to deal
with naturallangilage to enable the computer to understand what its user asks.
The discipline that studies the conD:ection between natural language and the represen~
tation of its meaning via computational models is computational linguistics. According
to such discipline, Question Answering can be defined as the task that, given a question
formulated in natural language, aims at finding one or more concise answers in the form
of sentences or phrases.
Question Answering can be interpreted as a sub-discipline of information retrieval
with the added challenge of applying sophisticated techniques to identify the complex
syntactic and semantic relationships present in text. Although it is widely accepted that
Question Answering represents a step beyond standard infomiation retrieval, allowing a
more sophisticated and satisfactory response to the user's information needs, it still shares
a series of unsolved issues with the latter.
First, in most state-of-the-art Question Answering systems, the results are created
independently of the questioner's characteristics, goals and needs. This is a serious limitation
in several cases: for instance, a primary school child and a History student may
need different answers to the questlon: When did, the Middle Ages begin?
Moreover, users often issue queries not as standalone but in the context of a wider
information need, for instance when researching a specific topic. Although it has recently been proposed that providing Question Answering systems with dialogue interfaces
would encourage and accommodate the submission of multiple related questions
and handle the user's requests for clarification, interactive Question Answering is still at
its early stages:
Furthermore, an i~sue which still remains open in current Question Answering is
that of efficiently answering complex questions, such as those invoking definitions and
descriptions (e.g. What is a metaphor?). Indeed, it is difficult to design criteria to assess
the correctness of answers to such complex questions.
.. These are the central research problems addressed by this thesis, and are solved as
follows.
An in-depth study on complex Question Answering led to the development of classifiers
for complex answers. These exploit a variety of lexical, syntactic and shallow
semantic features to perform textual classification using tree-~ernel functions for Support
Vector Machines.
The issue of personalization is solved by the integration of a User Modelling corn':
ponent within the the Question Answering model. The User Model is able to filter and
fe-rank results based on the user's reading level and interests.
The issue ofinteractivity is approached by the development of a dialogue model and a
dialogue manager suitable for open-domain interactive Question Answering. The utility
of such model is corroborated by the integration of an interactive interface to allow reference
resolution and follow-up conversation into the core Question Answerin,g system and
by its evaluation.
Finally, the models of personalized and interactive Question Answering are integrated
in a comprehensive framework forming a unified model for future Question Answering
research
Perceptual models in speech quality assessment and coding
The ever-increasing demand for good communications/toll
quality speech has created a renewed interest into the
perceptual impact of rate compression. Two general areas are
investigated in this work, namely speech quality assessment
and speech coding.
In the field of speech quality assessment, a model is
developed which simulates the processing stages of the
peripheral auditory system. At the output of the model a
"running" auditory spectrum is obtained. This represents
the auditory (spectral) equivalent of any acoustic sound such
as speech. Auditory spectra from coded speech segments serve
as inputs to a second model. This model simulates the
information centre in the brain which performs the speech
quality assessment. [Continues.
Incentives and Two-Sided Matching - Engineering Coordination Mechanisms for Social Clouds
The Social Cloud framework leverages existing relationships between members of a social network for the exchange of resources. This thesis focuses on the design of coordination mechanisms to address two challenges in this scenario. In the first part, user participation incentives are studied. In the second part, heuristics for two-sided matching-based resource allocation are designed and evaluated
Heavy Quarkonium Physics
This report is the result of the collaboration and research effort of the
Quarkonium Working Group over the last three years. It provides a comprehensive
overview of the state of the art in heavy-quarkonium theory and experiment,
covering quarkonium spectroscopy, decay, and production, the determination of
QCD parameters from quarkonium observables, quarkonia in media, and the effects
on quarkonia of physics beyond the Standard Model. An introduction to common
theoretical and experimental tools is included. Future opportunities for
research in quarkonium physics are also discussed.Comment: xviii + 487 pages, 260 figures. The full text is also available at
the Quarkonium Working Group web page: http://www.qwg.to.infn.i