17 research outputs found

    Automatic correction of grammatical errors in non-native English text

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 99-107).Learning a foreign language requires much practice outside of the classroom. Computer-assisted language learning systems can help fill this need, and one desirable capability of such systems is the automatic correction of grammatical errors in texts written by non-native speakers. This dissertation concerns the correction of non-native grammatical errors in English text, and the closely related task of generating test items for language learning, using a combination of statistical and linguistic methods. We show that syntactic analysis enables extraction of more salient features. We address issues concerning robustness in feature extraction from non-native texts; and also design a framework for simultaneous correction of multiple error types. Our proposed methods are applied on some of the most common usage errors, including prepositions, verb forms, and articles. The methods are evaluated on sentences with synthetic and real errors, and in both restricted and open domains. A secondary theme of this dissertation is that of user customization. We perform a detailed analysis on a non-native corpus, illustrating the utility of an error model based on the mother tongue. We study the benefits of adjusting the correction models based on the quality of the input text; and also present novel methods to generate high-quality multiple-choice items that are tailored to the interests of the user.by John Sie Yuen Lee.Ph.D

    Programming With Jack (Fourth Edition)

    Get PDF
    This manual describes the implementation of Jack™, with emphasis on how to extend it and modify it. The principle purpose of this manual is to describe what functions in the Jack libraries are available to be used in writing new features for Jack. The manual also gives an overview of how Jack works, for those interested in modifying its current behavior. This manual assumes that you already know how to use Jack, and are familiar with its basic terminology

    Study of a Multivariate Technique for the Search of Single Top-Quark Production with sqrt(s) = 8 TeV in the CMS Experiment at CERN

    Get PDF
    This work is presenting a study for the search of the single top-quark production in the CMS Experiment at CERN focusing on the s-channel process and muon decay mode as the final state topology, using a multivariate technique based on the Boosted Decision Trees (BDT) algorithm. The study is based on the collision data collected at 8 TeV in the CMS detector with a luminosity of 19.3 fb^(-1). The multivariate technique is utilized with an optimization procedure for understanding what are the appropriate variables to use for separation of the signal and background events. The BDT output is obtained by optimizing the choice of the input variables by iterating in a feedback loop globally sensitive to the correlation coefficients of the variables. Then, the optimized BDT discriminant is compared with the analysis which was performed without any optimization on the choice of inputs. It has been investigated that the BDT output does not reveal any significant change in the separating power as the most globally correlated variables are removed, iteratively. Therefore, reducing the variable list in this way can be advantageous since it advances our understanding for the physical meaning of the output classifier. This study is a first consideration for the optimization of the BDT analyses in the single top-quark production and in the next step, this results will be used to fit the data accounting the systematic uncertainties and extract the cross-section for the BDT discriminant obtained so far

    Advanced techniques for personalized, interactive question answering

    Get PDF
    Using a computer to answer questions has been a human dream since the beginning of the digital era. A first step towards the achievement of such an ambitious goal is to deal with naturallangilage to enable the computer to understand what its user asks. The discipline that studies the conD:ection between natural language and the represen~ tation of its meaning via computational models is computational linguistics. According to such discipline, Question Answering can be defined as the task that, given a question formulated in natural language, aims at finding one or more concise answers in the form of sentences or phrases. Question Answering can be interpreted as a sub-discipline of information retrieval with the added challenge of applying sophisticated techniques to identify the complex syntactic and semantic relationships present in text. Although it is widely accepted that Question Answering represents a step beyond standard infomiation retrieval, allowing a more sophisticated and satisfactory response to the user's information needs, it still shares a series of unsolved issues with the latter. First, in most state-of-the-art Question Answering systems, the results are created independently of the questioner's characteristics, goals and needs. This is a serious limitation in several cases: for instance, a primary school child and a History student may need different answers to the questlon: When did, the Middle Ages begin? Moreover, users often issue queries not as standalone but in the context of a wider information need, for instance when researching a specific topic. Although it has recently been proposed that providing Question Answering systems with dialogue interfaces would encourage and accommodate the submission of multiple related questions and handle the user's requests for clarification, interactive Question Answering is still at its early stages: Furthermore, an i~sue which still remains open in current Question Answering is that of efficiently answering complex questions, such as those invoking definitions and descriptions (e.g. What is a metaphor?). Indeed, it is difficult to design criteria to assess the correctness of answers to such complex questions. .. These are the central research problems addressed by this thesis, and are solved as follows. An in-depth study on complex Question Answering led to the development of classifiers for complex answers. These exploit a variety of lexical, syntactic and shallow semantic features to perform textual classification using tree-~ernel functions for Support Vector Machines. The issue of personalization is solved by the integration of a User Modelling corn': ponent within the the Question Answering model. The User Model is able to filter and fe-rank results based on the user's reading level and interests. The issue ofinteractivity is approached by the development of a dialogue model and a dialogue manager suitable for open-domain interactive Question Answering. The utility of such model is corroborated by the integration of an interactive interface to allow reference resolution and follow-up conversation into the core Question Answerin,g system and by its evaluation. Finally, the models of personalized and interactive Question Answering are integrated in a comprehensive framework forming a unified model for future Question Answering research

    Perceptual models in speech quality assessment and coding

    Get PDF
    The ever-increasing demand for good communications/toll quality speech has created a renewed interest into the perceptual impact of rate compression. Two general areas are investigated in this work, namely speech quality assessment and speech coding. In the field of speech quality assessment, a model is developed which simulates the processing stages of the peripheral auditory system. At the output of the model a "running" auditory spectrum is obtained. This represents the auditory (spectral) equivalent of any acoustic sound such as speech. Auditory spectra from coded speech segments serve as inputs to a second model. This model simulates the information centre in the brain which performs the speech quality assessment. [Continues.

    Incentives and Two-Sided Matching - Engineering Coordination Mechanisms for Social Clouds

    Get PDF
    The Social Cloud framework leverages existing relationships between members of a social network for the exchange of resources. This thesis focuses on the design of coordination mechanisms to address two challenges in this scenario. In the first part, user participation incentives are studied. In the second part, heuristics for two-sided matching-based resource allocation are designed and evaluated

    Heavy Quarkonium Physics

    Get PDF
    This report is the result of the collaboration and research effort of the Quarkonium Working Group over the last three years. It provides a comprehensive overview of the state of the art in heavy-quarkonium theory and experiment, covering quarkonium spectroscopy, decay, and production, the determination of QCD parameters from quarkonium observables, quarkonia in media, and the effects on quarkonia of physics beyond the Standard Model. An introduction to common theoretical and experimental tools is included. Future opportunities for research in quarkonium physics are also discussed.Comment: xviii + 487 pages, 260 figures. The full text is also available at the Quarkonium Working Group web page: http://www.qwg.to.infn.i
    corecore