12 research outputs found

    Structured local exponential models for machine translation

    Get PDF
    This thesis proposes a synthesis and generalization of local exponential translation models, the subclass of feature-rich translation models which associate probability distributions with individual rewrite rules used by the translation system, such as synchronous context-free rules, or with other individual aspects of translation hypotheses such as word pairs or reordering events. Unlike other authors we use these estimates to replace the traditional phrase models and lexical scores, rather than in addition to them, thereby demonstrating that the local exponential phrase models can be regarded as a generalization of standard methods not only in theoretical but also in practical terms. We further introduce a form of local translation models that combine features associated with surface forms of rules and features associated with less specific representation -- including those based on lemmas, inflections, and reordering patterns -- such that surface-form estimates are recovered as a special case of the model. Crucially, the proposed approach allows estimation of parameters for the latter type of features from training sets that include multiple source phrases, thereby overcoming an important training set fragmentation problem which hampers previously proposed local translation models. These proposals are experimentally validated. Conditioning all phrase-based probabilities in a hierarchical phrase-based system on source-side contextual information produces significant performance improvements. Extending the contextually-sensitive estimates with features modeling source-side morphology and reordering patterns yields consistent additional improvements, while further experiments show significant improvements obtained from modeling observed and unobserved inflections for a morphologically rich target language

    Transplant arteriosclerosis: an enigmatic disease due to a misnomer

    Full text link
    Solid organ transplantation across the allogeneic barrier, pioneered by Thomas Starzl, has by now become a common medical procedure. Unfortunately, the number of donor organs lost due to transplant arteriosclerosis (chronic rejection), remains significant and unchanged for decades. We argue that designation of transplant arteriosclerosis as chronic rejection, and its classification as a delayed long-lasting reaction of recipient immune effectors against donor alloantigens have given us a wrong impression that we have identified the necessary cause/pathogenesis of the tissue pathology. However, whatever treatment options we have in the anti-rejection toolbox, despite their success in treating classical rejection, do not work for the transplant arteriosclerosis. Yet, the scientific community has continued to conceptualize and approach the pathology within the alloimmunity model. Due to unproductive research from the alloimmunity and rejection perspective, the number of transplanted hearts lost due to this pathology today is almost the same as it was fifty years ago. We believe that this phenomenon falls under the rubric of linguistic relativity, and that language we chose to name the disease has restricted our cognitive ability to solve the problem. While the initial perception of the transplant arteriosclerosis as chronic rejection was logical and scientific, the subsequent experience revealed that such perception and approach have been fruitless, and likely are incorrect. Considering our tragic failure to prevent and treat the delayed arterial pathology of donor organs using all available knowledge on alloimmunity and rejection, we must finally disassociate the former from the latter. The only way to start this uncomfortable process is to change the words we are using; particularly, the words we chose to name the disease. We have to step out of the alloimmunity rejection box.Comment: 19 pages, 2 figure

    The Hiero Machine Translation System: Extensions, Evaluation, and Analysis

    Get PDF
    Hierarchical organization is a well known property of language, and yet the notion of hierarchical structure has been largely absent from the best performing machine translation systems in recent community-wide evaluations. In this paper, we discuss a new hierarchical phrase-based statistical machine translation system (Chiang, 2005), presenting recent extensions to the original proposal, new evaluation results in a community-wide evaluation, and a novel technique for fine-grained comparative analysis of MT systems.

    TDT-2002 Topic Tracking at Maryland: First Experiments with the Lemur Toolkit

    Get PDF
    The University of Maryland submitted six topic tracking runs for the 2002 Topic Detection and Tracking evaluation. Two runs were produced using the Lemur language modeling toolkit, the remaining four were produced using an separate system coded in Perl. The Lemur runs outperformed the Perl runs on the required condition because term frequency information was better handled. Two of the Perl runs used native Arabic orthography with two-best translation based on a statistical lexicon, obtaining similar results to those obtained with the Arabic-to-English translations provided with the collection. UMIACS-TR-2003-24 LAMP-TR-09

    Effect of liver transplantation on inflammatory bowel disease in patients with primary sclerosing cholangitis

    Get PDF
    This report investigates the influence of liver transplantation and concomitant immunosuppression on the course of progression of inflammatory bowel disease (IBD) and discusses statistical methodology appropriate for such settings. The data on 303 patients who underwent liver transplantation for primary sclerosing cholangitis (PSC) were analyzed using person-time analysis and Cox regression, with the duration of IBD as the time variable and transplantation as a segmented time-dependent covariate, to take into account both posttransplant and pretransplant history of IBD. The need for colectomy and appearance of colorectal cancer were taken as outcome measures. The only significant risk factor in the multivariate model for colectomy was transplantation itself, which increased the risk of colectomy due to intractable disease (Wald statistic; P =. 001). None of the variables available for analysis were found to influence the risk of colon cancer significantly. Graphs showing the dependence of the instantaneous risk of cancer on the time from onset of IBD and its independence from the latter in the case of colectomy are presented. The use of a unique statistical methodology described for the first time in this setting led us to the somewhat surprising conclusion that transplantation and concomitant use of immunosuppression accelerate the progression of IBD. At the same time, transplantation does not affect the incidence of colorectal cancer. These results confirm the findings of some recent studies and can potentially shed new light on the disease pathogenesis

    Making MIRACLEs: Interactive Translingual Search for Cebuano and Hindi

    No full text
    This article describes the design of MIRACLE, an easily extensible system based on Englih queries that has previously been used to search French, German, and Spanish documents, and explains how the capabilities of MIRACLE were rapidly extended to accommodate Cebuano and Hindi. Evaluation results for the cross-language search component are presented for both languages, along with results from a brief full-system interactive experiment with Hindi. The article concludes with some observations on directions for further research on interactive crosslanguage information retrieva
    corecore