72 research outputs found

    Oscan Pruffed Again

    Get PDF
    n/

    Interactive translation prediction versus conventional post-editing in practice: a study with the CasMaCat workbench

    Full text link
    [EN] We conducted a field trial in computer-assisted professional translation to compare interactive translation prediction (ITP) against conventional post-editing (PE) of machine translation (MT) output. In contrast to the conventional PE set-up, where an MT system first produces a static translation hypothesis that is then edited by a professional (hence "post-editing"), ITP constantly updates the translation hypothesis in real time in response to user edits. Our study involved nine professional translators and four reviewers working with the web-based CasMaCat workbench. Various new interactive features aiming to assist the post-editor/translator were also tested in this trial. Our results show that even with little training, ITP can be as productive as conventional PE in terms of the total time required to produce the final translation. Moreover, translation editors working with ITP require fewer key strokes to arrive at the final version of their translation.This work was supported by the European Union’s 7th Framework Programme (FP7/2007–2013) under grant agreement No 287576 (CasMaCat ).Sanchis Trilles, G.; Alabau, V.; Buck, C.; Carl, M.; Casacuberta Nolla, F.; Garcia Martinez, MM.; Germann, U.... (2014). Interactive translation prediction versus conventional post-editing in practice: a study with the CasMaCat workbench. Machine Translation. 28(3-4):217-235. https://doi.org/10.1007/s10590-014-9157-9S217235283-4Alabau V, Leiva LA, Ortiz-Martínez D, Casacuberta F (2012) User evaluation of interactive machine translation systems. In: Proceedings of the 16th Annual Conference of the European Association for Machine Translation, pp 20–23Alabau V, Buck C, Carl M, Casacuberta F, García-Martínez M, Germann U, González-Rubio J, Hill R, Koehn P, Leiva L, Mesa-Lao B, Ortiz-Martínez D, Saint-Amand H, Sanchis-Trilles G, Tsoukala C (2014) Casmacat: A computer-assisted translation workbench. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp 25–28Alves F, Vale D (2009) Probing the unit of translation in time: aspects of the design and development of a web application for storing, annotating, and querying translation process data. Across Lang Cultures 10(2):251–273Bach N, Huang F, Al-Onaizan Y (2011) Goodness: A method for measuring machine translation confidence. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp 211–219Barrachina S, Bender O, Casacuberta F, Civera J, Cubel E, Khadivi S, Lagarda AL, Ney H, Tomás J, Vidal E, Vilar JM (2009) Statistical approaches to computer-assisted translation. Comput Linguist 35(1):3–28Brown PF, Della Pietra SA, Della Pietra VJ (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2):263–311Callison-Burch C, Koehn P, Monz C, Post M, Soricut R, Specia L (2012) Findings of the 2012 workshop on statistical machine translation. In: Proceedings of the Seventh Workshop on Statistical Machine Translation, pp 10–51Carl M (2012a) The CRITT TPR-DB 1.0: A database for empirical human translation process research. In: Proceedings of the AMTA 2012 Workshop on Post-Editing Technology and Practice, pp 1–10Carl M (2012b) Translog-II: a program for recording user activity data for empirical reading and writing research. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation, pp 4108–4112Carl M (2014) Produkt- und Prozesseinheiten in der CRITT Translation Process Research Database. In: Ahrens B (ed) Translationswissenschaftliches Kolloquium III: Beiträge zur Übersetzungs- und Dolmetschwissenschaft (Köln/Germersheim). Peter Lang, Frankfurt am Main, pp 247–266Carl M, Kay M (2011) Gazing and typing activities during translation : a comparative study of translation units of professional and student translators. Meta 56(4):952–975Doherty S, O’Brien S, Carl M (2010) Eye tracking as an MT evaluation technique. Mach Transl 24(1):1–13Elming J, Carl M, Balling LW (2014) Investigating user behaviour in post-editing and translation using the Casmacat workbench. In: O’Brien S, Winther Balling L, Carl M, Simard M, Specia L (eds) Post-editing of machine translation: processes and applications. Cambridge Scholar Publishing, Newcastle upon Tyne, pp 147–169Federico M, Cattelan A, Trombetti M (2012) Measuring user productivity in machine translation enhanced computer assisted translation. In: Proceedings of the Tenth Biennial Conference of the Association for Machine Translation in the AmericasFlournoy R, Duran C (2009) Machine translation and document localization at adobe: From pilot to production. In: Proceedings of MT Summit XIIGreen S, Heer J, Manning CD (2013) The efficacy of human post-editing for language translation. In: Proceedings of SIGCHI Conference on Human Factors in Computing Systems, pp 439–448Guerberof A (2009) Productivity and quality in mt post-editing. In: Proceedings of MT Summit XII-Workshop: Beyond Translation Memories: New Tools for Translators MTGuerberof A (2012) Productivity and quality in the post-editing of outputs from translation memories and machine translation. Ph.D. ThesisJust MA, Carpenter PA (1980) A theory of reading: from eye fixations to comprehension. Psychol Rev 87(4):329Koehn P (2009a) A process study of computer-aided translation. Mach Transl 23(4):241–263Koehn P (2009b) A web-based interactive computer aided translation tool. In: Proceedings of ACL-IJCNLP 2009 Software Demonstrations, pp 17–20Krings HP (2001) Repairing texts: empirical investigations of machine translation post-editing processes, vol 5. Kent State University Press, KentLacruz I, Shreve GM, Angelone E (2012) Average pause ratio as an indicator of cognitive effort in post-editing: a case study. In: Proceedings of the AMTA 2012 Workshop on Post-Editing Technology and Practice, pp 21–30Langlais P, Foster G, Lapalme G (2000) Transtype: A computer-aided translation typing system. In: Proceedings of the 2000 NAACL-ANLP Workshop on Embedded Machine Translation Systems, pp 46–51Leiva LA, Alabau V, Vidal E (2013) Error-proof, high-performance, and context-aware gestures for interactive text edition. In: Proceedings of the 2013 annual conference extended abstracts on Human factors in computing systems, pp 1227–1232Montgomery D (2004) Introduction to statistical quality control. Wiley, HobokenO’Brien S (2009) Eye tracking in translation process research: methodological challenges and solutions, Copenhagen Studies in Language, vol 38. Samfundslitteratur, Copenhagen, pp 251–266Ortiz-Martínez D, Casacuberta F (2014) The new Thot toolkit for fully automatic and interactive statistical machine translation. In: Proceedings of the 14th Annual Meeting of the European Association for Computational Linguistics: System Demonstrations, pp 45–48Plitt M, Masselot F (2010) A productivity test of statistical machine translation post-editing in a typical localisation context. Prague Bulletin Math Linguist 93(1):7–16Sanchis-Trilles G, Ortiz-Martínez D, Civera J, Casacuberta F, Vidal E, Hoang H (2008) Improving interactive machine translation via mouse actions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp 485–494Simard M, Foster G (2013) Pepr: Post-edit propagation using phrase-based statistical machine translation. In: Proceedings of MT Summit XIV, pp 191–198Skadiņš R, Puriņš M, Skadiņa I, Vasiļjevs A (2011) Evaluation of SMT in localization to under-resourced inflected language. In: Proceedings of the 15th International Conference of the European Association for Machine Translation, pp 35–4

    NOX1 loss-of-function genetic variants in patients with inflammatory bowel disease.

    Get PDF
    Genetic defects that affect intestinal epithelial barrier function can present with very early-onset inflammatory bowel disease (VEOIBD). Using whole-genome sequencing, a novel hemizygous defect in NOX1 encoding NAPDH oxidase 1 was identified in a patient with ulcerative colitis-like VEOIBD. Exome screening of 1,878 pediatric patients identified further seven male inflammatory bowel disease (IBD) patients with rare NOX1 mutations. Loss-of-function was validated in p.N122H and p.T497A, and to a lesser degree in p.Y470H, p.R287Q, p.I67M, p.Q293R as well as the previously described p.P330S, and the common NOX1 SNP p.D360N (rs34688635) variant. The missense mutation p.N122H abrogated reactive oxygen species (ROS) production in cell lines, ex vivo colonic explants, and patient-derived colonic organoid cultures. Within colonic crypts, NOX1 constitutively generates a high level of ROS in the crypt lumen. Analysis of 9,513 controls and 11,140 IBD patients of non-Jewish European ancestry did not reveal an association between p.D360N and IBD. Our data suggest that loss-of-function variants in NOX1 do not cause a Mendelian disorder of high penetrance but are a context-specific modifier. Our results implicate that variants in NOX1 change brush border ROS within colonic crypts at the interface between the epithelium and luminal microbes

    Structure, function and diversity of the healthy human microbiome

    Get PDF
    Author Posting. © The Authors, 2012. This article is posted here by permission of Nature Publishing Group. The definitive version was published in Nature 486 (2012): 207-214, doi:10.1038/nature11234.Studies of the human microbiome have revealed that even healthy individuals differ remarkably in the microbes that occupy habitats such as the gut, skin and vagina. Much of this diversity remains unexplained, although diet, environment, host genetics and early microbial exposure have all been implicated. Accordingly, to characterize the ecology of human-associated microbial communities, the Human Microbiome Project has analysed the largest cohort and set of distinct, clinically relevant body habitats so far. We found the diversity and abundance of each habitat’s signature microbes to vary widely even among healthy subjects, with strong niche specialization both within and among individuals. The project encountered an estimated 81–99% of the genera, enzyme families and community configurations occupied by the healthy Western microbiome. Metagenomic carriage of metabolic pathways was stable among individuals despite variation in community structure, and ethnic/racial background proved to be one of the strongest associations of both pathways and microbes with clinical metadata. These results thus delineate the range of structural and functional configurations normal in the microbial communities of a healthy population, enabling future characterization of the epidemiology, ecology and translational applications of the human microbiome.This research was supported in part by National Institutes of Health grants U54HG004969 to B.W.B.; U54HG003273 to R.A.G.; U54HG004973 to R.A.G., S.K.H. and J.F.P.; U54HG003067 to E.S.Lander; U54AI084844 to K.E.N.; N01AI30071 to R.L.Strausberg; U54HG004968 to G.M.W.; U01HG004866 to O.R.W.; U54HG003079 to R.K.W.; R01HG005969 to C.H.; R01HG004872 to R.K.; R01HG004885 to M.P.; R01HG005975 to P.D.S.; R01HG004908 to Y.Y.; R01HG004900 to M.K.Cho and P. Sankar; R01HG005171 to D.E.H.; R01HG004853 to A.L.M.; R01HG004856 to R.R.; R01HG004877 to R.R.S. and R.F.; R01HG005172 to P. Spicer.; R01HG004857 to M.P.; R01HG004906 to T.M.S.; R21HG005811 to E.A.V.; M.J.B. was supported by UH2AR057506; G.A.B. was supported by UH2AI083263 and UH3AI083263 (G.A.B., C. N. Cornelissen, L. K. Eaves and J. F. Strauss); S.M.H. was supported by UH3DK083993 (V. B. Young, E. B. Chang, F. Meyer, T. M. S., M. L. Sogin, J. M. Tiedje); K.P.R. was supported by UH2DK083990 (J. V.); J.A.S. and H.H.K. were supported by UH2AR057504 and UH3AR057504 (J.A.S.); DP2OD001500 to K.M.A.; N01HG62088 to the Coriell Institute for Medical Research; U01DE016937 to F.E.D.; S.K.H. was supported by RC1DE0202098 and R01DE021574 (S.K.H. and H. Li); J.I. was supported by R21CA139193 (J.I. and D. S. Michaud); K.P.L. was supported by P30DE020751 (D. J. Smith); Army Research Office grant W911NF-11-1-0473 to C.H.; National Science Foundation grants NSF DBI-1053486 to C.H. and NSF IIS-0812111 to M.P.; The Office of Science of the US Department of Energy under Contract No. DE-AC02-05CH11231 for P.S. C.; LANL Laboratory-Directed Research and Development grant 20100034DR and the US Defense Threat Reduction Agency grants B104153I and B084531I to P.S.C.; Research Foundation - Flanders (FWO) grant to K.F. and J.Raes; R.K. is an HHMI Early Career Scientist; Gordon&BettyMoore Foundation funding and institutional funding fromthe J. David Gladstone Institutes to K.S.P.; A.M.S. was supported by fellowships provided by the Rackham Graduate School and the NIH Molecular Mechanisms in Microbial Pathogenesis Training Grant T32AI007528; a Crohn’s and Colitis Foundation of Canada Grant in Aid of Research to E.A.V.; 2010 IBM Faculty Award to K.C.W.; analysis of the HMPdata was performed using National Energy Research Scientific Computing resources, the BluBioU Computational Resource at Rice University

    A framework for human microbiome research

    Get PDF
    A variety of microbial communities and their genes (the microbiome) exist throughout the human body, with fundamental roles in human health and disease. The National Institutes of Health (NIH)-funded Human Microbiome Project Consortium has established a population-scale framework to develop metagenomic protocols, resulting in a broad range of quality-controlled resources and data including standardized methods for creating, processing and interpreting distinct types of high-throughput metagenomic data available to the scientific community. Here we present resources from a population of 242 healthy adults sampled at 15 or 18 body sites up to three times, which have generated 5,177 microbial taxonomic profiles from 16S ribosomal RNA genes and over 3.5 terabases of metagenomic sequence so far. In parallel, approximately 800 reference strains isolated from the human body have been sequenced. Collectively, these data represent the largest resource describing the abundance and variety of the human microbiome, while providing a framework for current and future studies

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Common germline polymorphisms associated with breast cancer-specific survival

    Get PDF
    Abstract Introduction Previous studies have identified common germline variants nominally associated with breast cancer survival. These associations have not been widely replicated in further studies. The purpose of this study was to evaluate the association of previously reported SNPs with breast cancer-specific survival using data from a pooled analysis of eight breast cancer survival genome-wide association studies (GWAS) from the Breast Cancer Association Consortium. Methods A literature review was conducted of all previously published associations between common germline variants and three survival outcomes: breast cancer-specific survival, overall survival and disease-free survival. All associations that reached the nominal significance level of P value <0.05 were included. Single nucleotide polymorphisms that had been previously reported as nominally associated with at least one survival outcome were evaluated in the pooled analysis of over 37,000 breast cancer cases for association with breast cancer-specific survival. Previous associations were evaluated using a one-sided test based on the reported direction of effect. Results Fifty-six variants from 45 previous publications were evaluated in the meta-analysis. Fifty-four of these were evaluated in the full set of 37,954 breast cancer cases with 2,900 events and the two additional variants were evaluated in a reduced sample size of 30,000 samples in order to ensure independence from the previously published studies. Five variants reached nominal significance (P <0.05) in the pooled GWAS data compared to 2.8 expected under the null hypothesis. Seven additional variants were associated (P <0.05) with ER-positive disease. Conclusions Although no variants reached genome-wide significance (P <5 x 10−8), these results suggest that there is some evidence of association between candidate common germline variants and breast cancer prognosis. Larger studies from multinational collaborations are necessary to increase the power to detect associations, between common variants and prognosis, at more stringent significance levels

    Exponential growth, high prevalence of SARS-CoV-2, and vaccine effectiveness associated with the Delta variant

    Get PDF
    SARS-CoV-2 infections were rising during early summer 2021 in many countries associated with the Delta variant. We assessed RT-PCR swab-positivity in the REal-time Assessment of Community Transmission-1 (REACT-1) study in England. We observed sustained exponential growth with average doubling time (June-July 2021) of 25 days driven by complete replacement of Alpha variant by Delta, and by high prevalence at younger less-vaccinated ages. Unvaccinated people were three times more likely than double-vaccinated people to test positive. However, after adjusting for age and other variables, vaccine effectiveness for double-vaccinated people was estimated at between ~50% and ~60% during this period in England. Increased social mixing in the presence of Delta had the potential to generate sustained growth in infections, even at high levels of vaccination

    Changes in symptomatology, reinfection, and transmissibility associated with the SARS-CoV-2 variant B.1.1.7: an ecological study

    Get PDF
    Background The SARS-CoV-2 variant B.1.1.7 was first identified in December, 2020, in England. We aimed to investigate whether increases in the proportion of infections with this variant are associated with differences in symptoms or disease course, reinfection rates, or transmissibility. Methods We did an ecological study to examine the association between the regional proportion of infections with the SARS-CoV-2 B.1.1.7 variant and reported symptoms, disease course, rates of reinfection, and transmissibility. Data on types and duration of symptoms were obtained from longitudinal reports from users of the COVID Symptom Study app who reported a positive test for COVID-19 between Sept 28 and Dec 27, 2020 (during which the prevalence of B.1.1.7 increased most notably in parts of the UK). From this dataset, we also estimated the frequency of possible reinfection, defined as the presence of two reported positive tests separated by more than 90 days with a period of reporting no symptoms for more than 7 days before the second positive test. The proportion of SARS-CoV-2 infections with the B.1.1.7 variant across the UK was estimated with use of genomic data from the COVID-19 Genomics UK Consortium and data from Public Health England on spike-gene target failure (a non-specific indicator of the B.1.1.7 variant) in community cases in England. We used linear regression to examine the association between reported symptoms and proportion of B.1.1.7. We assessed the Spearman correlation between the proportion of B.1.1.7 cases and number of reinfections over time, and between the number of positive tests and reinfections. We estimated incidence for B.1.1.7 and previous variants, and compared the effective reproduction number, Rt, for the two incidence estimates. Findings From Sept 28 to Dec 27, 2020, positive COVID-19 tests were reported by 36 920 COVID Symptom Study app users whose region was known and who reported as healthy on app sign-up. We found no changes in reported symptoms or disease duration associated with B.1.1.7. For the same period, possible reinfections were identified in 249 (0·7% [95% CI 0·6–0·8]) of 36 509 app users who reported a positive swab test before Oct 1, 2020, but there was no evidence that the frequency of reinfections was higher for the B.1.1.7 variant than for pre-existing variants. Reinfection occurrences were more positively correlated with the overall regional rise in cases (Spearman correlation 0·56–0·69 for South East, London, and East of England) than with the regional increase in the proportion of infections with the B.1.1.7 variant (Spearman correlation 0·38–0·56 in the same regions), suggesting B.1.1.7 does not substantially alter the risk of reinfection. We found a multiplicative increase in the Rt of B.1.1.7 by a factor of 1·35 (95% CI 1·02–1·69) relative to pre-existing variants. However, Rt fell below 1 during regional and national lockdowns, even in regions with high proportions of infections with the B.1.1.7 variant. Interpretation The lack of change in symptoms identified in this study indicates that existing testing and surveillance infrastructure do not need to change specifically for the B.1.1.7 variant. In addition, given that there was no apparent increase in the reinfection rate, vaccines are likely to remain effective against the B.1.1.7 variant. Funding Zoe Global, Department of Health (UK), Wellcome Trust, Engineering and Physical Sciences Research Council (UK), National Institute for Health Research (UK), Medical Research Council (UK), Alzheimer's Society
    corecore