173,540 research outputs found

    Topic modeling-based domain adaptation for system combination

    Get PDF
    This paper gives the system description of the domain adaptation team of Dublin City University for our participation in the system combination task in the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT (ML4HMT-12). We used the results of unsupervised document classification as meta information to the system combination module. For the Spanish-English data, our strategy achieved 26.33 BLEU points, 0.33 BLEU points absolute improvement over the standard confusion-network-based system combination. This was the best score in terms of BLEU among six participants in ML4HMT-12

    Optimizing expected word error rate via sampling for speech recognition

    Full text link
    State-level minimum Bayes risk (sMBR) training has become the de facto standard for sequence-level training of speech recognition acoustic models. It has an elegant formulation using the expectation semiring, and gives large improvements in word error rate (WER) over models trained solely using cross-entropy (CE) or connectionist temporal classification (CTC). sMBR training optimizes the expected number of frames at which the reference and hypothesized acoustic states differ. It may be preferable to optimize the expected WER, but WER does not interact well with the expectation semiring, and previous approaches based on computing expected WER exactly involve expanding the lattices used during training. In this paper we show how to perform optimization of the expected WER by sampling paths from the lattices used during conventional sMBR training. The gradient of the expected WER is itself an expectation, and so may be approximated using Monte Carlo sampling. We show experimentally that optimizing WER during acoustic model training gives 5% relative improvement in WER over a well-tuned sMBR baseline on a 2-channel query recognition task (Google Home)

    Randomized controlled trial of a coordinated care intervention to improve risk factor control after stroke or transient ischemic attack in the safety net: Secondary stroke prevention by Uniting Community and Chronic care model teams Early to End Disparities (SUCCEED).

    Get PDF
    BackgroundRecurrent strokes are preventable through awareness and control of risk factors such as hypertension, and through lifestyle changes such as healthier diets, greater physical activity, and smoking cessation. However, vascular risk factor control is frequently poor among stroke survivors, particularly among socio-economically disadvantaged blacks, Latinos and other people of color. The Chronic Care Model (CCM) is an effective framework for multi-component interventions aimed at improving care processes and outcomes for individuals with chronic disease. In addition, community health workers (CHWs) have played an integral role in reducing health disparities; however, their effectiveness in reducing vascular risk among stroke survivors remains unknown. Our objectives are to develop, test, and assess the economic value of a CCM-based intervention using an Advanced Practice Clinician (APC)-CHW team to improve risk factor control after stroke in an under-resourced, racially/ethnically diverse population.Methods/designIn this single-blind randomized controlled trial, 516 adults (≥40 years) with an ischemic stroke, transient ischemic attack or intracerebral hemorrhage within the prior 90 days are being enrolled at five sites within the Los Angeles County safety-net setting and randomized 1:1 to intervention vs usual care. Participants are excluded if they do not speak English, Spanish, Cantonese, Mandarin, or Korean or if they are unable to consent. The intervention includes a minimum of three clinic visits in the healthcare setting, three home visits, and Chronic Disease Self-Management Program group workshops in community venues. The primary outcome is blood pressure (BP) control (systolic BP <130 mmHg) at 1 year. Secondary outcomes include: (1) mean change in systolic BP; (2) control of other vascular risk factors including lipids and hemoglobin A1c, (3) inflammation (C reactive protein [CRP]), (4) medication adherence, (5) lifestyle factors (smoking, diet, and physical activity), (6) estimated relative reduction in risk for recurrent stroke or myocardial infarction (MI), and (7) cost-effectiveness of the intervention versus usual care.DiscussionIf this multi-component interdisciplinary intervention is shown to be effective in improving risk factor control after stroke, it may serve as a model that can be used internationally to reduce race/ethnic and socioeconomic disparities in stroke in resource-constrained settings.Trial registrationClinicalTrials.gov Identifier NCT01763203

    Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models

    Full text link
    Large-scale clinical data is invaluable to driving many computational scientific advances today. However, understandable concerns regarding patient privacy hinder the open dissemination of such data and give rise to suboptimal siloed research. De-identification methods attempt to address these concerns but were shown to be susceptible to adversarial attacks. In this work, we focus on the vast amounts of unstructured natural language data stored in clinical notes and propose to automatically generate synthetic clinical notes that are more amenable to sharing using generative models trained on real de-identified records. To evaluate the merit of such notes, we measure both their privacy preservation properties as well as utility in training clinical NLP models. Experiments using neural language models yield notes whose utility is close to that of the real ones in some clinical NLP tasks, yet leave ample room for future improvements.Comment: Clinical NLP Workshop 201

    MATREX: the DCU MT system for WMT 2010

    Get PDF
    This paper describes the DCU machine translation system in the evaluation campaign of the Joint Fifth Workshop on Statistical Machine Translation and Metrics in ACL-2010. We describe the modular design of our multi-engine machine translation (MT) system with particular focus on the components used in this participation. We participated in the English–Spanish and English–Czech translation tasks, in which we employed our multiengine architecture to translate. We also participated in the system combination task which was carried out by the MBR decoder and confusion network decoder

    Modeling the live-pig trade network in Georgia: Implications for disease prevention and control.

    Get PDF
    Live pig trade patterns, drivers and characteristics, particularly in backyard predominant systems, remain largely unexplored despite their important contribution to the spread of infectious diseases in the swine industry. A better understanding of the pig trade dynamics can inform the implementation of risk-based and more cost-effective prevention and control programs for swine diseases. In this study, a semi-structured questionnaire elaborated by FAO and implemented to 487 farmers was used to collect data regarding basic characteristics about pig demographics and live-pig trade among villages in the country of Georgia, where very scarce information is available. Social network analysis and exponential random graph models were used to better understand the structure, contact patterns and main drivers for pig trade in the country. Results indicate relatively infrequent (a total of 599 shipments in one year) and geographically localized (median Euclidean distance between shipments = 6.08 km; IQR = 0-13.88 km) pig movements in the studied regions. The main factors contributing to live-pig trade movements among villages were being from the same region (i.e., local trade), usage of a middleman or a live animal market to trade live pigs by at least one farmer in the village, and having a large number of pig farmers in the village. The identified villages' characteristics and structural network properties could be used to inform the design of more cost-effective surveillance systems in a country which pig industry was recently devastated by African swine fever epidemics and where backyard production systems are predominant
    corecore