173,540 research outputs found
Topic modeling-based domain adaptation for system combination
This paper gives the system description of the domain adaptation team of Dublin City University for our participation in the system combination task in the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT (ML4HMT-12). We used the results of unsupervised document classification as meta information to the system combination module. For the Spanish-English data, our strategy achieved 26.33 BLEU points, 0.33 BLEU points absolute improvement over the standard confusion-network-based system combination. This was the best score in terms of BLEU among six participants in ML4HMT-12
Optimizing expected word error rate via sampling for speech recognition
State-level minimum Bayes risk (sMBR) training has become the de facto
standard for sequence-level training of speech recognition acoustic models. It
has an elegant formulation using the expectation semiring, and gives large
improvements in word error rate (WER) over models trained solely using
cross-entropy (CE) or connectionist temporal classification (CTC). sMBR
training optimizes the expected number of frames at which the reference and
hypothesized acoustic states differ. It may be preferable to optimize the
expected WER, but WER does not interact well with the expectation semiring, and
previous approaches based on computing expected WER exactly involve expanding
the lattices used during training. In this paper we show how to perform
optimization of the expected WER by sampling paths from the lattices used
during conventional sMBR training. The gradient of the expected WER is itself
an expectation, and so may be approximated using Monte Carlo sampling. We show
experimentally that optimizing WER during acoustic model training gives 5%
relative improvement in WER over a well-tuned sMBR baseline on a 2-channel
query recognition task (Google Home)
Randomized controlled trial of a coordinated care intervention to improve risk factor control after stroke or transient ischemic attack in the safety net: Secondary stroke prevention by Uniting Community and Chronic care model teams Early to End Disparities (SUCCEED).
BackgroundRecurrent strokes are preventable through awareness and control of risk factors such as hypertension, and through lifestyle changes such as healthier diets, greater physical activity, and smoking cessation. However, vascular risk factor control is frequently poor among stroke survivors, particularly among socio-economically disadvantaged blacks, Latinos and other people of color. The Chronic Care Model (CCM) is an effective framework for multi-component interventions aimed at improving care processes and outcomes for individuals with chronic disease. In addition, community health workers (CHWs) have played an integral role in reducing health disparities; however, their effectiveness in reducing vascular risk among stroke survivors remains unknown. Our objectives are to develop, test, and assess the economic value of a CCM-based intervention using an Advanced Practice Clinician (APC)-CHW team to improve risk factor control after stroke in an under-resourced, racially/ethnically diverse population.Methods/designIn this single-blind randomized controlled trial, 516 adults (≥40 years) with an ischemic stroke, transient ischemic attack or intracerebral hemorrhage within the prior 90 days are being enrolled at five sites within the Los Angeles County safety-net setting and randomized 1:1 to intervention vs usual care. Participants are excluded if they do not speak English, Spanish, Cantonese, Mandarin, or Korean or if they are unable to consent. The intervention includes a minimum of three clinic visits in the healthcare setting, three home visits, and Chronic Disease Self-Management Program group workshops in community venues. The primary outcome is blood pressure (BP) control (systolic BP <130 mmHg) at 1 year. Secondary outcomes include: (1) mean change in systolic BP; (2) control of other vascular risk factors including lipids and hemoglobin A1c, (3) inflammation (C reactive protein [CRP]), (4) medication adherence, (5) lifestyle factors (smoking, diet, and physical activity), (6) estimated relative reduction in risk for recurrent stroke or myocardial infarction (MI), and (7) cost-effectiveness of the intervention versus usual care.DiscussionIf this multi-component interdisciplinary intervention is shown to be effective in improving risk factor control after stroke, it may serve as a model that can be used internationally to reduce race/ethnic and socioeconomic disparities in stroke in resource-constrained settings.Trial registrationClinicalTrials.gov Identifier NCT01763203
Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models
Large-scale clinical data is invaluable to driving many computational
scientific advances today. However, understandable concerns regarding patient
privacy hinder the open dissemination of such data and give rise to suboptimal
siloed research. De-identification methods attempt to address these concerns
but were shown to be susceptible to adversarial attacks. In this work, we focus
on the vast amounts of unstructured natural language data stored in clinical
notes and propose to automatically generate synthetic clinical notes that are
more amenable to sharing using generative models trained on real de-identified
records. To evaluate the merit of such notes, we measure both their privacy
preservation properties as well as utility in training clinical NLP models.
Experiments using neural language models yield notes whose utility is close to
that of the real ones in some clinical NLP tasks, yet leave ample room for
future improvements.Comment: Clinical NLP Workshop 201
MATREX: the DCU MT system for WMT 2010
This paper describes the DCU machine translation system in the evaluation campaign of the Joint Fifth Workshop on Statistical Machine Translation and Metrics in ACL-2010. We describe the modular design of our multi-engine machine translation (MT) system with particular focus on the components used in this participation.
We participated in the English–Spanish and English–Czech translation tasks, in which we employed our multiengine
architecture to translate. We also participated in the system combination task which was carried out by the MBR
decoder and confusion network decoder
Modeling the live-pig trade network in Georgia: Implications for disease prevention and control.
Live pig trade patterns, drivers and characteristics, particularly in backyard predominant systems, remain largely unexplored despite their important contribution to the spread of infectious diseases in the swine industry. A better understanding of the pig trade dynamics can inform the implementation of risk-based and more cost-effective prevention and control programs for swine diseases. In this study, a semi-structured questionnaire elaborated by FAO and implemented to 487 farmers was used to collect data regarding basic characteristics about pig demographics and live-pig trade among villages in the country of Georgia, where very scarce information is available. Social network analysis and exponential random graph models were used to better understand the structure, contact patterns and main drivers for pig trade in the country. Results indicate relatively infrequent (a total of 599 shipments in one year) and geographically localized (median Euclidean distance between shipments = 6.08 km; IQR = 0-13.88 km) pig movements in the studied regions. The main factors contributing to live-pig trade movements among villages were being from the same region (i.e., local trade), usage of a middleman or a live animal market to trade live pigs by at least one farmer in the village, and having a large number of pig farmers in the village. The identified villages' characteristics and structural network properties could be used to inform the design of more cost-effective surveillance systems in a country which pig industry was recently devastated by African swine fever epidemics and where backyard production systems are predominant
- …