5 research outputs found

    Neural Machine Translation for English–Kazakh with Morphological Segmentation and Synthetic Data

    Get PDF
    This paper presents the systems submitted by the University of Groningen to the English-Kazakh language pair (both translation directions) for the WMT 2019 news translation task. We explore the potential benefits of (i) morphological segmentation (both unsupervised and rule-based), given the agglutinative nature of Kazakh, (ii) data from two additional languages (Turkish and Russian), given the scarcity of English-Kazakh data and (iii) synthetic data, both for the source and for the target language. Our best sub- missions ranked second for Kazakh-English and third for English-Kazakh in terms of the BLEU automatic evaluation metric

    Comparison of Machine Learning Models Including Preoperative, Intraoperative, and Postoperative Data and Mortality After Cardiac Surgery

    Get PDF
    Importance: A variety of perioperative risk factors are associated with postoperative mortality risk. However, the relative contribution of routinely collected intraoperative clinical parameters to short-term and long-term mortality remains understudied. Objective: To examine the performance of multiple machine learning models with data from different perioperative periods to predict 30-day, 1-year, and 5-year mortality and investigate factors that contribute to these predictions. Design, Setting, and Participants: In this prognostic study using prospectively collected data, risk prediction models were developed for short-term and long-term mortality after cardiac surgery. Included participants were adult patients undergoing a first-time valve operation, coronary artery bypass grafting, or a combination of both between 1997 and 2017 in a single center, the University Medical Centre Groningen in the Netherlands. Mortality data were obtained in November 2017. Data analysis took place between February 2020 and August 2021. Exposure: Cardiac surgery. Main Outcomes and Measures: Postoperative mortality rates at 30 days, 1 year, and 5 years were the primary outcomes. The area under the receiver operating characteristic curve (AUROC) was used to assess discrimination. The contribution of all preoperative, intraoperative hemodynamic and temperature, and postoperative factors to mortality was investigated using Shapley additive explanations (SHAP) values. Results: Data from 9415 patients who underwent cardiac surgery (median [IQR] age, 68 [60-74] years; 2554 [27.1%] women) were included. Overall mortality rates at 30 days, 1 year, and 5 years were 268 patients (2.8%), 420 patients (4.5%), and 612 patients (6.5%), respectively. Models including preoperative, intraoperative, and postoperative data achieved AUROC values of 0.82 (95% CI, 0.78-0.86), 0.81 (95% CI, 0.77-0.85), and 0.80 (95% CI, 0.75-0.84) for 30-day, 1-year, and 5-year mortality, respectively. Models including only postoperative data performed similarly (30 days: 0.78 [95% CI, 0.73-0.82]; 1 year: 0.79 [95% CI, 0.74-0.83]; 5 years: 0.77 [95% CI, 0.73-0.82]). However, models based on all perioperative data provided less clinically usable predictions, with lower detection rates; for example, postoperative models identified a high-risk group with a 2.8-fold increase in risk for 5-year mortality (4.1 [95% CI, 3.3-5.1]) vs an increase of 11.3 (95% CI, 6.8-18.7) for the high-risk group identified by the full perioperative model. Postoperative markers associated with metabolic dysfunction and decreased kidney function were the main factors contributing to mortality risk. Conclusions and Relevance: This study found that the addition of continuous intraoperative hemodynamic and temperature data to postoperative data was not associated with improved machine learning-based identification of patients at increased risk of short-term and long-term mortality after cardiac operations

    Deep Learning for Identification of Acute Illness and Facial Cues of Illness

    Get PDF
    Background: The inclusion of facial and bodily cues (clinical gestalt) in machine learning (ML) models improves the assessment of patients' health status, as shown in genetic syndromes and acute coronary syndrome. It is unknown if the inclusion of clinical gestalt improves ML-based classification of acutely ill patients. As in previous research in ML analysis of medical images, simulated or augmented data may be used to assess the usability of clinical gestalt. Objective: To assess whether a deep learning algorithm trained on a dataset of simulated and augmented facial photographs reflecting acutely ill patients can distinguish between healthy and LPS-infused, acutely ill individuals. Methods: Photographs from twenty-six volunteers whose facial features were manipulated to resemble a state of acute illness were used to extract features of illness and generate a synthetic dataset of acutely ill photographs, using a neural transfer convolutional neural network (NT-CNN) for data augmentation. Then, four distinct CNNs were trained on different parts of the facial photographs and concatenated into one final, stacked CNN which classified individuals as healthy or acutely ill. Finally, the stacked CNN was validated in an external dataset of volunteers injected with lipopolysaccharide (LPS). Results: In the external validation set, the four individual feature models distinguished acutely ill patients with sensitivities ranging from 10.5% (95% CI, 1.3–33.1% for the skin model) to 89.4% (66.9–98.7%, for the nose model). Specificity ranged from 42.1% (20.3–66.5%) for the nose model and 94.7% (73.9–99.9%) for skin. The stacked model combining all four facial features achieved an area under the receiver characteristic operating curve (AUROC) of 0.67 (0.62–0.71) and distinguished acutely ill patients with a sensitivity of 100% (82.35–100.00%) and specificity of 42.11% (20.25–66.50%). Conclusion: A deep learning algorithm trained on a synthetic, augmented dataset of facial photographs distinguished between healthy and simulated acutely ill individuals, demonstrating that synthetically generated data can be used to develop algorithms for health conditions in which large datasets are difficult to obtain. These results support the potential of facial feature analysis algorithms to support the diagnosis of acute illness
    corecore