21 research outputs found

    Robust Speech Recognition Using Generative Adversarial Networks

    Full text link
    This paper describes a general, scalable, end-to-end framework that uses the generative adversarial network (GAN) objective to enable robust speech recognition. Encoders trained with the proposed approach enjoy improved invariance by learning to map noisy audio to the same embedding space as that of clean audio. Unlike previous methods, the new framework does not rely on domain expertise or simplifying assumptions as are often needed in signal processing, and directly encourages robustness in a data-driven way. We show the new approach improves simulated far-field speech recognition of vanilla sequence-to-sequence models without specialized front-ends or preprocessing

    Cold Fusion: Training Seq2Seq Models Together with Language Models

    Full text link
    Sequence-to-sequence (Seq2Seq) models with attention have excelled at tasks which involve generating natural language sentences such as machine translation, image captioning and speech recognition. Performance has further been improved by leveraging unlabeled data, often in the form of a language model. In this work, we present the Cold Fusion method, which leverages a pre-trained language model during training, and show its effectiveness on the speech recognition task. We show that Seq2Seq models with Cold Fusion are able to better utilize language information enjoying i) faster convergence and better generalization, and ii) almost complete transfer to a new domain while using less than 10% of the labeled training data

    ImageNet Large Scale Visual Recognition Challenge

    Get PDF
    The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the five years of the challenge, and propose future directions and improvements.Comment: 43 pages, 16 figures. v3 includes additional comparisons with PASCAL VOC (per-category comparisons in Table 3, distribution of localization difficulty in Fig 16), a list of queries used for obtaining object detection images (Appendix C), and some additional reference

    Pancreatic surgery outcomes: multicentre prospective snapshot study in 67 countries

    Get PDF
    BACKGROUND: Pancreatic surgery remains associated with high morbidity rates. Although postoperative mortality appears to have improved with specialization, the outcomes reported in the literature reflect the activity of highly specialized centres. The aim of this study was to evaluate the outcomes following pancreatic surgery worldwide. METHODS: This was an international, prospective, multicentre, cross-sectional snapshot study of consecutive patients undergoing pancreatic operations worldwide in a 3-month interval in 2021. The primary outcome was postoperative mortality within 90 days of surgery. Multivariable logistic regression was used to explore relationships with Human Development Index (HDI) and other parameters. RESULTS: A total of 4223 patients from 67 countries were analysed. A complication of any severity was detected in 68.7 per cent of patients (2901 of 4223). Major complication rates (Clavien–Dindo grade at least IIIa) were 24, 18, and 27 per cent, and mortality rates were 10, 5, and 5 per cent in low-to-middle-, high-, and very high-HDI countries respectively. The 90-day postoperative mortality rate was 5.4 per cent (229 of 4223) overall, but was significantly higher in the low-to-middle-HDI group (adjusted OR 2.88, 95 per cent c.i. 1.80 to 4.48). The overall failure-to-rescue rate was 21 per cent; however, it was 41 per cent in low-to-middle- compared with 19 per cent in very high-HDI countries. CONCLUSION: Excess mortality in low-to-middle-HDI countries could be attributable to failure to rescue of patients from severe complications. The authors call for a collaborative response from international and regional associations of pancreatic surgeons to address management related to death from postoperative complications to tackle the global disparities in the outcomes of pancreatic surgery (NCT04652271; ISRCTN95140761
    corecore