233 research outputs found

    Training for Speech Recognition on Coprocessors

    Get PDF
    Automatic Speech Recognition (ASR) has increased in popularity in recent years. The evolution of processor and storage technologies has enabled more advanced ASR mechanisms, fueling the development of virtual assistants such as Amazon Alexa, Apple Siri, Microsoft Cortana, and Google Home. The interest in such assistants, in turn, has amplified the novel developments in ASR research. However, despite this popularity, there has not been a detailed training efficiency analysis of modern ASR systems. This mainly stems from: the proprietary nature of many modern applications that depend on ASR, like the ones listed above; the relatively expensive co-processor hardware that is used to accelerate ASR by big vendors to enable such applications; and the absence of well-established benchmarks. The goal of this paper is to address the latter two of these challenges. The paper first describes an ASR model, based on a deep neural network inspired by recent work in this domain, and our experiences building it. Then we evaluate this model on three CPU-GPU co-processor platforms that represent different budget categories. Our results demonstrate that utilizing hardware acceleration yields good results even without high-end equipment. While the most expensive platform (10X price of the least expensive one) converges to the initial accuracy target 10-30% and 60-70% faster than the other two, the differences among the platforms almost disappear at slightly higher accuracy targets. In addition, our results further highlight both the difficulty of evaluating ASR systems due to the complex, long, and resource intensive nature of the model training in this domain, and the importance of establishing benchmarks for ASR.Comment: under submission to pvldb even though used acm style to submit her

    The Potential of the Intel Xeon Phi for Supervised Deep Learning

    Full text link
    Supervised learning of Convolutional Neural Networks (CNNs), also known as supervised Deep Learning, is a computationally demanding process. To find the most suitable parameters of a network for a given application, numerous training sessions are required. Therefore, reducing the training time per session is essential to fully utilize CNNs in practice. While numerous research groups have addressed the training of CNNs using GPUs, so far not much attention has been paid to the Intel Xeon Phi coprocessor. In this paper we investigate empirically and theoretically the potential of the Intel Xeon Phi for supervised learning of CNNs. We design and implement a parallelization scheme named CHAOS that exploits both the thread- and SIMD-parallelism of the coprocessor. Our approach is evaluated on the Intel Xeon Phi 7120P using the MNIST dataset of handwritten digits for various thread counts and CNN architectures. Results show a 103.5x speed up when training our large network for 15 epochs using 244 threads, compared to one thread on the coprocessor. Moreover, we develop a performance model and use it to assess our implementation and answer what-if questions.Comment: The 17th IEEE International Conference on High Performance Computing and Communications (HPCC 2015), Aug. 24 - 26, 2015, New York, US

    Accelerating Pattern Matching in Neuromorphic Text Recognition System Using Intel Xeon Phi Coprocessor

    Get PDF
    Neuromorphic computing systems refer to the computing architecture inspired by the working mechanism of human brains. The rapidly reducing cost and increasing performance of state-of-the-art computing hardware allows large-scale implementation of machine intelligence models with neuromorphic architectures and opens the opportunity for new applications. One such computing hardware is Intel Xeon Phi coprocessor, which delivers over a TeraFLOP of computing power with 61 integrated processing cores. How to efficiently harness such computing power to achieve real time decision and cognition is one of the key design considerations. This work presents an optimized implementation of Brain-State-in-a-Box (BSB) neural network model on the Xeon Phi coprocessor for pattern matching in the context of intelligent text recognition of noisy document images. From a scalability standpoint on a High Performance Computing (HPC) platform we show that efficient workload partitioning and resource management can double the performance of this many-core architecture for neuromorphic applications

    Recent Application in Biometrics

    Get PDF
    In the recent years, a number of recognition and authentication systems based on biometric measurements have been proposed. Algorithms and sensors have been developed to acquire and process many different biometric traits. Moreover, the biometric technology is being used in novel ways, with potential commercial and practical implications to our daily activities. The key objective of the book is to provide a collection of comprehensive references on some recent theoretical development as well as novel applications in biometrics. The topics covered in this book reflect well both aspects of development. They include biometric sample quality, privacy preserving and cancellable biometrics, contactless biometrics, novel and unconventional biometrics, and the technical challenges in implementing the technology in portable devices. The book consists of 15 chapters. It is divided into four sections, namely, biometric applications on mobile platforms, cancelable biometrics, biometric encryption, and other applications. The book was reviewed by editors Dr. Jucheng Yang and Dr. Norman Poh. We deeply appreciate the efforts of our guest editors: Dr. Girija Chetty, Dr. Loris Nanni, Dr. Jianjiang Feng, Dr. Dongsun Park and Dr. Sook Yoon, as well as a number of anonymous reviewers

    Parallel Online Time Warping for Real-Time Audio-to-Score Alignment in Multi-core Systems

    Full text link
    [EN] The Audio-to-Score framework consists of two separate stages: pre- processing and alignment. The alignment is commonly solved through offline Dynamic Time Warping (DTW), which is a method to find the path over the distortion matrix with the minimum cost to determine the relation between the performance and the musical score times. In this work we propose a par- allel online DTW solution based on a client-server architecture. The current version of the application has been implemented for multi-core architectures (x86, x64 and ARM), thus covering either powerful systems or mobile devices. An extensive experimentation has been conducted in order to validate the software. The experiments also show that our framework allows to achieve a good score alignment within the real-time window by using parallel computing techniques.This work has been partially supported by Spanish Ministry of Science and Innovation and FEDER under Projects TEC2012-38142-C04-01, TEC2012-38142-C04-03, TEC2012-38142-C04-04, TEC2015-67387-C4-1-R, TEC2015-67387-C4-3-R, TEC2015-67387-C4-4-R, the European Union FEDER (CAPAP-H5 network TIN2014-53522-REDT), and the Generalitat Valenciana under Grant PROMETEOII/2014/003.Alonso-Jordá, P.; Cortina, R.; Rodríguez-Serrano, F.; Vera-Candeas, P.; Alonso-González, M.; Ranilla, J. (2017). Parallel Online Time Warping for Real-Time Audio-to-Score Alignment in Multi-core Systems. The Journal of Supercomputing. 73(1):126-138. https://doi.org/10.1007/s11227-016-1647-5S126138731Joder C, Essid S, Richard G (2011) A conditional random field framework for robust and scalable audio-to-score matching. IEEE Trans Speech Audio Lang Process 19(8):2385–2397McNab RJ, Smith LA, Witten IH, Henderson CL, Cunningham SJ (1996) Towards the digital music library: tune retrieval from acoustic input. In: DL 96: Proceedings of the first ACM international conference on digital libraries. ACM, New York, pp 11–18Dannenberg RB (2007) An intelligent multi-track audio editor. In: Proceedings of international computer music conference (ICMC), vol 2, pp 89–94Duan Z, Pardo B (2011) Soundprism: an online system for score-informed source separation of music audio. IEEE J Sel Topics Signal Process 5(6):1205–1215Dixon S (2005) Live tracking of musical performances using on-line time warping. In: Proceedings of the international conference on digital audio effects (DAFx), Madrid, Spain, pp 92–97Orio N, Schwarz D (2001) Alignment of monophonic and polyphonic music to a score. In: Proceedings of the international computer music conference (ICMC), pp 129–132Simon I, Morris D, Basu S (2008) MySong: automatic accompaniment generation for vocal melodies. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, pp 725–734Rodriguez-Serrano FJ, Duan Z, Vera-Candeas P, Pardo B, Carabias-Orti JJ (2015) Online score-informed source separation with adaptive instrument models. J New Music Res Lond 44(2):83–96Arzt A, Widmer G, Dixon S (2008) Automatic page turning for musicians via real-time machine listening. In: Proceedings of the 18th European conference on artificial intelligence. IOS Press, Amsterdam, pp 241–245Carabias-Orti JJ, Rodriguez-Serrano FJ, Vera-Candeas P, Canadas-Quesada FJ, Ruiz-Reyes N (2015) An audio to score alignment framework using spectral factorization and dynamic time warping. In: 16th International Society for music information retrieval conference, pp 742–748Rodríguez-Serrano FJ, Menéndez-Canal J, Vidal A, Cañadas-Quesada FJ, Cortina R (2015) A DTW based score following method for score-informed sound source separation. In: Proceedings of the 12th sound and music computing conference 2015 (SMC-15), Ireland, pp 491–496Carabias-Ortí JJ, Rodríguez-Serrano FJ, Vera-Candeas P, Cañadas-Quesada FJ, Ruíz-Reyes N (2013) Constrained non-negative sparse coding using learnt instrument templates for realtime music transcription. Eng Appl Artif Intell 26(7):1671–1680Raphael C (2006) Aligning music audio with symbolic scores using a hybrid graphical model. Mach Learn 65:389–409Schreck-Ensemble (2001–2004) ComParser 1.42. http://home.hku.nl/~pieter.suurmond/SOFT/CMP/doc/cmp.html . Accessed Sept 2015Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust Speech Signal Process 23:52–72Dannenberg R, Hu N (2003) Polyphonic audio matching for score following and intelligent audio editors. In: Proceedings of the international computer music conference. International Computer Music Association, San Francisco, pp 27–34Mueller M, Kurth F, Roeder T (2004) Towards an efficient algorithm for automatic score-to-audio synchronization. In: Proceedings of the 5th international conference on music information retrieval, Barcelona, SpainMueller M, Mattes H, Kurth F (2006) An efficient multiscale approach to audio synchronization. In: Proceedings of the 7th international conference on music information retrieval, Victoria, CanadaKaprykowsky H, Rodet X (2006) Globally optimal short-time dynamic time warping applications to score to audio alignment. In: IEEE ICASSP, Toulouse, France, pp 249–252Fremerey C, Müller M, Clausen M (2010) Handling repeats and jumps in score-performance synchronization. In: Proceedings of ISMIR, pp 243–248Arzt A, Widmer G (2010) Towards effective any-time music tracking. In: Proceedings of starting AI researchers symposium (STAIRS), Lisbon, Portugal, pp 24–3

    Proposal of a health care network based on big data analytics for PDs

    Get PDF
    Health care networks for Parkinson's disease (PD) already exist and have been already proposed in the literature, but most of them are not able to analyse the vast volume of data generated from medical examinations and collected and organised in a pre-defined manner. In this work, the authors propose a novel health care network based on big data analytics for PD. The main goal of the proposed architecture is to support clinicians in the objective assessment of the typical PD motor issues and alterations. The proposed health care network has the ability to retrieve a vast volume of acquired heterogeneous data from a Data warehouse and train an ensemble SVM to classify and rate the motor severity of a PD patient. Once the network is trained, it will be able to analyse the data collected during motor examinations of a PD patient and generate a diagnostic report on the basis of the previously acquired knowledge. Such a diagnostic report represents a tool both to monitor the follow up of the disease for each patient and give robust advice about the severity of the disease to clinicians

    A Review on AI Chip Design

    Get PDF
    In recent years, artificial intelligence (AI) technologies have been widely used in many business areas. With the attention and investment of scientific researchers and research companies around the world, artificial intelligence technologies have proven their irreplaceable value in traditional speech recognition, image recognition, search/recommendation engines, and other areas. At the same time, however, the computational effort for artificial intelligence technologies is increasing dramatically, posing a huge challenge to the computing power of hardware devices. First, in this paper, we describe the direction of AI chip technology development, including the technical shortcomings of existing AI chips. So, we present the directions of AI chip development in recent years
    • …
    corecore