8,495 research outputs found

    Demonstrating the feasibility of standardized application program interfaces that will allow mobile/portable terminals to receive services combining UMTS and DVB-T

    Get PDF
    Crucial to the commercial exploitation of any service combining UMTS and DVB-T is the availability of standardized API’s adapted to the hybrid UMTS and DVB-T network and to the technical limitations of mobile/portable terminals. This paper describes work carried out in the European Commission Framework Program 5 (FP5) project CONFLUENT to demonstrate the feasibility of such Application Program Interfaces (API’s) by enabling the reception of a Multimedia Home Platform (MHP) based application transmitted over DVB-T on five different terminals with parts of the service running on a mobile phone

    Blind-Matched Filtering for Speech Enhancement with Distributed Microphones

    Get PDF

    Deep Learning for Audio Signal Processing

    Full text link
    Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

    Multimodal system for recording individual-level behaviors in songbird groups

    Full text link
    In longitudinal observations of animal groups, the goal is to identify individuals and to reliably detect their interactive behaviors including their vocalizations. However, to reliably extract individual vocalizations from their mixtures and other environmental sounds remains a serious challenge. Promising approaches are multi-modal systems that make use of animal-borne wireless sensors and that exploit the inherent signal redundancy. In this vein, we designed a modular recording system (BirdPark) that yields synchronized data streams and contains a custom software-defined radio receiver. We record pairs of songbirds with multiple cameras and microphones and record their body vibrations with custom low-power frequency-modulated (FM) radio transmitters. Our custom multi-antenna radio demodulation technique increases the signal-to-noise ratio of the received radio signals by 6 dB and reduces the signal loss rate by a factor of 87 to only 0.03% of the recording time compared to standard single-antenna demodulation techniques. Nevertheless, neither a single vibration channel nor a single sound channel is sufficient by itself to signal the complete vocal output of an individual, with each sensor modality missing on average about 3.7% of vocalizations. Our work emphasizes the need for high-quality recording systems and for multi-modal analysis of social behavior

    Potential of mobile applications in human-centric production and logistics management

    Get PDF
    With the increasing market penetration of smart devices (smartphones, smartwatches, and tablets), various mobile applications (apps) have been developed to fulfill tasks in daily life. Recently, efforts have been made to develop apps to support human operators in industrial work. When apps installed on commercial devices are utilized, tasks that were formerly done purely manually or with the help of investment-intensive specific devices can be performed more efficiently and/or at a lower cost and with reduced errors. Despite their advantages, smart devices have limitations because embedded sensors (e.g., accelerometers) and components (e.g., cameras) are usually designed for nonindustrial use. Hence, validation experiments and case studies for industrial applications are needed to ensure the reliability of app usage. In this study, a systematic literature review was employed to identify the state of knowledge about the use of mobile apps in production and logistics management. The results show how apps can support human centricity based on the enabling technologies and components of smart devices. An outlook for future research and applications is provided, including the need for proper validation studies to ensure the diversity and reliability of apps and more research on psychosocial aspects of human-technology interaction

    Emotions in context: examining pervasive affective sensing systems, applications, and analyses

    Get PDF
    Pervasive sensing has opened up new opportunities for measuring our feelings and understanding our behavior by monitoring our affective states while mobile. This review paper surveys pervasive affect sensing by examining and considering three major elements of affective pervasive systems, namely; “sensing”, “analysis”, and “application”. Sensing investigates the different sensing modalities that are used in existing real-time affective applications, Analysis explores different approaches to emotion recognition and visualization based on different types of collected data, and Application investigates different leading areas of affective applications. For each of the three aspects, the paper includes an extensive survey of the literature and finally outlines some of challenges and future research opportunities of affective sensing in the context of pervasive computing

    Evolutionary Speech Recognition

    Get PDF
    Automatic speech recognition systems are becoming ever more common and are increasingly deployed in more variable acoustic conditions, by very different speakers. So these systems, generally conceived in a laboratory, must be robust in order to provide optimal performance in real situations. This article explores the possibility of gaining robustness by designing speech recognition systems able to auto-modify in real time, in order to adapt to the changes of acoustic environment. As a starting point, the adaptive capacities of living organisms were considered in relation to their environment. Analogues of these mechanisms were then applied to automatic speech recognition systems. It appeared to be interesting to imagine a system adapting to the changing acoustic conditions in order to remain effective regardless of its conditions of use
    • 

    corecore