9,243 research outputs found

    Latency Performance for Real-Time Audio on BeagleBone Black

    Get PDF
    In this paper we present a set of tests aimed at evaluating the responsiveness of a BeagleBone Black board in real-time interactive audio applications. The default Angstrom Linux distribution was tested without modifying the underlying kernel. Latency measurements and audio quality were compared across the combination of different audio interfaces and audio synthesis models. Data analysis shows that the board is generally characterised by a remarkably high responsiveness; most of the tested configurations are affected by less than 7ms of latency and under-run activity proved to be contained using the correct optimisation techniques

    Wireless Audio Interactive Knot

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2001.Includes bibliographical references (leaves 44-45).The Sound Transformer is a new type of musical instrument. It looks a little like a saxophone, but when you sing or "kazoo" into it, astonishing transforms and mutations come out. What actually happens is that the input sound is sent via 802.11 wireless link to a net server that transforms the sound and sends it back to the instrument's speaker. In other words, instead of a resonant acoustic body, or a local computer synthesizer, this architecture allows sound to be sourced or transformed by an infinite array of online services, and channeled through a gesturally expressive handheld. Emerging infrastructures (802.11, Bluetooth, 3G and 4G, etc) seem to aim at this new class of instrument. But can such an architecture really work? In particular, given the delays incurred by decoupling the sound transformation from the instrument over a wireless network, are interactive music applications feasible? My thesis is that they are. To prove this, I built a platform called WAI-KNOT (for Wireless Audio Interactive Knot) in order to examine the latency issues as well as other design elements, and test their viability and impact on real music making. The Sound Transformer is a WAI-KNOT application.Adam Douglas Smith.S.M

    A Modular, Low Latency, A2B-based Architecture for Distributed Multichannel Full-Digital Audio Systems

    Get PDF
    Despite the increasing demand for multichannel audio systems, existing solutions are still mainly analog or audio-over-IP based, leading to well-known limitations: bulky wiring, high latency (0.5-2 ms), and expensive devices for protocol stack management. This paper presents a cost-effective, low latency, full-digital solution that overcomes all the previously mentioned problems. The proposed architecture is based on the new Automotive Audio Bus (A2B) protocol. It guarantees deterministic latency of 2 samples, 32 downstream/upstream channels over a single Unshielded Twisted Pair (UTP) cable and phase-aligned signals. A single A2B chip is required for each node, reducing dramatically the system cost. The developed architecture is composed by a main board and an A2B network. The main board handles up to 64 channels, and it converts standard protocols usually employed for audio signal delivery, such as AES10, AVB and AES67, into A2B streams and vice versa. The A2B network can include a series of devices, for instance power amplifiers, codecs, DSPs, and transducers. There are many application examples including, but not limited to, transducer arrays (e.g., microphone, loudspeaker, accelerometer arrays), audio distribution in meeting rooms, Wave Field Synthesis (WFS), Ambisonics immersive audio systems and Active Noise Control (ANC). A modular and portable WFS system was developed employing the above-described architecture. It is based on eight channels soundbars, which can be daisy-chained in reconfigurable geometries and featuring up to 192 channels

    A Framework to Measure Reliance of Acoustic Latency on Smartphone Status

    Get PDF
    Audio latency, defined as the time duration when an audio signal travels from the microphone to an app or from an app to the speakers, significantly influences the performance of many mobile sensing applications including acoustic based localization and speech recognition. It is well known within the mobile app development community that audio latencies can be significant (up to hundreds of milliseconds) and vary from smartphone to smartphone and from time to time. Therefore, it is essential to study the causes and effects of the audio latency in smartphones. To the best of our knowledge, there exist mobile apps that can measure audio latency but not the corresponding status of smartphones such as available RAM, CPU loads, battery level, and number of files and folders. In this paper, we are the first to propose a framework that can simultaneously log both the audio latency and the status of smartphones. The proposed framework does not require time synchronization or firmware reprogramming and can run on a standalone device. Since the framework is designed to study the latency causality, the status of smartphones are deliberately and randomly varied as maximum as possible. To evaluate the framework, we present a case study with Android devices. We design and implement a latency app that simultaneously measures the latency and the status of smartphones. The preliminary results show that the latency values have large means (50 - 150 ms) and variances (4-40 ms). The effect of latency can be considerably reduced by just simply subtracting the offset. In order to achieve improved latency prediction that can cope with the variances an advanced regression model would be preferred

    DeepVoCoder: A CNN model for compression and coding of narrow band speech

    Get PDF
    This paper proposes a convolutional neural network (CNN)-based encoder model to compress and code speech signal directly from raw input speech. Although the model can synthesize wideband speech by implicit bandwidth extension, narrowband is preferred for IP telephony and telecommunications purposes. The model takes time domain speech samples as inputs and encodes them using a cascade of convolutional filters in multiple layers, where pooling is applied after some layers to downsample the encoded speech by half. The final bottleneck layer of the CNN encoder provides an abstract and compact representation of the speech signal. In this paper, it is demonstrated that this compact representation is sufficient to reconstruct the original speech signal in high quality using the CNN decoder. This paper also discusses the theoretical background of why and how CNN may be used for end-to-end speech compression and coding. The complexity, delay, memory requirements, and bit rate versus quality are discussed in the experimental results.Web of Science7750897508

    A survey on hardware and software solutions for multimodal wearable assistive devices targeting the visually impaired

    Get PDF
    The market penetration of user-centric assistive devices has rapidly increased in the past decades. Growth in computational power, accessibility, and cognitive device capabilities have been accompanied by significant reductions in weight, size, and price, as a result of which mobile and wearable equipment are becoming part of our everyday life. In this context, a key focus of development has been on rehabilitation engineering and on developing assistive technologies targeting people with various disabilities, including hearing loss, visual impairments and others. Applications range from simple health monitoring such as sport activity trackers, through medical applications including sensory (e.g. hearing) aids and real-time monitoring of life functions, to task-oriented tools such as navigational devices for the blind. This paper provides an overview of recent trends in software and hardware-based signal processing relevant to the development of wearable assistive solutions
    • …
    corecore