Search CORE

382 research outputs found

An Efficient Method to Improve the Audio Quality Using AAC Low Complexity Decoder

Author: D. Kalyanasundaram, K. Santhakumar,
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/03/2015
Field of study

This paper presents a new approach to design a Digital Audio Broadcast (DAB) audio decoder is introduced to improve the superiority of audio. Countries all over the world use DAB broadcasting systems more prominently, in Europe. DAB+ is the upgraded version of digital audio broadcasting. DAB and DAB+ coexist in many countries, so receivers are essential to be compatible with both standards. DAB+ is approximately twice as efficient as DAB due to the adoption of the AAC+ audio codec, and DAB+ can provide high quality audio with bit rates as low as 64 kbit/s. Integrating an MPEG-1 Layer II (MP2) decoder and Advanced Audio Coding Low Complexity (AAC LC) decoder provides a fundamental audio decoding for DAB and DAB+. The generated audio frames data from the DAB channel decoders are stored in RAM. The bit stream demultiplexer parses the quantized spectrum data in the audio. The inverse quantization performs the inverse quantization computation and synthesis filter generates the time domain Pulse Code Modulation (PCM) samples, all the above operation results writes them back to the audio RAM. The existing system of this project uses HE AAC V2 decoder, that system consists has SBR and PS technologies. This two technologies are used to improve the sound quality in low bit rate program. The proposed scheme is uses AAC LC and MP2 decoder it improve the sound quality in high bit rate. The simulation of this project is carried out by using MATLAB R2011a and Xilinx ISE 9.2i. DOI: 10.17762/ijritcc2321-8169.15039

International Journal on Recent and Innovation Trends in Computing and Communication

Flexible and Low-Complexity Encoding and Decoding of Systematic Polar Codes

Author: Giard Pascal
Gross Warren J.
Sarkis Gabi
Tal Ido
Thibeault Claude
Vardy Alexander
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

In this work, we present hardware and software implementations of flexible polar systematic encoders and decoders. The proposed implementations operate on polar codes of any length less than a maximum and of any rate. We describe the low-complexity, highly parallel, and flexible systematic-encoding algorithm that we use and prove its correctness. Our hardware implementation results show that the overhead of adding code rate and length flexibility is little, and the impact on operation latency minor compared to code-specific versions. Finally, the flexible software encoder and decoder implementations are also shown to be able to maintain high throughput and low latency.Comment: Submitted to IEEE Transactions on Communications, 201

arXiv.org e-Print Archive

eScholarship - University of California

Assistive Technology, Accommodations, and the Americans with Disabilities Act

Author: Bailey Nell
Bruyere Susanne M
Publication venue: DigitalCommons@ILR
Publication date: 01/05/2001
Field of study

This brochure on Assistive Technology, Accommodations, and the Americans with Disabilities Act (ADA) is one of a series on human resources practices and workplace accommodations for persons with disabilities edited by Susanne M. Bruyère, Ph.D., CRC, SPHR, Director, Program on Employment and Disability, School of Industrial and Labor Relations - Extension Division, Cornell University. Cornell University was funded in the early 1990’s by the U.S. Department of Education National Institute on Disability and Rehabilitation Research as a National Materials Development Project on the employment provisions (Title I) of the ADA (Grant #H133D10155). These updates, and the development of new brochures, have been funded by Cornell’s Program on Employment and Disability and the Pacific Disability and Business Technical Assistance Center

DigitalCommons@ILR

eCommons@Cornell

Assistive Technology, Accommodations, and the Americans with Disabilities Act

Author: Bailey Nell
Publication venue: DigitalCommons@ILR
Publication date: 01/01/2011
Field of study

DigitalCommons@ILR

eCommons@Cornell

Learning and adaptation in brain machine interfaces

Author: Torene Spencer Bradley
Publication venue
Publication date: 09/03/2017
Field of study

Balancing subject learning and decoder adaptation is central to increasing brain machine interface (BMI) performance. We addressed these complementary aspects in two studies: (1) a learning study, in which mice modulated “beta” band activity to control a 1D auditory cursor, and (2) an adaptive decoding study, in which a simple recurrent artificial neural network (RNN) decoded intended saccade targets of monkeys. In the learning study, three mice successfully increased beta band power following trial initiations, and specifically increased beta burst durations from 157 ms to 182 ms, likely contributing to performance. Though the task did not explicitly require specific movements, all three mice appeared to modulate beta activity via active motor control and had consistent vibrissal motor cortex multiunit activity and local field potential relationships with contralateral whisker pad electromyograms. The increased burst durations may therefore by a direct result of increased motor activity. These findings suggest that only a subset of beta rhythm phenomenology can be volitionally modulated (e.g. the tonic “hold” beta), therefore limiting the possible set of successful beta neuromodulation strategies. In the adaptive decoding study, RNNs decoded delay period activity in oculomotor and working memory regions while monkeys performed a delayed saccade task. Adaptive decoding sessions began with brain-controlled trials using pre-trained RNN models, in contrast to static decoding sessions in which 300-500 initial eye-controlled training trials were performed. Closed loop RNN decoding performance was lower than predicted by offline simulations. More consistent delay period activity and saccade paths across trials were associated with higher decoding performance. Despite the advantage of consistency, one monkey’s delay period activity patterns changed over the first week of adaptive decoding, and the other monkey’s saccades were more erratic during adaptive decoding than during static decoding sessions. It is possible that the altered session paradigm eliminating eye-controlled training trials led to either frustration or exploratory learning, causing the neural and behavioral changes. Considering neural control and decoder adaptation of BMIs in these studies, future work should improve the “two-learner” subject-decoder system by better modeling the interaction between underlying brain states (and possibly their modulation) and the neural signatures representing desired outcomes

Boston University Institutional Repository (OpenBU)

PB-IEF-01

Author: Bagian Kepegawaian Rektorat
Publication venue: Bagian Kepegawaian dan HKTL
Publication date
Field of study

Document Repository

Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention

Author: Huang Qiushi
Kılıç Volkan
Ko Tom
Kong Qiuqiang
Li Shengchen
Liu Haohe
Liu Xubo
Mei Xinhao
Plumbley Mark D.
Sun Jianyuan
Tang Lilian H.
Wang Wenwu
Zhang Yu
Publication venue
Publication date: 28/05/2023
Field of study

Audio captioning aims to generate text descriptions of audio clips. In the real world, many objects produce similar sounds. How to accurately recognize ambiguous sounds is a major challenge for audio captioning. In this work, inspired by inherent human multimodal perception, we propose visually-aware audio captioning, which makes use of visual information to help the description of ambiguous sounding objects. Specifically, we introduce an off-the-shelf visual encoder to extract video features and incorporate the visual features into an audio captioning system. Furthermore, to better exploit complementary audio-visual contexts, we propose an audio-visual attention mechanism that adaptively integrates audio and visual context and removes the redundant information in the latent space. Experimental results on AudioCaps, the largest audio captioning dataset, show that our proposed method achieves state-of-the-art results on machine translation metrics.Comment: INTERSPEECH 202

arXiv.org e-Print Archive

Digital television applications

Author: Peng Chengyuan
Publication venue: Teknillinen korkeakoulu
Publication date: 15/11/2002
Field of study

Studying development of interactive services for digital television is a leading edge area of work as there is minimal research or precedent to guide their design. Published research is limited and therefore this thesis aims at establishing a set of computing methods using Java and XML technology for future set-top box interactive services. The main issues include middleware architecture, a Java user interface for digital television, content representation and return channel communications. The middleware architecture used was made up of an Application Manager, Application Programming Interface (API), a Java Virtual Machine, etc., which were arranged in a layered model to ensure the interoperability. The application manager was designed to control the lifecycle of Xlets; manage set-top box resources and remote control keys and to adapt the graphical device environment. The architecture of both application manager and Xlet forms the basic framework for running multiple interactive services simultaneously in future set-top box designs. User interface development is more complex for this type of platform (when compared to that for a desktop computer) as many constraints are set on the look and feel (e.g., TV-like and limited buttons). Various aspects of Java user interfaces were studied and my research in this area focused on creating a remote control event model and lightweight drawing components using the Java Abstract Window Toolkit (AWT) and Java Media Framework (JMF) together with Extensible Markup Language (XML). Applications were designed aimed at studying the data structure and efficiency of the XML language to define interactive content. Content parsing was designed as a lightweight software module based around two parsers (i.e., SAX parsing and DOM parsing). The still content (i.e., text, images, and graphics) and dynamic content (i.e., hyperlinked text, animations, and forms) can then be modeled and processed efficiently. This thesis also studies interactivity methods using Java APIs via a return channel. Various communication models are also discussed that meet the interactivity requirements for different interactive services. They include URL, Socket, Datagram, and SOAP models which applications can choose to use in order to establish a connection with the service or broadcaster in order to transfer data. This thesis is presented in two parts: The first section gives a general summary of the research and acts as a complement to the second section, which contains a series of related publications.reviewe

Aaltodoc Publication Archive

Analysis by synthesis spatial audio coding

Author: Ahmet Kondoz (1384131)
Ikhwana Elfitri (7185203)
Xiyu Shi (1384281)
Publication venue
Publication date: 29/08/2013
Field of study

This study presents a novel spatial audio coding (SAC) technique, called analysis by synthesis SAC (AbS-SAC), with a capability of minimising signal distortion introduced during the encoding processes. The reverse one-to-two (R-OTT), a module applied in the MPEG Surround to down-mix two channels as a single channel, is first configured as a closed-loop system. This closed-loop module offers a capability to reduce the quantisation errors of the spatial parameters, leading to an improved quality of the synthesised audio signals. Moreover, a sub-optimal AbS optimisation, based on the closed-loop R-OTT module, is proposed. This algorithm addresses a problem of practicality in implementing an optimal AbS optimisation while it is still capable of improving further the quality of the reconstructed audio signals. In terms of algorithm complexity, the proposed sub-optimal algorithm provides scalability. The results of objective and subjective tests are presented. It is shown that significant improvement of the objective performance, when compared to the conventional open-loop approach, is achieved. On the other hand, subjective test show that the proposed technique achieves higher subjective difference grade scores than the tested advanced audio coding multichannel

Loughborough University Institutional Repository

Surrey Research Insight