246,777 research outputs found
A Review of Audio Features and Statistical Models Exploited for Voice Pattern Design
Audio fingerprinting, also named as audio hashing, has been well-known as a
powerful technique to perform audio identification and synchronization. It
basically involves two major steps: fingerprint (voice pattern) design and
matching search. While the first step concerns the derivation of a robust and
compact audio signature, the second step usually requires knowledge about
database and quick-search algorithms. Though this technique offers a wide range
of real-world applications, to the best of the authors' knowledge, a
comprehensive survey of existing algorithms appeared more than eight years ago.
Thus, in this paper, we present a more up-to-date review and, for emphasizing
on the audio signal processing aspect, we focus our state-of-the-art survey on
the fingerprint design step for which various audio features and their
tractable statistical models are discussed.Comment: http://www.iaria.org/conferences2015/PATTERNS15.html ; Seventh
International Conferences on Pervasive Patterns and Applications (PATTERNS
2015), Mar 2015, Nice, Franc
CMOS Hyperbolic Sine ELIN filters for low/audio frequency biomedical applications
Hyperbolic-Sine (Sinh) filters form a subclass of Externally-Linear-Internally-Non-
Linear (ELIN) systems. They can handle large-signals in a low power environment under half
the capacitor area required by the more popular ELIN Log-domain filters. Their inherent
class-AB nature stems from the odd property of the sinh function at the heart of their
companding operation. Despite this early realisation, the Sinh filtering paradigm has not
attracted the interest it deserves to date probably due to its mathematical and circuit-level
complexity.
This Thesis presents an overview of the CMOS weak inversion Sinh filtering
paradigm and explains how biomedical systems of low- to audio-frequency range could
benefit from it. Its dual scope is to: consolidate the theory behind the synthesis and design of
high order Sinh continuous–time filters and more importantly to confirm their micro-power
consumption and 100+ dB of DR through measured results presented for the first time.
Novel high order Sinh topologies are designed by means of a systematic
mathematical framework introduced. They employ a recently proposed CMOS Sinh
integrator comprising only p-type devices in its translinear loops. The performance of the
high order topologies is evaluated both solely and in comparison with their Log domain
counterparts. A 5th order Sinh Chebyshev low pass filter is compared head-to-head with a
corresponding and also novel Log domain class-AB topology, confirming that Sinh filters
constitute a solution of equally high DR (100+ dB) with half the capacitor area at the expense
of higher complexity and power consumption. The theoretical findings are validated by
means of measured results from an 8th order notch filter for 50/60Hz noise fabricated in a
0.35ÎĽm CMOS technology. Measured results confirm a DR of 102dB, a moderate SNR of
~60dB and 74ÎĽW power consumption from 2V power supply
Audio-visual Rhetoric: Visualizing the Pattern Language of Film
Audio-visual Rhetoric is a knowledge domain for designers in theory and practice that is valid for all communicative actions through media that aim for persuasion. Within this domain, we introduce a framework for media analysis. We developed an Audio-Visual Pattern (AVP) language for film that is visualized within a notation system. This system shows auditory and visual parameters in order to reveal film’s rhetorical structure. We discuss related theories from pattern language and rhetoric and apply the AVP method to analyze 10 commercials.
Keywords:
Pattern Language, Film Analysis, Rhetoric, Emotion, Persuasion, Design Research</p
Audio style transfer
'Style transfer' among images has recently emerged as a very active research
topic, fuelled by the power of convolution neural networks (CNNs), and has
become fast a very popular technology in social media. This paper investigates
the analogous problem in the audio domain: How to transfer the style of a
reference audio signal to a target audio content? We propose a flexible
framework for the task, which uses a sound texture model to extract statistics
characterizing the reference audio style, followed by an optimization-based
audio texture synthesis to modify the target content. In contrast to mainstream
optimization-based visual transfer method, the proposed process is initialized
by the target content instead of random noise and the optimized loss is only
about texture, not structure. These differences proved key for audio style
transfer in our experiments. In order to extract features of interest, we
investigate different architectures, whether pre-trained on other tasks, as
done in image style transfer, or engineered based on the human auditory system.
Experimental results on different types of audio signal confirm the potential
of the proposed approach.Comment: ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech
and Signal Processing (ICASSP), Apr 2018, Calgary, France. IEE
- …