97 research outputs found
Adaptive Multi-Class Audio Classification in Noisy In-Vehicle Environment
With ever-increasing number of car-mounted electric devices and their
complexity, audio classification is increasingly important for the automotive
industry as a fundamental tool for human-device interactions. Existing
approaches for audio classification, however, fall short as the unique and
dynamic audio characteristics of in-vehicle environments are not appropriately
taken into account. In this paper, we develop an audio classification system
that classifies an audio stream into music, speech, speech+music, and noise,
adaptably depending on driving environments including highway, local road,
crowded city, and stopped vehicle. More than 420 minutes of audio data
including various genres of music, speech, speech+music, and noise are
collected from diverse driving environments. The results demonstrate that the
proposed approach improves the average classification accuracy up to 166%, and
64% for speech, and speech+music, respectively, compared with a non-adaptive
approach in our experimental settings
Exploitation of Memetics for Melodic Sequences Generation
Music, or in narrower sense, melodic contours of the aesthetically arranged pitches and the respective durations attracts our cognition since the beginning and now shaping the way we think in the complex life of culture. From evolutionary school of thoughts we could learn our perspective of seeing the musical diversity of folk songs in Indonesian archipelago by hypothesizing the aligning memes throughout the data sets. By regarding the memeplexes constructed from the the Zipf-Mandelbrot Law in melodic sequences and some mathematical characteristics of songs e.g.: gyration and spiraling effect, we construct evolutionary steps i.e.: genetic algorithm as tools for generating melodic sequences as an alternating computational methods to model the cognitive processes creating songs. While we build a melodic-contour generator, we present the enrichment on seeing the roles of limitless landscape of creativity and innovation guided by particular inspirations in the creation of work of art in general
Audio Features Affected by Music Expressiveness
Within a Music Information Retrieval perspective, the goal of the study
presented here is to investigate the impact on sound features of the musician's
affective intention, namely when trying to intentionally convey emotional
contents via expressiveness. A preliminary experiment has been performed
involving tuba players. The recordings have been analysed by extracting a
variety of features, which have been subsequently evaluated by combining both
classic and machine learning statistical techniques. Results are reported and
discussed.Comment: Submitted to ACM SIGIR Conference on Research and Development in
Information Retrieval (SIGIR 2016), Pisa, Italy, July 17-21, 201
Harmonic Nature of Maddalam : - A Study
The sound samples of different strokes of maddalam are analysed using MIR toolbox. The frequency spectrum, attack and decay parameters are studied. The reasons for the harmonic nature of maddalam are identified
Enabling Embodied Analogies in Intelligent Music Systems
The present methodology is aimed at cross-modal machine learning and uses
multidisciplinary tools and methods drawn from a broad range of areas and
disciplines, including music, systematic musicology, dance, motion capture,
human-computer interaction, computational linguistics and audio signal
processing. Main tasks include: (1) adapting wisdom-of-the-crowd approaches to
embodiment in music and dance performance to create a dataset of music and
music lyrics that covers a variety of emotions, (2) applying
audio/language-informed machine learning techniques to that dataset to identify
automatically the emotional content of the music and the lyrics, and (3)
integrating motion capture data from a Vicon system and dancers performing on
that music.Comment: 4 page
Developing a comprehensive framework for multimodal feature extraction
Feature extraction is a critical component of many applied data science
workflows. In recent years, rapid advances in artificial intelligence and
machine learning have led to an explosion of feature extraction tools and
services that allow data scientists to cheaply and effectively annotate their
data along a vast array of dimensions---ranging from detecting faces in images
to analyzing the sentiment expressed in coherent text. Unfortunately, the
proliferation of powerful feature extraction services has been mirrored by a
corresponding expansion in the number of distinct interfaces to feature
extraction services. In a world where nearly every new service has its own API,
documentation, and/or client library, data scientists who need to combine
diverse features obtained from multiple sources are often forced to write and
maintain ever more elaborate feature extraction pipelines. To address this
challenge, we introduce a new open-source framework for comprehensive
multimodal feature extraction. Pliers is an open-source Python package that
supports standardized annotation of diverse data types (video, images, audio,
and text), and is expressly with both ease-of-use and extensibility in mind.
Users can apply a wide range of pre-existing feature extraction tools to their
data in just a few lines of Python code, and can also easily add their own
custom extractors by writing modular classes. A graph-based API enables rapid
development of complex feature extraction pipelines that output results in a
single, standardized format. We describe the package's architecture, detail its
major advantages over previous feature extraction toolboxes, and use a sample
application to a large functional MRI dataset to illustrate how pliers can
significantly reduce the time and effort required to construct sophisticated
feature extraction workflows while increasing code clarity and maintainability
- …