216 research outputs found
Multi-Label Classifier Chains for Bird Sound
Bird sound data collected with unattended microphones for automatic surveys,
or mobile devices for citizen science, typically contain multiple
simultaneously vocalizing birds of different species. However, few works have
considered the multi-label structure in birdsong. We propose to use an ensemble
of classifier chains combined with a histogram-of-segments representation for
multi-label classification of birdsong. The proposed method is compared with
binary relevance and three multi-instance multi-label learning (MIML)
algorithms from prior work (which focus more on structure in the sound, and
less on structure in the label sets). Experiments are conducted on two
real-world birdsong datasets, and show that the proposed method usually
outperforms binary relevance (using the same features and base-classifier), and
is better in some cases and worse in others compared to the MIML algorithms.Comment: 6 pages, 1 figure, submission to ICML 2013 workshop on bioacoustics.
Note: this is a minor revision- the blind submission format has been replaced
with one that shows author names, and a few corrections have been mad
Recommended from our members
Parallels in the sequential organization of birdsong and human speech.
Human speech possesses a rich hierarchical structure that allows for meaning to be altered by words spaced far apart in time. Conversely, the sequential structure of nonhuman communication is thought to follow non-hierarchical Markovian dynamics operating over only short distances. Here, we show that human speech and birdsong share a similar sequential structure indicative of both hierarchical and Markovian organization. We analyze the sequential dynamics of song from multiple songbird species and speech from multiple languages by modeling the information content of signals as a function of the sequential distance between vocal elements. Across short sequence-distances, an exponential decay dominates the information in speech and birdsong, consistent with underlying Markovian processes. At longer sequence-distances, the decay in information follows a power law, consistent with underlying hierarchical processes. Thus, the sequential organization of acoustic elements in two learned vocal communication signals (speech and birdsong) shows functionally equivalent dynamics, governed by similar processes
Recommended from our members
Behavioral and neural selectivity for acoustic signatures of vocalizations
Vocal communication relies on the ability of listeners to identify, process, and respond to vocal sounds produced by others in complex environments. In order to accurately recognize these signals, animalsâ auditory systems must robustly represent acoustic features that distinguish vocal sounds from other environmental sounds. In this dissertation, I describe experiments combining acoustic, behavioral, and neurophysiological approaches to identify behaviorally relevant vocalization features and understand how they are represented in the brain. First, I show that vocal responses to communication sounds in songbirds depend on the presence of specific spectral signatures of vocalizations. Second, I identify an anatomically localized neural population in the auditory cortex that shows selective responses for behaviorally relevant sounds. Third, I show that these neuronsâ spectral selectivity is robust to acoustic context, indicating that they could function as spectral signature detectors in a variety of listening conditions. Last, I deconstruct neural selectivity for behaviorally relevant sounds and show that it is driven by a sensitivity to deep fluctuations in power along the sound frequency spectrum. Together, these results show that the processing of behaviorally relevant spectral features engages a specialized neural population in the auditory cortex, and elucidate an acoustic driver of vocalization selectivity
Recommended from our members
Multi-instance multi-label learning : algorithms and applications to bird bioacoustics
We consider the problem of supervised classification of bird species from audio recordings in a real-world acoustic monitoring scenario (i.e. audio data is collected in the field with an omnidirectional microphone, without human supervision). Obtaining better data about bird activity can assist conservation efforts, and improve our understanding of their interactions with the environment and other organisms. However, traditional observation methods are labor- intensive. Most prior work on machine learning for bird song is not applicable to real-world acoustic monitoring, because it assumes recordings contain only a single species of bird, while recordings typically contain multiple simultaneously vocalizing birds. We propose to use the multi-instance multi-label (MIML) framework in machine learning for the species classification problem, where the dataset is viewed as a collection of bags of instances paired with sets of labels. Furthermore, we formalize MIML instance annotation, where the goal is to predict instance labels while learning only from bag label sets. We develop the first MIML representation for audio, and several new algorithms for MIML instance annotation based on support vector machines or classifier chains. The proposed methods classify either the set of species present in a recording, or individual calls, while learning only from recordings paired with a set of species. This form of training data requires less human effort to obtain than individually labeled calls. These methods are successfully applied to audio collected in the field which included multiple simultaneously vocalizing species. The proposed algorithms for MIML classification are general, and are also applied to object recognition in images
New tools for old questions: studying vocal communication in the Zebra Finch (Taeniopygia guttata).
The Adult Zebra finches (Taeniopygia guttata) have a crystallised song and different types of calls. However, the exact number of the calls and their function is not completely understood. The pattern of calls might be associated with a specific context and with the kind of relationships between two interacting birds.
The aim of this project is: to test the correlation between an experimentally controlled context and the pattern of the calls-songs elicited.
The acoustic signals produced by the same pairs of Zebra finches exposed to three different conditions were recorded. Each pair of birds was first kept in a small sound box, then two couples were placed together in a larger aviary and finally nest material was added. The birds were equipped with a miniaturised microphones tied on their back, in order to ascertain the identity of the bird emitting the sound. Video recording was used to correlate the birdsâ behaviour with vocalizations. The males were implanted with an electrode suitable for Local Field Potential (LFP) recording placed in Nucleus Robustus of Arcopallium (RA). This nucleus is involved in the modulation of the learned features of songs and calls and in perceptual processing.
Quantitative analysis of temporal association between individual calls reveals that are used in bidirectional communication: precise patterns of association of calls are established into the pair. The type of relationship existing between two birds, for instance âmembers of a coupleâ or âdominance hierarchy between malesâ, and the environmental context, for example âbeing in a favourable breeding conditionâ are likely to be described by patterns of temporal associations of calls combinations.
It was possible to describe the change of activity of the RA during songs\calls production through the analysis of the LFP signal. Moreover the LFP showed a repeatable signal after several days, this demonstrate the suitability of this device for studying the development of long processes, for instance song learning
Acoustic sequences in non-human animals: a tutorial review and prospectus.
Animal acoustic communication often takes the form of complex sequences, made up of multiple distinct acoustic units. Apart from the well-known example of birdsong, other animals such as insects, amphibians, and mammals (including bats, rodents, primates, and cetaceans) also generate complex acoustic sequences. Occasionally, such as with birdsong, the adaptive role of these sequences seems clear (e.g. mate attraction and territorial defence). More often however, researchers have only begun to characterise - let alone understand - the significance and meaning of acoustic sequences. Hypotheses abound, but there is little agreement as to how sequences should be defined and analysed. Our review aims to outline suitable methods for testing these hypotheses, and to describe the major limitations to our current and near-future knowledge on questions of acoustic sequences. This review and prospectus is the result of a collaborative effort between 43 scientists from the fields of animal behaviour, ecology and evolution, signal processing, machine learning, quantitative linguistics, and information theory, who gathered for a 2013 workshop entitled, 'Analysing vocal sequences in animals'. Our goal is to present not just a review of the state of the art, but to propose a methodological framework that summarises what we suggest are the best practices for research in this field, across taxa and across disciplines. We also provide a tutorial-style introduction to some of the most promising algorithmic approaches for analysing sequences. We divide our review into three sections: identifying the distinct units of an acoustic sequence, describing the different ways that information can be contained within a sequence, and analysing the structure of that sequence. Each of these sections is further subdivided to address the key questions and approaches in that area. We propose a uniform, systematic, and comprehensive approach to studying sequences, with the goal of clarifying research terms used in different fields, and facilitating collaboration and comparative studies. Allowing greater interdisciplinary collaboration will facilitate the investigation of many important questions in the evolution of communication and sociality.This review was developed at an investigative workshop, âAnalyzing Animal Vocal Communication Sequencesâ that took place on October 21â23 2013 in Knoxville, Tennessee, sponsored by the National Institute for Mathematical and Biological Synthesis (NIMBioS). NIMBioS is an Institute sponsored by the National Science Foundation, the U.S. Department of Homeland Security, and the U.S. Department of Agriculture through NSF Awards #EF-0832858 and #DBI-1300426, with additional support from The University of Tennessee, Knoxville. In addition to the authors, Vincent Janik participated in the workshop. D.T.B.âs research is currently supported by NSF DEB-1119660. M.A.B.âs research is currently supported by NSF IOS-0842759 and NIH R01DC009582. M.A.R.âs research is supported by ONR N0001411IP20086 and NOPP (ONR/BOEM) N00014-11-1-0697. S.L.DeR.âs research is supported by the U.S. Office of Naval Research. R.F.-i-C.âs research was supported by the grant BASMATI (TIN2011-27479-C04-03) from the Spanish Ministry of Science and Innovation. E.C.G.âs research is currently supported by a National Research Council postdoctoral fellowship. E.E.V.âs research is supported by CONACYT, Mexico, award number I010/214/2012.This is the accepted manuscript. The final version is available at http://dx.doi.org/10.1111/brv.1216
- âŠ