159 research outputs found

    Multimedia signal processing for behavioral quantification in neuroscience

    Get PDF
    While there have been great advances in quantification of the genotype of organisms, including full genomes for many species, the quantification of phenotype is at a comparatively primitive stage. Part of the reason is technical difficulty: the phenotype covers a wide range of characteristics, ranging from static morphological features, to dynamic behavior. The latter poses challenges that are in the area of multimedia signal processing. Automated analysis of video and audio recordings of animal and human behavior is a growing area of research, ranging from the behavioral phenotyping of genetically modified mice or drosophila to the study of song learning in birds and speech acquisition in human infants. This paper reviews recent advances and identifies key problems for a range of behavior experiments that use audio and video recording. This research area offers both research challenges and an application domain for advanced multimedia signal processing. There are a number of MMSP tools that now exist which are directly relevant for behavioral quantification, such as speech recognition, video analysis and more recently, wired and wireless sensor networks for surveillance. The research challenge is to adapt these tools and to develop new ones required for studying human and animal behavior in a high throughput manner while minimizing human intervention. In contrast with consumer applications, in the research arena there is less of a penalty for computational complexity, so that algorithmic quality can be maximized through the utilization of larger computational resources that are available to the biomedical researcher

    Multiple Instance Learning: A Survey of Problem Characteristics and Applications

    Full text link
    Multiple instance learning (MIL) is a form of weakly supervised learning where training instances are arranged in sets, called bags, and a label is provided for the entire bag. This formulation is gaining interest because it naturally fits various problems and allows to leverage weakly labeled data. Consequently, it has been used in diverse application fields such as computer vision and document classification. However, learning from bags raises important challenges that are unique to MIL. This paper provides a comprehensive survey of the characteristics which define and differentiate the types of MIL problems. Until now, these problem characteristics have not been formally identified and described. As a result, the variations in performance of MIL algorithms from one data set to another are difficult to explain. In this paper, MIL problem characteristics are grouped into four broad categories: the composition of the bags, the types of data distribution, the ambiguity of instance labels, and the task to be performed. Methods specialized to address each category are reviewed. Then, the extent to which these characteristics manifest themselves in key MIL application areas are described. Finally, experiments are conducted to compare the performance of 16 state-of-the-art MIL methods on selected problem characteristics. This paper provides insight on how the problem characteristics affect MIL algorithms, recommendations for future benchmarking and promising avenues for research

    Hidden Markov Models

    Get PDF
    Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research

    Seeing sound: a new way to illustrate auditory objects and their neural correlates

    Full text link
    This thesis develops a new method for time-frequency signal processing and examines the relevance of the new representation in studies of neural coding in songbirds. The method groups together associated regions of the time-frequency plane into objects defined by time-frequency contours. By combining information about structurally stable contour shapes over multiple time-scales and angles, a signal decomposition is produced that distributes resolution adaptively. As a result, distinct signal components are represented in their own most parsimonious forms.  Next, through neural recordings in singing birds, it was found that activity in song premotor cortex is significantly correlated with the objects defined by this new representation of sound. In this process, an automated way of finding sub-syllable acoustic transitions in birdsongs was first developed, and then increased spiking probability was found at the boundaries of these acoustic transitions. Finally, a new approach to study auditory cortical sequence processing more generally is proposed. In this approach, songbirds were trained to discriminate Morse-code-like sequences of clicks, and the neural correlates of this behavior were examined in primary and secondary auditory cortex. It was found that a distinct transformation of auditory responses to the sequences of clicks exists as information transferred from primary to secondary auditory areas. Neurons in secondary auditory areas respond asynchronously and selectively -- in a manner that depends on the temporal context of the click. This transformation from a temporal to a spatial representation of sound provides a possible basis for the songbird's natural ability to discriminate complex temporal sequences

    A Comparison Study to Identify Birds Species Based on Bird Song Signals

    Full text link

    Bird species recognition using unsupervised modeling of individual vocalization elements

    Get PDF

    Comparing call-based versus subunit-based methods for categorizing Norwegian killer whale, Orcinus orca, vocalizations

    Get PDF
    Author Posting. © The Author(s), 2010. This is the author's version of the work. It is posted here by permission of Elsevier B.V. for personal use, not for redistribution. The definitive version was published in Animal Behaviour 81 (2011): 377-386, doi:10.1016/j.anbehav.2010.09.020.Students of animal communication face significant challenges when deciding how to categorise calls into subunits, calls, and call series. Here, we use algorithms designed to parse human speech to test different approaches for categorising calls of killer whales. Killer whale vocalisations have traditionally been categorised by humans into discrete call types. These calls often contain internal spectral shifts, periods of silence, and synchronously produced low and high frequency components, suggesting that they may be composed of subunits. We describe and compare three different approaches for modelling Norwegian killer whale calls. The first method considered the whole call as the basic unit of analysis. Inspired by human speech processing techniques, the second and third methods represented the calls in terms of subunits. Subunits may provide a more parsimonious approach to modelling the vocal stream since (1) there were fewer subunits than call types; (2) nearly 75% of all call types shared at least one subunit. We show that contour traces from stereotyped Norwegian killer whale calls yielded similar automatic classification performance using either whole calls or subunits. We also demonstrate that subunits derived from Norwegian stereotyped calls were detected in some Norwegian variable (non-stereotyped) calls as well as the stereotyped calls of other killer whale populations. Further work is required to test whether killer whales use subunits to generate and categorize their vocal repertoire.The undergraduate students were supported by the Massachusetts Institute of Technology Undergraduate Research Opportunities Program office and the Ocean Life Institute (OLI) at the Woods Hole Oceanographic Institution (WHOI). Field work was financed by the OLI, National Geographic Society and WWF Sweden. A. D. Shapiro was funded by a National Defense Science and Engineering Graduate Fellowship and the WHOI Academic Programs Office
    corecore