159 research outputs found
Multimedia signal processing for behavioral quantification in neuroscience
While there have been great advances in quantification of the genotype of organisms, including full genomes for many species, the quantification of phenotype is at a comparatively primitive stage. Part of the reason is technical difficulty: the phenotype covers a wide range of characteristics, ranging from static morphological features, to dynamic behavior. The latter poses challenges that are in the area of multimedia signal processing. Automated analysis of video and audio recordings of animal and human behavior is a growing area of research, ranging from the behavioral phenotyping of genetically modified mice or drosophila to the study of song learning in birds and speech acquisition in human infants. This paper reviews recent advances and identifies key problems for a range of behavior experiments that use audio and video recording. This research area offers both research challenges and an application domain for advanced multimedia signal processing. There are a number of MMSP tools that now exist which are directly relevant for behavioral quantification, such as speech recognition, video analysis and more recently, wired and wireless sensor networks for surveillance. The research challenge is to adapt these tools and to develop new ones required for studying human and animal behavior in a high throughput manner while minimizing human intervention. In contrast with consumer applications, in the research arena there is less of a penalty for computational complexity, so that algorithmic quality can be maximized through the utilization of larger computational resources that are available to the biomedical researcher
Multiple Instance Learning: A Survey of Problem Characteristics and Applications
Multiple instance learning (MIL) is a form of weakly supervised learning
where training instances are arranged in sets, called bags, and a label is
provided for the entire bag. This formulation is gaining interest because it
naturally fits various problems and allows to leverage weakly labeled data.
Consequently, it has been used in diverse application fields such as computer
vision and document classification. However, learning from bags raises
important challenges that are unique to MIL. This paper provides a
comprehensive survey of the characteristics which define and differentiate the
types of MIL problems. Until now, these problem characteristics have not been
formally identified and described. As a result, the variations in performance
of MIL algorithms from one data set to another are difficult to explain. In
this paper, MIL problem characteristics are grouped into four broad categories:
the composition of the bags, the types of data distribution, the ambiguity of
instance labels, and the task to be performed. Methods specialized to address
each category are reviewed. Then, the extent to which these characteristics
manifest themselves in key MIL application areas are described. Finally,
experiments are conducted to compare the performance of 16 state-of-the-art MIL
methods on selected problem characteristics. This paper provides insight on how
the problem characteristics affect MIL algorithms, recommendations for future
benchmarking and promising avenues for research
Hidden Markov Models
Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research
Seeing sound: a new way to illustrate auditory objects and their neural correlates
This thesis develops a new method for time-frequency signal processing and examines the relevance of the new representation in studies of neural coding in songbirds. The method groups together associated regions of the time-frequency plane into objects defined by time-frequency contours. By combining information about structurally stable contour shapes over multiple time-scales and angles, a signal decomposition is produced that distributes resolution adaptively. As a result, distinct signal components are represented in their own most parsimonious forms.
Next, through neural recordings in singing birds, it was found that activity in song premotor cortex is significantly correlated with the objects defined by this new representation of sound. In this process, an automated way of finding sub-syllable acoustic transitions in birdsongs was first developed, and then increased spiking probability was found at the boundaries of these acoustic transitions.
Finally, a new approach to study auditory cortical sequence processing more generally is proposed. In this approach, songbirds were trained to discriminate Morse-code-like sequences of clicks, and the neural correlates of this behavior were examined in primary and secondary auditory cortex. It was found that a distinct transformation of auditory responses to the sequences of clicks exists as information transferred from primary to secondary auditory areas. Neurons in secondary auditory areas respond asynchronously and selectively -- in a manner that depends on the temporal context of the click. This transformation from a temporal to a spatial representation of sound provides a possible basis for the songbird's natural ability to discriminate complex temporal sequences
Comparing call-based versus subunit-based methods for categorizing Norwegian killer whale, Orcinus orca, vocalizations
Author Posting. © The Author(s), 2010. This is the author's version of the work. It is posted here by permission of Elsevier B.V. for personal use, not for redistribution. The definitive version was published in Animal Behaviour 81 (2011): 377-386, doi:10.1016/j.anbehav.2010.09.020.Students of animal communication face significant challenges when deciding how to
categorise calls into subunits, calls, and call series. Here, we use algorithms designed to parse
human speech to test different approaches for categorising calls of killer whales. Killer whale
vocalisations have traditionally been categorised by humans into discrete call types. These calls
often contain internal spectral shifts, periods of silence, and synchronously produced low and
high frequency components, suggesting that they may be composed of subunits. We describe
and compare three different approaches for modelling Norwegian killer whale calls. The first
method considered the whole call as the basic unit of analysis. Inspired by human speech
processing techniques, the second and third methods represented the calls in terms of subunits.
Subunits may provide a more parsimonious approach to modelling the vocal stream since (1)
there were fewer subunits than call types; (2) nearly 75% of all call types shared at least one
subunit. We show that contour traces from stereotyped Norwegian killer whale calls yielded
similar automatic classification performance using either whole calls or subunits. We also
demonstrate that subunits derived from Norwegian stereotyped calls were detected in some
Norwegian variable (non-stereotyped) calls as well as the stereotyped calls of other killer whale
populations. Further work is required to test whether killer whales use subunits to generate and
categorize their vocal repertoire.The undergraduate students were
supported by the Massachusetts Institute of Technology Undergraduate Research Opportunities
Program office and the Ocean Life Institute (OLI) at the Woods Hole Oceanographic Institution
(WHOI). Field work was financed by the OLI, National Geographic Society and WWF Sweden.
A. D. Shapiro was funded by a National Defense Science and Engineering Graduate Fellowship
and the WHOI Academic Programs Office
- …