747 research outputs found

    Automatic Emotion Recognition from Mandarin Speech

    Get PDF

    A Structured Model of Video Reproduces Primary Visual Cortical Organisation

    Get PDF
    The visual system must learn to infer the presence of objects and features in the world from the images it encounters, and as such it must, either implicitly or explicitly, model the way these elements interact to create the image. Do the response properties of cells in the mammalian visual system reflect this constraint? To address this question, we constructed a probabilistic model in which the identity and attributes of simple visual elements were represented explicitly and learnt the parameters of this model from unparsed, natural video sequences. After learning, the behaviour and grouping of variables in the probabilistic model corresponded closely to functional and anatomical properties of simple and complex cells in the primary visual cortex (V1). In particular, feature identity variables were activated in a way that resembled the activity of complex cells, while feature attribute variables responded much like simple cells. Furthermore, the grouping of the attributes within the model closely parallelled the reported anatomical grouping of simple cells in cat V1. Thus, this generative model makes explicit an interpretation of complex and simple cells as elements in the segmentation of a visual scene into basic independent features, along with a parametrisation of their moment-by-moment appearances. We speculate that such a segmentation may form the initial stage of a hierarchical system that progressively separates the identity and appearance of more articulated visual elements, culminating in view-invariant object recognition

    Combining timbric and rhythmic features for semantic music tagging

    Get PDF
    In this thesis we propose a novel approach to semantic music tagging. The project uses a modified Hidden Markov Model to semantically link two acoustic features. We make the assumption that acoustically similar songs have similar tags. We model our known collection as a graph where the states represent the songs and the model's probabilities are related\nto the timbric and rhythmic similarity. Tags are inferred from songs in acoustically meaningful paths, all starting from the query song

    Novel Methods for Forensic Multimedia Data Analysis: Part I

    Get PDF
    The increased usage of digital media in daily life has resulted in the demand for novel multimedia data analysis techniques that can help to use these data for forensic purposes. Processing of such data for police investigation and as evidence in a court of law, such that data interpretation is reliable, trustworthy, and efficient in terms of human time and other resources required, will help greatly to speed up investigation and make investigation more effective. If such data are to be used as evidence in a court of law, techniques that can confirm origin and integrity are necessary. In this chapter, we are proposing a new concept for new multimedia processing techniques for varied multimedia sources. We describe the background and motivation for our work. The overall system architecture is explained. We present the data to be used. After a review of the state of the art of related work of the multimedia data we consider in this work, we describe the method and techniques we are developing that go beyond the state of the art. The work will be continued in a Chapter Part II of this topic
    • …
    corecore