7 research outputs found

    FPGA-Based Multimodal Embedded Sensor System Integrating Low- and Mid-Level Vision

    Get PDF
    Motion estimation is a low-level vision task that is especially relevant due to its wide range of applications in the real world. Many of the best motion estimation algorithms include some of the features that are found in mammalians, which would demand huge computational resources and therefore are not usually available in real-time. In this paper we present a novel bioinspired sensor based on the synergy between optical flow and orthogonal variant moments. The bioinspired sensor has been designed for Very Large Scale Integration (VLSI) using properties of the mammalian cortical motion pathway. This sensor combines low-level primitives (optical flow and image moments) in order to produce a mid-level vision abstraction layer. The results are described trough experiments showing the validity of the proposed system and an analysis of the computational resources and performance of the applied algorithms

    Definición y extracción de características en programas audiovisuales para el reconocimiento automático de temática

    Get PDF
    En este proyecto estudiamos las principales características de distintos tipos de vídeos retransmitidos por televisión y diseñamos el comienzo de una aplicación que se encargue de analizar estas características para clasificar los distintos programas en función de su temática

    Context-based multimedia semantics modelling and representation

    Get PDF
    The evolution of the World Wide Web, increase in processing power, and more network bandwidth have contributed to the proliferation of digital multimedia data. Since multimedia data has become a critical resource in many organisations, there is an increasing need to gain efficient access to data, in order to share, extract knowledge, and ultimately use the knowledge to inform business decisions. Existing methods for multimedia semantic understanding are limited to the computable low-level features; which raises the question of how to identify and represent the high-level semantic knowledge in multimedia resources.In order to bridge the semantic gap between multimedia low-level features and high-level human perception, this thesis seeks to identify the possible contextual dimensions in multimedia resources to help in semantic understanding and organisation. This thesis investigates the use of contextual knowledge to organise and represent the semantics of multimedia data aimed at efficient and effective multimedia content-based semantic retrieval.A mixed methods research approach incorporating both Design Science Research and Formal Methods for investigation and evaluation was adopted. A critical review of current approaches for multimedia semantic retrieval was undertaken and various shortcomings identified. The objectives for a solution were defined which led to the design, development, and formalisation of a context-based model for multimedia semantic understanding and organisation. The model relies on the identification of different contextual dimensions in multimedia resources to aggregate meaning and facilitate semantic representation, knowledge sharing and reuse. A prototype system for multimedia annotation, CONMAN was built to demonstrate aspects of the model and validate the research hypothesis, H₁.Towards providing richer and clearer semantic representation of multimedia content, the original contributions of this thesis to Information Science include: (a) a novel framework and formalised model for organising and representing the semantics of heterogeneous visual data; and (b) a novel S-Space model that is aimed at visual information semantic organisation and discovery, and forms the foundations for automatic video semantic understanding

    Statistical Motion Information Extraction and Representation for Semantic Video Analysis

    No full text
    corecore