486 research outputs found

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    A deep-learning based feature hybrid framework for spatiotemporal saliency detection inside videos

    Get PDF
    Although research on detection of saliency and visual attention has been active over recent years, most of the existing work focuses on still image rather than video based saliency. In this paper, a deep learning based hybrid spatiotemporal saliency feature extraction framework is proposed for saliency detection from video footages. The deep learning model is used for the extraction of high-level features from raw video data, and they are then integrated with other high-level features. The deep learning network has been found extremely effective for extracting hidden features than that of conventional handcrafted methodology. The effectiveness for using hybrid high-level features for saliency detection in video is demonstrated in this work. Rather than using only one static image, the proposed deep learning model take several consecutive frames as input and both the spatial and temporal characteristics are considered when computing saliency maps. The efficacy of the proposed hybrid feature framework is evaluated by five databases with human gaze complex scenes. Experimental results show that the proposed model outperforms five other state-of-the-art video saliency detection approaches. In addition, the proposed framework is found useful for other video content based applications such as video highlights. As a result, a large movie clip dataset together with labeled video highlights is generated

    Redes neuronales de convolución profundas para la regionalización estadística de proyecciones de cambio climático

    Get PDF
    RESUMEN Las proyecciones climáticas a escala local y/o regional son muy demandadas por diversos sectores socioeconómicos para elaborar sus planes de adaptación y mitigación al cambio climático. Sin embargo, los modelos climáticos globales actuales presentan una resolución espacial muy baja, lo que dificulta enormemente la elaboración de este tipo de estudios. Una manera de aumentar esta resolución es establecer relaciones estadísticas entre la variable local de interés (por ejemplo la temperatura y/o precipitación en una localidad dada) y un conjunto de variables de larga escala (por ejemplo, geopotencial y/o vientos en distintos niveles verticales) dadas por los modelos climáticos. En particular, en esta Tesis se explora la idoneidad de las redes neuronales de convolución (CNN) como método de downscaling estadístico para generar proyecciones de cambio climático a alta resolución sobre Europa. Para ello se evalúa primero la capacidad de estos modelos para reproducir la variabilidad local de precipitación y de temperatura en un período histórico reciente, comparándolas contra otros métodos estadísticos de referencia. A continuación, se analiza la idoneidad de estos modelos para regionalizar las proyecciones climáticas en el futuro (hasta el año 2100). Además, se desarrollan diversos estudios de interpretabilidad sobre redes neuronales para ganar confianza y conocimiento sobre el uso de este tipo de técnicas para aplicaciones climáticas, puesto que a menudo son rechazadas por ser consideradas “cajas negras”.ABSTRACT Regional climate projections are very demanded by different socioeconomics sectors to elaborate their adaptation and mitigation plans to climate change. Nevertheless, the state-of-the-art Global Glimate Models (GCMs) present very coarse spatial resolutions what limits their use in most of practical applications and impact studies. One way to increase this limited spatial resolution is to establish empirical/statistical functions which link the local variable of interest (e.g. temperature and/or precipitation at a given site) with a set of large-scale atmospheric variables (e.g. geopotential and/or winds at different vertical levels), which are typically well-reproduced by GCMs. In this context, this Thesis explores the suitability of deep learning, and in particular modern Convolutional Neural Networks (CNNs), as statistical downscaling techniques to produce regional climate change projections over Europe. To achieve this ambitious goal, the capacity of CNNs to reproduce the local variability of precipitation and temperature fields in present climate conditions is first assessed by comparing their performance with that from a set of traditional, benchmark statistical methods. Subsequently, their suitability to produce plausible future (up to 2100) high-resolution scenarios is put to the test by comparing their projected signals of change with those given by a set of state-of-the-art GCMs from CMIP5 and Regional Climate Models (RCMs) from the flagship EURO-CORDEX initiative. Also, a variety of interpretability techniques are also carried out to gain confidence and knowledge on the use of CNNs for climate applications, which have typically discarded until now for being considered as "black-boxes"

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling

    Advances in Monocular Exemplar-based Human Body Pose Analysis: Modeling, Detection and Tracking

    Get PDF
    Esta tesis contribuye en el análisis de la postura del cuerpo humano a partir de secuencias de imágenes adquiridas con una sola cámara. Esta temática presenta un amplio rango de potenciales aplicaciones en video-vigilancia, video-juegos o aplicaciones biomédicas. Las técnicas basadas en patrones han tenido éxito, sin embargo, su precisión depende de la similitud del punto de vista de la cámara y de las propiedades de la escena entre las imágenes de entrenamiento y las de prueba. Teniendo en cuenta un conjunto de datos de entrenamiento capturado mediante un número reducido de cámaras fijas, paralelas al suelo, se han identificado y analizado tres escenarios posibles con creciente nivel de dificultad: 1) una cámara estática paralela al suelo, 2) una cámara de vigilancia fija con un ángulo de visión considerablemente diferente, y 3) una secuencia de video capturada con una cámara en movimiento o simplemente una sola imagen estática
    corecore