42 research outputs found

    A deep evaluator for image retargeting quality by geometrical and contextual interaction

    Get PDF
    An image is compressed or stretched during the multidevice displaying, which will have a very big impact on perception quality. In order to solve this problem, a variety of image retargeting methods have been proposed for the retargeting process. However, how to evaluate the results of different image retargeting is a very critical issue. In various application systems, the subjective evaluation method cannot be applied on a large scale. So we put this problem in the accurate objective-quality evaluation. Currently, most of the image retargeting quality assessment algorithms use simple regression methods as the last step to obtain the evaluation result, which are not corresponding with the perception simulation in the human vision system (HVS). In this paper, a deep quality evaluator for image retargeting based on the segmented stacked AutoEnCoder (SAE) is proposed. Through the help of regularization, the designed deep learning framework can solve the overfitting problem. The main contributions in this framework are to simulate the perception of retargeted images in HVS. Especially, it trains two separated SAE models based on geometrical shape and content matching. Then, the weighting schemes can be used to combine the obtained scores from two models. Experimental results in three well-known databases show that our method can achieve better performance than traditional methods in evaluating different image retargeting results

    Ranking-preserving cross-source learning for image retargeting quality assessment

    Get PDF
    Image retargeting techniques adjust images into different sizes and have attracted much attention recently. Objective quality assessment (OQA) of image retargeting results is often desired to automatically select the best results. Existing OQA methods train a model using some benchmarks (e.g., RetargetMe), in which subjective scores evaluated by users are provided. Observing that it is challenging even for human subjects to give consistent scores for retargeting results of different source images (diff-source-results), in this paper we propose a learning-based OQA method that trains a General Regression Neural Network (GRNN) model based on relative scores - which preserve the ranking - of retargeting results of the same source image (same-source-results). In particular, we develop a novel training scheme with provable convergence that learns a common base scalar for same-source-results. With this source specific offset, our computed scores not only preserve the ranking of subjective scores for same-source-results, but also provide a reference to compare the diff-source-results. We train and evaluate our GRNN model using human preference data collected in RetargetMe. We further introduce a subjective benchmark to evaluate the generalizability of different OQA methods. Experimental results demonstrate that our method outperforms ten representative OQA methods in ranking prediction and has better generalizability to different datasets

    Learning to rank retargeted images

    Get PDF
    Image retargeting techniques that adjust images into different sizes have attracted much attention recently. Objective quality assessment (OQA) of image retargeting results is often desired to automatically select the best results. Existing OQA methods output an absolute score for each retargeted image and use these scores to compare different results. Observing that it is challenging even for human subjects to give consistent scores for retargeting results of different source images, in this paper we propose a learningbased OQA method that predicts the ranking of a set of retargeted images with the same source image. We show that this more manageable task helps achieve more consistent prediction to human preference and is sufficient for most application scenarios. To compute the ranking, we propose a simple yet efficient machine learning framework that uses a General Regression Neural Network (GRNN) to model a combination of seven elaborate OQA metrics. We then propose a simple scheme to transform the relative scores output from GRNN into a global ranking. We train our GRNN model using human preference data collected in the elaborate RetargetMe benchmark and evaluate our method based on the subjective study in RetargetMe. Moreover, we introduce a further subjective benchmark to evaluate the generalizability of different OQA methods. Experimental results demonstrate that our method outperforms eight representative OQA methods in ranking prediction and has better generalizability to different datasets

    Motion-based video retargeting with optimized crop-and-warp

    Full text link

    Superando la brecha de la realidad: Algoritmos de aprendizaje por imitaci贸n y por refuerzos para problemas de locomoci贸n rob贸tica b铆peda

    Get PDF
    ilustraciones, diagramas, fotograf铆asEsta tesis presenta una estrategia de entrenamiento de robots que utiliza t茅cnicas de aprendizaje artificial para optimizar el rendimiento de los robots en tareas complejas. Motivado por los impresionantes logros recientes en el aprendizaje autom谩tico, especialmente en juegos y escenarios virtuales, el proyecto tiene como objetivo explorar el potencial de estas t茅cnicas para mejorar las capacidades de los robots m谩s all谩 de la programaci贸n humana tradicional a pesar de las limitaciones impuestas por la brecha de la realidad. El caso de estudio seleccionado para esta investigaci贸n es la locomoci贸n b铆peda, ya que permite dilucidar los principales desaf铆os y ventajas de utilizar m茅todos de aprendizaje artificial para el aprendizaje de robots. La tesis identifica cuatro desaf铆os principales en este contexto: la variabilidad de los resultados obtenidos de los algoritmos de aprendizaje artificial, el alto costo y riesgo asociado con la realizaci贸n de experimentos en robots reales, la brecha entre la simulaci贸n y el comportamiento del mundo real, y la necesidad de adaptar los patrones de movimiento humanos a los sistemas rob贸ticos. La propuesta consiste en tres m贸dulos principales para abordar estos desaf铆os: Enfoques de Control No Lineal, Aprendizaje por Imitaci贸n y Aprendizaje por Reforzamiento. El m贸dulo de Enfoques de Control No Lineal establece una base al modelar robots y emplear t茅cnicas de control bien establecidas. El m贸dulo de Aprendizaje por Imitaci贸n utiliza la imitaci贸n para generar pol铆ticas iniciales basadas en datos de captura de movimiento de referencia o resultados preliminares de pol铆ticas para crear patrones de marcha similares a los humanos y factibles. El m贸dulo de Aprendizaje por Refuerzos complementa el proceso mejorando de manera iterativa las pol铆ticas param茅tricas, principalmente a trav茅s de la simulaci贸n pero con el rendimiento en el mundo real como objetivo final. Esta tesis enfatiza la modularidad del enfoque, permitiendo la implementaci贸n de los m贸dulos individuales por separado o su combinaci贸n para determinar la estrategia m谩s efectiva para diferentes escenarios de entrenamiento de robots. Al utilizar una combinaci贸n de t茅cnicas de control establecidas, aprendizaje por imitaci贸n y aprendizaje por refuerzos, la estrategia de entrenamiento propuesta busca desbloquear el potencial para que los robots alcancen un rendimiento optimizado en tareas complejas, contribuyendo al avance de la inteligencia artificial en la rob贸tica no solo en sistemas virtuales sino en sistemas reales.The thesis introduces a comprehensive robot training framework that utilizes artificial learning techniques to optimize robot performance in complex tasks. Motivated by recent impressive achievements in machine learning, particularly in games and virtual scenarios, the project aims to explore the potential of these techniques for improving robot capabilities beyond traditional human programming. The case study selected for this investigation is bipedal locomotion, as it allows for elucidating key challenges and advantages of using artificial learning methods for robot learning. The thesis identifies four primary challenges in this context: the variability of results obtained from artificial learning algorithms, the high cost and risk associated with conducting experiments on real robots, the reality gap between simulation and real-world behavior, and the need to adapt human motion patterns to robotic systems. The proposed approach consists of three main modules to address these challenges: Non-linear Control Approaches, Imitation Learning, and Reinforcement Learning. The Non-linear Control module establishes a foundation by modeling robots and employing well-established control techniques. The Imitation Learning module utilizes imitation to generate initial policies based on reference motion capture data or preliminary policy results to create feasible human-like gait patterns. The Reinforcement Learning module complements the process by iteratively improving parametric policies, primarily through simulation but ultimately with real-world performance as the ultimate goal. The thesis emphasizes the modularity of the approach, allowing for the implementation of individual modules separately or their combination to determine the most effective strategy for different robot training scenarios. By employing a combination of established control techniques, imitation learning, and reinforcement learning, the framework seeks to unlock the potential for robots to achieve optimized performances in complex tasks, contributing to the advancement of artificial intelligence in robotics.DoctoradoDoctor en ingenier铆a mec谩nica y mecatr贸nic

    Combining content analysis with usage analysis to better understand visual contents

    Get PDF
    This thesis focuses on the problem of understanding visual contents, which can be images, videos or 3D contents. Understanding means that we aim at inferring semantic information about the visual content. The goal of our work is to study methods that combine two types of approaches: 1) automatic content analysis and 2) an analysis of how humans interact with the content (in other words, usage analysis). We start by reviewing the state of the art from both Computer Vision and Multimedia communities. Twenty years ago, the main approach was aiming at a fully automatic understanding of images. This approach today gives way to different forms of human intervention, whether it is through the constitution of annotated datasets, or by solving problems interactively (e.g. detection or segmentation), or by the implicit collection of information gathered from content usages. These different types of human intervention are at the heart of modern research questions: how to motivate human contributors? How to design interactive scenarii that will generate interactions that contribute to content understanding? How to check or ensure the quality of human contributions? How to aggregate human contributions? How to fuse inputs obtained from usage analysis with traditional outputs from content analysis? Our literature review addresses these questions and allows us to position the contributions of this thesis. In our first set of contributions we revisit the detection of important (or salient) regions through implicit feedback from users that either consume or produce visual contents. In 2D, we develop several interfaces of interactive video (e.g. zoomable video) in order to coordinate content analysis and usage analysis. We also generalize these results to 3D by introducing a new detector of salient regions that builds upon simultaneous video recordings of the same public artistic performance (dance show, chant, etc.) by multiple users. The second contribution of our work aims at a semantic understanding of fixed images. With this goal in mind, we use data gathered through a game, Ask鈥檔Seek, that we created. Elementary interactions (such as clicks) together with textual input data from players are, as before, mixed with automatic analysis of images. In particular, we show the usefulness of interactions that help revealing spatial relations between different objects in a scene. After studying the problem of detecting objects on a scene, we also adress the more ambitious problem of segmentation

    Systems design study of the Pioneer Venus spacecraft. Volume 1. Technical analyses and tradeoffs, sections 1-4 (part 1 of 4)

    Get PDF
    The results are reported of the Pioneer Venus studies from 2 October 1972 through 30 June 1973. Many missions were considered, involving two launch vehicles (Thor/Delta and Atlas/Centaur), and different launch opportunities and spacecraft configurations to meet varying science requirements, all at minimum cost. The sequence of events is described and the specific studies conducted are summarized. The effects of science payload on mission and spacecraft design are discussed along with the mission analyses

    PyrSat - Prevention and response to wild fires with an intelligent Earth observation CubeSat

    Get PDF
    Forest fires are a pervasive and serious problem. Besides loss of life and extensive environmental damage, fires also result in substantial economic losses, not to mention property damage, injuries, displacements and hardships experienced by the affected citizens. This project proposes a low-cost intelligent hyperspectral 3U CubeSat for the production of fire risk and burnt area maps. It applies Machine Learning algorithms to autonomously process images and obtain final data products on-board the satellite for direct transmission to users on the ground. Used in combination with other services such as EFFIS or AFIS, the system could considerably reduce the extent and consequences of forest fires
    corecore