473 research outputs found

    Toward sparse and geometry adapted video approximations

    Get PDF
    Video signals are sequences of natural images, where images are often modeled as piecewise-smooth signals. Hence, video can be seen as a 3D piecewise-smooth signal made of piecewise-smooth regions that move through time. Based on the piecewise-smooth model and on related theoretical work on rate-distortion performance of wavelet and oracle based coding schemes, one can better analyze the appropriate coding strategies that adaptive video codecs need to implement in order to be efficient. Efficient video representations for coding purposes require the use of adaptive signal decompositions able to capture appropriately the structure and redundancy appearing in video signals. Adaptivity needs to be such that it allows for proper modeling of signals in order to represent these with the lowest possible coding cost. Video is a very structured signal with high geometric content. This includes temporal geometry (normally represented by motion information) as well as spatial geometry. Clearly, most of past and present strategies used to represent video signals do not exploit properly its spatial geometry. Similarly to the case of images, a very interesting approach seems to be the decomposition of video using large over-complete libraries of basis functions able to represent salient geometric features of the signal. In the framework of video, these features should model 2D geometric video components as well as their temporal evolution, forming spatio-temporal 3D geometric primitives. Through this PhD dissertation, different aspects on the use of adaptivity in video representation are studied looking toward exploiting both aspects of video: its piecewise nature and the geometry. The first part of this work studies the use of localized temporal adaptivity in subband video coding. This is done considering two transformation schemes used for video coding: 3D wavelet representations and motion compensated temporal filtering. A theoretical R-D analysis as well as empirical results demonstrate how temporal adaptivity improves coding performance of moving edges in 3D transform (without motion compensation) based video coding. Adaptivity allows, at the same time, to equally exploit redundancy in non-moving video areas. The analogy between motion compensated video and 1D piecewise-smooth signals is studied as well. This motivates the introduction of local length adaptivity within frame-adaptive motion compensated lifted wavelet decompositions. This allows an optimal rate-distortion performance when video motion trajectories are shorter than the transformation "Group Of Pictures", or when efficient motion compensation can not be ensured. After studying temporal adaptivity, the second part of this thesis is dedicated to understand the fundamentals of how can temporal and spatial geometry be jointly exploited. This work builds on some previous results that considered the representation of spatial geometry in video (but not temporal, i.e, without motion). In order to obtain flexible and efficient (sparse) signal representations, using redundant dictionaries, the use of highly non-linear decomposition algorithms, like Matching Pursuit, is required. General signal representation using these techniques is still quite unexplored. For this reason, previous to the study of video representation, some aspects of non-linear decomposition algorithms and the efficient decomposition of images using Matching Pursuits and a geometric dictionary are investigated. A part of this investigation concerns the study on the influence of using a priori models within approximation non-linear algorithms. Dictionaries with a high internal coherence have some problems to obtain optimally sparse signal representations when used with Matching Pursuits. It is proved, theoretically and empirically, that inserting in this algorithm a priori models allows to improve the capacity to obtain sparse signal approximations, mainly when coherent dictionaries are used. Another point discussed in this preliminary study, on the use of Matching Pursuits, concerns the approach used in this work for the decompositions of video frames and images. The technique proposed in this thesis improves a previous work, where authors had to recur to sub-optimal Matching Pursuit strategies (using Genetic Algorithms), given the size of the functions library. In this work the use of full search strategies is made possible, at the same time that approximation efficiency is significantly improved and computational complexity is reduced. Finally, a priori based Matching Pursuit geometric decompositions are investigated for geometric video representations. Regularity constraints are taken into account to recover the temporal evolution of spatial geometric signal components. The results obtained for coding and multi-modal (audio-visual) signal analysis, clarify many unknowns and show to be promising, encouraging to prosecute research on the subject

    Multimedia tools for the creation of online learning materials : a critique

    Get PDF

    Artificial Intelligence in the Creative Industries: A Review

    Full text link
    This paper reviews the current state of the art in Artificial Intelligence (AI) technologies and applications in the context of the creative industries. A brief background of AI, and specifically Machine Learning (ML) algorithms, is provided including Convolutional Neural Network (CNNs), Generative Adversarial Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement Learning (DRL). We categorise creative applications into five groups related to how AI technologies are used: i) content creation, ii) information analysis, iii) content enhancement and post production workflows, iv) information extraction and enhancement, and v) data compression. We critically examine the successes and limitations of this rapidly advancing technology in each of these areas. We further differentiate between the use of AI as a creative tool and its potential as a creator in its own right. We foresee that, in the near future, machine learning-based AI will be adopted widely as a tool or collaborative assistant for creativity. In contrast, we observe that the successes of machine learning in domains with fewer constraints, where AI is the `creator', remain modest. The potential of AI (or its developers) to win awards for its original creations in competition with human creatives is also limited, based on contemporary technologies. We therefore conclude that, in the context of creative industries, maximum benefit from AI will be derived where its focus is human centric -- where it is designed to augment, rather than replace, human creativity

    CWI Self-evaluation 1999-2004

    Get PDF

    Activity recognition in naturalistic environments using body-worn sensors

    Get PDF
    Phd ThesisThe research presented in this thesis investigates how deep learning and feature learning can address challenges that arise for activity recognition systems in naturalistic, ecologically valid surroundings such as the private home. One of the main aims of ubiquitous computing is the development of automated recognition systems for human activities and behaviour that are sufficiently robust to be deployed in realistic, in-the-wild environments. In most cases, the targeted application scenarios are people’s daily lives, where systems have to abide by practical usability and privacy constraints. We discuss how these constraints impact data collection and analysis and demonstrate how common approaches to the analysis of movement data effectively limit the practical use of activity recognition systems in every-day surroundings. In light of these issues we develop a novel approach to the representation and modelling of movement data based on a data-driven methodology that has applications in activity recognition, behaviour imaging, and skill assessment in ubiquitous computing. A number of case studies illustrate the suitability of the proposed methods and outline how study design can be adapted to maximise the benefit of these techniques, which show promising performance for clinical applications in particular.SiDE research hu

    Mobile interface to sensor networks

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2008.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 145-150).This thesis presents the Ubicorder. The Ubicorder is a location and orientation aware sensor network browsing and interactive visualization system together with in-situ event denition and identification (termed EDITY). The mobile browser allows real-time viewing of sensor network data, and enables the user to easily establish correlation between physically observed phenomena and their sensor signature. Based on the observed correlations, events are triggered every time a given set of sensor-value conditions are satised. Event rules can then be recursively combined to detect complex physical phenomena. Through a first-use user study, we evaluate the system for its usefulness and usability.by Manas Mittal.S.M

    Conference on Intelligent Robotics in Field, Factory, Service, and Space (CIRFFSS 1994), volume 1

    Get PDF
    The AIAA/NASA Conference on Intelligent Robotics in Field, Factory, Service, and Space (CIRFFSS '94) was originally proposed because of the strong belief that America's problems of global economic competitiveness and job creation and preservation can partly be solved by the use of intelligent robotics, which are also required for human space exploration missions. Individual sessions addressed nuclear industry, agile manufacturing, security/building monitoring, on-orbit applications, vision and sensing technologies, situated control and low-level control, robotic systems architecture, environmental restoration and waste management, robotic remanufacturing, and healthcare applications
    • …
    corecore