2,825 research outputs found

    Algorithms & implementation of advanced video coding standards

    Get PDF
    Advanced video coding standards have become widely deployed coding techniques used in numerous products, such as broadcast, video conference, mobile television and blu-ray disc, etc. New compression techniques are gradually included in video coding standards so that a 50% compression rate reduction is achievable every five years. However, the trend also has brought many problems, such as, dramatically increased computational complexity, co-existing multiple standards and gradually increased development time. To solve the above problems, this thesis intends to investigate efficient algorithms for the latest video coding standard, H.264/AVC. Two aspects of H.264/AVC standard are inspected in this thesis: (1) Speeding up intra4x4 prediction with parallel architecture. (2) Applying an efficient rate control algorithm based on deviation measure to intra frame. Another aim of this thesis is to work on low-complexity algorithms for MPEG-2 to H.264/AVC transcoder. Three main mapping algorithms and a computational complexity reduction algorithm are focused by this thesis: motion vector mapping, block mapping, field-frame mapping and efficient modes ranking algorithms. Finally, a new video coding framework methodology to reduce development time is examined. This thesis explores the implementation of MPEG-4 simple profile with the RVC framework. A key technique of automatically generating variable length decoder table is solved in this thesis. Moreover, another important video coding standard, DV/DVCPRO, is further modeled by RVC framework. Consequently, besides the available MPEG-4 simple profile and China audio/video standard, a new member is therefore added into the RVC framework family. A part of the research work presented in this thesis is targeted algorithms and implementation of video coding standards. In the wide topic, three main problems are investigated. The results show that the methodologies presented in this thesis are efficient and encourage

    Resource Management in Distributed Camera Systems

    Get PDF
    The aim of this work is to investigate different methods to solve the problem of allocating the correct amount of resources (network bandwidth and storage space) to video camera systems. Here we explore the intersection between two research areas: automatic control and game theory. Camera systems are a good example of the emergence of the Internet of Things (IoT) and its impact on our daily lives and the environment. We aim to improve today’s systems, shift from resources over-provisioning to allocate dynamically resources where they are needed the most. We optimize the storage and bandwidth allocation of camera systems to limit the impact on the environment as well as provide the best visual quality attainable with the resource limitations. This thesis is written as a collection of papers. It begins by introducing the problem with today’s camera systems, and continues with background information about resource allocation, automatic control and game theory. The third chapter de- scribes the models of the considered systems, their limitations and challenges. It then continues by providing more background on the automatic control and game theory techniques used in the proposed solutions. Finally, the proposed solutions are provided in five papers.Paper I proposes an approach to estimate the amount of data needed by surveillance cameras given camera and scenario parameters. This model is used for calculating the quasi Worst-Case Transmission Times of videos over a network. Papers II and III apply control concepts to camera network storage and bandwidth assignment. They provide simple, yet elegant solutions to the allocation of these resources in distributed camera systems. Paper IV com- bines pricing theory with control techniques to force the video quality of cam- era systems to converge to a common value based solely on the compression parameter of the provided videos. Paper V uses the VCG auction mechanism to solve the storage space allocation problem in competitive camera systems. It allows for a better system-wide visual quality than a simple split allocation given the limited system knowledge, trust and resource constraints

    Cubic-panorama image dataset analysis for storage and transmission

    Full text link

    Object Tracking

    Get PDF
    Object tracking consists in estimation of trajectory of moving objects in the sequence of images. Automation of the computer object tracking is a difficult task. Dynamics of multiple parameters changes representing features and motion of the objects, and temporary partial or full occlusion of the tracked objects have to be considered. This monograph presents the development of object tracking algorithms, methods and systems. Both, state of the art of object tracking methods and also the new trends in research are described in this book. Fourteen chapters are split into two sections. Section 1 presents new theoretical ideas whereas Section 2 presents real-life applications. Despite the variety of topics contained in this monograph it constitutes a consisted knowledge in the field of computer object tracking. The intention of editor was to follow up the very quick progress in the developing of methods as well as extension of the application

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE

    Adapting Computer Vision Models To Limitations On Input Dimensionality And Model Complexity

    Get PDF
    When considering instances of distributed systems where visual sensors communicate with remote predictive models, data traffic is limited to the capacity of communication channels, and hardware limits the processing of collected data prior to transmission. We study novel methods of adapting visual inference to limitations on complexity and data availability at test time, wherever the aforementioned limitations exist. Our contributions detailed in this thesis consider both task-specific and task-generic approaches to reducing the data requirement for inference, and evaluate our proposed methods on a wide range of computer vision tasks. This thesis makes four distinct contributions: (i) We investigate multi-class action classification via two-stream convolutional neural networks that directly ingest information extracted from compressed video bitstreams. We show that selective access to macroblock motion vector information provides a good low-dimensional approximation of the underlying optical flow in visual sequences. (ii) We devise a bitstream cropping method by which AVC/H.264 and H.265 bitstreams are reduced to the minimum amount of necessary elements for optical flow extraction, while maintaining compliance with codec standards. We additionally study the effect of codec rate-quality control on the sparsity and noise incurred on optical flow derived from resulting bitstreams, and do so for multiple coding standards. (iii) We demonstrate degrees of variability in the amount of data required for action classification, and leverage this to reduce the dimensionality of input volumes by inferring the required temporal extent for accurate classification prior to processing via learnable machines. (iv) We extend the Mixtures-of-Experts (MoE) paradigm to adapt the data cost of inference for any set of constituent experts. We postulate that the minimum acceptable data cost of inference varies for different input space partitions, and consider mixtures where each expert is designed to meet a different set of constraints on input dimensionality. To take advantage of the flexibility of such mixtures in processing different input representations and modalities, we train biased gating functions such that experts requiring less information to make their inferences are favoured to others. We finally note that, our proposed data utility optimization solutions include a learnable component which considers specified priorities on the amount of information to be used prior to inference, and can be realized for any combination of tasks, modalities, and constraints on available data

    Slice-Level Trading of Quality and Performance in Decoding H.264 Video: Slice-basiertes Abwägen zwischen Qualität und Leistung beim Dekodieren von H.264-Video

    Get PDF
    When a demanding video decoding task requires more CPU resources then available, playback degrades ungracefully today: The decoder skips frames selected arbitrarily or by simple heuristics, which is noticed by the viewer as jerky motion in the good case or as images completely breaking up in the bad case. The latter can happen due to missing reference frames. This thesis provides a way to schedule individual decoding tasks based on a cost for performance trade. Therefore, I will present a way to preprocess a video, generating estimates for the cost in terms of execution time and the performance in terms of perceived visual quality. The granularity of the scheduling decision is a single slice, which leads to a much more fine-grained approach than dealing with entire frames. Together with an actual scheduler implementation that uses the generated estimates, this work allows for higher perceived quality video playback in case of CPU overload.Wenn eine anspruchsvolle Video-Dekodierung mehr Prozessor-Ressourcen benötigt, als verfügbar sind, dann verschlechtert sich die Abspielqualität mit aktuellen Methoden drastisch: Willkürlich oder mit einfachen Heuristiken ausgewählten Bilder werden nicht dekodiert. Diese Auslassung nimmt der Betrachter im günstigsten Fall nur als ruckelnde Bewegung wahr, im ungünstigen Fall jedoch als komplettes Zusammenbrechen nachfolgender Bilder durch Folgefehler im Dekodierprozess. Meine Arbeit ermöglicht es, einzelne Teilaufgaben des Dekodierprozesses anhand einer Kosten-Nutzen-Analyse einzuplanen. Dafür ermittle ich die Kosten im Sinne von Rechenzeitbedarf und den Nutzen im Sinne von visueller Qualität für einzelne Slices eines H.264 Videos. Zusammen mit einer Implementierung eines Schedulers, der diese Werte nutzt, erlaubt meine Arbeit höhere vom Betrachter wahrgenommene Videoqualität bei knapper Prozessorzeit

    Character Recognition

    Get PDF
    Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

    Enhancing Mobile Capacity through Generic and Efficient Resource Sharing

    Get PDF
    Mobile computing devices are becoming indispensable in every aspect of human life, but diverse hardware limits make current mobile devices far from ideal for satisfying the performance requirements of modern mobile applications and being used anytime, anywhere. Mobile Cloud Computing (MCC) could be a viable solution to bypass these limits which enhances the mobile capacity through cooperative resource sharing, but is challenging due to the heterogeneity of mobile devices in both hardware and software aspects. Traditional schemes either restrict to share a specific type of hardware resource within individual applications, which requires tremendous reprogramming efforts; or disregard the runtime execution pattern and transmit too much unnecessary data, resulting in bandwidth and energy waste.To address the aforementioned challenges, we present three novel designs of resource sharing frameworks which utilize the various system resources from a remote or personal cloud to enhance the mobile capacity in a generic and efficient manner. First, we propose a novel method-level offloading methodology to run the mobile computational workload on the remote cloud CPU. Minimized data transmission is achieved during such offloading by identifying and selectively migrating the memory contexts which are necessary to the method execution. Second, we present a systematic framework to maximize the mobile performance of graphics rendering with the remote cloud GPU, during which the redundant pixels across consecutive frames are reused to reduce the transmitted frame data. Last, we propose to exploit the unified mobile OS services and generically interconnect heterogeneous mobile devices towards a personal mobile cloud, which complement and flexibly share mobile peripherals (e.g., sensors, camera) with each other

    Heart Rate Estimation During Physical Exercise Using Wrist-Type Ppg Sensors

    Get PDF
    Accurate heart rate monitoring during intense physical exercise is a challenging problem due to the high levels of motion artifacts (MA) in photoplethysmography (PPG) sensors. PPG is a non-invasive optical sensor that is being used in wearable devices to measure blood flow changes using the property of light reflection and absorption, allowing the extraction of vital signals such as the heart rate (HR). However, the sensor is susceptible to MA which increases during physical activity. This occurs since the frequency range of movement and HR overlaps, difficulting correct HR estimation. For this reason, MA removal has remained an active topic under research. Several approaches have been developed in the recent past and among these, a Kalman filter (KF) based approach showed promising results for an accurate estimation and tracking using PPG sensors. However, this previous tracker was demonstrated for a particular dataset, with manually tuned parameters. Moreover, such trackers do not account for the correct method for fusing data. Such a custom approach might not perform accurately in practical scenarios, where the amount of MA and the heart rate variability (HRV) depend on numerous, unpredictable factors. Thus, an approach to automatically tune the KF based on the Expectation-Maximization (EM) algorithm, with a measurement fusion approach is developed. The applicability of such a method is demonstrated using an open-source PPG database, as well as a developed synthetic generation tool that models PPG and accelerometer (ACC) signals during predetermined physical activities
    corecore