78 research outputs found

    Semi-automatic video object segmentation for multimedia applications

    Get PDF
    A semi-automatic video object segmentation tool is presented for segmenting both still pictures and image sequences. The approach comprises both automatic segmentation algorithms and manual user interaction. The still image segmentation component is comprised of a conventional spatial segmentation algorithm (Recursive Shortest Spanning Tree (RSST)), a hierarchical segmentation representation method (Binary Partition Tree (BPT)), and user interaction. An initial segmentation partition of homogeneous regions is created using RSST. The BPT technique is then used to merge these regions and hierarchically represent the segmentation in a binary tree. The semantic objects are then manually built by selectively clicking on image regions. A video object-tracking component enables image sequence segmentation, and this subsystem is based on motion estimation, spatial segmentation, object projection, region classification, and user interaction. The motion between the previous frame and the current frame is estimated, and the previous object is then projected onto the current partition. A region classification technique is used to determine which regions in the current partition belong to the projected object. User interaction is allowed for object re-initialisation when the segmentation results become inaccurate. The combination of all these components enables offline video sequence segmentation. The results presented on standard test sequences illustrate the potential use of this system for object-based coding and representation of multimedia

    Stochastic Performance Throttling for Multicore Architectures under Spatial and Temporal Dependencies

    Get PDF

    Low bit-rate image sequence coding

    Get PDF

    Adaptive video delivery using semantics

    Get PDF
    The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications

    A reduced reference video quality assessment method for provision as a service over SDN/NFV-enabled networks

    Get PDF
    139 p.The proliferation of multimedia applications and services has generarted a noteworthy upsurge in network traffic regarding video content and has created the need for trustworthy service quality assessment methods. Currently, predominent position among the technological trends in telecommunication networkds are Network Function Virtualization (NFV), Software Defined Networking (SDN) and 5G mobile networks equipped with small cells. Additionally Video Quality Assessment (VQA) methods are a very useful tool for both content providers and network operators, to understand of how users perceive quality and this study the feasibility of potential services and adapt the network available resources to satisfy the user requirements

    A reduced reference video quality assessment method for provision as a service over SDN/NFV-enabled networks

    Get PDF
    139 p.The proliferation of multimedia applications and services has generarted a noteworthy upsurge in network traffic regarding video content and has created the need for trustworthy service quality assessment methods. Currently, predominent position among the technological trends in telecommunication networkds are Network Function Virtualization (NFV), Software Defined Networking (SDN) and 5G mobile networks equipped with small cells. Additionally Video Quality Assessment (VQA) methods are a very useful tool for both content providers and network operators, to understand of how users perceive quality and this study the feasibility of potential services and adapt the network available resources to satisfy the user requirements

    Layer-based coding, smoothing, and scheduling of low-bit-rate video for teleconferencing over tactical ATM networks

    Get PDF
    This work investigates issues related to distribution of low bit rate video within the context of a teleconferencing application deployed over a tactical ATM network. The main objective is to develop mechanisms that support transmission of low bit rate video streams as a series of scalable layers that progressively improve quality. The hierarchical nature of the layered video stream is actively exploited along the transmission path from the sender to the recipients to facilitate transmission. A new layered coder design tailored to video teleconferencing in the tactical environment is proposed. Macroblocks selected due to scene motion are layered via subband decomposition using the fast Haar transform. A generalized layering scheme groups the subbands to form an arbitrary number of layers. As a layering scheme suitable for low motion video is unsuitable for static slides, the coder adapts the layering scheme to the video content. A suboptimal rate control mechanism that reduces the kappa dimensional rate distortion problem resulting from the use of multiple quantizers tailored to each layer to a 1 dimensional problem by creating a single rate distortion curve for the coder in terms of a suboptimal set of kappa dimensional quantizer vectors is investigated. Rate control is thus simplified into a table lookup of a codebook containing the suboptimal quantizer vectors. The rate controller is ideal for real time video and limits fluctuations in the bit stream with no corresponding visible fluctuations in perceptual quality. A traffic smoother prior to network entry is developed to increase queuing and scheduler efficiency. Three levels of smoothing are studied: frame, layer, and cell interarrival. Frame level smoothing occurs via rate control at the application. Interleaving and cell interarrival smoothing are accomplished using a leaky bucket mechanism inserted prior to the adaptation layer or within the adaptation layerhttp://www.archive.org/details/layerbasedcoding00parkLieutenant Commander, United States NavyApproved for public release; distribution is unlimited

    Dynamic Resource Management of Network-on-Chip Platforms for Multi-stream Video Processing

    Get PDF
    This thesis considers resource management in the context of parallel multiple video stream decoding, on multicore/many-core platforms. Such platforms have tens or hundreds of on-chip processing elements which are connected via a Network-on-Chip (NoC). Inefficient task allocation configurations can negatively affect the communication cost and resource contention in the platform, leading to predictability and performance issues. Efficient resource management for large-scale complex workloads is considered a challenging research problem; especially when applications such as video streaming and decoding have dynamic and unpredictable workload characteristics. For these type of applications, runtime heuristic-based task mapping techniques are required. As the application and platform size increase, decentralised resource management techniques are more desirable to overcome the reliability and performance bottlenecks in centralised management. In this work, several heuristic-based runtime resource management techniques, targeting real-time video decoding workloads are proposed. Firstly, two admission control approaches are proposed; one fully deterministic and highly predictable; the other is heuristic-based, which balances predictability and performance. Secondly, a pair of runtime task mapping schemes are presented, which make use of limited known application properties, communication cost and blocking-aware heuristics. Combined with the proposed deterministic admission controller, these techniques can provide strict timing guarantees for hard real-time streams whilst improving resource usage. The third contribution in this thesis is a distributed, bio-inspired, low-overhead, task re-allocation technique, which is used to further improve the timeliness and workload distribution of admitted soft real-time streams. Finally, this thesis explores parallelisation and resource management issues, surrounding soft real-time video streams that have been encoded using complex encoding tools and modern codecs such as High Efficiency Video Coding (HEVC). Properties of real streams and decoding trace data are analysed, to statistically model and generate synthetic HEVC video decoding workloads. These workloads are shown to have complex and varying task dependency structures and resource requirements. To address these challenges, two novel runtime task clustering and mapping techniques for Tile-parallel HEVC decoding are proposed. These strategies consider the workload communication to computation ratio and stream-specific characteristics to balance predictability improvement and communication energy reduction. Lastly, several task to memory controller port assignment schemes are explored to alleviate performance bottlenecks, resulting from memory traffic contention

    Energy efficient enabling technologies for semantic video processing on mobile devices

    Get PDF
    Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art
    corecore