58 research outputs found

    No-reference image and video quality assessment: a classification and review of recent approaches

    Get PDF

    Adaptive video delivery using semantics

    Get PDF
    The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications

    Multimedia

    Get PDF
    The nowadays ubiquitous and effortless digital data capture and processing capabilities offered by the majority of devices, lead to an unprecedented penetration of multimedia content in our everyday life. To make the most of this phenomenon, the rapidly increasing volume and usage of digitised content requires constant re-evaluation and adaptation of multimedia methodologies, in order to meet the relentless change of requirements from both the user and system perspectives. Advances in Multimedia provides readers with an overview of the ever-growing field of multimedia by bringing together various research studies and surveys from different subfields that point out such important aspects. Some of the main topics that this book deals with include: multimedia management in peer-to-peer structures & wireless networks, security characteristics in multimedia, semantic gap bridging for multimedia content and novel multimedia applications

    Dimensionality reduction and sparse representations in computer vision

    Get PDF
    The proliferation of camera equipped devices, such as netbooks, smartphones and game stations, has led to a significant increase in the production of visual content. This visual information could be used for understanding the environment and offering a natural interface between the users and their surroundings. However, the massive amounts of data and the high computational cost associated with them, encumbers the transfer of sophisticated vision algorithms to real life systems, especially ones that exhibit resource limitations such as restrictions in available memory, processing power and bandwidth. One approach for tackling these issues is to generate compact and descriptive representations of image data by exploiting inherent redundancies. We propose the investigation of dimensionality reduction and sparse representations in order to accomplish this task. In dimensionality reduction, the aim is to reduce the dimensions of the space where image data reside in order to allow resource constrained systems to handle them and, ideally, provide a more insightful description. This goal is achieved by exploiting the inherent redundancies that many classes of images, such as faces under different illumination conditions and objects from different viewpoints, exhibit. We explore the description of natural images by low dimensional non-linear models called image manifolds and investigate the performance of computer vision tasks such as recognition and classification using these low dimensional models. In addition to dimensionality reduction, we study a novel approach in representing images as a sparse linear combination of dictionary examples. We investigate how sparse image representations can be used for a variety of tasks including low level image modeling and higher level semantic information extraction. Using tools from dimensionality reduction and sparse representation, we propose the application of these methods in three hierarchical image layers, namely low-level features, mid-level structures and high-level attributes. Low level features are image descriptors that can be extracted directly from the raw image pixels and include pixel intensities, histograms, and gradients. In the first part of this work, we explore how various techniques in dimensionality reduction, ranging from traditional image compression to the recently proposed Random Projections method, affect the performance of computer vision algorithms such as face detection and face recognition. In addition, we discuss a method that is able to increase the spatial resolution of a single image, without using any training examples, according to the sparse representations framework. In the second part, we explore mid-level structures, including image manifolds and sparse models, produced by abstracting information from low-level features and offer compact modeling of high dimensional data. We propose novel techniques for generating more descriptive image representations and investigate their application in face recognition and object tracking. In the third part of this work, we propose the investigation of a novel framework for representing the semantic contents of images. This framework employs high level semantic attributes that aim to bridge the gap between the visual information of an image and its textual description by utilizing low level features and mid level structures. This innovative paradigm offers revolutionary possibilities including recognizing the category of an object from purely textual information without providing any explicit visual example

    Primitives and design of the intelligent pixel multimedia communicator

    Get PDF
    Communication systems arc an ever more essential component of our modern global society. Mobile communications systems are still in a state of rapid advancement and growth. Technology is constantly evolving at a rapid pace in ever more diverse areas and the emerging mobile multimedia based communication systems offer new challenges for both current and future technologies. To realise the full potential of mobile multimedia communication systems there is a need to explore new options to solve some of the fundamental problems facing the technology. In particular, the complexity of such a system within an infrastructure framework that is inherently limited by its power sources and has very restricted transmission bandwidth demands new methodologies and approaches

    Cross-layer Optimization for Video Delivery over Wireless Networks

    Get PDF
    As video streaming is becoming the most popular application of Internet mo- bile, the design and the optimization of video communications over wireless networks is attracting increasingly attention from both academia and indus- try. The main challenges are to enhance the quality of service support, and to dynamically adapt the transmitted video streams to the network condition. The cross-layer methods, i.e., the exchange of information among different layers of the system, is one of the key concepts to be exploited to achieve this goals. In this thesis we propose novel cross-layer optimization frameworks for scalable video coding (SVC) delivery and for HTTP adaptive streaming (HAS) application over the downlink and the uplink of Long Term Evolution (LTE) wireless networks. They jointly address optimized content-aware rate adaptation and radio resource allocation (RRA) with the aim of maximiz- ing the sum of the achievable rates while minimizing the quality difference among multiple videos. For multi-user SVC delivery over downlink wireless systems, where IP/TV is the most representative application, we decompose the optimization problem and we propose the novel iterative local approxi- mation algorithm to derive the optimal solution, by also presenting optimal algorithms to solve the resulting two sub-problems. For multiple SVC de- livery over uplink wireless systems, where healt-care services are the most attractive and challenging application, we propose joint video adaptation and aggregation directly performed at the application layer of the transmit- ting equipment, which exploits the guaranteed bit-rate (GBR) provided by the low-complexity sub-optimal RRA solutions proposed. Finally, we pro- pose a quality-fair adaptive streaming solution to deliver fair video quality to HAS clients in a LTE cell by adaptively selecting the prescribed (GBR) of each user according to the video content in addition to the channel condi- tion. Extensive numerical evaluations show the significant enhancements of the proposed strategies with respect to other state-of-the-art frameworks
    corecore