338 research outputs found

    Dynamic bandwidth allocation in ATM networks

    Get PDF
    Includes bibliographical references.This thesis investigates bandwidth allocation methodologies to transport new emerging bursty traffic types in ATM networks. However, existing ATM traffic management solutions are not readily able to handle the inevitable problem of congestion as result of the bursty traffic from the new emerging services. This research basically addresses bandwidth allocation issues for bursty traffic by proposing and exploring the concept of dynamic bandwidth allocation and comparing it to the traditional static bandwidth allocation schemes

    Highly efficient low-level feature extraction for video representation and retrieval.

    Get PDF
    PhDWitnessing the omnipresence of digital video media, the research community has raised the question of its meaningful use and management. Stored in immense multimedia databases, digital videos need to be retrieved and structured in an intelligent way, relying on the content and the rich semantics involved. Current Content Based Video Indexing and Retrieval systems face the problem of the semantic gap between the simplicity of the available visual features and the richness of user semantics. This work focuses on the issues of efficiency and scalability in video indexing and retrieval to facilitate a video representation model capable of semantic annotation. A highly efficient algorithm for temporal analysis and key-frame extraction is developed. It is based on the prediction information extracted directly from the compressed domain features and the robust scalable analysis in the temporal domain. Furthermore, a hierarchical quantisation of the colour features in the descriptor space is presented. Derived from the extracted set of low-level features, a video representation model that enables semantic annotation and contextual genre classification is designed. Results demonstrate the efficiency and robustness of the temporal analysis algorithm that runs in real time maintaining the high precision and recall of the detection task. Adaptive key-frame extraction and summarisation achieve a good overview of the visual content, while the colour quantisation algorithm efficiently creates hierarchical set of descriptors. Finally, the video representation model, supported by the genre classification algorithm, achieves excellent results in an automatic annotation system by linking the video clips with a limited lexicon of related keywords

    Coding local and global binary visual features extracted from video sequences

    Get PDF
    Binary local features represent an effective alternative to real-valued descriptors, leading to comparable results for many visual analysis tasks, while being characterized by significantly lower computational complexity and memory requirements. When dealing with large collections, a more compact representation based on global features is often preferred, which can be obtained from local features by means of, e.g., the Bag-of-Visual-Word (BoVW) model. Several applications, including for example visual sensor networks and mobile augmented reality, require visual features to be transmitted over a bandwidth-limited network, thus calling for coding techniques that aim at reducing the required bit budget, while attaining a target level of efficiency. In this paper we investigate a coding scheme tailored to both local and global binary features, which aims at exploiting both spatial and temporal redundancy by means of intra- and inter-frame coding. In this respect, the proposed coding scheme can be conveniently adopted to support the Analyze-Then-Compress (ATC) paradigm. That is, visual features are extracted from the acquired content, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast with the traditional approach, in which visual content is acquired at a node, compressed and then sent to a central unit for further processing, according to the Compress-Then-Analyze (CTA) paradigm. In this paper we experimentally compare ATC and CTA by means of rate-efficiency curves in the context of two different visual analysis tasks: homography estimation and content-based retrieval. Our results show that the novel ATC paradigm based on the proposed coding primitives can be competitive with CTA, especially in bandwidth limited scenarios.Comment: submitted to IEEE Transactions on Image Processin

    Video indexing and summarization using motion activity

    Get PDF
    In this dissertation, video-indexing techniques using low-level motion activity characteristics and their application to video summarization are presented. The MPEG-7 motion activity feature is defined as the subjective level of activity or motion in a video segment. First, a novel psychophysical and analytical framework for automatic measurement of motion activity in compliance with its subjective perception is developed. A psychophysically sound subjective ground truth for motion activity and a test-set of video clips is constructed for this purpose. A number of low-level, compressed domain motion vector based, known and novel descriptors are then described. It is shown that these descriptors successfully estimate the subjective level of motion activity of video clips. Furthermore, the individual strengths and limitations of the proposed descriptors are determined using a novel pair wise comparison framework. It is verified that the intensity of motion activity descriptor of the MPEG-7 standard is one of the best performers, while a novel descriptor proposed in this dissertation performs comparably or better. A new descriptor for the spatial distribution of motion activity in a scene is proposed. This descriptor is supplementary to the intensity of motion activity descriptor. The new descriptor is shown to have comparable query retrieval performance to the current spatial distribution of motion activity descriptor of the MPEG-7 standard. The insights obtained from the motion activity investigation are applied to video summarization. A novel approach to summarizing and skimming through video using motion activity is presented. The approach is based on allocation of playback time to video segments proportional to the motion activity of the segments. Low activity segments are played faster than high activity segments in such a way that a constant level of activity is maintained throughout the video. Since motion activity is a low-complexity descriptor, the proposed summarization techniques are extremely fast. The summarization techniques are successfully used on surveillance video, The proposed techniques can also be used as a preprocessing stage for more complex summarization and content analysis techniques, thus providing significant cost gains

    Content Based Image Retrieval (CBIR) in Remote Clinical Diagnosis and Healthcare

    Full text link
    Content-Based Image Retrieval (CBIR) locates, retrieves and displays images alike to one given as a query, using a set of features. It demands accessible data in medical archives and from medical equipment, to infer meaning after some processing. A problem similar in some sense to the target image can aid clinicians. CBIR complements text-based retrieval and improves evidence-based diagnosis, administration, teaching, and research in healthcare. It facilitates visual/automatic diagnosis and decision-making in real-time remote consultation/screening, store-and-forward tests, home care assistance and overall patient surveillance. Metrics help comparing visual data and improve diagnostic. Specially designed architectures can benefit from the application scenario. CBIR use calls for file storage standardization, querying procedures, efficient image transmission, realistic databases, global availability, access simplicity, and Internet-based structures. This chapter recommends important and complex aspects required to handle visual content in healthcare.Comment: 28 pages, 6 figures, Book Chapter from "Encyclopedia of E-Health and Telemedicine

    MediaSync: Handbook on Multimedia Synchronization

    Get PDF
    This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences

    Frame registration for motion compensation in imaging photoplethysmography

    Get PDF
    © 2018 by the authors. Licensee MDPI, Basel, Switzerland. Imaging photoplethysmography (iPPG) is an emerging technology used to assess microcirculation and cardiovascular signs by collecting backscattered light from illuminated tissue using optical imaging sensors. An engineering approach is used to evaluate whether a silicone cast of a human palm might be effectively utilized to predict the results of image registration schemes for motion compensation prior to their application on live human tissue. This allows us to establish a performance baseline for each of the algorithms and to isolate performance and noise fluctuations due to the induced motion from the temporally changing physiological signs. A multi-stage evaluation model is developed to qualitatively assess the influence of the region of interest (ROI), system resolution and distance, reference frame selection, and signal normalization on extracted iPPG waveforms from live tissue. We conclude that the application of image registration is able to deliver up to 75% signal-to-noise (SNR) improvement (4.75 to 8.34) over an uncompensated iPPG signal by employing an intensity-based algorithm with a moving reference frame

    Digital photo album management techniques: from one dimension to multi-dimension.

    Get PDF
    Lu Yang.Thesis submitted in: November 2004.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 96-103).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation --- p.1Chapter 1.2 --- Our Contributions --- p.3Chapter 1.3 --- Thesis Outline --- p.5Chapter 2 --- Background Study --- p.7Chapter 2.1 --- MPEG-7 Introduction --- p.8Chapter 2.2 --- Image Analysis in CBIR Systems --- p.11Chapter 2.2.1 --- Color Information --- p.13Chapter 2.2.2 --- Color Layout --- p.19Chapter 2.2.3 --- Texture Information --- p.20Chapter 2.2.4 --- Shape Information --- p.24Chapter 2.2.5 --- CBIR Systems --- p.26Chapter 2.3 --- Image Processing in JPEG Frequency Domain --- p.30Chapter 2.4 --- Photo Album Clustering --- p.33Chapter 3 --- Feature Extraction and Similarity Analysis --- p.38Chapter 3.1 --- Feature Set in Frequency Domain --- p.38Chapter 3.1.1 --- JPEG Frequency Data --- p.39Chapter 3.1.2 --- Our Feature Set --- p.42Chapter 3.2 --- Digital Photo Similarity Analysis --- p.43Chapter 3.2.1 --- Energy Histogram --- p.43Chapter 3.2.2 --- Photo Distance --- p.45Chapter 4 --- 1-Dimensional Photo Album Management Techniques --- p.49Chapter 4.1 --- Photo Album Sorting --- p.50Chapter 4.2 --- Photo Album Clustering --- p.52Chapter 4.3 --- Photo Album Compression --- p.56Chapter 4.3.1 --- Variable IBP frames --- p.56Chapter 4.3.2 --- Adaptive Search Window --- p.57Chapter 4.3.3 --- Compression Flow --- p.59Chapter 4.4 --- Experiments and Performance Evaluations --- p.60Chapter 5 --- High Dimensional Photo Clustering --- p.67Chapter 5.1 --- Traditional Clustering Techniques --- p.67Chapter 5.1.1 --- Hierarchical Clustering --- p.68Chapter 5.1.2 --- Traditional K-means --- p.71Chapter 5.2 --- Multidimensional Scaling --- p.74Chapter 5.2.1 --- Introduction --- p.75Chapter 5.2.2 --- Classical Scaling --- p.77Chapter 5.3 --- Our Interactive MDS-based Clustering --- p.80Chapter 5.3.1 --- Principal Coordinates from MDS --- p.81Chapter 5.3.2 --- Clustering Scheme --- p.82Chapter 5.3.3 --- Layout Scheme --- p.84Chapter 5.4 --- Experiments and Results --- p.87Chapter 6 --- Conclusions --- p.94Bibliography --- p.9

    Adaptive video delivery using semantics

    Get PDF
    The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications
    • …
    corecore