2,449 research outputs found

    Temporal and Spatial Alignment of Multimedia Signals

    Get PDF
    With the increasing availability of cameras and other mobile devices, digital images and videos are becoming ubiquitous. Research efforts have been made to develop technologies that utilize multiple pieces of multimedia information simultaneously. This dissertation focuses on the temporal and spatial alignment of multimedia signals, which is a fundamental problem that needs to be solved to enable such applications dealing with multiple pieces of multimedia data. The first part of the dissertation addresses the synchronization of multimedia signals. We propose a new modality for audio and video synchronization based on the electric network frequency (ENF) signal naturally embedded in multimedia recordings. Synchronization of audio and video is achieved by aligning the ENF signals. The proposed method offers a significant departure to tackling the audio/video synchronization problem from existing work, and a strong potential to address previously untractable scenarios. Estimation of the ENF signal from video is a challenging task. In order to address the problem of insufficient sampling rate of video, we propose to exploit the rolling shutter mechanism commonly adopted in CMOS camera sensors. Several techniques are designed to alleviate the distortions of motions and brightness changes in videos for ENF estimation. We also address several challenges that are unique to the synchronization of digitized analog audio recordings. Speed offset often occurs in digitized analog audio recordings due to the inconsistency in the tape's rolling speed. We show that the ENF signal captured by the original analog audio recording can be retained in the digitized version. The ENF signal is considered approximately as a single-tone signal and used as a reference to detect and correct speed offsets automatically. A complete multimedia application system often needs to jointly consider both temporal synchronization and spatial alignment. The last part of the dissertation examines the quality assessment of local image features for efficient and robust spatial alignment. We propose a scheme to evaluate the quality of SIFT features in terms of their robustness and discriminability. A quality score is assigned to every SIFT feature based on its contrast value, scale and descriptor, using a quality metric kernel that is obtained in a one-time training phase. Feature selection is performed by retaining features with high quality scores. The proposed approach is also applicable to other local image features, such as the Speeded Up Robust Features (SURF)

    Personal area technologies for internetworked services

    Get PDF

    Evaluation of IEEE 802.1 Time Sensitive Networking Performance for Microgrid and Smart Grid Power System Applications

    Get PDF
    Proliferation of distributed energy resources and the importance of smart energy management has led to increased interest in microgrids. A microgrid is an area of the grid that can be disconnected and operated independently from the main grid when required and can generate some or all of its own energy needs with distributed energy resources and battery storage. This allows for the microgrid area to continue operating even when the main grid is unavailable. In addition, often a microgrid can utilize waste heat from energy generation to drive thermal loads, further improving energy utilization. This leads to increased reliability and overall efficiency in the microgrid area.As microgrids (and by extension the smart grid) become more widespread, new methods of communication and control are required to aid in management of many different distributed entities. One such communication architecture that may prove useful is the set of IEEE 802.1 Time Sensitive Networking (TSN) standards. These standards specify improvements and new capabilities for LAN based communication networks that previously made them unsuitable for widespread deployment in a power system setting. These standards include specifications for low latency guarantees, clock synchronization, data frame redundancy, and centralized system administration. These capabilities were previously available on proprietary or application specific solutions. However, they will now be available as part of the Ethernet standard, enabling backwards compatibility with existing network architecture and support with future advances.Two of the featured standards, IEEE 802.1AS (governing time-synchronization) and IEEE 802.1Qbv (governing time aware traffic shaping), will be tested and evaluated for their potential utility in power systems and microgrid applications. These tests will measure the latency achievable using TSN over a network at various levels of congestion and compare these results with UDP and TCP protocols. In addition, the ability to use synchronized clocks to generate waveforms for microgrid inverter synchronization will be explored

    User-Oriented QoS in Packet Video Delivery

    Get PDF
    We focus on packet video delivery, with an emphasis on the quality of service perceived by the end-user. A video signal passes through several subsystems, such as the source coder, the network and the decoder. Each of these can impair the information, either by data loss or by introducing delay. We describe how each of the subsystems can be tuned to optimize the quality of the delivered signal, for a given available bit rate in the network. The assessment of end-user quality is not trivial. We present recent research results, which rely on a model of the human visual system

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Learning Generalizable Visual Patterns Without Human Supervision

    Get PDF
    Owing to the existence of large labeled datasets, Deep Convolutional Neural Networks have ushered in a renaissance in computer vision. However, almost all of the visual data we generate daily - several human lives worth of it - remains unlabeled and thus out of reach of today’s dominant supervised learning paradigm. This thesis focuses on techniques that steer deep models towards learning generalizable visual patterns without human supervision. Our primary tool in this endeavor is the design of Self-Supervised Learning tasks, i.e., pretext-tasks for which labels do not involve human labor. Besides enabling the learning from large amounts of unlabeled data, we demonstrate how self-supervision can capture relevant patterns that supervised learning largely misses. For example, we design learning tasks that learn deep representations capturing shape from images, motion from video, and 3D pose features from multi-view data. Notably, these tasks’ design follows a common principle: The recognition of data transformations. The strong performance of the learned representations on downstream vision tasks such as classification, segmentation, action recognition, or pose estimation validate this pretext-task design. This thesis also explores the use of Generative Adversarial Networks (GANs) for unsupervised representation learning. Besides leveraging generative adversarial learning to define image transformation for self-supervised learning tasks, we also address training instabilities of GANs through the use of noise. While unsupervised techniques can significantly reduce the burden of supervision, in the end, we still rely on some annotated examples to fine-tune learned representations towards a target task. To improve the learning from scarce or noisy labels, we describe a supervised learning algorithm with improved generalization in these challenging settings

    High Quality Multimedia Streaming Up Sampler for Android Platform MobWS

    Get PDF
    In modern era internet is fastest mean of digital transportations and use of mobile devices is emerging to access digitized data, multimedia, sports, videos, TV shows, websites etc. from anyplace, anytime. Also people can share live videos mobile to mobile. However, existing methods are having limitations of resources like bandwidth is shared among different clients, which is resulted into drawback of video streaming. Many new mobile devices with high hardware configuration are present in market to support the high resolution by Apple, Sony, Micromax, Google, etc. but because of low resolution in multimedia streaming it will not support to these new mobile devices. This can result into introduction of visual distortion and artefacts. Thus, to provide high quality video streaming and optimized Mobile Web Service (MobWS) with more ease for mobile devices, method is proposed. This investigated approach is to enable the hosting of WebPages with live videos on android smart phones and bridges resolution gap between end user mobile device and multimedia streaming. This up sampling system is designed to evaluate high-quality multimedia streaming onto mobile phones. That is real time video broadcasting and synchronizing to client device with high resolution, to be done with less computation time as compared to previous approaches

    Perceptual Quality Metric as a Performance Tool for ATM Adaptation of MPEG-2 Based Multimedia Applications

    Get PDF
    In this paper we study the perceptual impact of data loss on MPEG-2 video coded streams transmitted over an ATM network. This impact is measured using a perceptual quality metric based on a spatio-temporal model of the human visual system. Video streams have been transmitted on top of both new network and ATM adaptation layers which provide a robust transmission by applying per-cell sequence numbering combined with a selective Forward Error Correction (FEC) mechanism. We compare their performance against a transmission over AAL5. Results show that the proposed AAL behaves better in terms of both network performance and perceived quality of the MPEG-2 decoded sequenc
    • 

    corecore