Search CORE

1,477 research outputs found

End-to-End Learning of Representations for Asynchronous Event-Based Data

Author: Derpanis Konstantinos G.
Gehrig Daniel
Loquercio Antonio
Scaramuzza Davide
Publication venue
Publication date: 20/08/2019
Field of study

Event cameras are vision sensors that record asynchronous streams of per-pixel brightness changes, referred to as "events". They have appealing advantages over frame-based cameras for computer vision, including high temporal resolution, high dynamic range, and no motion blur. Due to the sparse, non-uniform spatiotemporal layout of the event signal, pattern recognition algorithms typically aggregate events into a grid-based representation and subsequently process it by a standard vision pipeline, e.g., Convolutional Neural Network (CNN). In this work, we introduce a general framework to convert event streams into grid-based representations through a sequence of differentiable operations. Our framework comes with two main advantages: (i) allows learning the input event representation together with the task dedicated network in an end to end manner, and (ii) lays out a taxonomy that unifies the majority of extant event representations in the literature and identifies novel ones. Empirically, we show that our approach to learning the event representation end-to-end yields an improvement of approximately 12% on optical flow estimation and object recognition over state-of-the-art methods.Comment: To appear at ICCV 201

arXiv.org e-Print Archive

Crossref

ZORA

Recommended from our members

Error relilient video communications using high level M-QAM. Modelling and simulation of a comparative analysis of a dual-priority M-QAM transmission system for H.264/AVC video applications over band-limited and error-phone channels.

Author: Abdurrhman Ahmed B.M.
Publication venue: Department of Computing, School of Computing, Informatics and Media
Publication date: 01/01/2010
Field of study

An experimental investigation of an M level (M = 16, 64 and 256) Quadrature Amplitude Modulation (QAM) transmission system suitable for video transmission is presented. The communication system is based on layered video coding and unequal error protection to make the video bitstream robust to channel errors. An implementation is described in which H.264 video is protected unequally by partitioning the compressed data into two layers of different visual importance. The partition scheme is based on a separation of the group of pictures (GoP) in the intra-coded frame (I-frame) and predictive coded frame (P frame). This partition scheme is then applied to split the H.264-coded video bitstream and is suitable for Constant Bit Rate (CBR) transmission. Unequal error protection is based on uniform and non-uniform M-QAM constellations in conjunction with different scenarios of splitting the transmitted symbol for protection of the more important information of the video data; different constellation arrangements are proposed and evaluated to increase the capacity of the high priority layer. The performance of the transmission system is evaluated under Additive White Gaussian Noise (AWGN) and Rayleigh fading conditions. Simulation results showed that in noisy channels the decoded video can be improved by assigning a larger portion of the video data to the enhancement layer in conjunction with non-uniform constellation arrangements; in better channel conditions the quality of the received video can be improved by assigning more bits in the high priority channel and using uniform constellations. The aforementioned varying conditions can make the video transmission more successful over error-prone channels. Further techniques were developed to combat various channel impairments by considering channel coding methods suitable for layered video coding applications. It is shown that a combination of non-uniform M-QAM and forward error correction (FEC) will yield a better performance. Additionally, antenna diversity techniques are examined and introduced to the transmission system that can offer a significant improvement in the quality of service of mobile video communication systems in environments that can be modelled by a Rayleigh fading channel

Bradford Scholars

GST: GPU-decodable supercompressed textures

Author: Krajcevski Pavel
Manocha Dinesh
Pratapa Srihari
Publication venue
Publication date: 01/01/2016
Field of study

Modern GPUs supporting compressed textures allow interactive application developers to save scarce GPU resources such as VRAM and bandwidth. Compressed textures use fixed compression ratios whose lossy representations are significantly poorer quality than traditional image compression formats such as JPEG. We present a new method in the class of supercompressed textures that provides an additional layer of compression to already compressed textures. Our texture representation is designed for endpoint compressed formats such as DXT and PVRTC and decoding on commodity GPUs. We apply our algorithm to commonly used formats by separating their representation into two parts that are processed independently and then entropy encoded. Our method preserves the CPU-GPU bandwidth during the decoding phase and exploits the parallelism of GPUs to provide up to 3X faster decode compared to prior texture supercompression algorithms. Along with the gains in decoding speed, our method maintains both the compression size and quality of current state of the art texture representations

Carolina Digital Repository

REGION-BASED ADAPTIVE DISTRIBUTED VIDEO CODING CODEC

Author: ABDELRAHMAN ELAMIN ABDELRAHMAN ELAMIN
Publication venue
Publication date: 01/12/2010
Field of study

The recently developed Distributed Video Coding (DVC) is typically suitable for the applications where the conventional video coding is not feasible because of its inherent high-complexity encoding. Examples include video surveillance usmg wireless/wired video sensor network and applications using mobile cameras etc. With DVC, the complexity is shifted from the encoder to the decoder. The practical application of DVC is referred to as Wyner-Ziv video coding (WZ) where an estimate of the original frame called "side information" is generated using motion compensation at the decoder. The compression is achieved by sending only that extra information that is needed to correct this estimation. An error-correcting code is used with the assumption that the estimate is a noisy version of the original frame and the rate needed is certain amount of the parity bits. The side information is assumed to have become available at the decoder through a virtual channel. Due to the limitation of compensation method, the predicted frame, or the side information, is expected to have varying degrees of success. These limitations stem from locationspecific non-stationary estimation noise. In order to avoid these, the conventional video coders, like MPEG, make use of frame partitioning to allocate optimum coder for each partition and hence achieve better rate-distortion performance. The same, however, has not been used in DVC as it increases the encoder complexity. This work proposes partitioning the considered frame into many coding units (region) where each unit is encoded differently. This partitioning is, however, done at the decoder while generating the side-information and the region map is sent over to encoder at very little rate penalty. The partitioning allows allocation of appropriate DVC coding parameters (virtual channel, rate, and quantizer) to each region. The resulting regions map is compressed by employing quadtree algorithm and communicated to the encoder via the feedback channel. The rate control in DVC is performed by channel coding techniques (turbo codes, LDPC, etc.). The performance of the channel code depends heavily on the accuracy of virtual channel model that models estimation error for each region. In this work, a turbo code has been used and an adaptive WZ DVC is designed both in transform domain and in pixel domain. The transform domain WZ video coding (TDWZ) has distinct superior performance as compared to the normal Pixel Domain Wyner-Ziv (PDWZ), since it exploits the ' spatial redundancy during the encoding. The performance evaluations show that the proposed system is superior to the existing distributed video coding solutions. Although the, proposed system requires extra bits representing the "regions map" to be transmitted, fuut still the rate gain is noticeable and it outperforms the state-of-the-art frame based DVC by 0.6-1.9 dB. The feedback channel (FC) has the role to adapt the bit rate to the changing ' statistics between the side infonmation and the frame to be encoded. In the unidirectional scenario, the encoder must perform the rate control. To correctly estimate the rate, the encoder must calculate typical side information. However, the rate cannot be exactly calculated at the encoder, instead it can only be estimated. This work also prbposes a feedback-free region-based adaptive DVC solution in pixel domain based on machine learning approach to estimate the side information. Although the performance evaluations show rate-penalty but it is acceptable considering the simplicity of the proposed algorithm. vii

UTPedia

Automatic DVB signal analyser

Author: César Galobardes Eduardo
González Luque Rosa Ma.
Universitat Autònoma de Barcelona. Escola d'Enginyeria
Publication venue
Publication date: 01/01/2009
Field of study

El problema de controlar les emissions de televisió digital a tota Europa pel desenvolupament de receptors robustos i fiables és cada vegada més significant, per això, sorgeix la necessitat d'automatitzar el procés d'anàlisi i control d'aquests senyals. Aquest projecte presenta el desenvolupament software d'una aplicació que vol solucionar una part d'aquest problema. L'aplicació s'encarrega d'analitzar, gestionar i capturar senyals de televisió digital. Aquest document fa una introducció a la matèria central que és la televisió digital i la informació que porten els senyals de televisió, concretament, la que es refereix a l'estàndard "Digital Video Broadcasting". A continuació d'aquesta part, l'escrit es concentra en l'explicació i descripció de les funcionalitats que necessita cobrir l'aplicació, així com introduir i explicar cada etapa d'un procés de desenvolupament software. Finalment, es resumeixen els avantatges de la creació d'aquest programa per l'automatització de l'anàlisi de senyal digital partint d'una optimització de recursos.El problema de controlar las emisiones de televisión digital de toda Europa para el desarrollo de receptores robustos y fiables es cada vez más notable, por ello, surge la necesidad de automatizar el proceso de análisis y control de estas señales. Este proyecto presenta el desarrollo software de una aplicación que pretende solucionar parte del problema. La aplicación se encarga de analizar, gestionar y capturar señales de televisión digital. Este documento hace una introducción en la materia central que es la televisión digital y la información que transportan las señales de televisión, concretamente, la que se refiere al estándar "Digital Video Broadcasting". A continuación de esta parte, el escrito se centra en la explicación y descripción de las funcionalidades que necesita cubrir la aplicación, así como introducir y explicar cada etapa de un proceso de desarrollo de software. Finalmente, se resumen las ventajas de la creación de este programa para la automatización del análisis de señal digital a partir de una optimización de recursos.The problem of controlling all European digital television broadcastings for sturdy and reliable receivers' development is every time more remarkably, for this reason, the necessity of analysis and control process automation of these signals appears. This project presents the software development of an application that tries to solve part of the problem. The application is in charge of analyse, manage and record digital television signals. This essay introduces the main subject that it is digital television and the information that television signals carries, specifically, the information related to the standard "Digital Video Broadcasting". Following this section, the document focuses in the explanation and description of application scope functionalities, and also wants to introduce and explain each stage of a software development process. Finally, the advantages of program creation for the automation of digital signal analysis from an optimization of resources are summarised

Diposit Digital de Documents de la UAB

DeepLight: Robust and unobtrusive real-time screen-camera communication for real-world displays

Author: ASHOK Ashwin
JAYATILAKA Gihan
MISRA Archan
TRAN Vu Huy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/05/2021
Field of study

National Research Foundation (NRF) Singapore under NRF Investigatorship gran

Institutional Knowledge at Singapore Management University

Improved Encoding for Compressed Textures

Author: Krajcevski Pavel
Publication venue: University of North Carolina at Chapel Hill Graduate School
Publication date: 01/01/2016
Field of study

For the past few decades, graphics hardware has supported mapping a two dimensional image, or texture, onto a three dimensional surface to add detail during rendering. The complexity of modern applications using interactive graphics hardware have created an explosion of the amount of data needed to represent these images. In order to alleviate the amount of memory required to store and transmit textures, graphics hardware manufacturers have introduced hardware decompression units into the texturing pipeline. Textures may now be stored as compressed in memory and decoded at run-time in order to access the pixel data. In order to encode images to be used with these hardware features, many compression algorithms are run offline as a preprocessing step, often times the most time-consuming step in the asset preparation pipeline. This research presents several techniques to quickly serve compressed texture data. With the goal of interactive compression rates while maintaining compression quality, three algorithms are presented in the class of endpoint compression formats. The first uses intensity dilation to estimate compression parameters for low-frequency signal-modulated compressed textures and offers up to a 3X improvement in compression speed. The second, FasTC, shows that by estimating the final compression parameters, partition-based formats can choose an approximate partitioning and offer orders of magnitude faster encoding speed. The third, SegTC, shows additional improvement over selecting a partitioning by using a global segmentation to find the boundaries between image features. This segmentation offers an additional 2X improvement over FasTC while maintaining similar compressed quality. Also presented is a case study in using texture compression to benefit two dimensional concave path rendering. Compressing pixel coverage textures used for compositing yields both an increase in rendering speed and a decrease in storage overhead. Additionally an algorithm is presented that uses a single layer of indirection to adaptively select the block size compressed for each texture, giving a 2X increase in compression ratio for textures of mixed detail. Finally, a texture storage representation that is decoded at runtime on the GPU is presented. The decoded texture is still compressed for graphics hardware but uses 2X fewer bytes for storage and network bandwidth.Doctor of Philosoph

Carolina Digital Repository

Recommended from our members

New Frameworks for Secure Image Communication in the Internet of Things (IoT)

Author: Albalawi Umar Abdalah S
Publication venue: 'University of North Texas Libraries'
Publication date: 01/08/2016
Field of study

The continuous expansion of technology, broadband connectivity and the wide range of new devices in the IoT cause serious concerns regarding privacy and security. In addition, in the IoT a key challenge is the storage and management of massive data streams. For example, there is always the demand for acceptable size with the highest quality possible for images to meet the rapidly increasing number of multimedia applications. The effort in this dissertation contributes to the resolution of concerns related to the security and compression functions in image communications in the Internet of Thing (IoT), due to the fast of evolution of IoT. This dissertation proposes frameworks for a secure digital camera in the IoT. The objectives of this dissertation are twofold. On the one hand, the proposed framework architecture offers a double-layer of protection: encryption and watermarking that will address all issues related to security, privacy, and digital rights management (DRM) by applying a hardware architecture of the state-of-the-art image compression technique Better Portable Graphics (BPG), which achieves high compression ratio with small size. On the other hand, the proposed framework of SBPG is integrated with the Digital Camera. Thus, the proposed framework of SBPG integrated with SDC is suitable for high performance imaging in the IoT, such as Intelligent Traffic Surveillance (ITS) and Telemedicine. Due to power consumption, which has become a major concern in any portable application, a low-power design of SBPG is proposed to achieve an energy- efficient SBPG design. As the visual quality of the watermarked and compressed images improves with larger values of PSNR, the results show that the proposed SBPG substantially increases the quality of the watermarked compressed images. Higher value of PSNR also shows how robust the algorithm is to different types of attack. From the results obtained for the energy- efficient SBPG design, it can be observed that the power consumption is substantially reduced, up to 19%

UNT Digital Library

REGION-BASED ADAPTIVE DISTRIBUTED VIDEO CODING CODEC

Author: ABDELRAHMAN ELAMIN ABDELRAHMAN ELAMIN
Publication venue
Publication date: 01/12/2010
Field of study

UTPedia

Forensic Video Analytic Software

Author: Balagopalan Kapilan
Goonetilleke Sahani
Ratnarajah Anton Jeran
Rodrigo Ranga
Tissera Dumindu
Publication venue
Publication date: 17/09/2023
Field of study

Law enforcement officials heavily depend on Forensic Video Analytic (FVA) Software in their evidence extraction process. However present-day FVA software are complex, time consuming, equipment dependent and expensive. Developing countries struggle to gain access to this gateway to a secure haven. The term forensic pertains the application of scientific methods to the investigation of crime through post-processing, whereas surveillance is the close monitoring of real-time feeds. The principle objective of this Final Year Project was to develop an efficient and effective FVA Software, addressing the shortcomings through a stringent and systematic review of scholarly research papers, online databases and legal documentation. The scope spans multiple object detection, multiple object tracking, anomaly detection, activity recognition, tampering detection, general and specific image enhancement and video synopsis. Methods employed include many machine learning techniques, GPU acceleration and efficient, integrated architecture development both for real-time and postprocessing. For this CNN, GMM, multithreading and OpenCV C++ coding were used. The implications of the proposed methodology would rapidly speed up the FVA process especially through the novel video synopsis research arena. This project has resulted in three research outcomes Moving Object Based Collision Free Video Synopsis, Forensic and Surveillance Analytic Tool Architecture and Tampering Detection Inter-Frame Forgery. The results include forensic and surveillance panel outcomes with emphasis on video synopsis and Sri Lankan context. Principal conclusions include the optimization and efficient algorithm integration to overcome limitations in processing power, memory and compromise between real-time performance and accuracy.Comment: The Forensic Video Analytic Software demo video is available https://www.youtube.com/watch?v=vsZlYKQxSk

arXiv.org e-Print Archive