Search CORE

5,173 research outputs found

Algorithms for compression of high dynamic range images and video

Author: Vladimir Dolzhenko (7169792)
Publication venue
Publication date: 01/01/2015
Field of study

The recent advances in sensor and display technologies have brought upon the High Dynamic Range (HDR) imaging capability. The modern multiple exposure HDR sensors can achieve the dynamic range of 100-120 dB and LED and OLED display devices have contrast ratios of 10^5:1 to 10^6:1. Despite the above advances in technology the image/video compression algorithms and associated hardware are yet based on Standard Dynamic Range (SDR) technology, i.e. they operate within an effective dynamic range of up to 70 dB for 8 bit gamma corrected images. Further the existing infrastructure for content distribution is also designed for SDR, which creates interoperability problems with true HDR capture and display equipment. The current solutions for the above problem include tone mapping the HDR content to fit SDR. However this approach leads to image quality associated problems, when strong dynamic range compression is applied. Even though some HDR-only solutions have been proposed in literature, they are not interoperable with current SDR infrastructure and are thus typically used in closed systems. Given the above observations a research gap was identified in the need for efficient algorithms for the compression of still images and video, which are capable of storing full dynamic range and colour gamut of HDR images and at the same time backward compatible with existing SDR infrastructure. To improve the usability of SDR content it is vital that any such algorithms should accommodate different tone mapping operators, including those that are spatially non-uniform. In the course of the research presented in this thesis a novel two layer CODEC architecture is introduced for both HDR image and video coding. Further a universal and computationally efficient approximation of the tone mapping operator is developed and presented. It is shown that the use of perceptually uniform colourspaces for internal representation of pixel data enables improved compression efficiency of the algorithms. Further proposed novel approaches to the compression of metadata for the tone mapping operator is shown to improve compression performance for low bitrate video content. Multiple compression algorithms are designed, implemented and compared and quality-complexity trade-offs are identified. Finally practical aspects of implementing the developed algorithms are explored by automating the design space exploration flow and integrating the high level systems design framework with domain specific tools for synthesis and simulation of multiprocessor systems. The directions for further work are also presented

Loughborough University Institutional Repository

Recommended from our members

3D multiple description coding for error resilience over wireless networks

Author: Umar Abubakar
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2011
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Mobile communications has gained a growing interest from both customers and service providers alike in the last 1-2 decades. Visual information is used in many application domains such as remote health care, video –on demand, broadcasting, video surveillance etc. In order to enhance the visual effects of digital video content, the depth perception needs to be provided with the actual visual content. 3D video has earned a significant interest from the research community in recent years, due to the tremendous impact it leaves on viewers and its enhancement of the user’s quality of experience (QoE). In the near future, 3D video is likely to be used in most video applications, as it offers a greater sense of immersion and perceptual experience. When 3D video is compressed and transmitted over error prone channels, the associated packet loss leads to visual quality degradation. When a picture is lost or corrupted so severely that the concealment result is not acceptable, the receiver typically pauses video playback and waits for the next INTRA picture to resume decoding. Error propagation caused by employing predictive coding may degrade the video quality severely. There are several ways used to mitigate the effects of such transmission errors. One widely used technique in International Video Coding Standards is error resilience. The motivation behind this research work is that, existing schemes for 2D colour video compression such as MPEG, JPEG and H.263 cannot be applied to 3D video content. 3D video signals contain depth as well as colour information and are bandwidth demanding, as they require the transmission of multiple high-bandwidth 3D video streams. On the other hand, the capacity of wireless channels is limited and wireless links are prone to various types of errors caused by noise, interference, fading, handoff, error burst and network congestion. Given the maximum bit rate budget to represent the 3D scene, optimal bit-rate allocation between texture and depth information rendering distortion/losses should be minimised. To mitigate the effect of these errors on the perceptual 3D video quality, error resilience video coding needs to be investigated further to offer better quality of experience (QoE) to end users. This research work aims at enhancing the error resilience capability of compressed 3D video, when transmitted over mobile channels, using Multiple Description Coding (MDC) in order to improve better user’s quality of experience (QoE). Furthermore, this thesis examines the sensitivity of the human visual system (HVS) when employed to view 3D video scenes. The approach used in this study is to use subjective testing in order to rate people’s perception of 3D video under error free and error prone conditions through the use of a carefully designed bespoke questionnaire.Petroleum Technology Development Fund (PTDF

Brunel University Research Archive

The role of automaticity and attention in neural processes underlying empathy for happiness, sadness, and anxiety.

Author: Lieberman Matthew D
Morelli Sylvia A
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

Although many studies have examined the neural basis of empathy, relatively little is known about how empathic processes are affected by different attentional conditions. Thus, we examined whether instructions to empathize might amplify responses in empathy-related regions and whether cognitive load would diminish the involvement of these regions. Thirty-two participants completed a functional magnetic resonance imaging session assessing empathic responses to individuals experiencing happy, sad, and anxious events. Stimuli were presented under three conditions: watching naturally, actively empathizing, and under cognitive load. Across analyses, we found evidence for a core set of neural regions that support empathic processes (dorsomedial prefrontal cortex, DMPFC; medial prefrontal cortex, MPFC; temporoparietal junction, TPJ; amygdala; ventral anterior insula, AI; and septal area, SA). Two key regions-the ventral AI and SA-were consistently active across all attentional conditions, suggesting that they are automatically engaged during empathy. In addition, watching vs. empathizing with targets was not markedly different and instead led to similar subjective and neural responses to others' emotional experiences. In contrast, cognitive load reduced the subjective experience of empathy and diminished neural responses in several regions related to empathy and social cognition (DMPFC, MPFC, TPJ, and amygdala). The results reveal how attention impacts empathic processes and provides insight into how empathy may unfold in everyday interactions

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

eScholarship - University of California

Ubiquitous Scalable Graphics: An End-to-End Framework using Wavelets

Author: Wu Fan
Publication venue: Digital WPI
Publication date: 19/11/2008
Field of study

Advances in ubiquitous displays and wireless communications have fueled the emergence of exciting mobile graphics applications including 3D virtual product catalogs, 3D maps, security monitoring systems and mobile games. Current trends that use cameras to capture geometry, material reflectance and other graphics elements means that very high resolution inputs is accessible to render extremely photorealistic scenes. However, captured graphics content can be many gigabytes in size, and must be simplified before they can be used on small mobile devices, which have limited resources, such as memory, screen size and battery energy. Scaling and converting graphics content to a suitable rendering format involves running several software tools, and selecting the best resolution for target mobile device is often done by trial and error, which all takes time. Wireless errors can also affect transmitted content and aggressive compression is needed for low-bandwidth wireless networks. Most rendering algorithms are currently optimized for visual realism and speed, but are not resource or energy efficient on mobile device. This dissertation focuses on the improvement of rendering performance by reducing the impacts of these problems with UbiWave, an end-to-end Framework to enable real time mobile access to high resolution graphics using wavelets. The framework tackles the issues including simplification, transmission, and resource efficient rendering of graphics content on mobile device based on wavelets by utilizing 1) a Perceptual Error Metric (PoI) for automatically computing the best resolution of graphics content for a given mobile display to eliminate guesswork and save resources, 2) Unequal Error Protection (UEP) to improve the resilience to wireless errors, 3) an Energy-efficient Adaptive Real-time Rendering (EARR) heuristic to balance energy consumption, rendering speed and image quality and 4) an Energy-efficient Streaming Technique. The results facilitate a new class of mobile graphics application which can gracefully adapt the lowest acceptable rendering resolution to the wireless network conditions and the availability of resources and battery energy on mobile device adaptively

DigitalCommons@WPI

Metrics for Stereoscopic Image Compression

Author: GORLEY PAUL,WARD
Publication venue
Publication date: 01/01/2012
Field of study

Metrics for automatically predicting the compression settings for stereoscopic images, to minimize file size, while still maintaining an acceptable level of image quality are investigated. This research evaluates whether symmetric or asymmetric compression produces a better quality of stereoscopic image. Initially, how Peak Signal to Noise Ratio (PSNR) measures the quality of varyingly compressed stereoscopic image pairs was investigated. Two trials with human subjects, following the ITU-R BT.500-11 Double Stimulus Continuous Quality Scale (DSCQS) were undertaken to measure the quality of symmetric and asymmetric stereoscopic image compression. Computational models of the Human Visual System (HVS) were then investigated and a new stereoscopic image quality metric designed and implemented. The metric point matches regions of high spatial frequency between the left and right views of the stereo pair and accounts for HVS sensitivity to contrast and luminance changes in these regions. The PSNR results show that symmetric, as opposed to asymmetric stereo image compression, produces significantly better results. The human factors trial suggested that in general, symmetric compression of stereoscopic images should be used. The new metric, Stereo Band Limited Contrast, has been demonstrated as a better predictor of human image quality preference than PSNR and can be used to predict a perceptual threshold level for stereoscopic image compression. The threshold is the maximum compression that can be applied without the perceived image quality being altered. Overall, it is concluded that, symmetric, as opposed to asymmetric stereo image encoding, should be used for stereoscopic image compression. As PSNR measures of image quality are correctly criticized for correlating poorly with perceived visual quality, the new HVS based metric was developed. This metric produces a useful threshold to provide a practical starting point to decide the level of compression to use

Durham e-Theses

OpenGrey Repository

Data mining in large audio collections of dolphin signals

Author: Kohlsdorf Daniel
Publication venue: Georgia Institute of Technology
Publication date: 21/09/2015
Field of study

The study of dolphin cognition involves intensive research of animal vocal- izations recorded in the field. In this dissertation I address the automated analysis of audible dolphin communication. I propose a system called the signal imager that automatically discovers patterns in dolphin signals. These patterns are invariant to frequency shifts and time warping transformations. The discovery algorithm is based on feature learning and unsupervised time series segmentation using hidden Markov models. Researchers can inspect the patterns visually and interactively run com- parative statistics between the distribution of dolphin signals in different behavioral contexts. The required statistics for the comparison describe dolphin communication as a combination of the following models: a bag-of-words model, an n-gram model and an algorithm to learn a set of regular expressions. Furthermore, the system can use the patterns to automatically tag dolphin signals with behavior annotations. My results indicate that the signal imager provides meaningful patterns to the marine biologist and that the comparative statistics are aligned with the biologists’ domain knowledge.Ph.D

Scholarly Materials And Research @ Georgia Tech

Recommended from our members

End-to-end 3D video communication over heterogeneous networks

Author: Mohib Hamdullah
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2014
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Three-dimensional technology, more commonly referred to as 3D technology, has revolutionised many fields including entertainment, medicine, and communications to name a few. In addition to 3D films, games, and sports channels, 3D perception has made tele-medicine a reality. By the year 2015, 30% of the all HD panels at home will be 3D enabled, predicted by consumer electronics manufacturers. Stereoscopic cameras, a comparatively mature technology compared to other 3D systems, are now being used by ordinary citizens to produce 3D content and share at a click of a button just like they do with the 2D counterparts via sites like YouTube. But technical challenges still exist, including with autostereoscopic multiview displays. 3D content requires many complex considerations--including how to represent it, and deciphering what is the best compression format--when considering transmission or storage, because of its increased amount of data. Any decision must be taken in the light of the available bandwidth or storage capacity, quality and user expectations. Free viewpoint navigation also remains partly unsolved. The most pressing issue getting in the way of widespread uptake of consumer 3D systems is the ability to deliver 3D content to heterogeneous consumer displays over the heterogeneous networks. Optimising 3D video communication solutions must consider the entire pipeline, starting with optimisation at the video source to the end display and transmission optimisation. Multi-view offers the most compelling solution for 3D videos with motion parallax and freedom from wearing headgear for 3D video perception. Optimising multi-view video for delivery and display could increase the demand for true 3D in the consumer market. This thesis focuses on an end-to-end quality optimisation in 3D video communication/transmission, offering solutions for optimisation at the compression, transmission, and decoder levels.Brunel University - Isambard Research Scholarshi

Brunel University Research Archive

Investigation of the effects of image compression on the geometric quality of digital protogrammetric imagery

Author: Kwabena-Forkuo Eric
Publication venue: Faculty of Engineering and the Built Environment
Publication date: 12/09/2023
Field of study

We are living in a decade, where the use of digital images is becoming increasingly important. Photographs are now converted into digital form, and direct acquisition of digital images is becoming increasing important as sensors and associated electronics. Unlike images in analogue form, digital representation of images allows visual information to· be easily manipulated in useful ways. One practical problem of the digital image representation is that, it requires a very large number of bits and hence one encounters a fairly large volume of data in a digital production environment if they are stored uncompressed on the disk. With the rapid advances in sensor technology and digital electronics, the number of bits grow larger in softcopy photogrammetry, remote sensing and multimedia GIS. As a result, it is desirable to find efficient representation for digital images in order to reduce the memory required for storage, improve the data access rate from storage devices, and reduce the time required for transfer across communication channels. The component of digital image processing that deals with this problem is called image compression. Image compression is a necessity for the utilisation of large digital images in softcopy photogrammetry, remote sensing, and multimedia GIS. Numerous image Compression standards exist today with the common goal of reducing the number of bits needed to store images, and to facilitate the interchange of compressed image data between various devices and applications. JPEG image compression standard is one alternative for carrying out the image compression task. This standard was formed under the auspices ISO and CCITT for the purpose of developing an international standard for the compression and decompression of continuous-tone, still-frame, monochrome and colour images. The JPEG standard algorithm &Us into three general categories: the baseline sequential process that provides a simple and efficient algorithm for most image coding applications, the extended DCT-based process that allows the baseline system to satisfy a broader range of applications, and an independent lossless process for application demanding that type of compression. This thesis experimentally investigates the geometric degradations resulting from lossy JPEG compression on photogrammetric imagery at various levels of quality factors. The effects and the suitability of JPEG lossy image compression on industrial photogrammetric imagery are investigated. Examples are drawn from the extraction of targets in close-range photogrammetric imagery. In the experiments, the JPEG was used to compress and decompress a set of test images. The algorithm has been tested on digital images containing various levels of entropy (a measure of information content of an image) with different image capture capabilities. Residual data was obtained by taking the pixel-by-pixel difference between the original data and the reconstructed data. The image quality measure, root mean square (rms) error of the residual was used as a quality measure to judge the quality of images produced by JPEG(DCT-based) image compression technique. Two techniques, TIFF (IZW) compression and JPEG(DCT-based) compression are compared with respect to compression ratios achieved. JPEG(DCT-based) yields better compression ratios, and it seems to be a good choice for image compression. Further in the investigation, it is found out that, for grey-scale images, the best compression ratios were obtained when the quality factors between 60 and 90 were used (i.e., at a compression ratio of 1:10 to 1:20). At these quality factors the reconstructed data has virtually no degradation in the visual and geometric quality for the application at hand. Recently, many fast and efficient image file formats have also been developed to store, organise and display images in an efficient way. Almost every image file format incorporates some kind of compression method to manage data within common place networks and storage devices. The current major file formats used in softcopy photogrammetry, remote sensing and · multimedia GIS. were also investigated. It was also found out that the choice of a particular image file format for a given application generally involves several interdependent considerations including quality; flexibility; computation; storage, or transmission. The suitability of a file format for a given purpose is · best determined by knowing its original purpose. Some of these are widely used (e.g., TIFF, JPEG) and serve as exchange formats. Others are adapted to the needs of particular applications or particular operating systems

Cape Town University OpenUCT

Object Detection and Recognition Using YOLO: Detect and Recognize URL(s) in an Image Scene

Author: Ajala John
Publication venue: The Repository at St. Cloud State
Publication date: 01/05/2021
Field of study

The world in the 21st century is ever evolving towards automation. This upsurge seemingly has no decline in the foreseeable future. Image recognition is at the forefront of this charge which seeks to revolutionize the way of living of the average man. If robotics can be likened to the creation of a body for computers to live in, then image processing is the development of the part of its brain which deal with identification and recognition of images. To accomplish this task, we developed an object detection algorithm using YOLO, and acronym for “You Only Look Once”. Our algorithm was trained on fifty thousand images and evaluated on ten thousand images and employed a 21 x 21 grid. We also programmed a text generator which randomly creates texts and URLs in an image. A record of useful information about the location of the URLs in the image is also recorded and later passed to the YOLO algorithm for training. At the end of this project, we observed significant difference in the accuracy of URL detection when using an OCR software or our YOLO algorithm. However, our algorithm would be best used to specify the region of interest before converting to texts which greatly improves accuracy when combined with OCR software

St. Cloud State University