1,089 research outputs found

    Multi-Modal Enhancement Techniques for Visibility Improvement of Digital Images

    Get PDF
    Image enhancement techniques for visibility improvement of 8-bit color digital images based on spatial domain, wavelet transform domain, and multiple image fusion approaches are investigated in this dissertation research. In the category of spatial domain approach, two enhancement algorithms are developed to deal with problems associated with images captured from scenes with high dynamic ranges. The first technique is based on an illuminance-reflectance (I-R) model of the scene irradiance. The dynamic range compression of the input image is achieved by a nonlinear transformation of the estimated illuminance based on a windowed inverse sigmoid transfer function. A single-scale neighborhood dependent contrast enhancement process is proposed to enhance the high frequency components of the illuminance, which compensates for the contrast degradation of the mid-tone frequency components caused by dynamic range compression. The intensity image obtained by integrating the enhanced illuminance and the extracted reflectance is then converted to a RGB color image through linear color restoration utilizing the color components of the original image. The second technique, named AINDANE, is a two step approach comprised of adaptive luminance enhancement and adaptive contrast enhancement. An image dependent nonlinear transfer function is designed for dynamic range compression and a multiscale image dependent neighborhood approach is developed for contrast enhancement. Real time processing of video streams is realized with the I-R model based technique due to its high speed processing capability while AINDANE produces higher quality enhanced images due to its multi-scale contrast enhancement property. Both the algorithms exhibit balanced luminance, contrast enhancement, higher robustness, and better color consistency when compared with conventional techniques. In the transform domain approach, wavelet transform based image denoising and contrast enhancement algorithms are developed. The denoising is treated as a maximum a posteriori (MAP) estimator problem; a Bivariate probability density function model is introduced to explore the interlevel dependency among the wavelet coefficients. In addition, an approximate solution to the MAP estimation problem is proposed to avoid the use of complex iterative computations to find a numerical solution. This relatively low complexity image denoising algorithm implemented with dual-tree complex wavelet transform (DT-CWT) produces high quality denoised images

    Perceptual lossless medical image coding

    Get PDF
    A novel perceptually lossless coder is presented for the compression of medical images. Built on the JPEG 2000 coding framework, the heart of the proposed coder is a visual pruning function, embedded with an advanced human vision model to identify and to remove visually insignificant/irrelevant information. The proposed coder offers the advantages of simplicity and modularity with bit-stream compliance. Current results have shown superior compression ratio gains over that of its information lossless counterparts without any visible distortion. In addition, a case study consisting of 31 medical experts has shown that no perceivable difference of statistical significance exists between the original images and the images compressed by the proposed coder

    Livrable D3.3 of the PERSEE project : 2D coding tools

    Get PDF
    49Livrable D3.3 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D3.3 du projet. Son titre : 2D coding tool

    The effects of emotionally salient unimodal and multimodal stimuli on low-level visual perception

    Get PDF
    Sensory information can both impair and enhance low-level visual feature processing, and this can be significantly modulated depending on the whether this information matches the visual sensory modality. Emotionally significant visual and auditory stimuli can have opposing effects on attention. While task-irrelevant emotionally salient visual stimuli can often impair task attention, task-irrelevant emotionally salient auditory stimuli have been shown to enhance aspects of attention. To date, no study has directly compared how emotionally salient information presented to different sensory modalities can affect low-level vision. Using Gabor patches of differing contrasts to measure the threshold of visual perception, we hypothesized that emotionally salient visual stimuli would impair low-level vision, while emotionally salient auditory stimuli would enhance low-level vision. We found that sensory modulation may be dependant on matched sensory domain presentation, as visual emotional stimuli impaired low-level vision, but emotional auditory stimuli did not affect low-level vision

    Edge-Enhancement DenseNet for X-ray Fluoroscopy Image Denoising in Cardiac Electrophysiology Procedures

    Get PDF
    PURPOSE: Reducing X‐ray dose increases safety in cardiac electrophysiology procedures but also increases image noise and artifacts which may affect the discernibility of devices and anatomical cues. Previous denoising methods based on convolutional neural networks (CNNs) have shown improvements in the quality of low‐dose X‐ray fluoroscopy images but may compromise clinically important details required by cardiologists. METHODS: In order to obtain denoised X‐ray fluoroscopy images whilst preserving details, we propose a novel deep‐learning‐based denoising framework, namely edge‐enhancement densenet (EEDN), in which an attention‐awareness edge‐enhancement module is designed to increase edge sharpness. In this framework, a CNN‐based denoiser is first used to generate an initial denoising result. Contours representing edge information are then extracted using an attention block and a group of interacted ultra‐dense blocks for edge feature representation. Finally, the initial denoising result and enhanced edges are combined to generate the final X‐ray image. The proposed denoising framework was tested on a total of 3262 clinical images taken from 100 low‐dose X‐ray sequences acquired from 20 patients. The performance was assessed by pairwise voting from five cardiologists as well as quantitative indicators. Furthermore, we evaluated our technique's effect on catheter detection using 416 images containing coronary sinus catheters in order to examine its influence as a pre‐processing tool. RESULTS: The average signal‐to‐noise ratio of X‐ray images denoised with EEDN was 24.5, which was 2.2 times higher than that of the original images. The accuracy of catheter detection from EEDN denoised sequences showed no significant difference compared with their original counterparts. Moreover, EEDN received the highest average votes in our clinician assessment when compared to our existing technique and the original images. CONCLUSION: The proposed deep learning‐based framework shows promising capability for denoising interventional X‐ray fluoroscopy images. The results from the catheter detection show that the network does not affect the results of such an algorithm when used as a pre‐processing step. The extensive qualitative and quantitative evaluations suggest that the network may be of benefit to reduce radiation dose when applied in real time in the catheter laboratory

    Adaptive video delivery using semantics

    Get PDF
    The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications

    Assessing neurodegeneration of the retina and brain with ultra-widefield retinal imaging

    Get PDF
    The eye is embryologically, physiologically and anatomically linked to the brain. Emerging evidence suggests that neurodegenerative diseases, such as Alzheimer’s disease (AD), manifest in the retina. Retinal imaging is a quick, non-invasive method to view the retina and its microvasculature. Features such as blood vessel calibre, tortuosity and complexity of the vascular structure (measured through fractal analysis) are thought to reflect microvascular health and have been found to associate with clinical signs of hypertension, diabetes, cardiovascular disease and cognitive decline. Small deposits of acellular debris called drusen in the peripheral retina have also been linked with AD where histological studies show they can contain amyloid beta, a hallmark of AD. Age-related macular degeneration (AMD) is a neurodegenerative disorder of the retina and a leading cause of irreversible vision loss in the ageing population. Increasing number and size of drusen is a characteristic of AMD disease progression. Ultra-widefield (UWF) retinal imaging with a scanning laser ophthalmoscope captures up to 80% of the retina in a single acquisition allowing a larger area of the retina to be assessed for signs of neurodegeneration than is possible with a conventional fundus camera, particularly the periphery. Quantification of changes to the microvasculature and drusen load could be used to derive early biomarkers of diseases that have vascular and neurodegenerative components such as AD and other forms of dementia.Manually grading drusen in UWF images is a difficult, subjective and a time-consuming process because the area imaged is large (around 700mm2) and drusen appear as small spots ( 0.8 and < 0.9), achieving AUC 0.55-0.59, 0.78-0.82 and 0.82-0.85 in the central, perimacular and peripheral zones, respectively. Measurements of the retinal vasculature appearing in UWF images of cognitively healthy (CH) individuals and patients diagnosed with mild cognitive impairment (MCI) and AD were obtained using a previously established pipeline. Following data cleaning, vascular measures were compared using multivariate generalised estimation equations (GEE), which accounts for the correlation between eyes of individuals with correction for confounders (e.g. age). The vascular measures were repeated for a subset of images and analysed using GEE to assess the repeatability of the results. When comparing AD with CH, the analysis showed a statistically significant difference between measurements of arterioles in the inferonasal quadrant, but fractal analysis produced inconsistent results due to differences in the area sampled in which the fractal dimension was calculated.When looking at drusen load, there was a higher abundance of drusen in the inferonasal region of the peripheral retina in the CH and AD compared to the MCI group. Using GEE analysis, there was evidence of a significant difference in drusen count when comparing MCI to CH (p = 0.02) and MCI to AD (p = 0.03), but no evidence of a difference when comparing AD to CH. However, given the low sensitivity of the system (partly the result of only moderate agreement between human observers), there will be a large proportion of drusen that are not detected giving an under estimation of the true amount of drusen present in an image. Overcoming this limitation will involve training the system using larger datasets and annotations from additional observers to create a more consistent reference standard. Further validation could then be performed in the future to determine if these promising pilot results persist, leading to candidate retinal biomarkers of AD

    Study and Implementation of Watermarking Algorithms

    Get PDF
    Water Making is the process of embedding data called a watermark into a multimedia object such that watermark can be detected or extracted later to make an assertion about the object. The object may be an audio, image or video. A copy of a digital image is identical to the original. This has in many instances, led to the use of digital content with malicious intent. One way to protect multimedia data against illegal recording and retransmission is to embed a signal, called digital signature or copyright label or watermark that authenticates the owner of the data. Data hiding, schemes to embed secondary data in digital media, have made considerable progress in recent years and attracted attention from both academia and industry. Techniques have been proposed for a variety of applications, including ownership protection, authentication and access control. Imperceptibility, robustness against moderate processing such as compression, and the ability to hide many bits are the basic but rat..

    Advancements and Breakthroughs in Ultrasound Imaging

    Get PDF
    Ultrasonic imaging is a powerful diagnostic tool available to medical practitioners, engineers and researchers today. Due to the relative safety, and the non-invasive nature, ultrasonic imaging has become one of the most rapidly advancing technologies. These rapid advances are directly related to the parallel advancements in electronics, computing, and transducer technology together with sophisticated signal processing techniques. This book focuses on state of the art developments in ultrasonic imaging applications and underlying technologies presented by leading practitioners and researchers from many parts of the world

    Computational processing and analysis of ear images

    Get PDF
    Tese de mestrado. Engenharia Biomédica. Faculdade de Engenharia. Universidade do Porto. 201
    corecore