13 research outputs found

    Make the Most Out of Your Net: Alternating Between Canonical and Hard Datasets for Improved Image Demosaicing

    Full text link
    Image demosaicing is an important step in the image processing pipeline for digital cameras, and it is one of the many tasks within the field of image restoration. A well-known characteristic of natural images is that most patches are smooth, while high-content patches like textures or repetitive patterns are much rarer, which results in a long-tailed distribution. This distribution can create an inductive bias when training machine learning algorithms for image restoration tasks and for image demosaicing in particular. There have been many different approaches to address this challenge, such as utilizing specific losses or designing special network architectures. What makes our work is unique in that it tackles the problem from a training protocol perspective. Our proposed training regime consists of two key steps. The first step is a data-mining stage where sub-categories are created and then refined through an elimination process to only retain the most helpful sub-categories. The second step is a cyclic training process where the neural network is trained on both the mined sub-categories and the original dataset. We have conducted various experiments to demonstrate the effectiveness of our training method for the image demosaicing task. Our results show that this method outperforms standard training across a range of architecture sizes and types, including CNNs and Transformers. Moreover, we are able to achieve state-of-the-art results with a significantly smaller neural network, compared to previous state-of-the-art methods

    Target-oriented Domain Adaptation for Infrared Image Super-Resolution

    Full text link
    Recent efforts have explored leveraging visible light images to enrich texture details in infrared (IR) super-resolution. However, this direct adaptation approach often becomes a double-edged sword, as it improves texture at the cost of introducing noise and blurring artifacts. To address these challenges, we propose the Target-oriented Domain Adaptation SRGAN (DASRGAN), an innovative framework specifically engineered for robust IR super-resolution model adaptation. DASRGAN operates on the synergy of two key components: 1) Texture-Oriented Adaptation (TOA) to refine texture details meticulously, and 2) Noise-Oriented Adaptation (NOA), dedicated to minimizing noise transfer. Specifically, TOA uniquely integrates a specialized discriminator, incorporating a prior extraction branch, and employs a Sobel-guided adversarial loss to align texture distributions effectively. Concurrently, NOA utilizes a noise adversarial loss to distinctly separate the generative and Gaussian noise pattern distributions during adversarial training. Our extensive experiments confirm DASRGAN's superiority. Comparative analyses against leading methods across multiple benchmarks and upsampling factors reveal that DASRGAN sets new state-of-the-art performance standards. Code are available at \url{https://github.com/yongsongH/DASRGAN}.Comment: 11 pages, 9 figure

    Honest Score Client Selection Scheme: Preventing Federated Learning Label Flipping Attacks in Non-IID Scenarios

    Full text link
    Federated Learning (FL) is a promising technology that enables multiple actors to build a joint model without sharing their raw data. The distributed nature makes FL vulnerable to various poisoning attacks, including model poisoning attacks and data poisoning attacks. Today, many byzantine-resilient FL methods have been introduced to mitigate the model poisoning attack, while the effectiveness when defending against data poisoning attacks still remains unclear. In this paper, we focus on the most representative data poisoning attack - "label flipping attack" and monitor its effectiveness when attacking the existing FL methods. The results show that the existing FL methods perform similarly in Independent and identically distributed (IID) settings but fail to maintain the model robustness in Non-IID settings. To mitigate the weaknesses of existing FL methods in Non-IID scenarios, we introduce the Honest Score Client Selection (HSCS) scheme and the corresponding HSCSFL framework. In the HSCSFL, The server collects a clean dataset for evaluation. Under each iteration, the server collects the gradients from clients and then perform HSCS to select aggregation candidates. The server first evaluates the performance of each class of the global model and generates the corresponding risk vector to indicate which class could be potentially attacked. Similarly, the server evaluates the client's model and records the performance of each class as the accuracy vector. The dot product of each client's accuracy vector and global risk vector is generated as the client's host score; only the top p\% host score clients are included in the following aggregation. Finally, server aggregates the gradients and uses the outcome to update the global model. The comprehensive experimental results show our HSCSFL effectively enhances the FL robustness and defends against the "label flipping attack.

    Vesta:A Digital Health Analytics Platform for a Smart Home in a Box

    Get PDF
    © 2020 This paper presents Vesta, a digital health platform composed of a smart home in a box for data collection and a machine learning based analytic system for deriving health indicators using activity recognition, sleep analysis and indoor localization. This system has been deployed in the homes of 40 patients undergoing a heart valve intervention in the United Kingdom (UK) as part of the EurValve project, measuring patients health and well-being before and after their operation. In this work a cohort of 20 patients are analyzed, and 2 patients are analyzed in detail as example case studies. A quantitative evaluation of the platform is provided using patient collected data, as well as a comparison using standardized Patient Reported Outcome Measures (PROMs) which are commonly used in hospitals, and a custom survey. It is shown how the ubiquitous in-home Vesta platform can increase clinical confidence in self-reported patient feedback. Demonstrating its suitability for digital health studies, Vesta provides deeper insight into the health, well-being and recovery of patients within their home

    Fully Bayesian Inference for Finite and Infinite Discrete Exponential Mixture Models

    Get PDF
    Count data often appears in natural language processing and computer vision applications. For example, in images and textual documents clustering, each image or text can be described by a histogram of visual words or text words. In real applications, these frequency vectors often show high-dimensional and sparsity nature. In this case, hierarchical Bayesian modeling frameworks show the ability to model the dependence of the word repetitive occurrences ’burstiness’. Moreover, approximating these models to exponential families is helpful to improve computing efficiency, especially when facing high-dimensional count data and large data sets. However, classical deterministic approaches such as expectation-maximization (EM) do not achieve good results in real-life complex applications. This thesis explores the use of a fully Bayesian inference for finite discrete exponential mixture models of Multinomial Generalized Dirichlet (EMGD), Multinomial Beta-Liouville (EMBL), Multinomial Scaled Dirichlet (EMSD), and Multinomial Shifted Scaled Dirichlet (EMSSD). Finite mixtures have already shown superior performance in real data sets clustering with EM approach. The proposed approaches in this thesis are based on Monte Carlo simulation technique of Gibbs sampling mixed with Metropolis-Hastings step, and we utilize exponential family conjugate prior information to construct the required posteriors relying on Bayesian theory. Furthermore, we also present the infinite models based on Dirichlet processes, which results in clustering algorithms that do not require the specification of the number of mixture components to be given in advance. The performance of our Bayesian approaches was tested in some challenging real-world applications concerning text sentiment analysis, fake news detection, and human face gender recognition

    Super-resolution assessment and detection

    Get PDF
    Super Resolution (SR) techniques are powerful digital manipulation tools that have significantly impacted various industries due to their ability to enhance the resolution of lower quality images and videos. Yet, the real-world adaptation of SR models poses numerous challenges, which blind SR models aim to overcome by emulating complex real-world degradations. In this thesis, we investigate these SR techniques, with a particular focus on comparing the performance of blind models to their non-blind counterparts under various conditions. Despite recent progress, the proliferation of SR techniques raises concerns about their potential misuse. These methods can easily manipulate real digital content and create misrepresentations, which highlights the need for robust SR detection mechanisms. In our study, we analyze the limitations of current SR detection techniques and propose a new detection system that exhibits higher performance in discerning real and upscaled videos. Moreover, we conduct several experiments to gain insights into the strengths and weaknesses of the detection models, providing a better understanding of their behavior and limitations. Particularly, we target 4K videos, which are rapidly becoming the standard resolution in various fields such as streaming services, gaming, and content creation. As part of our research, we have created and utilized a unique dataset in 4K resolution, specifically designed to facilitate the investigation of SR techniques and their detection

    Entropy in Image Analysis II

    Get PDF
    Image analysis is a fundamental task for any application where extracting information from images is required. The analysis requires highly sophisticated numerical and analytical methods, particularly for those applications in medicine, security, and other fields where the results of the processing consist of data of vital importance. This fact is evident from all the articles composing the Special Issue "Entropy in Image Analysis II", in which the authors used widely tested methods to verify their results. In the process of reading the present volume, the reader will appreciate the richness of their methods and applications, in particular for medical imaging and image security, and a remarkable cross-fertilization among the proposed research areas

    Deep Learning in Medical Image Analysis

    Get PDF
    The accelerating power of deep learning in diagnosing diseases will empower physicians and speed up decision making in clinical environments. Applications of modern medical instruments and digitalization of medical care have generated enormous amounts of medical images in recent years. In this big data arena, new deep learning methods and computational models for efficient data processing, analysis, and modeling of the generated data are crucially important for clinical applications and understanding the underlying biological process. This book presents and highlights novel algorithms, architectures, techniques, and applications of deep learning for medical image analysis

    An investigation of the use of gradients in imaging, including best approximation and the Structural Similarity image quality measure

    Get PDF
    The L^2-based mean squared error (MSE) and its variations continue to be the most widely employed metrics in image processing. This is most probably due to the fact that (1) the MSE is simple to compute and (2) it possesses a number of convenient mathematical properties, including differentiability and convexity. It is well known, however, that these L^2-based measures perform poorly in terms of measuring the visual quality of images. Their failure is partially due to the fact that the L^2 metric does not capture spatial relationships between pixels. This was a motivation for the introduction of the so-called Structural Similarity (SSIM) image quality measure [1] which, along with is variations, continues to be one of the most effective measure of visual quality. The SSIM index measures the similarity between two images by combining three components of the human visual system--luminance, contrast, and structure. It is our belief that the structure term, which measures the correlation between images, is the most important component of the SSIM. A considerable portion of this thesis focusses on adapting the L^2 distance for image processing applications. Our first approach involves inserting an intensity-dependent weight function into the integral such that it conforms to generalized Weber's model of perception. We solve the associated best approximation problem and discuss examples in both one- and two-dimensions. Motivated by the success of the SSIM, we also solve the Weberized best approximation problem with an added regularization term involving the correlation. Another approach we take towards adapting the MSE for image processing involves introducing gradient information into the metric. Specifically, we study the traditional L^2 best approximation problem with an added regularization term involving the L^2 distance between gradients. We use orthonormal functions to solve the best approximation problem in both the continuous and discrete setting. In both cases, we prove that the Fourier coefficients remain optimal provided certain assumptions on the orthonormal basis hold. Our final best approximation problem to be considered involves maximizing the correlation between gradients. We obtain the relevant stationarity conditions and show that an infinity of solutions exists. A unique solution can be obtained using two assumptions adapted from [2]. We demonstrate that this problem is equivalent to maximizing the entire SSIM function between gradients. During this work, we prove that the discrete derivatives of the DCT and DFT basis functions form an orthogonal set, a result which has not appeared in the literature to the best of our knowledge. Our study of gradients is not limited to best approximation problems. A second major focus of this thesis concerns the development of gradient-based image quality measures. This was based on the idea that the human visual system may also be sensitive to distortions in the magnitudes and/or direction of variations in greyscale or colour intensities--in other words, their gradients. Indeed, as we show in a persuasive simple example, the use of the L^2 distances between image gradients already yields a significant improvement over the MSE. One naturally wonders whether a measure of the correlation between image gradients could yield even better results--in fact, possibly "better" than the SSIM itself! (We will define what we mean by "better" in this thesis.) For this reason, we pursue many possible forms of a "gradient-based SSIM". First, however, we must address the question of how to define the correlation between the gradient vectors of two images. We formulate and compare many novel gradient similarity measures. Among those, we justify our selection of a preferred measure which, although simple-minded, we show to be highly correlated with the "rigorous" canonical correlation method. We then present many attempts at incorporating our gradient similarity measure into the SSIM. We finally arrive at a novel gradient-based SSIM, our so-called "gradSSIM1", which we argue does, in fact, improve the SSIM. The novelty of our approach lies in its use of SSIM-dependent exponents, which allow us to seamlessly blend our measure of gradient correlation and the traditional SSIM. To compare two image quality measures, e.g., the SSIM and our "gradSSIM1", we require use of the LIVE image database [3]. This database contains numerous distorted images, each of which is associated with a single score indicating visual quality. We suggest that these scores be considered as the independent variable, an understanding that does not appear to be have been adopted elsewhere in the literature. This work also necessitated a detailed analysis of the SSIM, including the roles of its individual components and the effect of varying its stability constants. It appears that such analysis has not been performed elsewhere in the literature. References: [1] Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing, 13(4):600-612, 2004. [2] P. Bendevis and E.R. Vrscay. Structural Similarity-Based Approximation over Orthogonal Bases: Investigating the Use of Individual Components Functions S_k(x,y). In Aurelio Campilho and Mohamed Kamel, editors, Image Analysis and Recognition - 11th International COnference, ICIAR 2014, Vilamoura, Portugal, October 22-24, 2014, Proceedings, Part 1, volume 8814 of Lecture Notes in Computer Science, pages 55-64, 2014. [3] H.R. Sheikh, M.F. Sabir, and A.C. Bovik. A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms. IEEE Transactions on Image Processing, 15(11):3440-2451, November 2006
    corecore