17 research outputs found

    CLIP Guided Image-perceptive Prompt Learning for Image Enhancement

    Full text link
    Image enhancement is a significant research area in the fields of computer vision and image processing. In recent years, many learning-based methods for image enhancement have been developed, where the Look-up-table (LUT) has proven to be an effective tool. In this paper, we delve into the potential of Contrastive Language-Image Pre-Training (CLIP) Guided Prompt Learning, proposing a simple structure called CLIP-LUT for image enhancement. We found that the prior knowledge of CLIP can effectively discern the quality of degraded images, which can provide reliable guidance. To be specific, We initially learn image-perceptive prompts to distinguish between original and target images using CLIP model, in the meanwhile, we introduce a very simple network by incorporating a simple baseline to predict the weights of three different LUT as enhancement network. The obtained prompts are used to steer the enhancement network like a loss function and improve the performance of model. We demonstrate that by simply combining a straightforward method with CLIP, we can obtain satisfactory results.Comment: A trial work to the image enhancemen

    A Large-scale Film Style Dataset for Learning Multi-frequency Driven Film Enhancement

    Full text link
    Film, a classic image style, is culturally significant to the whole photographic industry since it marks the birth of photography. However, film photography is time-consuming and expensive, necessitating a more efficient method for collecting film-style photographs. Numerous datasets that have emerged in the field of image enhancement so far are not film-specific. In order to facilitate film-based image stylization research, we construct FilmSet, a large-scale and high-quality film style dataset. Our dataset includes three different film types and more than 5000 in-the-wild high resolution images. Inspired by the features of FilmSet images, we propose a novel framework called FilmNet based on Laplacian Pyramid for stylizing images across frequency bands and achieving film style outcomes. Experiments reveal that the performance of our model is superior than state-of-the-art techniques. Our dataset and code will be made publicly available

    High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net

    Full text link
    Shadows often occur when we capture the documents with casual equipment, which influences the visual quality and readability of the digital copies. Different from the algorithms for natural shadow removal, the algorithms in document shadow removal need to preserve the details of fonts and figures in high-resolution input. Previous works ignore this problem and remove the shadows via approximate attention and small datasets, which might not work in real-world situations. We handle high-resolution document shadow removal directly via a larger-scale real-world dataset and a carefully designed frequency-aware network. As for the dataset, we acquire over 7k couples of high-resolution (2462 x 3699) images of real-world document pairs with various samples under different lighting circumstances, which is 10 times larger than existing datasets. As for the design of the network, we decouple the high-resolution images in the frequency domain, where the low-frequency details and high-frequency boundaries can be effectively learned via the carefully designed network structure. Powered by our network and dataset, the proposed method clearly shows a better performance than previous methods in terms of visual quality and numerical results. The code, models, and dataset are available at: https://github.com/CXH-Research/DocShadow-SD7KComment: Accepted by International Conference on Computer Vision 2023 (ICCV 2023

    ShaDocFormer: A Shadow-attentive Threshold Detector with Cascaded Fusion Refiner for document shadow removal

    Full text link
    Document shadow is a common issue that arise when capturing documents using mobile devices, which significantly impacts the readability. Current methods encounter various challenges including inaccurate detection of shadow masks and estimation of illumination. In this paper, we propose ShaDocFormer, a Transformer-based architecture that integrates traditional methodologies and deep learning techniques to tackle the problem of document shadow removal. The ShaDocFormer architecture comprises two components: the Shadow-attentive Threshold Detector (STD) and the Cascaded Fusion Refiner (CFR). The STD module employs a traditional thresholding technique and leverages the attention mechanism of the Transformer to gather global information, thereby enabling precise detection of shadow masks. The cascaded and aggregative structure of the CFR module facilitates a coarse-to-fine restoration process for the entire image. As a result, ShaDocFormer excels in accurately detecting and capturing variations in both shadow and illumination, thereby enabling effective removal of shadows. Extensive experiments demonstrate that ShaDocFormer outperforms current state-of-the-art methods in both qualitative and quantitative measurements

    UWFormer: Underwater Image Enhancement via a Semi-Supervised Multi-Scale Transformer

    Full text link
    Underwater images often exhibit poor quality, imbalanced coloration, and low contrast due to the complex and intricate interaction of light, water, and objects. Despite the significant contributions of previous underwater enhancement techniques, there exist several problems that demand further improvement: (i) Current deep learning methodologies depend on Convolutional Neural Networks (CNNs) that lack multi-scale enhancement and also have limited global perception fields. (ii) The scarcity of paired real-world underwater datasets poses a considerable challenge, and the utilization of synthetic image pairs risks overfitting. To address the aforementioned issues, this paper presents a Multi-scale Transformer-based Network called UWFormer for enhancing images at multiple frequencies via semi-supervised learning, in which we propose a Nonlinear Frequency-aware Attention mechanism and a Multi-Scale Fusion Feed-forward Network for low-frequency enhancement. Additionally, we introduce a specialized underwater semi-supervised training strategy, proposing a Subaqueous Perceptual Loss function to generate reliable pseudo labels. Experiments using full-reference and non-reference underwater benchmarks demonstrate that our method outperforms state-of-the-art methods in terms of both quantity and visual quality

    DocDeshadower: Frequency-aware Transformer for Document Shadow Removal

    Full text link
    The presence of shadows significantly impacts the visual quality of scanned documents. However, the existing traditional techniques and deep learning methods used for shadow removal have several limitations. These methods either rely heavily on heuristics, resulting in suboptimal performance, or require large datasets to learn shadow-related features. In this study, we propose the DocDeshadower, a multi-frequency Transformer-based model built on Laplacian Pyramid. DocDeshadower is designed to remove shadows at different frequencies in a coarse-to-fine manner. To achieve this, we decompose the shadow image into different frequency bands using Laplacian Pyramid. In addition, we introduce two novel components to this model: the Attention-Aggregation Network and the Gated Multi-scale Fusion Transformer. The Attention-Aggregation Network is designed to remove shadows in the low-frequency part of the image, whereas the Gated Multi-scale Fusion Transformer refines the entire image at a global scale with its large perceptive field. Our extensive experiments demonstrate that DocDeshadower outperforms the current state-of-the-art methods in both qualitative and quantitative terms

    Hardware inspired neural network for efficient time-resolved biomedical imaging

    Get PDF
    Convolutional neural networks (CNN) have revealed exceptional performance for fluorescence lifetime imaging (FLIM). However, redundant parameters and complicated topologies make it challenging to implement such networks on embedded hardware to achieve real-time processing. We report a lightweight, quantized neural architecture that can offer fast FLIM imaging. The forward-propagation is significantly simplified by replacing matrix multiplications in each convolution layer with additions and data quantization using a low bit-width. We first used synthetic 3-D lifetime data with given lifetime ranges and photon counts to assure correct average lifetimes can be obtained. Afterwards, human prostatic cancer cells incubated with gold nanoprobes were utilized to validate the feasibility of the network for real-world data. The quantized network yielded a 37.8% compression ratio without performance degradation. Clinical relevance - This neural network can be applied to diagnose cancer early based on fluorescence lifetime in a non-invasive way. This approach brings high accuracy and accelerates diagnostic processes for clinicians who are not experts in biomedical signal processin

    Fast analysis of time‐domain fluorescence lifetime imaging via extreme learning machine

    Get PDF
    We present a fast and accurate analytical method for fluorescence lifetime imaging microscopy (FLIM), using the extreme learning machine (ELM). We used extensive metrics to evaluate ELM and existing algorithms. First, we compared these algorithms using synthetic datasets. The results indicate that ELM can obtain higher fidelity, even in low-photon conditions. Afterwards, we used ELM to retrieve lifetime components from human prostate cancer cells loaded with gold nanosensors, showing that ELM also outperforms the iterative fitting and non-fitting algorithms. By comparing ELM with a computational efficient neural network, ELM achieves comparable accuracy with less training and inference time. As there is no back-propagation process for ELM during the training phase, the training speed is much higher than existing neural network approaches. The proposed strategy is promising for edge computing with online training

    The impact of metabolic overweight/obesity phenotypes on unplanned readmission risk in patients with COPD: a retrospective cohort study

    Get PDF
    Background: There is an inconsistent association between overweight/obesity and chronic obstructive pulmonary disease (COPD). Considering that different metabolic characteristics exist among individuals in the same body mass index (BMI) category, the classification of overweight/obesity based on metabolic status may facilitate the risk assessment of COPD. Our study aimed to explore the relationship between metabolic overweight/obesity phenotypes and unplanned readmission in patients with COPD.Methods: We conducted a retrospective cohort study using the Nationwide Readmissions Database (NRD). According to metabolic overweight/obesity phenotypes, patients were classified into four groups: metabolically healthy non-overweight/obesity (MHNO), metabolically unhealthy non-overweight/obesity (MUNO), metabolically healthy with overweight/obesity (MHO), and metabolically unhealthy with overweight/obesity (MUO). The primary outcome was unplanned readmission to hospital within 30 days of discharge from index hospitalization. Secondary outcomes included in-hospital mortality, length of stay (LOS) and total charges of readmission within 30 days.Results: Among 1,445,890 patients admitted with COPD, 167,156 individuals were unplanned readmitted within 30 days. Patients with the phenotype MUNO [hazard ratio (HR), 1.049; 95%CI, 1.038–1.061; p < 0.001] and MUO (HR, 1.061; 95%CI, 1.045–1.077; p < 0.001) had a higher readmission risk compared with patients with MHNO. But in elders (≥65yr), MHO also had a higher readmission risk (HR, 1.032; 95%CI, 1.002–1.063; p = 0.039). Besides, the readmission risk of COPD patients with hyperglycemia or hypertension regardless of overweight/obesity increased (p < 0.001).Conclusion: In patients with COPD, overweight/obesity alone had little effect on unplanned readmission, whereas metabolic abnormalities regardless of overweight/obesity were associated with an increased risk of unplanned readmission. Among the metabolic abnormalities, particular attention should be paid to hyperglycemia and hypertension. But in elders (≥65yr) overweight/obesity and metabolic abnormalities independently exacerbated the adverse outcomes

    CEEMDAN-IPSO-LSTM: A Novel Model for Short-Term Passenger Flow Prediction in Urban Rail Transit Systems

    No full text
    Urban rail transit (URT) is a key mode of public transport, which serves for greatest user demand. Short-term passenger flow prediction aims to improve management validity and avoid extravagance of public transport resources. In order to anticipate passenger flow for URT, managing nonlinearity, correlation, and periodicity of data series in a single model is difficult. This paper offers a short-term passenger flow prediction combination model based on complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and long-short term memory neural network (LSTM) in order to more accurately anticipate the short-period passenger flow of URT. In the meantime, the hyperparameters of LSTM were calculated using the improved particle swarm optimization (IPSO). First, CEEMDAN-IPSO-LSTM model performed the CEEMDAN decomposition of passenger flow data and obtained uncoupled intrinsic mode functions and a residual sequence after removing noisy data. Second, we built a CEEMDAN-IPSO-LSTM passenger flow prediction model for each decomposed component and extracted prediction values. Third, the experimental results showed that compared with the single LSTM model, CEEMDAN-IPSO-LSTM model reduced by 40 persons/35 persons, 44 persons/35 persons, 37 persons/31 persons, and 46.89%/35.1% in SD, RMSE, MAE, and MAPE, and increase by 2.32%/3.63% and 2.19%/1.67% in R and R2, respectively. This model can reduce the risks of public health security due to excessive crowding of passengers (especially in the period of COVID-19), as well as reduce the negative impact on the environment through the optimization of traffic flows, and develop low-carbon transportation
    corecore