247 research outputs found

    Revisiting Single Image Reflection Removal In the Wild

    Full text link
    This research focuses on the issue of single-image reflection removal (SIRR) in real-world conditions, examining it from two angles: the collection pipeline of real reflection pairs and the perception of real reflection locations. We devise an advanced reflection collection pipeline that is highly adaptable to a wide range of real-world reflection scenarios and incurs reduced costs in collecting large-scale aligned reflection pairs. In the process, we develop a large-scale, high-quality reflection dataset named Reflection Removal in the Wild (RRW). RRW contains over 14,950 high-resolution real-world reflection pairs, a dataset forty-five times larger than its predecessors. Regarding perception of reflection locations, we identify that numerous virtual reflection objects visible in reflection images are not present in the corresponding ground-truth images. This observation, drawn from the aligned pairs, leads us to conceive the Maximum Reflection Filter (MaxRF). The MaxRF could accurately and explicitly characterize reflection locations from pairs of images. Building upon this, we design a reflection location-aware cascaded framework, specifically tailored for SIRR. Powered by these innovative techniques, our solution achieves superior performance than current leading methods across multiple real-world benchmarks. Codes and datasets will be publicly available

    Contrastive Masked Autoencoders are Stronger Vision Learners

    Full text link
    Masked image modeling (MIM) has achieved promising results on various vision tasks. However, the limited discriminability of learned representation manifests there is still plenty to go for making a stronger vision learner. Towards this goal, we propose Contrastive Masked Autoencoders (CMAE), a new self-supervised pre-training method for learning more comprehensive and capable vision representations. By elaboratively unifying contrastive learning (CL) and masked image model (MIM) through novel designs, CMAE leverages their respective advantages and learns representations with both strong instance discriminability and local perceptibility. Specifically, CMAE consists of two branches where the online branch is an asymmetric encoder-decoder and the momentum branch is a momentum updated encoder. During training, the online encoder reconstructs original images from latent representations of masked images to learn holistic features. The momentum encoder, fed with the full images, enhances the feature discriminability via contrastive learning with its online counterpart. To make CL compatible with MIM, CMAE introduces two new components, i.e. pixel shifting for generating plausible positive views and feature decoder for complementing features of contrastive pairs. Thanks to these novel designs, CMAE effectively improves the representation quality and transfer performance over its MIM counterpart. CMAE achieves the state-of-the-art performance on highly competitive benchmarks of image classification, semantic segmentation and object detection. Notably, CMAE-Base achieves 85.3%85.3\% top-1 accuracy on ImageNet and 52.5%52.5\% mIoU on ADE20k, surpassing previous best results by 0.7%0.7\% and 1.8%1.8\% respectively. The source code is publicly accessible at \url{https://github.com/ZhichengHuang/CMAE}.Comment: Accepted by TPAM

    Quality-Optimized and Secure End-to-End Authentication for Media Delivery

    Full text link

    Application of the Kaiser score by MRI in patients with breast lesions by ultrasound and mammography

    Get PDF
    PURPOSEThis study aimed to verify whether the use of the Kaiser score can improve the diagnostic performance in breast magnetic resonance imaging (MRI) for suspicious lesions and avoid further invasive diagnostic approaches.METHODSThis retrospective study enrolled 97 patients who underwent breast MRI before undergoing breast biopsy or surgery. Evaluations were conducted on all MRI images individually by 2 radiologists using the Kaiser score. Neither radiologist had the knowledge of the final histopathological diagnosis. The ability of the Kaiser score in diagnosis was established via a receiver performing characteristic (ROC) analysis, which was measured by the area under the ROC curve (AUC). Youden index was used to define the optimal cutoff value. Kaiser score categories were dichotomized into positive (cutoff score > 4) and negative scores (cutoff score ≤ 4). Cohen’s kappa coefficient was used to analyze the inter-rater agreement.RESULTSHistopathology revealed 56 malignant and 41 benign lesions. The AUC for all lesions evaluated by the Kaiser score was 0.992 (95% CI: 0.981-1.0) and 0.958 (95% CI: 0.920-0.996) for 2 radiologists, respectively. Inter-rater agreement of the dichotomized Kaiser score was excellent (κ=0.894, P < .001). A total of 20 lesions (33.8%) previously categorized as BI-RADS 4 were reduced to BI-RADS 2/3 (19 benign lesions and 1 malignant lesion).CONCLUSIONThe Kaiser score is a valuable auxiliary diagnostic tool for improving the diagnostic ability of radiologists, whose experiences in breast MRI are diverse. In some cases, the application of the Kaiser score could possibly avoid unnecessary breast biopsies

    The Impact of Whole Brain Global Functional Connectivity Density Following MECT in Major Depression: A Follow-Up Study

    Get PDF
    To explore the alteration of global functional connectivity density (gFCD) in depressive patients after modified electroconvulsive therapy (MECT) and analyze the relationship between gFCD and clinical outcome. Thirty-seven subjects were evaluated based on the diagnostic criteria of the International Classification of Diseases-10 (ICD-10), consisting of a depressive group (24 patients after follow-ups) and a healthy control group with 13 normal individuals. All participants received Hamilton Depression Scale (HAMD) scores and resting-state functional magnetic resonance imaging scans. The gFCD significantly increased in the posterior-middle insula, the supra-marginal gyrus and the dorsal medial prefrontal cortex (dmPFC) before MECT treatment compared to healthy controlled patients. The gFCD statistically expanded in the perigenual anterior cingulate cortex (pgACC), the orbitofrontal cortex bilaterally and the left-supra-marginal gyrus after MECT, and it decreased notably in the posterior insula. The gFCD in the pgACC and the right orbital frontal cortex of depressive group before MECT showed a positive correlation with HAMD scores with treatment. Conforming to the impact of gFCD in depressive patients after MECT, the aforementioned brain region may become an indicator of MECT effect

    Second primary malignancies in cervical cancer and endometrial cancer survivors: a population-based analysis

    Get PDF
    Background: We evaluated the relative attribution and interactions of treatment and patient-related risk factors for second primary malignancies (SPMs) in cervical and endometrial cancer survivors. Methods: Stage I–III cervical and endometrial cancer survivors’ data from the Surveillance, Epidemiology, and End Results (SEER) registry between January 1988 and December 2015 were analyzed. The standardized incidence ratio (SIR), excess absolute risk (EAR), and corresponding 95% confidence interval (95% CI) values were calculated. Analyses were classified based on proxies of human papillomavirus (HPV), smoking, hormone, and radiotherapy (RT) status. Additive and multiplicative interactions were assessed. Results: Cervical cancer survivors had a higher risk for developing potentially HPV and smoking-related SPMs, especially in the RT group (SIRHPV = 3.7, 95% CI: 2.9–4.6; SIRsmoking = 3.2, 95% CI: 2.8–3.6). Second vaginal cancer patients had the highest SIR (23.8, 95% CI: 14.9–36.0). There were strong synergistic interactions between RT and the proxy of smoking (Pinteraction < 0.001), accounting for 36% of potentially smoking-related SPMs in cervical cancer survivors. Conclusions: RT, HPV, and smoking promote SPMs in cervical cancer to different extents. The SPM burden in cervical cancer survivors could be mostly attributed to smoking and RT and their interactions

    Revealing influencing factors on global waste distribution via deep-learning based dumpsite detection from satellite imagery

    Get PDF
    AbstractWith the advancement of global civilisation, monitoring and managing dumpsites have become essential parts of environmental governance in various countries. Dumpsite locations are difficult to obtain in a timely manner by local government agencies and environmental groups. The World Bank shows that governments need to spend massive labour and economic costs to collect illegal dumpsites to implement management. Here we show that applying novel deep convolutional networks to high-resolution satellite images can provide an effective, efficient, and low-cost method to detect dumpsites. In sampled areas of 28 cities around the world, our model detects nearly 1000 dumpsites that appeared around 2021. This approach reduces the investigation time by more than 96.8% compared with the manual method. With this novel and powerful methodology, it is now capable of analysing the relationship between dumpsites and various social attributes on a global scale, temporally and spatially.</jats:p

    Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs

    Full text link
    Nowadays, Large Language Models (LLMs) have been trained using extended context lengths to foster more creative applications. However, long context training poses great challenges considering the constraint of GPU memory. It not only leads to substantial activation memory consumption during training, but also incurs considerable memory fragmentation. To facilitate long context training, existing frameworks have adopted strategies such as recomputation and various forms of parallelisms. Nevertheless, these techniques rely on redundant computation or extensive communication, resulting in low Model FLOPS Utilization (MFU). In this paper, we propose MEMO, a novel LLM training framework designed for fine-grained activation memory management. Given the quadratic scaling of computation and linear scaling of memory with sequence lengths when using FlashAttention, we offload memory-consuming activations to CPU memory after each layer's forward pass and fetch them during the backward pass. To maximize the swapping of activations without hindering computation, and to avoid exhausting limited CPU memory, we implement a token-wise activation recomputation and swapping mechanism. Furthermore, we tackle the memory fragmentation issue by employing a bi-level Mixed Integer Programming (MIP) approach, optimizing the reuse of memory across transformer layers. Empirical results demonstrate that MEMO achieves an average of 2.42x and 2.26x MFU compared to Megatron-LM and DeepSpeed, respectively. This improvement is attributed to MEMO's ability to minimize memory fragmentation, reduce recomputation and intensive communication, and circumvent the delays associated with the memory reorganization process due to fragmentation. By leveraging fine-grained activation memory management, MEMO facilitates efficient training of 7B LLM with 1 million sequence length on just 8 A800 GPUs, achieving an MFU of 52.30%
    corecore