247 research outputs found
Revisiting Single Image Reflection Removal In the Wild
This research focuses on the issue of single-image reflection removal (SIRR)
in real-world conditions, examining it from two angles: the collection pipeline
of real reflection pairs and the perception of real reflection locations. We
devise an advanced reflection collection pipeline that is highly adaptable to a
wide range of real-world reflection scenarios and incurs reduced costs in
collecting large-scale aligned reflection pairs. In the process, we develop a
large-scale, high-quality reflection dataset named Reflection Removal in the
Wild (RRW). RRW contains over 14,950 high-resolution real-world reflection
pairs, a dataset forty-five times larger than its predecessors. Regarding
perception of reflection locations, we identify that numerous virtual
reflection objects visible in reflection images are not present in the
corresponding ground-truth images. This observation, drawn from the aligned
pairs, leads us to conceive the Maximum Reflection Filter (MaxRF). The MaxRF
could accurately and explicitly characterize reflection locations from pairs of
images. Building upon this, we design a reflection location-aware cascaded
framework, specifically tailored for SIRR. Powered by these innovative
techniques, our solution achieves superior performance than current leading
methods across multiple real-world benchmarks. Codes and datasets will be
publicly available
Contrastive Masked Autoencoders are Stronger Vision Learners
Masked image modeling (MIM) has achieved promising results on various vision
tasks. However, the limited discriminability of learned representation
manifests there is still plenty to go for making a stronger vision learner.
Towards this goal, we propose Contrastive Masked Autoencoders (CMAE), a new
self-supervised pre-training method for learning more comprehensive and capable
vision representations. By elaboratively unifying contrastive learning (CL) and
masked image model (MIM) through novel designs, CMAE leverages their respective
advantages and learns representations with both strong instance
discriminability and local perceptibility. Specifically, CMAE consists of two
branches where the online branch is an asymmetric encoder-decoder and the
momentum branch is a momentum updated encoder. During training, the online
encoder reconstructs original images from latent representations of masked
images to learn holistic features. The momentum encoder, fed with the full
images, enhances the feature discriminability via contrastive learning with its
online counterpart. To make CL compatible with MIM, CMAE introduces two new
components, i.e. pixel shifting for generating plausible positive views and
feature decoder for complementing features of contrastive pairs. Thanks to
these novel designs, CMAE effectively improves the representation quality and
transfer performance over its MIM counterpart. CMAE achieves the
state-of-the-art performance on highly competitive benchmarks of image
classification, semantic segmentation and object detection. Notably, CMAE-Base
achieves top-1 accuracy on ImageNet and mIoU on ADE20k,
surpassing previous best results by and respectively. The
source code is publicly accessible at
\url{https://github.com/ZhichengHuang/CMAE}.Comment: Accepted by TPAM
Application of the Kaiser score by MRI in patients with breast lesions by ultrasound and mammography
PURPOSEThis study aimed to verify whether the use of the Kaiser score can improve the diagnostic performance in breast magnetic resonance imaging (MRI) for suspicious lesions and avoid further invasive diagnostic approaches.METHODSThis retrospective study enrolled 97 patients who underwent breast MRI before undergoing breast biopsy or surgery. Evaluations were conducted on all MRI images individually by 2 radiologists using the Kaiser score. Neither radiologist had the knowledge of the final histopathological diagnosis. The ability of the Kaiser score in diagnosis was established via a receiver performing characteristic (ROC) analysis, which was measured by the area under the ROC curve (AUC). Youden index was used to define the optimal cutoff value. Kaiser score categories were dichotomized into positive (cutoff score > 4) and negative scores (cutoff score ≤ 4). Cohen’s kappa coefficient was used to analyze the inter-rater agreement.RESULTSHistopathology revealed 56 malignant and 41 benign lesions. The AUC for all lesions evaluated by the Kaiser score was 0.992 (95% CI: 0.981-1.0) and 0.958 (95% CI: 0.920-0.996) for 2 radiologists, respectively. Inter-rater agreement of the dichotomized Kaiser score was excellent (κ=0.894, P < .001). A total of 20 lesions (33.8%) previously categorized as BI-RADS 4 were reduced to BI-RADS 2/3 (19 benign lesions and 1 malignant lesion).CONCLUSIONThe Kaiser score is a valuable auxiliary diagnostic tool for improving the diagnostic ability of radiologists, whose experiences in breast MRI are diverse. In some cases, the application of the Kaiser score could possibly avoid unnecessary breast biopsies
The Impact of Whole Brain Global Functional Connectivity Density Following MECT in Major Depression: A Follow-Up Study
To explore the alteration of global functional connectivity density (gFCD) in depressive patients after modified electroconvulsive therapy (MECT) and analyze the relationship between gFCD and clinical outcome. Thirty-seven subjects were evaluated based on the diagnostic criteria of the International Classification of Diseases-10 (ICD-10), consisting of a depressive group (24 patients after follow-ups) and a healthy control group with 13 normal individuals. All participants received Hamilton Depression Scale (HAMD) scores and resting-state functional magnetic resonance imaging scans. The gFCD significantly increased in the posterior-middle insula, the supra-marginal gyrus and the dorsal medial prefrontal cortex (dmPFC) before MECT treatment compared to healthy controlled patients. The gFCD statistically expanded in the perigenual anterior cingulate cortex (pgACC), the orbitofrontal cortex bilaterally and the left-supra-marginal gyrus after MECT, and it decreased notably in the posterior insula. The gFCD in the pgACC and the right orbital frontal cortex of depressive group before MECT showed a positive correlation with HAMD scores with treatment. Conforming to the impact of gFCD in depressive patients after MECT, the aforementioned brain region may become an indicator of MECT effect
Second primary malignancies in cervical cancer and endometrial cancer survivors: a population-based analysis
Background: We evaluated the relative attribution and interactions of treatment and patient-related risk factors for second primary malignancies (SPMs) in cervical and endometrial cancer survivors.
Methods: Stage I–III cervical and endometrial cancer survivors’ data from the Surveillance, Epidemiology, and End Results (SEER) registry between January 1988 and December 2015 were analyzed. The standardized incidence ratio (SIR), excess absolute risk (EAR), and corresponding 95% confidence interval (95% CI) values were calculated. Analyses were classified based on proxies of human papillomavirus (HPV), smoking, hormone, and radiotherapy (RT) status. Additive and multiplicative interactions were assessed.
Results: Cervical cancer survivors had a higher risk for developing potentially HPV and smoking-related SPMs, especially in the RT group (SIRHPV = 3.7, 95% CI: 2.9–4.6; SIRsmoking = 3.2, 95% CI: 2.8–3.6). Second vaginal cancer patients had the highest SIR (23.8, 95% CI: 14.9–36.0). There were strong synergistic interactions between RT and the proxy of smoking (Pinteraction < 0.001), accounting for 36% of potentially smoking-related SPMs in cervical cancer survivors.
Conclusions: RT, HPV, and smoking promote SPMs in cervical cancer to different extents. The SPM burden in cervical cancer survivors could be mostly attributed to smoking and RT and their interactions
Revealing influencing factors on global waste distribution via deep-learning based dumpsite detection from satellite imagery
AbstractWith the advancement of global civilisation, monitoring and managing dumpsites have become essential parts of environmental governance in various countries. Dumpsite locations are difficult to obtain in a timely manner by local government agencies and environmental groups. The World Bank shows that governments need to spend massive labour and economic costs to collect illegal dumpsites to implement management. Here we show that applying novel deep convolutional networks to high-resolution satellite images can provide an effective, efficient, and low-cost method to detect dumpsites. In sampled areas of 28 cities around the world, our model detects nearly 1000 dumpsites that appeared around 2021. This approach reduces the investigation time by more than 96.8% compared with the manual method. With this novel and powerful methodology, it is now capable of analysing the relationship between dumpsites and various social attributes on a global scale, temporally and spatially.</jats:p
Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs
Nowadays, Large Language Models (LLMs) have been trained using extended
context lengths to foster more creative applications. However, long context
training poses great challenges considering the constraint of GPU memory. It
not only leads to substantial activation memory consumption during training,
but also incurs considerable memory fragmentation. To facilitate long context
training, existing frameworks have adopted strategies such as recomputation and
various forms of parallelisms. Nevertheless, these techniques rely on redundant
computation or extensive communication, resulting in low Model FLOPS
Utilization (MFU). In this paper, we propose MEMO, a novel LLM training
framework designed for fine-grained activation memory management. Given the
quadratic scaling of computation and linear scaling of memory with sequence
lengths when using FlashAttention, we offload memory-consuming activations to
CPU memory after each layer's forward pass and fetch them during the backward
pass. To maximize the swapping of activations without hindering computation,
and to avoid exhausting limited CPU memory, we implement a token-wise
activation recomputation and swapping mechanism. Furthermore, we tackle the
memory fragmentation issue by employing a bi-level Mixed Integer Programming
(MIP) approach, optimizing the reuse of memory across transformer layers.
Empirical results demonstrate that MEMO achieves an average of 2.42x and 2.26x
MFU compared to Megatron-LM and DeepSpeed, respectively. This improvement is
attributed to MEMO's ability to minimize memory fragmentation, reduce
recomputation and intensive communication, and circumvent the delays associated
with the memory reorganization process due to fragmentation. By leveraging
fine-grained activation memory management, MEMO facilitates efficient training
of 7B LLM with 1 million sequence length on just 8 A800 GPUs, achieving an MFU
of 52.30%
- …
