151 research outputs found
Generative-Discriminative Complementary Learning
Majority of state-of-the-art deep learning methods are discriminative
approaches, which model the conditional distribution of labels given inputs
features. The success of such approaches heavily depends on high-quality
labeled instances, which are not easy to obtain, especially as the number of
candidate classes increases. In this paper, we study the complementary learning
problem. Unlike ordinary labels, complementary labels are easy to obtain
because an annotator only needs to provide a yes/no answer to a randomly chosen
candidate class for each instance. We propose a generative-discriminative
complementary learning method that estimates the ordinary labels by modeling
both the conditional (discriminative) and instance (generative) distributions.
Our method, we call Complementary Conditional GAN (CCGAN), improves the
accuracy of predicting ordinary labels and can generate high-quality instances
in spite of weak supervision. In addition to the extensive empirical studies,
we also theoretically show that our model can retrieve the true conditional
distribution from the complementarily-labeled data
MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices
The deployment of large-scale text-to-image diffusion models on mobile
devices is impeded by their substantial model size and slow inference speed. In
this paper, we propose \textbf{MobileDiffusion}, a highly efficient
text-to-image diffusion model obtained through extensive optimizations in both
architecture and sampling techniques. We conduct a comprehensive examination of
model architecture design to reduce redundancy, enhance computational
efficiency, and minimize model's parameter count, while preserving image
generation quality. Additionally, we employ distillation and diffusion-GAN
finetuning techniques on MobileDiffusion to achieve 8-step and 1-step inference
respectively. Empirical studies, conducted both quantitatively and
qualitatively, demonstrate the effectiveness of our proposed techniques.
MobileDiffusion achieves a remarkable \textbf{sub-second} inference speed for
generating a image on mobile devices, establishing a new state
of the art
Learning to screen Glaucoma like the ophthalmologists
GAMMA Challenge is organized to encourage the AI models to screen the
glaucoma from a combination of 2D fundus image and 3D optical coherence
tomography volume, like the ophthalmologists
Effective Drusen Localization for Early AMD Screening using Sparse Multiple Instance Learning
Age-related Macular Degeneration (AMD) is one of the leading causes of blindness. Automatic screening of AMD has attracted much research effort in recent years because it brings benefits to both patients and ophthalmologists. Drusen is an important clinical indicator for AMD in its early stage. Accurately detecting and localizing drusen are important for AMD detection and grading. In this paper, we propose an effective approach to localize drusen in fundus images. This approach trains a drusen classifier from a weakly labeled dataset, i.e., only the existence of drusen is known but not the exact locations or boundaries, by employing Multiple Instance Learning (MIL). Specifically, considering the sparsity of drusen in fundus images, we employ sparse Multiple Instance Learning to obtain better performance compared with classical MIL. Experiments on 350 fundus images with 96 having AMD demonstrates that on the task of AMD detection, multiple instance learning, both classical and sparse versions, achieve comparable performance compared with fully supervised SVM. On the task of drusen localization, sparse MIL outperforms MIL significantly
- …