151 research outputs found

    Generative-Discriminative Complementary Learning

    Get PDF
    Majority of state-of-the-art deep learning methods are discriminative approaches, which model the conditional distribution of labels given inputs features. The success of such approaches heavily depends on high-quality labeled instances, which are not easy to obtain, especially as the number of candidate classes increases. In this paper, we study the complementary learning problem. Unlike ordinary labels, complementary labels are easy to obtain because an annotator only needs to provide a yes/no answer to a randomly chosen candidate class for each instance. We propose a generative-discriminative complementary learning method that estimates the ordinary labels by modeling both the conditional (discriminative) and instance (generative) distributions. Our method, we call Complementary Conditional GAN (CCGAN), improves the accuracy of predicting ordinary labels and can generate high-quality instances in spite of weak supervision. In addition to the extensive empirical studies, we also theoretically show that our model can retrieve the true conditional distribution from the complementarily-labeled data

    MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices

    Full text link
    The deployment of large-scale text-to-image diffusion models on mobile devices is impeded by their substantial model size and slow inference speed. In this paper, we propose \textbf{MobileDiffusion}, a highly efficient text-to-image diffusion model obtained through extensive optimizations in both architecture and sampling techniques. We conduct a comprehensive examination of model architecture design to reduce redundancy, enhance computational efficiency, and minimize model's parameter count, while preserving image generation quality. Additionally, we employ distillation and diffusion-GAN finetuning techniques on MobileDiffusion to achieve 8-step and 1-step inference respectively. Empirical studies, conducted both quantitatively and qualitatively, demonstrate the effectiveness of our proposed techniques. MobileDiffusion achieves a remarkable \textbf{sub-second} inference speed for generating a 512×512512\times512 image on mobile devices, establishing a new state of the art

    Learning to screen Glaucoma like the ophthalmologists

    Full text link
    GAMMA Challenge is organized to encourage the AI models to screen the glaucoma from a combination of 2D fundus image and 3D optical coherence tomography volume, like the ophthalmologists

    Effective Drusen Localization for Early AMD Screening using Sparse Multiple Instance Learning

    Get PDF
    Age-related Macular Degeneration (AMD) is one of the leading causes of blindness. Automatic screening of AMD has attracted much research effort in recent years because it brings benefits to both patients and ophthalmologists. Drusen is an important clinical indicator for AMD in its early stage. Accurately detecting and localizing drusen are important for AMD detection and grading. In this paper, we propose an effective approach to localize drusen in fundus images. This approach trains a drusen classifier from a weakly labeled dataset, i.e., only the existence of drusen is known but not the exact locations or boundaries, by employing Multiple Instance Learning (MIL). Specifically, considering the sparsity of drusen in fundus images, we employ sparse Multiple Instance Learning to obtain better performance compared with classical MIL. Experiments on 350 fundus images with 96 having AMD demonstrates that on the task of AMD detection, multiple instance learning, both classical and sparse versions, achieve comparable performance compared with fully supervised SVM. On the task of drusen localization, sparse MIL outperforms MIL significantly
    • …
    corecore