16 research outputs found

    DigGAN: Discriminator gradIent Gap Regularization for GAN Training with Limited Data

    Full text link
    Generative adversarial nets (GANs) have been remarkably successful at learning to sample from distributions specified by a given dataset, particularly if the given dataset is reasonably large compared to its dimensionality. However, given limited data, classical GANs have struggled, and strategies like output-regularization, data-augmentation, use of pre-trained models and pruning have been shown to lead to improvements. Notably, the applicability of these strategies is 1) often constrained to particular settings, e.g., availability of a pretrained GAN; or 2) increases training time, e.g., when using pruning. In contrast, we propose a Discriminator gradIent Gap regularized GAN (DigGAN) formulation which can be added to any existing GAN. DigGAN augments existing GANs by encouraging to narrow the gap between the norm of the gradient of a discriminator's prediction w.r.t.\ real images and w.r.t.\ the generated samples. We observe this formulation to avoid bad attractors within the GAN loss landscape, and we find DigGAN to significantly improve the results of GAN training when limited data is available. Code is available at \url{https://github.com/AilsaF/DigGAN}.Comment: Accepted to NeurIPS 202

    AutoFocusFormer: Image Segmentation off the Grid

    Full text link
    Real world images often have highly imbalanced content density. Some areas are very uniform, e.g., large patches of blue sky, while other areas are scattered with many small objects. Yet, the commonly used successive grid downsampling strategy in convolutional deep networks treats all areas equally. Hence, small objects are represented in very few spatial locations, leading to worse results in tasks such as segmentation. Intuitively, retaining more pixels representing small objects during downsampling helps to preserve important information. To achieve this, we propose AutoFocusFormer (AFF), a local-attention transformer image recognition backbone, which performs adaptive downsampling by learning to retain the most important pixels for the task. Since adaptive downsampling generates a set of pixels irregularly distributed on the image plane, we abandon the classic grid structure. Instead, we develop a novel point-based local attention block, facilitated by a balanced clustering module and a learnable neighborhood merging module, which yields representations for our point-based versions of state-of-the-art segmentation heads. Experiments show that our AutoFocusFormer (AFF) improves significantly over baseline models of similar sizes.Comment: CVPR 202

    Pseudo-Generalized Dynamic View Synthesis from a Video

    Full text link
    Rendering scenes observed in a monocular video from novel viewpoints is a challenging problem. For static scenes the community has studied both scene-specific optimization techniques, which optimize on every test scene, and generalized techniques, which only run a deep net forward pass on a test scene. In contrast, for dynamic scenes, scene-specific optimization techniques exist, but, to our best knowledge, there is currently no generalized method for dynamic novel view synthesis from a given monocular video. To answer whether generalized dynamic novel view synthesis from monocular videos is possible today, we establish an analysis framework based on existing techniques and work toward the generalized approach. We find a pseudo-generalized process without scene-specific appearance optimization is possible, but geometrically and temporally consistent depth estimates are needed. Despite no scene-specific appearance optimization, the pseudo-generalized approach improves upon some scene-specific methods.Comment: ICLR 2024; Originally titled as "Is Generalized Dynamic Novel View Synthesis from Monocular Videos Possible Today?"; Project page: https://xiaoming-zhao.github.io/projects/pgdv

    Function and flexibility of object exploration in kea and New Caledonian crows

    Get PDF
    Data collection with the New Caledonian crows was funded by an International Seedcorn Award from the University of York to M.L.L. This study was supported by a Rutherford Discovery Fellowship (A.H.T.). Our data are deposited at: http://dx.doi.org/10.5061/dryad.dq04j [48].A range of non-human animals frequently manipulate and explore objects in their environment, which may enable them to learn about physical properties and potentially form more abstract concepts of properties such as weight and rigidity. Whether animals can apply the information learned during their exploration to solve novel problems, however, and whether they actually change their exploratory behavior to seek functional information about objects have not been fully explored. We allowed kea (Nestor notabilis) and New Caledonian crows (Corvus moneduloides) to explore sets of novel objects both before and after encountering a task in which some of the objects could function as tools. Following this, subjects were given test trials in which they could choose among the objects they had explored to solve a tool-use task. Several individuals from both species performed above chance on these test trials, and only did so after exploring the objects, compared with a control experiment with no prior exploration phase. These results suggest that selection of functional tools may be guided by information acquired during exploration. Neither kea nor crows changed the duration or quality of their exploration after learning that the objects had a functional relevance, suggesting that birds do not adjust their behavior to explicitly seek this information.Publisher PDFPeer reviewe

    Enabling real-time multi-messenger astrophysics discoveries with deep learning

    Get PDF
    Multi-messenger astrophysics is a fast-growing, interdisciplinary field that combines data, which vary in volume and speed of data processing, from many different instruments that probe the Universe using different cosmic messengers: electromagnetic waves, cosmic rays, gravitational waves and neutrinos. In this Expert Recommendation, we review the key challenges of real-time observations of gravitational wave sources and their electromagnetic and astroparticle counterparts, and make a number of recommendations to maximize their potential for scientific discovery. These recommendations refer to the design of scalable and computationally efficient machine learning algorithms; the cyber-infrastructure to numerically simulate astrophysical sources, and to process and interpret multi-messenger astrophysics data; the management of gravitational wave detections to trigger real-time alerts for electromagnetic and astroparticle follow-ups; a vision to harness future developments of machine learning and cyber-infrastructure resources to cope with the big-data requirements; and the need to build a community of experts to realize the goals of multi-messenger astrophysics

    Function and flexibility of object exploration in kea and New Caledonian crows

    No full text
    A range of non-human animals frequently manipulate and explore objects in their environment, which may enable them to learn about physical properties and potentially form more abstract concepts of properties such as weight and rigidity. Whether animals can apply the information learned during their exploration to solve novel problems, however, and whether they actually change their exploratory behavior to seek functional information about objects have not been fully explored. We allowed kea (Nestor notabilis) and New Caledonian crows (Corvus moneduloides) to explore sets of novel objects both before and after encountering a task in which some of the objects could function as tools. Following this, subjects were given test trials in which they could choose among the objects they had explored to solve a tool-use task. Several individuals from both species performed above chance on these test trials, and only did so after exploring the objects, compared with a control experiment with no prior exploration phase. These results suggest that selection of functional tools may be guided by information acquired during exploration. Neither kea nor crows changed the duration or quality of their exploration after learning that the objects had a functional relevance, suggesting that birds do not adjust their behavior to explicitly seek this information.</p
    corecore