16 research outputs found
DigGAN: Discriminator gradIent Gap Regularization for GAN Training with Limited Data
Generative adversarial nets (GANs) have been remarkably successful at
learning to sample from distributions specified by a given dataset,
particularly if the given dataset is reasonably large compared to its
dimensionality. However, given limited data, classical GANs have struggled, and
strategies like output-regularization, data-augmentation, use of pre-trained
models and pruning have been shown to lead to improvements. Notably, the
applicability of these strategies is 1) often constrained to particular
settings, e.g., availability of a pretrained GAN; or 2) increases training
time, e.g., when using pruning. In contrast, we propose a Discriminator
gradIent Gap regularized GAN (DigGAN) formulation which can be added to any
existing GAN. DigGAN augments existing GANs by encouraging to narrow the gap
between the norm of the gradient of a discriminator's prediction w.r.t.\ real
images and w.r.t.\ the generated samples. We observe this formulation to avoid
bad attractors within the GAN loss landscape, and we find DigGAN to
significantly improve the results of GAN training when limited data is
available. Code is available at \url{https://github.com/AilsaF/DigGAN}.Comment: Accepted to NeurIPS 202
AutoFocusFormer: Image Segmentation off the Grid
Real world images often have highly imbalanced content density. Some areas
are very uniform, e.g., large patches of blue sky, while other areas are
scattered with many small objects. Yet, the commonly used successive grid
downsampling strategy in convolutional deep networks treats all areas equally.
Hence, small objects are represented in very few spatial locations, leading to
worse results in tasks such as segmentation. Intuitively, retaining more pixels
representing small objects during downsampling helps to preserve important
information. To achieve this, we propose AutoFocusFormer (AFF), a
local-attention transformer image recognition backbone, which performs adaptive
downsampling by learning to retain the most important pixels for the task.
Since adaptive downsampling generates a set of pixels irregularly distributed
on the image plane, we abandon the classic grid structure. Instead, we develop
a novel point-based local attention block, facilitated by a balanced clustering
module and a learnable neighborhood merging module, which yields
representations for our point-based versions of state-of-the-art segmentation
heads. Experiments show that our AutoFocusFormer (AFF) improves significantly
over baseline models of similar sizes.Comment: CVPR 202
Pseudo-Generalized Dynamic View Synthesis from a Video
Rendering scenes observed in a monocular video from novel viewpoints is a
challenging problem. For static scenes the community has studied both
scene-specific optimization techniques, which optimize on every test scene, and
generalized techniques, which only run a deep net forward pass on a test scene.
In contrast, for dynamic scenes, scene-specific optimization techniques exist,
but, to our best knowledge, there is currently no generalized method for
dynamic novel view synthesis from a given monocular video. To answer whether
generalized dynamic novel view synthesis from monocular videos is possible
today, we establish an analysis framework based on existing techniques and work
toward the generalized approach. We find a pseudo-generalized process without
scene-specific appearance optimization is possible, but geometrically and
temporally consistent depth estimates are needed. Despite no scene-specific
appearance optimization, the pseudo-generalized approach improves upon some
scene-specific methods.Comment: ICLR 2024; Originally titled as "Is Generalized Dynamic Novel View
Synthesis from Monocular Videos Possible Today?"; Project page:
https://xiaoming-zhao.github.io/projects/pgdv
Function and flexibility of object exploration in kea and New Caledonian crows
Data collection with the New Caledonian crows was funded by an International Seedcorn Award from the University of York to M.L.L. This study was supported by a Rutherford Discovery Fellowship (A.H.T.). Our data are deposited at: http://dx.doi.org/10.5061/dryad.dq04j [48].A range of non-human animals frequently manipulate and explore objects in their environment, which may enable them to learn about physical properties and potentially form more abstract concepts of properties such as weight and rigidity. Whether animals can apply the information learned during their exploration to solve novel problems, however, and whether they actually change their exploratory behavior to seek functional information about objects have not been fully explored. We allowed kea (Nestor notabilis) and New Caledonian crows (Corvus moneduloides) to explore sets of novel objects both before and after encountering a task in which some of the objects could function as tools. Following this, subjects were given test trials in which they could choose among the objects they had explored to solve a tool-use task. Several individuals from both species performed above chance on these test trials, and only did so after exploring the objects, compared with a control experiment with no prior exploration phase. These results suggest that selection of functional tools may be guided by information acquired during exploration. Neither kea nor crows changed the duration or quality of their exploration after learning that the objects had a functional relevance, suggesting that birds do not adjust their behavior to explicitly seek this information.Publisher PDFPeer reviewe
Enabling real-time multi-messenger astrophysics discoveries with deep learning
Multi-messenger astrophysics is a fast-growing, interdisciplinary field that combines data, which vary in volume and speed of data processing, from many different instruments that probe the Universe using different cosmic messengers: electromagnetic waves, cosmic rays, gravitational waves and neutrinos. In this Expert Recommendation, we review the key challenges of real-time observations of gravitational wave sources and their electromagnetic and astroparticle counterparts, and make a number of recommendations to maximize their potential for scientific discovery. These recommendations refer to the design of scalable and computationally efficient machine learning algorithms; the cyber-infrastructure to numerically simulate astrophysical sources, and to process and interpret multi-messenger astrophysics data; the management of gravitational wave detections to trigger real-time alerts for electromagnetic and astroparticle follow-ups; a vision to harness future developments of machine learning and cyber-infrastructure resources to cope with the big-data requirements; and the need to build a community of experts to realize the goals of multi-messenger astrophysics
Function and flexibility of object exploration in kea and New Caledonian crows
A range of non-human animals frequently manipulate and explore objects in their environment, which may enable them to learn about physical properties and potentially form more abstract concepts of properties such as weight and rigidity. Whether animals can apply the information learned during their exploration to solve novel problems, however, and whether they actually change their exploratory behavior to seek functional information about objects have not been fully explored. We allowed kea (Nestor notabilis) and New Caledonian crows (Corvus moneduloides) to explore sets of novel objects both before and after encountering a task in which some of the objects could function as tools. Following this, subjects were given test trials in which they could choose among the objects they had explored to solve a tool-use task. Several individuals from both species performed above chance on these test trials, and only did so after exploring the objects, compared with a control experiment with no prior exploration phase. These results suggest that selection of functional tools may be guided by information acquired during exploration. Neither kea nor crows changed the duration or quality of their exploration after learning that the objects had a functional relevance, suggesting that birds do not adjust their behavior to explicitly seek this information.</p