296 research outputs found

    Learning Dense Correspondences between Photos and Sketches

    Full text link
    Humans effortlessly grasp the connection between sketches and real-world objects, even when these sketches are far from realistic. Moreover, human sketch understanding goes beyond categorization -- critically, it also entails understanding how individual elements within a sketch correspond to parts of the physical world it represents. What are the computational ingredients needed to support this ability? Towards answering this question, we make two contributions: first, we introduce a new sketch-photo correspondence benchmark, PSC6k\textit{PSC6k}, containing 150K annotations of 6250 sketch-photo pairs across 125 object categories, augmenting the existing Sketchy dataset with fine-grained correspondence metadata. Second, we propose a self-supervised method for learning dense correspondences between sketch-photo pairs, building upon recent advances in correspondence learning for pairs of photos. Our model uses a spatial transformer network to estimate the warp flow between latent representations of a sketch and photo extracted by a contrastive learning-based ConvNet backbone. We found that this approach outperformed several strong baselines and produced predictions that were quantitatively consistent with other warp-based methods. However, our benchmark also revealed systematic differences between predictions of the suite of models we tested and those of humans. Taken together, our work suggests a promising path towards developing artificial systems that achieve more human-like understanding of visual images at different levels of abstraction. Project page: https://photo-sketch-correspondence.github.ioComment: Accepted to ICML 2023. Project page: https://photo-sketch-correspondence.github.i

    Visual scoping operations for physical assembly

    Full text link
    Planning is hard. The use of subgoals can make planning more tractable, but selecting these subgoals is computationally costly. What algorithms might enable us to reap the benefits of planning using subgoals while minimizing the computational overhead of selecting them? We propose visual scoping, a strategy that interleaves planning and acting by alternately defining a spatial region as the next subgoal and selecting actions to achieve it. We evaluated our visual scoping algorithm on a variety of physical assembly problems against two baselines: planning all subgoals in advance and planning without subgoals. We found that visual scoping achieves comparable task performance to the subgoal planner while requiring only a fraction of the total computational cost. Together, these results contribute to our understanding of how humans might make efficient use of cognitive resources to solve complex planning problems

    Drawing as a versatile cognitive tool

    Get PDF
    Drawing is a cognitive tool that makes the invisible contents of mental life visible. Humans use this tool to produce a remarkable variety of pictures, from realistic portraits to schematic diagrams. Despite this variety and the prevalence of drawn images, the psychological mechanisms that enable drawings to be so versatile have yet to be fully explored. In this Review, we synthesize contemporary work in multiple areas of psychology, computer science and neuroscience that examines the cognitive processes involved in drawing production and comprehension. This body of findings suggests that the balance of contributions from perception, memory and social inference during drawing production varies depending on the situation, resulting in some drawings that are more realistic and other drawings that are more abstract. We also consider the use of drawings as a research tool for investigating various aspects of cognition, as well as the role that drawing has in facilitating learning and communication. Taken together, information about how drawings are used in different contexts illuminates the central role of visually grounded abstractions in human thought and behaviour

    Learning to communicate about shared procedural abstractions

    Full text link
    Many real-world tasks require agents to coordinate their behavior to achieve shared goals. Successful collaboration requires not only adopting the same communicative conventions, but also grounding these conventions in the same task-appropriate conceptual abstractions. We investigate how humans use natural language to collaboratively solve physical assembly problems more effectively over time. Human participants were paired up in an online environment to reconstruct scenes containing two block towers. One participant could see the target towers, and sent assembly instructions for the other participant to reconstruct. Participants provided increasingly concise instructions across repeated attempts on each pair of towers, using higher-level referring expressions that captured each scene's hierarchical structure. To explain these findings, we extend recent probabilistic models of ad-hoc convention formation with an explicit perceptual learning mechanism. These results shed light on the inductive biases that enable intelligent agents to coordinate upon shared procedural abstractions

    Impaired Autophagic Clearance with a Gain-of-Function Variant of the Lysosomal Cl−/H+ Exchanger ClC-7

    Get PDF
    ClC-7 is a ubiquitously expressed voltage-gated Cl−/H+ exchanger that critically contributes to lysosomal ion homeostasis. Together with its β-subunit Ostm1, ClC-7 localizes to lysosomes and to the ruffled border of osteoclasts, where it supports the acidification of the resorption lacuna. Loss of ClC-7 or Ostm1 leads to osteopetrosis accompanied by accumulation of storage material in lysosomes and neurodegeneration. Interestingly, not all osteopetrosis-causing CLCN7 mutations from patients are associated with a loss of ion transport. Some rather result in an acceleration of voltage-dependent ClC-7 activation. Recently, a gain-of-function variant, ClC-7Y715C, that yields larger ion currents upon heterologous expression, was identified in two patients with neurodegeneration, organomegaly and albinism. However, neither the patients nor a mouse model that carried the equivalent mutation developed osteopetrosis, although expression of ClC-7Y715C induced the formation of enlarged intracellular vacuoles. Here, we investigated how, in transfected cells with mutant ClC-7, the substitution of this tyrosine impinged on the morphology and function of lysosomes. Combinations of the tyrosine mutation with mutations that either uncouple Cl− from H+ counter-transport or strongly diminish overall ion currents were used to show that increased ClC-7 Cl−/H+ exchange activity is required for the formation of enlarged vacuoles by membrane fusion. Degradation of endocytosed material was reduced in these compartments and resulted in an accumulation of lysosomal storage material. In cells expressing the ClC-7 gain-of-function mutant, autophagic clearance was largely impaired, resulting in a build-up of autophagic material

    Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties

    Full text link
    General physical scene understanding requires more than simply localizing and recognizing objects -- it requires knowledge that objects can have different latent properties (e.g., mass or elasticity), and that those properties affect the outcome of physical events. While there has been great progress in physical and video prediction models in recent years, benchmarks to test their performance typically do not require an understanding that objects have individual physical properties, or at best test only those properties that are directly observable (e.g., size or color). This work proposes a novel dataset and benchmark, termed Physion++, that rigorously evaluates visual physical prediction in artificial systems under circumstances where those predictions rely on accurate estimates of the latent physical properties of objects in the scene. Specifically, we test scenarios where accurate prediction relies on estimates of properties such as mass, friction, elasticity, and deformability, and where the values of those properties can only be inferred by observing how objects move and interact with other objects or fluids. We evaluate the performance of a number of state-of-the-art prediction models that span a variety of levels of learning vs. built-in knowledge, and compare that performance to a set of human predictions. We find that models that have been trained using standard regimes and datasets do not spontaneously learn to make inferences about latent properties, but also that models that encode objectness and physical states tend to make better predictions. However, there is still a huge gap between all models and human performance, and all models' predictions correlate poorly with those made by humans, suggesting that no state-of-the-art model is learning to make physical predictions in a human-like way. Project page: https://dingmyu.github.io/physion_v2

    Next-Generation Comprehensive Data-Driven Models of Solar Eruptive Events

    Full text link
    Solar flares and coronal mass ejections are interrelated phenomena that together are known as solar eruptive events. These are the main drivers of space weather and understanding their origins is a primary goal of Heliophysics. In this white paper, we advocate for the allocation of sufficient resources to bring together experts in observations and modeling to construct and test next generation data-driven models of solar eruptive events. We identify the key components necessary for constructing comprehensive end-to-end models including global scale 3D MHD resolving magnetic field evolution and reconnection, small scale simulations of particle acceleration in reconnection exhausts, kinetic scale transport of flare-accelerated particles into the lower solar atmosphere, and the radiative and hydrodynamics responses of the solar atmosphere to flare heating. Using this modeling framework, long-standing questions regarding how solar eruptive events release energy, accelerate particles, and heat plasma can be explored. To address open questions in solar flare physics, we recommend that NASA and NSF provide sufficient research and analysis funds to bring together a large body of researchers and numerical tools to tackle the end-to-end modeling framework that we outline. Current dedicated theory and modeling funding programs are relatively small scale and infrequent; funding agencies must recognize that modern space physics demands the use of both observations and modeling to make rapid progress.Comment: White paper submitted to the Decadal Survey for Solar and Space Physics (Heliophysics) 2024-2033; 9 pages, 4 figure

    The Rapidly Flaring Afterglow of the Very Bright and Energetic GRB 070125

    Get PDF
    We report on multi-wavelength observations, ranging from the X-ray to radio wave bands, of the IPN-localized gamma-ray burst GRB 070125. Spectroscopic observations reveal the presence of absorption lines due to O I, Si II, and C IV, implying a likely redshift of z = 1.547. The well-sampled light curves, in particular from 0.5 to 4 days after the burst, suggest a jet break at 3.7 days, corresponding to a jet opening angle of ~7.0 degrees, and implying an intrinsic GRB energy in the 1 - 10,000 keV band of around E = (6.3 - 6.9)x 10^(51) erg (based on the fluences measured by the gamma-ray detectors of the IPN network). GRB 070125 is among the brightest afterglows observed to date. The spectral energy distribution implies a host extinction of Av < 0.9 mag. Two rebrightening episodes are observed, one with excellent time coverage, showing an increase in flux of 56% in ~8000 seconds. The evolution of the afterglow light curve is achromatic at all times. Late-time observations of the afterglow do not show evidence for emission from an underlying host galaxy or supernova. Any host galaxy would be subluminous, consistent with current GRB host-galaxy samples. Evidence for strong Mg II absorption features is not found, which is perhaps surprising in view of the relatively high redshift of this burst and the high likelihood for such features along GRB-selected lines of sight.Comment: 50 pages, 9 figures, 5 tables Accepted to the Astrophysical Journa
    • …
    corecore