371 research outputs found

    Learning Video Object Segmentation with Visual Memory

    Get PDF
    This paper addresses the task of segmenting moving objects in unconstrained videos. We introduce a novel two-stream neural network with an explicit memory module to achieve this. The two streams of the network encode spatial and temporal features in a video sequence respectively, while the memory module captures the evolution of objects over time. The module to build a "visual memory" in video, i.e., a joint representation of all the video frames, is realized with a convolutional recurrent unit learned from a small number of training video sequences. Given a video frame as input, our approach assigns each pixel an object or background label based on the learned spatio-temporal features as well as the "visual memory" specific to the video, acquired automatically without any manually-annotated frames. The visual memory is implemented with convolutional gated recurrent units, which allows to propagate spatial information over time. We evaluate our method extensively on two benchmarks, DAVIS and Freiburg-Berkeley motion segmentation datasets, and show state-of-the-art results. For example, our approach outperforms the top method on the DAVIS dataset by nearly 6%. We also provide an extensive ablative analysis to investigate the influence of each component in the proposed framework

    ImageJ2: ImageJ for the next generation of scientific image data

    Full text link
    ImageJ is an image analysis program extensively used in the biological sciences and beyond. Due to its ease of use, recordable macro language, and extensible plug-in architecture, ImageJ enjoys contributions from non-programmers, amateur programmers, and professional developers alike. Enabling such a diversity of contributors has resulted in a large community that spans the biological and physical sciences. However, a rapidly growing user base, diverging plugin suites, and technical limitations have revealed a clear need for a concerted software engineering effort to support emerging imaging paradigms, to ensure the software's ability to handle the requirements of modern science. Due to these new and emerging challenges in scientific imaging, ImageJ is at a critical development crossroads. We present ImageJ2, a total redesign of ImageJ offering a host of new functionality. It separates concerns, fully decoupling the data model from the user interface. It emphasizes integration with external applications to maximize interoperability. Its robust new plugin framework allows everything from image formats, to scripting languages, to visualization to be extended by the community. The redesigned data model supports arbitrarily large, N-dimensional datasets, which are increasingly common in modern image acquisition. Despite the scope of these changes, backwards compatibility is maintained such that this new functionality can be seamlessly integrated with the classic ImageJ interface, allowing users and developers to migrate to these new methods at their own pace. ImageJ2 provides a framework engineered for flexibility, intended to support these requirements as well as accommodate future needs

    Extending Cognitive Architectures with Spatial and Visual Imagery Mechanisms

    Full text link
    This research presents a computational synthesis of cognition with spatial and visual imagery processing by extending a symbolic cognitive architecture (Soar) with mechanisms to support reasoning with quantitative spatial and visual depictive representations. Inspired by psychological and neurological evidence of mental imagery, our primary goals are to achieve new functional capability and computational efficiency in a task-independent manner. We describe how our theory and the corresponding architecture derive from behavioral, biological, functional, and computational constraints and demonstrate results from three different domains. Our evaluation reveals that in tasks where reasoning includes many spatial or visual properties, the combination of amodal and perceptual representations provides an agent with additional functional capability and improves its problem-solving quality. We also show that specialized processing units specific to a perceptual representation but independent of task knowledge are likely to be necessary in order to realize computational efficiency in a general manner. The research is significant because past research in cognitive architectures primarily views amodal, symbolic representations as being sufficient for knowledge representation and thought. We expand those ideas with the notion that perceptual-based representations participate directly in the thinking rather than serving simply as a source of sensory information. The new capabilities of the resulting architecture, which includes Soar and its Spatial-Visual Imagery (SVI) component, emerge from its ability to amalgamate symbolic and perceptual representations and use them to inform reasoning. Soar’s symbolic memories and processes provide the building blocks necessary for high-level control in the pursuit of goals, learning, and the encoding of amodal, symbolic knowledge for abstract reasoning. SVI encompasses the quantitative spatial and visual depictive representations and processes specialized for efficient construction and extraction of spatial and visual properties.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/60876/1/slathrop_1.pd

    Disorder and interference: localization phenomena

    Full text link
    The specific problem we address in these lectures is the problem of transport and localization in disordered systems, when interference is present, as characteristic for waves, with a focus on realizations with ultracold atoms.Comment: Notes of a lecture delivered at the Les Houches School of Physics on "Ultracold gases and quantum information" 2009 in Singapore. v3: corrected mistakes, improved script for numerics, Chapter 9 in "Les Houches 2009 - Session XCI: Ultracold Gases and Quantum Information" edited by C. Miniatura et al. (Oxford University Press, 2011

    Activity Analysis; Finding Explanations for Sets of Events

    Get PDF
    Automatic activity recognition is the computational process of analysing visual input and reasoning about detections to understand the performed events. In all but the simplest scenarios, an activity involves multiple interleaved events, some related and others independent. The activity in a car park or at a playground would typically include many events. This research assumes the possible events and any constraints between the events can be defined for the given scene. Analysing the activity should thus recognise a complete and consistent set of events; this is referred to as a global explanation of the activity. By seeking a global explanation that satisfies the activity’s constraints, infeasible interpretations can be avoided, and ambiguous observations may be resolved. An activity’s events and any natural constraints are defined using a grammar formalism. Attribute Multiset Grammars (AMG) are chosen because they allow defining hierarchies, as well as attribute rules and constraints. When used for recognition, detectors are employed to gather a set of detections. Parsing the set of detections by the AMG provides a global explanation. To find the best parse tree given a set of detections, a Bayesian network models the probability distribution over the space of possible parse trees. Heuristic and exhaustive search techniques are proposed to find the maximum a posteriori global explanation. The framework is tested for two activities: the activity in a bicycle rack, and around a building entrance. The first case study involves people locking bicycles onto a bicycle rack and picking them up later. The best global explanation for all detections gathered during the day resolves local ambiguities from occlusion or clutter. Intensive testing on 5 full days proved global analysis achieves higher recognition rates. The second case study tracks people and any objects they are carrying as they enter and exit a building entrance. A complete sequence of the person entering and exiting multiple times is recovered by the global explanation

    Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds

    Get PDF
    In this paper we address the problems of modeling the acoustic space generated by a full-spectrum sound source and of using the learned model for the localization and separation of multiple sources that simultaneously emit sparse-spectrum sounds. We lay theoretical and methodological grounds in order to introduce the binaural manifold paradigm. We perform an in-depth study of the latent low-dimensional structure of the high-dimensional interaural spectral data, based on a corpus recorded with a human-like audiomotor robot head. A non-linear dimensionality reduction technique is used to show that these data lie on a two-dimensional (2D) smooth manifold parameterized by the motor states of the listener, or equivalently, the sound source directions. We propose a probabilistic piecewise affine mapping model (PPAM) specifically designed to deal with high-dimensional data exhibiting an intrinsic piecewise linear structure. We derive a closed-form expectation-maximization (EM) procedure for estimating the model parameters, followed by Bayes inversion for obtaining the full posterior density function of a sound source direction. We extend this solution to deal with missing data and redundancy in real world spectrograms, and hence for 2D localization of natural sound sources such as speech. We further generalize the model to the challenging case of multiple sound sources and we propose a variational EM framework. The associated algorithm, referred to as variational EM for source separation and localization (VESSL) yields a Bayesian estimation of the 2D locations and time-frequency masks of all the sources. Comparisons of the proposed approach with several existing methods reveal that the combination of acoustic-space learning with Bayesian inference enables our method to outperform state-of-the-art methods.Comment: 19 pages, 9 figures, 3 table

    INFN What Next: Ultra-relativistic Heavy-Ion Collisions

    Full text link
    This document was prepared by the community that is active in Italy, within INFN (Istituto Nazionale di Fisica Nucleare), in the field of ultra-relativistic heavy-ion collisions. The experimental study of the phase diagram of strongly-interacting matter and of the Quark-Gluon Plasma (QGP) deconfined state will proceed, in the next 10-15 years, along two directions: the high-energy regime at RHIC and at the LHC, and the low-energy regime at FAIR, NICA, SPS and RHIC. The Italian community is strongly involved in the present and future programme of the ALICE experiment, the upgrade of which will open, in the 2020s, a new phase of high-precision characterisation of the QGP properties at the LHC. As a complement of this main activity, there is a growing interest in a possible future experiment at the SPS, which would target the search for the onset of deconfinement using dimuon measurements. On a longer timescale, the community looks with interest at the ongoing studies and discussions on a possible fixed-target programme using the LHC ion beams and on the Future Circular Collider.Comment: 99 pages, 56 figure
    • …
    corecore