371 research outputs found
Learning Video Object Segmentation with Visual Memory
This paper addresses the task of segmenting moving objects in unconstrained
videos. We introduce a novel two-stream neural network with an explicit memory
module to achieve this. The two streams of the network encode spatial and
temporal features in a video sequence respectively, while the memory module
captures the evolution of objects over time. The module to build a "visual
memory" in video, i.e., a joint representation of all the video frames, is
realized with a convolutional recurrent unit learned from a small number of
training video sequences. Given a video frame as input, our approach assigns
each pixel an object or background label based on the learned spatio-temporal
features as well as the "visual memory" specific to the video, acquired
automatically without any manually-annotated frames. The visual memory is
implemented with convolutional gated recurrent units, which allows to propagate
spatial information over time. We evaluate our method extensively on two
benchmarks, DAVIS and Freiburg-Berkeley motion segmentation datasets, and show
state-of-the-art results. For example, our approach outperforms the top method
on the DAVIS dataset by nearly 6%. We also provide an extensive ablative
analysis to investigate the influence of each component in the proposed
framework
ImageJ2: ImageJ for the next generation of scientific image data
ImageJ is an image analysis program extensively used in the biological
sciences and beyond. Due to its ease of use, recordable macro language, and
extensible plug-in architecture, ImageJ enjoys contributions from
non-programmers, amateur programmers, and professional developers alike.
Enabling such a diversity of contributors has resulted in a large community
that spans the biological and physical sciences. However, a rapidly growing
user base, diverging plugin suites, and technical limitations have revealed a
clear need for a concerted software engineering effort to support emerging
imaging paradigms, to ensure the software's ability to handle the requirements
of modern science. Due to these new and emerging challenges in scientific
imaging, ImageJ is at a critical development crossroads.
We present ImageJ2, a total redesign of ImageJ offering a host of new
functionality. It separates concerns, fully decoupling the data model from the
user interface. It emphasizes integration with external applications to
maximize interoperability. Its robust new plugin framework allows everything
from image formats, to scripting languages, to visualization to be extended by
the community. The redesigned data model supports arbitrarily large,
N-dimensional datasets, which are increasingly common in modern image
acquisition. Despite the scope of these changes, backwards compatibility is
maintained such that this new functionality can be seamlessly integrated with
the classic ImageJ interface, allowing users and developers to migrate to these
new methods at their own pace. ImageJ2 provides a framework engineered for
flexibility, intended to support these requirements as well as accommodate
future needs
Extending Cognitive Architectures with Spatial and Visual Imagery Mechanisms
This research presents a computational synthesis of cognition with spatial and visual imagery processing by extending a symbolic cognitive architecture (Soar) with mechanisms to support reasoning with quantitative spatial and visual depictive representations. Inspired by psychological and neurological evidence of mental imagery, our primary goals are to achieve new functional capability and computational efficiency in a task-independent manner. We describe how our theory and the corresponding architecture derive from behavioral, biological, functional, and computational constraints and demonstrate results from three different domains. Our evaluation reveals that in tasks where reasoning includes many spatial or visual properties, the combination of amodal and perceptual representations provides an agent with additional functional capability and improves its problem-solving quality. We also show that specialized processing units specific to a perceptual representation but independent of task knowledge are likely to be necessary in order to realize computational efficiency in a general manner.
The research is significant because past research in cognitive architectures primarily views amodal, symbolic representations as being sufficient for knowledge representation and thought. We expand those ideas with the notion that perceptual-based representations participate directly in the thinking rather than serving simply as a source of sensory information. The new capabilities of the resulting architecture, which includes Soar and its Spatial-Visual Imagery (SVI) component, emerge from its ability to amalgamate symbolic and perceptual representations and use them to inform reasoning. Soar’s symbolic memories and processes provide the building blocks necessary for high-level control in the pursuit of goals, learning, and the encoding of amodal, symbolic knowledge for abstract reasoning. SVI encompasses the quantitative spatial and visual depictive representations and processes specialized for efficient construction and extraction of spatial and visual properties.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/60876/1/slathrop_1.pd
Disorder and interference: localization phenomena
The specific problem we address in these lectures is the problem of transport
and localization in disordered systems, when interference is present, as
characteristic for waves, with a focus on realizations with ultracold atoms.Comment: Notes of a lecture delivered at the Les Houches School of Physics on
"Ultracold gases and quantum information" 2009 in Singapore. v3: corrected
mistakes, improved script for numerics, Chapter 9 in "Les Houches 2009 -
Session XCI: Ultracold Gases and Quantum Information" edited by C. Miniatura
et al. (Oxford University Press, 2011
Activity Analysis; Finding Explanations for Sets of Events
Automatic activity recognition is the computational process of analysing visual input and reasoning about detections to understand the performed events. In all but the simplest scenarios, an activity involves multiple interleaved events, some related and others independent. The activity in a car park or at a playground would typically include many events. This research assumes the possible events and any constraints between the events can be defined for the given scene. Analysing the activity should thus recognise a complete and consistent set of events; this is referred to as a global explanation of the activity. By seeking a global explanation that satisfies the activity’s constraints, infeasible interpretations can be avoided, and ambiguous observations may be resolved.
An activity’s events and any natural constraints are defined using a grammar formalism. Attribute Multiset Grammars (AMG) are chosen because they allow defining hierarchies, as well as attribute rules and constraints. When used for recognition, detectors are employed to gather a set of detections. Parsing the set of detections by the AMG provides a global explanation. To find the best parse tree given a set of detections, a Bayesian network models the probability distribution over the space of possible parse trees. Heuristic and exhaustive search techniques are proposed to find the maximum a posteriori global explanation.
The framework is tested for two activities: the activity in a bicycle rack, and around a building entrance. The first case study involves people locking bicycles onto a bicycle rack and picking them up later. The best global explanation for all detections gathered during the day resolves local ambiguities from occlusion or clutter. Intensive testing on 5 full days proved global analysis achieves higher recognition rates. The second case study tracks people and any objects they are carrying as they enter and exit a building entrance. A complete sequence of the person entering and exiting multiple times is recovered by the global explanation
Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds
In this paper we address the problems of modeling the acoustic space
generated by a full-spectrum sound source and of using the learned model for
the localization and separation of multiple sources that simultaneously emit
sparse-spectrum sounds. We lay theoretical and methodological grounds in order
to introduce the binaural manifold paradigm. We perform an in-depth study of
the latent low-dimensional structure of the high-dimensional interaural
spectral data, based on a corpus recorded with a human-like audiomotor robot
head. A non-linear dimensionality reduction technique is used to show that
these data lie on a two-dimensional (2D) smooth manifold parameterized by the
motor states of the listener, or equivalently, the sound source directions. We
propose a probabilistic piecewise affine mapping model (PPAM) specifically
designed to deal with high-dimensional data exhibiting an intrinsic piecewise
linear structure. We derive a closed-form expectation-maximization (EM)
procedure for estimating the model parameters, followed by Bayes inversion for
obtaining the full posterior density function of a sound source direction. We
extend this solution to deal with missing data and redundancy in real world
spectrograms, and hence for 2D localization of natural sound sources such as
speech. We further generalize the model to the challenging case of multiple
sound sources and we propose a variational EM framework. The associated
algorithm, referred to as variational EM for source separation and localization
(VESSL) yields a Bayesian estimation of the 2D locations and time-frequency
masks of all the sources. Comparisons of the proposed approach with several
existing methods reveal that the combination of acoustic-space learning with
Bayesian inference enables our method to outperform state-of-the-art methods.Comment: 19 pages, 9 figures, 3 table
INFN What Next: Ultra-relativistic Heavy-Ion Collisions
This document was prepared by the community that is active in Italy, within
INFN (Istituto Nazionale di Fisica Nucleare), in the field of
ultra-relativistic heavy-ion collisions. The experimental study of the phase
diagram of strongly-interacting matter and of the Quark-Gluon Plasma (QGP)
deconfined state will proceed, in the next 10-15 years, along two directions:
the high-energy regime at RHIC and at the LHC, and the low-energy regime at
FAIR, NICA, SPS and RHIC. The Italian community is strongly involved in the
present and future programme of the ALICE experiment, the upgrade of which will
open, in the 2020s, a new phase of high-precision characterisation of the QGP
properties at the LHC. As a complement of this main activity, there is a
growing interest in a possible future experiment at the SPS, which would target
the search for the onset of deconfinement using dimuon measurements. On a
longer timescale, the community looks with interest at the ongoing studies and
discussions on a possible fixed-target programme using the LHC ion beams and on
the Future Circular Collider.Comment: 99 pages, 56 figure
- …