22,292 research outputs found
Activity Recognition based on a Magnitude-Orientation Stream Network
The temporal component of videos provides an important clue for activity
recognition, as a number of activities can be reliably recognized based on the
motion information. In view of that, this work proposes a novel temporal stream
for two-stream convolutional networks based on images computed from the optical
flow magnitude and orientation, named Magnitude-Orientation Stream (MOS), to
learn the motion in a better and richer manner. Our method applies simple
nonlinear transformations on the vertical and horizontal components of the
optical flow to generate input images for the temporal stream. Experimental
results, carried on two well-known datasets (HMDB51 and UCF101), demonstrate
that using our proposed temporal stream as input to existing neural network
architectures can improve their performance for activity recognition. Results
demonstrate that our temporal stream provides complementary information able to
improve the classical two-stream methods, indicating the suitability of our
approach to be used as a temporal video representation.Comment: 8 pages, SIBGRAPI 201
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Digital Color Imaging
This paper surveys current technology and research in the area of digital
color imaging. In order to establish the background and lay down terminology,
fundamental concepts of color perception and measurement are first presented
us-ing vector-space notation and terminology. Present-day color recording and
reproduction systems are reviewed along with the common mathematical models
used for representing these devices. Algorithms for processing color images for
display and communication are surveyed, and a forecast of research trends is
attempted. An extensive bibliography is provided
Enabling geometry-based 3-D tele-immersion with fast mesh compression and linear rateless coding
3-D tele-immersion (3DTI) enables participants in remote locations to share, in real time, an activity. It offers users interactive and immersive experiences, but it challenges current media-streaming solutions. Work in the past has mainly focused on the efficient delivery of image-based 3-D videos and on realistic rendering and reconstruction of geometry-based 3-D objects. The contribution of this paper is a real-time streaming component for 3DTI with dynamic reconstructed geometry. This component includes both a novel fast compression method and a rateless packet protection scheme specifically designed towards the requirements imposed by real time transmission of live-reconstructed mesh geometry. Tests on a large dataset show an encoding speed-up up to ten times at comparable compression ratio and quality, when compared with the high-end MPEG-4 SC3DMC mesh encoders. The implemented rateless code ensures complete packet loss protection of the triangle mesh object and a delivery delay within interactive bounds. Contrary to most linear fountain codes, the designed codec enables real-time progressive decoding allowing partial decoding each time a packet is received. This approach is compared with transmission over TCP in packet loss rates and latencies, typical in managed WAN and MAN networks, and heavily outperforms it in terms of end-to-end delay. The streaming component has been integrated into a larger 3DTI environment that includes state of the art 3-D reconstruction and rendering modules. This resulted in a prototype that can capture, compress transmit, and render triangle mesh geometry in real-time in realistic internet conditions as shown in experiments. Compared with alternative methods, lower interactive end-to-end delay and frame rates over three times higher are achieved
On systematic approaches for interpreted information transfer of inspection data from bridge models to structural analysis
In conjunction with the improved methods of monitoring damage and degradation processes, the interest in reliability assessment of reinforced concrete bridges is increasing in recent years. Automated imagebased inspections of the structural surface provide valuable data to extract quantitative information about deteriorations, such as crack patterns. However, the knowledge gain results from processing this information in a structural context, i.e. relating the damage artifacts to building components. This way, transformation to structural analysis is enabled. This approach sets two further requirements: availability of structural bridge information and a standardized storage for interoperability with subsequent analysis tools. Since the involved large datasets are only efficiently processed in an automated manner, the implementation of the complete workflow from damage and building data to structural analysis is targeted in this work. First, domain concepts are derived from the back-end tasks: structural analysis, damage modeling, and life-cycle assessment. The common interoperability format, the Industry Foundation Class (IFC), and processes in these domains are further assessed. The need for usercontrolled interpretation steps is identified and the developed prototype thus allows interaction at subsequent model stages. The latter has the advantage that interpretation steps can be individually separated into either a structural analysis or a damage information model or a combination of both. This approach to damage information processing from the perspective of structural analysis is then validated in different case studies
Sparse Coding on Symmetric Positive Definite Manifolds using Bregman Divergences
This paper introduces sparse coding and dictionary learning for Symmetric
Positive Definite (SPD) matrices, which are often used in machine learning,
computer vision and related areas. Unlike traditional sparse coding schemes
that work in vector spaces, in this paper we discuss how SPD matrices can be
described by sparse combination of dictionary atoms, where the atoms are also
SPD matrices. We propose to seek sparse coding by embedding the space of SPD
matrices into Hilbert spaces through two types of Bregman matrix divergences.
This not only leads to an efficient way of performing sparse coding, but also
an online and iterative scheme for dictionary learning. We apply the proposed
methods to several computer vision tasks where images are represented by region
covariance matrices. Our proposed algorithms outperform state-of-the-art
methods on a wide range of classification tasks, including face recognition,
action recognition, material classification and texture categorization
Random Forests and Networks Analysis
D. Wilson~\cite{[Wi]} in the 1990's described a simple and efficient
algorithm based on loop-erased random walks to sample uniform spanning trees
and more generally weighted trees or forests spanning a given graph. This
algorithm provides a powerful tool in analyzing structures on networks and
along this line of thinking, in recent works~\cite{AG1,AG2,ACGM1,ACGM2} we
focused on applications of spanning rooted forests on finite graphs. The
resulting main conclusions are reviewed in this paper by collecting related
theorems, algorithms, heuristics and numerical experiments. A first
foundational part on determinantal structures and efficient sampling procedures
is followed by four main applications: 1) a random-walk-based notion of
well-distributed points in a graph 2) how to describe metastable dynamics in
finite settings by means of Markov intertwining dualities 3) coarse graining
schemes for networks and associated processes 4) wavelets-like pyramidal
algorithms for graph signals.Comment: Survey pape
- …