11,601 research outputs found
STV-based Video Feature Processing for Action Recognition
In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end
ICNet for Real-Time Semantic Segmentation on High-Resolution Images
We focus on the challenging task of real-time semantic segmentation in this
paper. It finds many practical applications and yet is with fundamental
difficulty of reducing a large portion of computation for pixel-wise label
inference. We propose an image cascade network (ICNet) that incorporates
multi-resolution branches under proper label guidance to address this
challenge. We provide in-depth analysis of our framework and introduce the
cascade feature fusion unit to quickly achieve high-quality segmentation. Our
system yields real-time inference on a single GPU card with decent quality
results evaluated on challenging datasets like Cityscapes, CamVid and
COCO-Stuff.Comment: ECCV 201
Learning an Interactive Segmentation System
Many successful applications of computer vision to image or video
manipulation are interactive by nature. However, parameters of such systems are
often trained neglecting the user. Traditionally, interactive systems have been
treated in the same manner as their fully automatic counterparts. Their
performance is evaluated by computing the accuracy of their solutions under
some fixed set of user interactions. This paper proposes a new evaluation and
learning method which brings the user in the loop. It is based on the use of an
active robot user - a simulated model of a human user. We show how this
approach can be used to evaluate and learn parameters of state-of-the-art
interactive segmentation systems. We also show how simulated user models can be
integrated into the popular max-margin method for parameter learning and
propose an algorithm to solve the resulting optimisation problem.Comment: 11 pages, 7 figures, 4 table
Unwind: Interactive Fish Straightening
The ScanAllFish project is a large-scale effort to scan all the world's
33,100 known species of fishes. It has already generated thousands of
volumetric CT scans of fish species which are available on open access
platforms such as the Open Science Framework. To achieve a scanning rate
required for a project of this magnitude, many specimens are grouped together
into a single tube and scanned all at once. The resulting data contain many
fish which are often bent and twisted to fit into the scanner. Our system,
Unwind, is a novel interactive visualization and processing tool which
extracts, unbends, and untwists volumetric images of fish with minimal user
interaction. Our approach enables scientists to interactively unwarp these
volumes to remove the undesired torque and bending using a piecewise-linear
skeleton extracted by averaging isosurfaces of a harmonic function connecting
the head and tail of each fish. The result is a volumetric dataset of a
individual, straight fish in a canonical pose defined by the marine biologist
expert user. We have developed Unwind in collaboration with a team of marine
biologists: Our system has been deployed in their labs, and is presently being
used for dataset construction, biomechanical analysis, and the generation of
figures for scientific publication
- …