279 research outputs found
Robust 3D People Tracking and Positioning System in a Semi-Overlapped Multi-Camera Environment
People positioning and tracking in 3D indoor environments are challenging tasks due to background clutter and occlusions. Current works are focused on solving people occlusions in low-cluttered backgrounds, but fail in high-cluttered scenarios, specially when foreground objects occlude people. In this paper, a novel 3D people positioning and tracking system is presented, which shows itself robust to both possible occlusion sources: static scene objects and other people. The system holds on a set of multiple cameras with partially overlapped fields of view. Moving regions are segmented independently in each camera stream by means of a new background modeling strategy based on Gabor filters. People detection is carried out on these segmentations through a template-based correlation strategy. Detected people are tracked independently in each camera view by means of a graph-based matching strategy, which estimates the best correspondences between consecutive people segmentations. Finally, 3D tracking and positioning of people is achieved by geometrical consistency analysis over the tracked 2D candidates, using head position (instead of object centroids) to increase robustness to foreground occlusions
Part decomposition of 3D surfaces
This dissertation describes a general algorithm that automatically decomposes realworld scenes and objects into visual parts. The input to the algorithm is a 3 D triangle mesh that approximates the surfaces of a scene or object. This geometric mesh completely specifies the shape of interest. The output of the algorithm is a set of boundary contours that dissect the mesh into parts where these parts agree with human perception. In this algorithm, shape alone defines the location of a bom1dary contour for a part. The algorithm leverages a human vision theory known as the minima rule that states that human visual perception tends to decompose shapes into parts along lines of negative curvature minima. Specifically, the minima rule governs the location of part boundaries, and as a result the algorithm is known as the Minima Rule Algorithm. Previous computer vision methods have attempted to implement this rule but have used pseudo measures of surface curvature. Thus, these prior methods are not true implementations of the rule. The Minima Rule Algorithm is a three step process that consists of curvature estimation, mesh segmentation, and quality evaluation. These steps have led to three novel algorithms known as Normal Vector Voting, Fast Marching Watersheds, and Part Saliency Metric, respectively. For each algorithm, this dissertation presents both the supporting theory and experimental results. The results demonstrate the effectiveness of the algorithm using both synthetic and real data and include comparisons with previous methods from the research literature. Finally, the dissertation concludes with a summary of the contributions to the state of the art
Two Decades of Colorization and Decolorization for Images and Videos
Colorization is a computer-aided process, which aims to give color to a gray
image or video. It can be used to enhance black-and-white images, including
black-and-white photos, old-fashioned films, and scientific imaging results. On
the contrary, decolorization is to convert a color image or video into a
grayscale one. A grayscale image or video refers to an image or video with only
brightness information without color information. It is the basis of some
downstream image processing applications such as pattern recognition, image
segmentation, and image enhancement. Different from image decolorization, video
decolorization should not only consider the image contrast preservation in each
video frame, but also respect the temporal and spatial consistency between
video frames. Researchers were devoted to develop decolorization methods by
balancing spatial-temporal consistency and algorithm efficiency. With the
prevalance of the digital cameras and mobile phones, image and video
colorization and decolorization have been paid more and more attention by
researchers. This paper gives an overview of the progress of image and video
colorization and decolorization methods in the last two decades.Comment: 12 pages, 19 figure
SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation
In this paper, we propose a scribble-based video colorization network with
temporal aggregation called SVCNet. It can colorize monochrome videos based on
different user-given color scribbles. It addresses three common issues in the
scribble-based video colorization area: colorization vividness, temporal
consistency, and color bleeding. To improve the colorization quality and
strengthen the temporal consistency, we adopt two sequential sub-networks in
SVCNet for precise colorization and temporal smoothing, respectively. The first
stage includes a pyramid feature encoder to incorporate color scribbles with a
grayscale frame, and a semantic feature encoder to extract semantics. The
second stage finetunes the output from the first stage by aggregating the
information of neighboring colorized frames (as short-range connections) and
the first colorized frame (as a long-range connection). To alleviate the color
bleeding artifacts, we learn video colorization and segmentation
simultaneously. Furthermore, we set the majority of operations on a fixed small
image resolution and use a Super-resolution Module at the tail of SVCNet to
recover original sizes. It allows the SVCNet to fit different image resolutions
at the inference. Finally, we evaluate the proposed SVCNet on DAVIS and Videvo
benchmarks. The experimental results demonstrate that SVCNet produces both
higher-quality and more temporally consistent videos than other well-known
video colorization approaches. The codes and models can be found at
https://github.com/zhaoyuzhi/SVCNet.Comment: accepted by IEEE Transactions on Image Processing (TIP
Visual routines and attention
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (leaves 90-93).by Satyajit Rao.Ph.D
Learning Regional Attraction for Line Segment Detection
This paper presents regional attraction of line segment maps, and hereby
poses the problem of line segment detection (LSD) as a problem of region
coloring. Given a line segment map, the proposed regional attraction first
establishes the relationship between line segments and regions in the image
lattice. Based on this, the line segment map is equivalently transformed to an
attraction field map (AFM), which can be remapped to a set of line segments
without loss of information. Accordingly, we develop an end-to-end framework to
learn attraction field maps for raw input images, followed by a squeeze module
to detect line segments. Apart from existing works, the proposed detector
properly handles the local ambiguity and does not rely on the accurate
identification of edge pixels. Comprehensive experiments on the Wireframe
dataset and the YorkUrban dataset demonstrate the superiority of our method. In
particular, we achieve an F-measure of 0.831 on the Wireframe dataset,
advancing the state-of-the-art performance by 10.3 percent.Comment: Accepted to IEEE TPAMI. arXiv admin note: text overlap with
arXiv:1812.0212
- âŠ