55 research outputs found
Collective classification for labeling of places and objects in 2D and 3D range data
In this paper, we present an algorithm to identify types of places and objects from 2D and 3D laser range data obtained in indoor environments. Our approach is a combination of a collective classification method based on associative Markov networks together with an instance-based feature extraction using nearest neighbor. Additionally, we show how to select the best features needed to represent the objects and places, reducing the time needed for the learning and inference steps while maintaining high classification rates. Experimental results in real data demonstrate the effectiveness of our approach in indoor environments
Structured learning of sum-of-submodular higher order energy functions
Submodular functions can be exactly minimized in polynomial time, and the
special case that graph cuts solve with max flow \cite{KZ:PAMI04} has had
significant impact in computer vision
\cite{BVZ:PAMI01,Kwatra:SIGGRAPH03,Rother:GrabCut04}. In this paper we address
the important class of sum-of-submodular (SoS) functions
\cite{Arora:ECCV12,Kolmogorov:DAM12}, which can be efficiently minimized via a
variant of max flow called submodular flow \cite{Edmonds:ADM77}. SoS functions
can naturally express higher order priors involving, e.g., local image patches;
however, it is difficult to fully exploit their expressive power because they
have so many parameters. Rather than trying to formulate existing higher order
priors as an SoS function, we take a discriminative learning approach,
effectively searching the space of SoS functions for a higher order prior that
performs well on our training set. We adopt a structural SVM approach
\cite{Joachims/etal/09a,Tsochantaridis/etal/04} and formulate the training
problem in terms of quadratic programming; as a result we can efficiently
search the space of SoS priors via an extended cutting-plane algorithm. We also
show how the state-of-the-art max flow method for vision problems
\cite{Goldberg:ESA11} can be modified to efficiently solve the submodular flow
problem. Experimental comparisons are made against the OpenCV implementation of
the GrabCut interactive segmentation technique \cite{Rother:GrabCut04}, which
uses hand-tuned parameters instead of machine learning. On a standard dataset
\cite{Gulshan:CVPR10} our method learns higher order priors with hundreds of
parameter values, and produces significantly better segmentations. While our
focus is on binary labeling problems, we show that our techniques can be
naturally generalized to handle more than two labels
Multi-scale conditional random fields for over-segmented irregular 3D point clouds classification
©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holderIn this paper, we propose using multi-scale Conditional Random Fields to classes 3D outdoor terrestrial laser scanned data. We improved Lim and Suterpsilas methods by introducing regional edge potentials in addition to the local edge and node potentials in the multi-scale Conditional Random Fields, and only a relatively small amount of increment in the computation time is required to achieve the improved recognition rate. In the model, the raw data points are over-segmented into an improved mid-level representation, ldquosuper-voxelsrdquo. Local and regional features are then extracted from the super-voxel and parameters learnt by the multi-scale Conditional Random Fields. The classification accuracy is improved by 5% to 10% with our proposed model compared to labeling with Conditional Random Fields in (Lim and Suter, 2007). The overall computation time by labeling the super-voxels instead of individual points is lower than the previous 3D data labeling approaches.Ee Hui Lim, David Sute
Sensor fusion for semantic segmentation of urban scenes
Abstract—Semantic understanding of environments is an important problem in robotics in general and intelligent au-tonomous systems in particular. In this paper, we propose a semantic segmentation algorithm which effectively fuses infor-mation from images and 3D point clouds. The proposed method incorporates information from multiple scales in an intuitive and effective manner. A late-fusion architecture is proposed to maximally leverage the training data in each modality. Finally, a pairwise Conditional Random Field (CRF) is used as a post-processing step to enforce spatial consistency in the structured prediction. The proposed algorithm is evaluated on the publicly available KITTI dataset [1] [2], augmented with additional pixel and point-wise semantic labels for building, sky, road, vegetation, sidewalk, car, pedestrian, cyclist, sign/pole, and fence regions. A per-pixel accuracy of 89.3 % and average class accuracy of 65.4 % is achieved, well above current state-of-the-art [3]. I
Structured learning of sum-of-submodular higher order energy functions
Submodular functions can be exactly minimized in polynomial time, and the special case that graph cuts solve with max flow [18] has had significant impact in computer vision [5, 20, 27]. In this paper we address the important class of sum-of-submodular (SoS) functions [2, 17], which can be efficiently minimized via a variant of max flow called submodular flow [6]. SoS functions can naturally express higher order priors involving, e.g., local image patches; however, it is difficult to fully exploit their expressive power because they have so many parameters. Rather than trying to formulate existing higher order priors as an SoS function, we take a discriminative learning approach, effectively searching the space of SoS functions for a higher order prior that performs well on our training set. We adopt a structural SVM approach [14, 33] and formulate the training problem in terms of quadratic programming; as a result we can efficiently search the space of SoS priors via an extended cutting-plane algorithm. We also show how the state-of-the-art max flow method for vision problems [10] can be modified to efficiently solve the submodular flow problem. Experimental comparisons are made against the OpenCV implementation of the GrabCut interactive segmentation technique [27], which uses hand-tuned parameters instead of machine learning. On a standard dataset [11] our method learns higher order priors with hundreds of parameter values, and produces significantly better segmentations. While our focus is on binary labeling problems, we show that our techniques can be naturally generalized to handle more than two labels. 1
- …