6 research outputs found

    Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation

    Full text link
    In this paper, we propose an alternative method to estimate room layouts of cluttered indoor scenes. This method enjoys the benefits of two novel techniques. The first one is semantic transfer (ST), which is: (1) a formulation to integrate the relationship between scene clutter and room layout into convolutional neural networks; (2) an architecture that can be end-to-end trained; (3) a practical strategy to initialize weights for very deep networks under unbalanced training data distribution. ST allows us to extract highly robust features under various circumstances, and in order to address the computation redundance hidden in these features we develop a principled and efficient inference scheme named physics inspired optimization (PIO). PIO's basic idea is to formulate some phenomena observed in ST features into mechanics concepts. Evaluations on public datasets LSUN and Hedau show that the proposed method is more accurate than state-of-the-art methods.Comment: To appear in CVPR 2017. Project Page: https://sites.google.com/view/st-pio

    Supervised geodesic propagation for semantic label transfer

    Get PDF
    Abstract. In this paper we propose a novel semantic label transfer method using supervised geodesic propagation (SGP). We use supervised learning to guide the seed selection and the label propagation. Given an input image, we first retrieve its similar image set from annotated databases. A Joint Boost model is learned on the similar image set of the input image. Then the recognition proposal map of the input image is inferred by this learned model. The initial distance map is defined by the proposal map: the higher probability, the smaller distance. In each iteration step of the geodesic propagation, the seed is selected as the one with the smallest distance from the undetermined superpixels. We learn a classifier as an indicator to indicate whether to propagate labels between two neighboring superpixels. The training samples of the indicator are annotated neighboring pairs from the similar image set. The geodesic distances of its neighbors are updated according to the combination of the texture and boundary features and the indication value. Experiments on three datasets show that our method outperforms the traditional learning based methods and the previous label transfer method for the semantic segmentation work

    Supervised label transfer for semantic segmentation of street scenes

    No full text
    In this paper, we propose a robust supervised label transfer method for the semantic segmentation of street scenes. Given an input image of street scene, we first find multiple image sets from the training database consisting of images with annotation, each of which can cover all semantic categories in the input image. Then, we establish dense correspondence between the input image and each found image sets with a proposed KNN-MRF matching scheme. It is followed by a matching correspondences classification that tries to reduce the number of semantically incorrect correspondences with trained matching correspondences classification models for different categories. With those matching correspondences classified as semantically correct correspondences, we infer the confidence values of each super pixel belonging to different semantic categories, and integrate them and spatial smoothness constraint in a markov random field to segment the input image. Experiments on three datasets show our method outperforms the traditional learning based methods and the previous nonparametric label transfer method, for the semantic segmentation of street scenes. © 2010 Springer-Verlag

    Segmentation d'images de façades de bâtiments acquises d'un point de vue terrestre

    Get PDF
    L'analyse de façades (détection, compréhension et reconstruction) à partir d'images acquises depuis la rue est aujourd'hui un thème de recherche très actif en photogrammétrie et en vision par ordinateur de part ses nombreuses applications industrielles. Cette thèse montre des avancées réalisées dans le domaine de la segmentation générique de grands volumes de ce type d'images, contenant une ou plusieurs zones de façades (entières ou tronquées).Ce type de données se caractérise par une complexité architecturale très riche ainsi que par des problèmes liés à l'éclairage et au point de vue d'acquisition. La généricité des traitements est un enjeu important. La contrainte principale est de n'introduire que le minimum d'a priori possible. Nous basons nos approches sur les propriétés d'alignements et de répétitivité des structures principales de la façade. Nous proposons un partitionnement hiérarchique des contours de l'image ainsi qu'une détection de grilles de structures répétitives par processus ponctuels marqués. Sur les résultats, la façade est séparée de ses voisines et de son environnement (rue, ciel). D'autre part, certains éléments comme les fenêtres, les balcons ou le fond de mur, sans être reconnus, sont extraits de manière cohérente. Le paramétrage s'effectue en une seule passe et s'applique à tous les styles d'architecture rencontrés. La problématique se situe en amont de nombreuses thématiques comme la séparation de façades, l'accroissement du niveau de détail de modèles urbains 3D générés à partir de photos aériennes ou satellitaires, la compression ou encore l'indexation à partir de primitives géométriques (regroupement de structures et espacements entre ellesFacade analysis (detection, understanding and field of reconstruction) in street level imagery is currently a very active field of research in photogrammetric computer vision due to its many applications. This thesis shows some progress made in the field of generic segmentation of a broad range of images that contain one or more facade areas (as a whole or in part).This kind of data is carecterized by a very rich and varied architectural complexity and by problems in lighting conditions and in the choice of a camera's point of view. Workflow genericity is an important issue. One significant constraint is to be as little biased as possible. The approches presented extract the main facade structures based on geometric properties such as alignment and repetitivity. We propose a hierarchic partition of the image contour edges and a detection of repetitive grid patterns based on marked point processes. The facade is set appart from its neighbooring façades and from its environment (the ground, the sky). Some elements such as windows, balconies or wall backgrounds, are extracted in a relevant way, without being recognized. The parameters regulation is done in one step and refers to all architectural styles encountered. The problem originates from most themes such as facade separation, the increase of level of details in 3D city models generated from aerial or satellite imagery, compression or indexation based on geometric primitives (structure grouping and space between them)PARIS-EST-Université (770839901) / SudocSudocFranceF

    Context-driven Object Detection and Segmentation with Auxiliary Information

    No full text
    One fundamental problem in computer vision and robotics is to localize objects of interest in an image. The task can either be formulated as an object detection problem if the objects are described by a set of pose parameters, or an object segmentation one if we recover object boundary precisely. A key issue in object detection and segmentation concerns exploiting the spatial context, as local evidence is often insufficient to determine object pose in the presence of heavy occlusions or large object appearance variations. This thesis addresses the object detection and segmentation problem in such adverse conditions with auxiliary depth data provided by RGBD cameras. We focus on four main issues in context-aware object detection and segmentation: 1) what are the effective context representations? 2) how can we work with limited and imperfect depth data? 3) how to design depth-aware features and integrate depth cues into conventional visual inference tasks? 4) how to make use of unlabeled data to relax the labeling requirements for training data? We discuss three object detection and segmentation scenarios based on varying amounts of available auxiliary information. In the first case, depth data are available for model training but not available for testing. We propose a structured Hough voting method for detecting objects with heavy occlusion in indoor environments, in which we extend the Hough hypothesis space to include both the object's location, and its visibility pattern. We design a new score function that accumulates votes for object detection and occlusion prediction. In addition, we explore the correlation between objects and their environment, building a depth-encoded object-context model based on RGBD data. In the second case, we address the problem of localizing glass objects with noisy and incomplete depth data. Our method integrates the intensity and depth information from a single view point, and builds a Markov Random Field that predicts glass boundary and region jointly. In addition, we propose a nonparametric, data-driven label transfer scheme for local glass boundary estimation. A weighted voting scheme based on a joint feature manifold is adopted to integrate depth and appearance cues, and we learn a distance metric on the depth-encoded feature manifold. In the third case, we make use of unlabeled data to relax the annotation requirements for object detection and segmentation, and propose a novel data-dependent margin distribution learning criterion for boosting, which utilizes the intrinsic geometric structure of datasets. One key aspect of this method is that it can seamlessly incorporate unlabeled data by including a graph Laplacian regularizer. We demonstrate the performance of our models and compare with baseline methods on several real-world object detection and segmentation tasks, including indoor object detection, glass object segmentation and foreground segmentation in video
    corecore