321 research outputs found

    Clothing Co-Parsing by Joint Image Segmentation and Labeling

    Full text link
    This paper aims at developing an integrated system of clothing co-parsing, in order to jointly parse a set of clothing images (unsegmented but annotated with tags) into semantic configurations. We propose a data-driven framework consisting of two phases of inference. The first phase, referred as "image co-segmentation", iterates to extract consistent regions on images and jointly refines the regions over all images by employing the exemplar-SVM (E-SVM) technique [23]. In the second phase (i.e. "region co-labeling"), we construct a multi-image graphical model by taking the segmented regions as vertices, and incorporate several contexts of clothing configuration (e.g., item location and mutual interactions). The joint label assignment can be solved using the efficient Graph Cuts algorithm. In addition to evaluate our framework on the Fashionista dataset [30], we construct a dataset called CCP consisting of 2098 high-resolution street fashion photos to demonstrate the performance of our system. We achieve 90.29% / 88.23% segmentation accuracy and 65.52% / 63.89% recognition rate on the Fashionista and the CCP datasets, respectively, which are superior compared with state-of-the-art methods.Comment: 8 pages, 5 figures, CVPR 201

    Semantic 3D Occupancy Mapping through Efficient High Order CRFs

    Full text link
    Semantic 3D mapping can be used for many applications such as robot navigation and virtual interaction. In recent years, there has been great progress in semantic segmentation and geometric 3D mapping. However, it is still challenging to combine these two tasks for accurate and large-scale semantic mapping from images. In the paper, we propose an incremental and (near) real-time semantic mapping system. A 3D scrolling occupancy grid map is built to represent the world, which is memory and computationally efficient and bounded for large scale environments. We utilize the CNN segmentation as prior prediction and further optimize 3D grid labels through a novel CRF model. Superpixels are utilized to enforce smoothness and form robust P N high order potential. An efficient mean field inference is developed for the graph optimization. We evaluate our system on the KITTI dataset and improve the segmentation accuracy by 10% over existing systems.Comment: IROS 201

    Semantic Image Segmentation Using Region Bank

    Get PDF
    International audienceSemantic image segmentation assigns a predefined class label to each pixel. This paper proposes a unified framework by using region bank to solve this task. Images are hierarchically segmented leading to region banks. Local features and high-level descriptors are extracted on each region of the banks. Discriminative classifiers are learned based the histograms of features descriptors computed from training region bank (TRB). Optimally merging predicted regions of query region bank (QRB) results in semantic labeling. This paper details each algorithmic module used in our system, however, any algorithm fits corresponding modules can be plugged into the proposed framework. Experiments on the challenging Microsoft Research Cambridge (MSRC 21) dataset show that the proposed approach achieves the state-of-the-art performance

    Image segmentation for automated taxiing of unmanned aircraft

    Get PDF
    This paper details a method of detecting collision risks for Unmanned Aircraft during taxiing. Using images captured from an on-board camera, semantic segmentation can be used to identify surface types and detect potential collisions. A review of classifier lead segmentation concludes that texture feature descriptors lack the pixel level accuracy required for collision avoidance. Instead, segmentation prior to classification is suggested as a better method for accurate region border extraction. This is achieved through an initial over-segmentation using the established SLIC superpixel technique with further untrained clustering using DBSCAN algorithm. Known classes are used to train a classifier through construction of a texton dictionary and models of texton content typical to each class. The paper demonstrates the application of said system to real world images, and shows good automated segment identification. Remaining issues are identified and contextual information is suggested as a method of resolving them going forward
    • …
    corecore