67,242 research outputs found
Matterport3D: Learning from RGB-D Data in Indoor Environments
Access to large, diverse RGB-D datasets is critical for training RGB-D scene
understanding algorithms. However, existing datasets still cover only a limited
number of views or a restricted scale of spaces. In this paper, we introduce
Matterport3D, a large-scale RGB-D dataset containing 10,800 panoramic views
from 194,400 RGB-D images of 90 building-scale scenes. Annotations are provided
with surface reconstructions, camera poses, and 2D and 3D semantic
segmentations. The precise global alignment and comprehensive, diverse
panoramic set of views over entire buildings enable a variety of supervised and
self-supervised computer vision tasks, including keypoint matching, view
overlap prediction, normal prediction from color, semantic segmentation, and
region classification
Automatic Structural Scene Digitalization
In this paper, we present an automatic system for the analysis and labeling
of structural scenes, floor plan drawings in Computer-aided Design (CAD)
format. The proposed system applies a fusion strategy to detect and recognize
various components of CAD floor plans, such as walls, doors, windows and other
ambiguous assets. Technically, a general rule-based filter parsing method is
fist adopted to extract effective information from the original floor plan.
Then, an image-processing based recovery method is employed to correct
information extracted in the first step. Our proposed method is fully automatic
and real-time. Such analysis system provides high accuracy and is also
evaluated on a public website that, on average, archives more than ten
thousands effective uses per day and reaches a relatively high satisfaction
rate.Comment: paper submitted to PloS On
Fine-Grained Product Class Recognition for Assisted Shopping
Assistive solutions for a better shopping experience can improve the quality
of life of people, in particular also of visually impaired shoppers. We present
a system that visually recognizes the fine-grained product classes of items on
a shopping list, in shelves images taken with a smartphone in a grocery store.
Our system consists of three components: (a) We automatically recognize useful
text on product packaging, e.g., product name and brand, and build a mapping of
words to product classes based on the large-scale GroceryProducts dataset. When
the user populates the shopping list, we automatically infer the product class
of each entered word. (b) We perform fine-grained product class recognition
when the user is facing a shelf. We discover discriminative patches on product
packaging to differentiate between visually similar product classes and to
increase the robustness against continuous changes in product design. (c) We
continuously improve the recognition accuracy through active learning. Our
experiments show the robustness of the proposed method against cross-domain
challenges, and the scalability to an increasing number of products with
minimal re-training.Comment: Accepted at ICCV Workshop on Assistive Computer Vision and Robotics
(ICCV-ACVR) 201
- …