1,675 research outputs found
Data Imputation through the Identification of Local Anomalies
We introduce a comprehensive and statistical framework in a model free
setting for a complete treatment of localized data corruptions due to severe
noise sources, e.g., an occluder in the case of a visual recording. Within this
framework, we propose i) a novel algorithm to efficiently separate, i.e.,
detect and localize, possible corruptions from a given suspicious data instance
and ii) a Maximum A Posteriori (MAP) estimator to impute the corrupted data. As
a generalization to Euclidean distance, we also propose a novel distance
measure, which is based on the ranked deviations among the data attributes and
empirically shown to be superior in separating the corruptions. Our algorithm
first splits the suspicious instance into parts through a binary partitioning
tree in the space of data attributes and iteratively tests those parts to
detect local anomalies using the nominal statistics extracted from an
uncorrupted (clean) reference data set. Once each part is labeled as anomalous
vs normal, the corresponding binary patterns over this tree that characterize
corruptions are identified and the affected attributes are imputed. Under a
certain conditional independency structure assumed for the binary patterns, we
analytically show that the false alarm rate of the introduced algorithm in
detecting the corruptions is independent of the data and can be directly set
without any parameter tuning. The proposed framework is tested over several
well-known machine learning data sets with synthetically generated corruptions;
and experimentally shown to produce remarkable improvements in terms of
classification purposes with strong corruption separation capabilities. Our
experiments also indicate that the proposed algorithms outperform the typical
approaches and are robust to varying training phase conditions
Structured Indoor Modeling
In this dissertation, we propose data-driven approaches to reconstruct 3D models for indoor scenes which are represented in a structured way (e.g., a wall is represented by a planar surface and two rooms are connected via the wall). The structured representation of models is more application ready than dense representations (e.g., a point cloud), but poses additional challenges for reconstruction since extracting structures requires high-level understanding about geometries. To address this challenging problem, we explore two common structural regularities of indoor scenes: 1) most indoor structures consist of planar surfaces (planarity), and 2) structural surfaces (e.g., walls and floor) can be represented by a 2D floorplan as a top-down view projection (orthogonality). With breakthroughs in data capturing techniques, we develop automated systems to tackle structured modeling problems, namely piece-wise planar reconstruction and floorplan reconstruction, by learning shape priors (i.e., planarity and orthogonality) from data. With structured representations and production-level quality, the reconstructed models have an immediate impact on many industrial applications
SynTable: A Synthetic Data Generation Pipeline for Unseen Object Amodal Instance Segmentation of Cluttered Tabletop Scenes
In this work, we present SynTable, a unified and flexible Python-based
dataset generator built using NVIDIA's Isaac Sim Replicator Composer for
generating high-quality synthetic datasets for unseen object amodal instance
segmentation of cluttered tabletop scenes. Our dataset generation tool can
render a complex 3D scene containing object meshes, materials, textures,
lighting, and backgrounds. Metadata, such as modal and amodal instance
segmentation masks, occlusion masks, depth maps, bounding boxes, and material
properties, can be generated to automatically annotate the scene according to
the users' requirements. Our tool eliminates the need for manual labeling in
the dataset generation process while ensuring the quality and accuracy of the
dataset. In this work, we discuss our design goals, framework architecture, and
the performance of our tool. We demonstrate the use of a sample dataset
generated using SynTable by ray tracing for training a state-of-the-art model,
UOAIS-Net. The results show significantly improved performance in Sim-to-Real
transfer when evaluated on the OSD-Amodal dataset. We offer this tool as an
open-source, easy-to-use, photorealistic dataset generator for advancing
research in deep learning and synthetic data generation.Comment: Version
- …