1,740 research outputs found
Geometric Multi-Model Fitting by Deep Reinforcement Learning
This paper deals with the geometric multi-model fitting from noisy,
unstructured point set data (e.g., laser scanned point clouds). We formulate
multi-model fitting problem as a sequential decision making process. We then
use a deep reinforcement learning algorithm to learn the optimal decisions
towards the best fitting result. In this paper, we have compared our method
against the state-of-the-art on simulated data. The results demonstrated that
our approach significantly reduced the number of fitting iterations
Low-rank Based Algorithms for Rectification, Repetition Detection and De-noising in Urban Images
In this thesis, we aim to solve the problem of automatic image rectification and repeated patterns detection on 2D urban images, using novel low-rank based techniques. Repeated patterns (such as windows, tiles, balconies and doors) are prominent and significant features in urban scenes.
Detection of the periodic structures is useful in many applications such as photorealistic 3D reconstruction, 2D-to-3D alignment, facade parsing, city modeling, classification, navigation, visualization in 3D map environments, shape completion, cinematography and 3D games. However both of the image rectification and repeated patterns detection problems are challenging due to scene occlusions, varying illumination, pose variation and sensor noise. Therefore, detection of these repeated patterns becomes very important for city scene analysis.
Given a 2D image of urban scene, we automatically rectify a facade image and extract facade textures first. Based on the rectified facade texture, we exploit novel algorithms that extract repeated patterns by using Kronecker product based modeling that is based on a solid theoretical foundation. We have tested our algorithms in a large set of images, which includes building facades from Paris, Hong Kong and New York
Cylinders extraction in non-oriented point clouds as a clustering problem
Finding geometric primitives in 3D point clouds is a fundamental task in many engineering applications such as robotics, autonomous-vehicles and automated industrial inspection. Among all solid shapes, cylinders are frequently found in a variety of scenes, comprising natural or man-made objects. Despite their ubiquitous presence, automated extraction and fitting can become challenging if performed ”in-the-wild”, when the number of primitives is unknown or the point cloud is noisy and not oriented. In this paper we pose the problem of extracting multiple cylinders in a scene by means of a Game-Theoretic inlier selection process exploiting the geometrical relations between pairs of axis candidates. First, we formulate the similarity between two possible cylinders considering the rigid motion aligning the two axes to the same line. This motion is represented with a unitary dual-quaternion so that the distance between two cylinders is induced by the length of the shortest geodesic path in SE(3). Then, a Game-Theoretical process exploits such similarity function to extract sets of primitives maximizing their inner mutual consensus. The outcome of the evolutionary process consists in a probability distribution over the sets of candidates (ie axes), which in turn is used to directly estimate the final cylinder parameters. An extensive experimental section shows that the proposed algorithm offers a high resilience to noise, since the process inherently discards inconsistent data. Compared to other methods, it does not need point normals and does not require a fine tuning of multiple parameters
Transparent Privacy is Principled Privacy
Differential privacy revolutionizes the way we think about statistical
disclosure limitation. Among the benefits it brings to the table, one is
particularly profound and impactful. Under this formal approach to privacy, the
mechanism with which data is privatized can be spelled out in full
transparency, without sacrificing the privacy guarantee. Curators of
open-source demographic and scientific data are at a position to offer privacy
without obscurity. This paper supplies a technical treatment to the pitfalls of
obscure privacy, and establishes transparent privacy as a prerequisite to
drawing correct statistical inference. It advocates conceiving transparent
privacy as a dynamic component that can improve data quality from the total
survey error perspective, and discusses the limited statistical usability of
mere procedural transparency which may arise when dealing with mandated
invariants. Transparent privacy is the only viable path towards principled
inference from privatized data releases. Its arrival marks great progress
towards improved reproducibility, accountability and public trust.Comment: 2 figure
Recommended from our members
Representation Learning for Shape Decomposition, By Shape Decomposition
The ability to parse 3D objects into their constituent parts is essential for humans to understand and interact with the surrounding world. Imparting this skill in machines is important for various computer graphics, computer vision, and robotics tasks. Machines endowed with this skill can better interact with its surroundings, perform shape editing, texturing, recomposing, tracking, and animation. In this thesis, we ask two questions. First, how can machines decompose 3D shapes into their fundamental parts? Second, does the ability to decompose the 3D shape into these parts help learn useful 3D shape representations?
In this thesis, we focus on parsing the shape into compact representations, such as parametric surface patches and Constructive Solid Geometry (CSG) primitives, which are also widely used representations in 3D modeling in computer graphics. Inspired by the advances in neural networks for 3D shape processing, we develop neural network approaches to tackle shape decomposition. First, we present CSGNet, a network architecture to parse shapes into CSG programs, which is trained using combination of supervised and reinforcement learning. Second, we present ParSeNet, a network architecture to decompose a shape into parametric surface patches (B-Spline) and geometric primitives (plane, cone, cylinder and sphere), trained on a large set of CAD models using supervised learning.
The training of deep neural network architectures for 3D recognition and generation tasks requires a large amount of labeled datasets. We explore ways to alleviate this problem by relying on shape decomposition methods to guide the learning process. Towards that end, we first study the use of freely available metadata, albeit inconsistent, from shape repositories to learn 3D shape features. Later we show that learning to decompose a 3D shape into geometric primitives also helps in learning shape representations useful for semantic segmentation tasks. Finally, since most 3D shapes encountered in real life are textured, consisting of several fine-grained semantic parts, we propose a method to learn fine-grained representations for textured 3D shapes in a self-supervised manner by incorporating 3D geometric priors
Gazedirector: Fully articulated eye gaze redirection in video
We present GazeDirector, a new approach for eye gaze redirection that uses model-fitting. Our method first tracks the eyes by fitting a multi-part eye region model to video frames using analysis-by-synthesis, thereby recovering eye region shape, texture, pose, and gaze simultaneously. It then redirects gaze by 1) warping the eyelids from the original image using a model-derived flow field, and 2) rendering and compositing synthesized 3D eyeballs onto the output image in a photorealistic manner. GazeDirector allows us to change where people are looking without person-specific training data, and with full articulation, i.e. we can precisely specify new gaze directions in 3D. Quantitatively, we evaluate both model-fitting and gaze synthesis, with experiments for gaze estimation and redirection on the Columbia gaze dataset. Qualitatively, we compare GazeDirector against recent work on gaze redirection, showing better results especially for large redirection angles. Finally, we demonstrate gaze redirection on YouTube videos by introducing new 3D gaze targets and by manipulating visual behavior
Computer Vision Techniques for Transcatheter Intervention
Minimally invasive transcatheter technologies have demonstrated substantial promise for the diagnosis and treatment of cardiovascular diseases. For example, TAVI is an alternative to AVR for the treatment of severe aortic stenosis and TAFA is widely used for the treatment and cure of atrial fibrillation. In addition, catheter-based IVUS and OCT imaging of coronary arteries provides important information about the coronary lumen, wall and plaque characteristics. Qualitative and quantitative analysis of these cross-sectional image data will be beneficial for the evaluation and treatment of coronary artery diseases such as atherosclerosis. In all the phases (preoperative, intraoperative, and postoperative) during the transcatheter intervention procedure, computer vision techniques (e.g., image segmentation, motion tracking) have been largely applied in the field to accomplish tasks like annulus measurement, valve selection, catheter placement control, and vessel centerline extraction. This provides beneficial guidance for the clinicians in surgical planning, disease diagnosis, and treatment assessment. In this paper, we present a systematical review on these state-of-the-art methods.We aim to give a comprehensive overview for researchers in the area of computer vision on the subject of transcatheter intervention. Research in medical computing is multi-disciplinary due to its nature, and hence it is important to understand the application domain, clinical background, and imaging modality so that methods and quantitative measurements derived from analyzing the imaging data are appropriate and meaningful. We thus provide an overview on background information of transcatheter intervention procedures, as well as a review of the computer vision techniques and methodologies applied in this area
- …