Search CORE

171 research outputs found

Self-Supervised Pre-training for 3D Point Clouds via View-Specific Point-to-Image Translation

Author: Hou Junhui
Zhang Qijian
Publication venue
Publication date: 28/07/2023
Field of study

The past few years have witnessed the great success and prevalence of self-supervised representation learning within the language and 2D vision communities. However, such advancements have not been fully migrated to the field of 3D point cloud learning. Different from existing pre-training paradigms designed for deep point cloud feature extractors that fall into the scope of generative modeling or contrastive learning, this paper proposes a translative pre-training framework, namely PointVST, driven by a novel self-supervised pretext task of cross-modal translation from 3D point clouds to their corresponding diverse forms of 2D rendered images. More specifically, we begin with deducing view-conditioned point-wise embeddings through the insertion of the viewpoint indicator, and then adaptively aggregate a view-specific global codeword, which can be further fed into subsequent 2D convolutional translation heads for image generation. Extensive experimental evaluations on various downstream task scenarios demonstrate that our PointVST shows consistent and prominent performance superiority over current state-of-the-art approaches as well as satisfactory domain transfer capability. Our code will be publicly available at https://github.com/keeganhk/PointVST

arXiv.org e-Print Archive

Geographic Information Science (GIScience) and Geospatial Approaches for the Analysis of Historical Visual Sources and Cartographic Material

Author
Publication venue: 'MDPI AG'
Publication date: 21/06/2022
Field of study

This book focuses on the use of GIScience in conjunction with historical visual sources to resolve past scenarios. The themes, knowledge gained and methodologies conducted might be of interest to a variety of scholars from the social science and humanities disciplines

Directory of Open Access Books (DOAB)

Planning Framework for Robotic Pizza Dough Stretching with a Rolling Pin

Author: Kim Jung-Tae
Lippiello Vincenzo
Ruggiero Fabio
Siciliano Bruno
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Stretching a pizza dough with a rolling pin is a nonprehensile manipulation. Since the object is deformable, force closure cannot be established, and the manipulation is carried out in a nonprehensile way. The framework of this pizza dough stretching application that is explained in this chapter consists of four sub-procedures: (i) recognition of the pizza dough on a plate, (ii) planning the necessary steps to shape the pizza dough to the desired form, (iii) path generation for a rolling pin to execute the output of the pizza dough planner, and (iv) inverse kinematics for the bi-manual robot to grasp and control the rolling pin properly. Using the deformable object model described in Chap. 3, each sub-procedure of the proposed framework is explained sequentially

Archivio della ricerca - Università degli studi di Napoli Federico II

Synthesizing and Editing Photo-realistic Visual Objects

Author: Turmukhambetov D
Publication venue: UCL (University College London)
Publication date: 28/12/2016
Field of study

In this thesis we investigate novel methods of synthesizing new images of a deformable visual object using a collection of images of the object. We investigate both parametric and non-parametric methods as well as a combination of the two methods for the problem of image synthesis. Our main focus are complex visual objects, specifically deformable objects and objects with varying numbers of visible parts. We first introduce sketch-driven image synthesis system, which allows the user to draw ellipses and outlines in order to sketch a rough shape of animals as a constraint to the synthesized image. This system interactively provides feedback in the form of ellipse and contour suggestions to the partial sketch of the user. The user's sketch guides the non-parametric synthesis algorithm that blends patches from two exemplar images in a coarse-to-fine fashion to create a final image. We evaluate the method and synthesized images through two user studies. Instead of non-parametric blending of patches, a parametric model of the appearance is more desirable as its appearance representation is shared between all images of the dataset. Hence, we propose Context-Conditioned Component Analysis, a probabilistic generative parametric model, which described images with a linear combination of basis functions. The basis functions are evaluated for each pixel using a context vector computed from the local shape information. We evaluate C-CCA qualitatively and quantitatively on inpainting, appearance transfer and reconstruction tasks. Drawing samples of C-CCA generates novel, globally-coherent images, which, unfortunately, lack high-frequency details due to dimensionality reduction and misalignment. We develop a non-parametric model that enhances the samples of C-CCA with locally-coherent, high-frequency details. The non-parametric model efficiently finds patches from the dataset that match the C-CCA sample and blends the patches together. We analyze the results of the combined method on the datasets of horse and elephant images

UCL Discovery

Random Forests Applied as a Soil Spatial Predictive Model in Arid Utah

Author: Stum Alexander Knell
Publication venue: DigitalCommons@USU
Publication date: 01/01/2010
Field of study

Initial soil surveys are incomplete for large tracts of public land in the western USA. Digital soil mapping offers a quantitative approach as an alternative to traditional soil mapping. I sought to predict soil classes across an arid to semiarid watershed of western Utah by applying random forests (RF) and using environmental covariates derived from Landsat 7 Enhanced Thematic Mapper Plus (ETM+) and digital elevation models (DEM). Random forests are similar to classification and regression trees (CART). However, RF is doubly random. Many (e.g., 500) weak trees are grown (trained) independently because each tree is trained with a new randomly selected bootstrap sample, and a random subset of variables is used to split each node. To train and validate the RF trees, 561 soil descriptions were made in the field. An additional 111 points were added by case-based reasoning using aerial photo interpretation. As RF makes classification decisions from the mode of many independently grown trees, model uncertainty can be derived. The overall out of the bag (OOB) error was lower without weighting of classes; weighting increased the overall OOB error and the resulting output did not reflect soil-landscape relationships observed in the field. The final RF model had an OOB error of 55.2% and predicted soils on landforms consistent with soil-landscape relationships. The OOB error for individual classes typically decreased with increasing class size. In addition to the final classification, I determined the second and third most likely classification, model confidence, and the hypothetical extent of individual classes. Pixels that had high possibility of belonging to multiple soil classes were aggregated using a minimum confidence value based on limiting soil features, which is an effective and objective method of determining membership in soil map unit associations and complexes mapped at the 1:24,000 scale. Variables derived from both DEM and Landsat 7 ETM+ sources were important for predicting soil classes based on Gini and standard measures of variable importance and OOB errors from groves grown with exclusively DEM- or Landsat-derived data. Random forests was a powerful predictor of soil classes and produced outputs that facilitated further understanding of soil-landscape relationships

DigitalCommons@USU

ProQuest OAI Repository

Research Reports: 1997 NASA/ASEE Summer Faculty Fellowship Program

Author: Dowdy J.
Freeman L. M.
Karr G. R.
Publication venue
Publication date
Field of study

For the 33rd consecutive year, a NASA/ASEE Summer Faculty Fellowship Program was conducted at the Marshall Space Flight Center (MSFC). The program was conducted by the University of Alabama in Huntsville and MSFC during the period June 2, 1997 through August 8, 1997. Operated under the auspices of the American Society for Engineering Education, the MSFC program was sponsored by the Higher Education Branch, Education Division, NASA Headquarters, Washington, D.C. The basic objectives of the program, which are in the 34th year of operation nationally, are: (1) to further the professional knowledge of qualified engineering and science faculty members; (2) to stimulate an exchange of ideas between participants and NASA; (3) to enrich and refresh the research and teaching activities of the participants' institutions; and (4) to contribute to the research objectives of the NASA centers. The Faculty Fellows spent 10 weeks at MSFC engaged in a research project compatible with their interests and background and worked in collaboration with a NASA/MSFC colleague. This document is a compilation of Fellows' reports on their research during the summer of 1997. The University of Alabama in Huntsville presents the Co-Directors' report on the administrative operations of the program. Further information can be obtained by contacting any of the editors

NASA Technical Reports Server

Two and three dimensional segmentation of multimodal imagery

Author: Vantaram Sreenath Rao
Publication venue: RIT Scholar Works
Publication date: 01/10/2012
Field of study

The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes

RIT Scholar Works

ARCHITECTURE ESTIMATION FROM SPARSE IMAGES USING GRAMMATICAL SHAPE PRIORS FOR CULTURAL HERITAGE

Author: NC DOCKS at The University of North Carolina at Charlotte
Sui Yunfeng
Publication venue
Publication date: 01/01/2011
Field of study

The estimation and reconstruction of 3D architectural structures is of great in- terest in computer vision, as well as cultural heritage. This dissertation proposes a novel approach to solve the di??cult problem of estimating architectural structures from sparse images and e??ciently generating 3D models from estimation results for cultural heritage. This approach takes as input one plan drawing image and a few fac¸ade images, and provides as output the volumetric 3D models which represent the structures in the sparse images. Support of this research goal has motivated new investigations in underlying structure estimation problems including detecting structural feature points in 2D images, decomposing plan drawings into semantically meaningful shapes for medieval castles, estimating rectangular and Gothic fac¸ades using shape priors, and estimating complete 3D models for architectural structures using a novel volumetric shape grammar. Major outstanding challenges in each of these topic areas are addressed resulting in contributions to current state-of-the-art as it applied to these di??cult problems

The University of North Carolina at Greensboro

Remote Sensing

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

This dual conception of remote sensing brought us to the idea of preparing two different books; in addition to the first book which displays recent advances in remote sensing applications, this book is devoted to new techniques for data processing, sensors and platforms. We do not intend this book to cover all aspects of remote sensing techniques and platforms, since it would be an impossible task for a single volume. Instead, we have collected a number of high-quality, original and representative contributions in those areas

Directory of Open Access Books (DOAB)