261 research outputs found
Generative Model with Coordinate Metric Learning for Object Recognition Based on 3D Models
Given large amount of real photos for training, Convolutional neural network
shows excellent performance on object recognition tasks. However, the process
of collecting data is so tedious and the background are also limited which
makes it hard to establish a perfect database. In this paper, our generative
model trained with synthetic images rendered from 3D models reduces the
workload of data collection and limitation of conditions. Our structure is
composed of two sub-networks: semantic foreground object reconstruction network
based on Bayesian inference and classification network based on multi-triplet
cost function for avoiding over-fitting problem on monotone surface and fully
utilizing pose information by establishing sphere-like distribution of
descriptors in each category which is helpful for recognition on regular photos
according to poses, lighting condition, background and category information of
rendered images. Firstly, our conjugate structure called generative model with
metric learning utilizing additional foreground object channels generated from
Bayesian rendering as the joint of two sub-networks. Multi-triplet cost
function based on poses for object recognition are used for metric learning
which makes it possible training a category classifier purely based on
synthetic data. Secondly, we design a coordinate training strategy with the
help of adaptive noises acting as corruption on input images to help both
sub-networks benefit from each other and avoid inharmonious parameter tuning
due to different convergence speed of two sub-networks. Our structure achieves
the state of the art accuracy of over 50\% on ShapeNet database with data
migration obstacle from synthetic images to real photos. This pipeline makes it
applicable to do recognition on real images only based on 3D models.Comment: 14 page
Adversarial Semantic Scene Completion from a Single Depth Image
We propose a method to reconstruct, complete and semantically label a 3D
scene from a single input depth image. We improve the accuracy of the regressed
semantic 3D maps by a novel architecture based on adversarial learning. In
particular, we suggest using multiple adversarial loss terms that not only
enforce realistic outputs with respect to the ground truth, but also an
effective embedding of the internal features. This is done by correlating the
latent features of the encoder working on partial 2.5D data with the latent
features extracted from a variational 3D auto-encoder trained to reconstruct
the complete semantic scene. In addition, differently from other approaches
that operate entirely through 3D convolutions, at test time we retain the
original 2.5D structure of the input during downsampling to improve the
effectiveness of the internal representation of our model. We test our approach
on the main benchmark datasets for semantic scene completion to qualitatively
and quantitatively assess the effectiveness of our proposal.Comment: 2018 International Conference on 3D Vision (3DV
Regional surname affinity: a spatial network approach
OBJECTIVE
We investigate surname affinities among areas of modern‐day China, by constructing a spatial network, and making community detection. It reports a geographical genealogy of the Chinese population that is result of population origins, historical migrations, and societal evolutions.
MATERIALS AND METHODS
We acquire data from the census records supplied by China's National Citizen Identity Information System, including the surname and regional information of 1.28 billion registered Chinese citizens. We propose a multilayer minimum spanning tree (MMST) to construct a spatial network based on the matrix of isonymic distances, which is often used to characterize the dissimilarity of surname structure among areas. We use the fast unfolding algorithm to detect network communities.
RESULTS
We obtain a 10‐layer MMST network of 362 prefecture nodes and 3,610 edges derived from the matrix of the Euclidean distances among these areas. These prefectures are divided into eight groups in the spatial network via community detection. We measure the partition by comparing the inter‐distances and intra‐distances of the communities and obtain meaningful regional ethnicity classification.
DISCUSSION
The visualization of the resulting communities on the map indicates that the prefectures in the same community are usually geographically adjacent. The formation of this partition is influenced by geographical factors, historic migrations, trade and economic factors, as well as isolation of culture and language. The MMST algorithm proves to be effective in geo‐genealogy and ethnicity classification for it retains essential information about surname affinity and highlights the geographical consanguinity of the population.National Natural Science Foundation of China, Grant/Award Numbers: 61773069, 71731002; National Social Science Foundation of China, Grant/Award Number: 14BSH024; Foundation of China of China Scholarships Council, Grant/Award Numbers: 201606045048, 201706040188, 201706040015; DOE, Grant/Award Number: DE-AC07-05Id14517; DTRA, Grant/Award Number: HDTRA1-14-1-0017; NSF, Grant/Award Numbers: CHE-1213217, CMMI-1125290, PHY-1505000 (61773069 - National Natural Science Foundation of China; 71731002 - National Natural Science Foundation of China; 14BSH024 - National Social Science Foundation of China; 201606045048 - Foundation of China of China Scholarships Council; 201706040188 - Foundation of China of China Scholarships Council; 201706040015 - Foundation of China of China Scholarships Council; DE-AC07-05Id14517 - DOE; HDTRA1-14-1-0017 - DTRA; CHE-1213217 - NSF; CMMI-1125290 - NSF; PHY-1505000 - NSF)Published versio
New Approach for Unambiguous High-Resolution Wide-Swath SAR Imaging
The high-resolution wide-swath (HRWS) SAR system uses a small antenna for transmitting waveform and multiple antennas both in elevation and azimuth for receiving echoes. It has the potential to achieve wide spatial coverage and fine azimuth resolution, while it suffers from elevation pattern loss caused by the presence of topographic height and impaired azimuth resolution caused by nonuniform sampling. A new approach for HRWS SAR imaging based on compressed sensing (CS) is introduced. The data after range compression of multiple elevation apertures are used to estimate direction of arrival (DOA) of targets via CS, and the adaptive digital beamforming in elevation is achieved accordingly, which avoids the pattern loss of scan-on-receive (SCORE) algorithm when topographic height exists. The effective phase centers of the system are nonuniformly distributed when displaced phase center antenna (DPCA) technology is adopted, which causes Doppler ambiguities under traditional SAR imaging algorithms. Azimuth reconstruction based on CS can resolve this problem via precisely modeling the nonuniform sampling. Validation with simulations and experiment in an anechoic chamber are presented
ForkNet: Multi-branch Volumetric Semantic Completion from a Single Depth Image
We propose a novel model for 3D semantic completion from a single depth
image, based on a single encoder and three separate generators used to
reconstruct different geometric and semantic representations of the original
and completed scene, all sharing the same latent space. To transfer information
between the geometric and semantic branches of the network, we introduce paths
between them concatenating features at corresponding network layers. Motivated
by the limited amount of training samples from real scenes, an interesting
attribute of our architecture is the capacity to supplement the existing
dataset by generating a new training dataset with high quality, realistic
scenes that even includes occlusion and real noise. We build the new dataset by
sampling the features directly from latent space which generates a pair of
partial volumetric surface and completed volumetric semantic surface. Moreover,
we utilize multiple discriminators to increase the accuracy and realism of the
reconstructions. We demonstrate the benefits of our approach on standard
benchmarks for the two most common completion tasks: semantic 3D scene
completion and 3D object completion.Comment: Accepted in International Conference on Computer Vision 201
- …