25,614 research outputs found
Recognition of 3-D Objects from Multiple 2-D Views by a Self-Organizing Neural Architecture
The recognition of 3-D objects from sequences of their 2-D views is modeled by a neural architecture, called VIEWNET that uses View Information Encoded With NETworks. VIEWNET illustrates how several types of noise and varialbility in image data can be progressively removed while incornplcte image features are restored and invariant features are discovered using an appropriately designed cascade of processing stages. VIEWNET first processes 2-D views of 3-D objects using the CORT-X 2 filter, which discounts the illuminant, regularizes and completes figural boundaries, and removes noise from the images. Boundary regularization and cornpletion are achieved by the same mechanisms that suppress image noise. A log-polar transform is taken with respect to the centroid of the resulting figure and then re-centered to achieve 2-D scale and rotation invariance. The invariant images are coarse coded to further reduce noise, reduce foreshortening effects, and increase generalization. These compressed codes are input into a supervised learning system based on the fuzzy ARTMAP algorithm. Recognition categories of 2-D views are learned before evidence from sequences of 2-D view categories is accumulated to improve object recognition. Recognition is studied with noisy and clean images using slow and fast learning. VIEWNET is demonstrated on an MIT Lincoln Laboratory database of 2-D views of jet aircraft with and without additive noise. A recognition rate of 90% is achieved with one 2-D view category and of 98.5% correct with three 2-D view categories.National Science Foundation (IRI 90-24877); Office of Naval Research (N00014-91-J-1309, N00014-91-J-4100, N00014-92-J-0499); Air Force Office of Scientific Research (F9620-92-J-0499, 90-0083
Impact of Ground Truth Annotation Quality on Performance of Semantic Image Segmentation of Traffic Conditions
Preparation of high-quality datasets for the urban scene understanding is a
labor-intensive task, especially, for datasets designed for the autonomous
driving applications. The application of the coarse ground truth (GT)
annotations of these datasets without detriment to the accuracy of semantic
image segmentation (by the mean intersection over union - mIoU) could simplify
and speedup the dataset preparation and model fine tuning before its practical
application. Here the results of the comparative analysis for semantic
segmentation accuracy obtained by PSPNet deep learning architecture are
presented for fine and coarse annotated images from Cityscapes dataset. Two
scenarios were investigated: scenario 1 - the fine GT images for training and
prediction, and scenario 2 - the fine GT images for training and the coarse GT
images for prediction. The obtained results demonstrated that for the most
important classes the mean accuracy values of semantic image segmentation for
coarse GT annotations are higher than for the fine GT ones, and the standard
deviation values are vice versa. It means that for some applications some
unimportant classes can be excluded and the model can be tuned further for some
classes and specific regions on the coarse GT dataset without loss of the
accuracy even. Moreover, this opens the perspectives to use deep neural
networks for the preparation of such coarse GT datasets.Comment: 10 pages, 6 figures, 2 tables, The Second International Conference on
Computer Science, Engineering and Education Applications (ICCSEEA2019) 26-27
January 2019, Kiev, Ukrain
Data-Driven Segmentation of Post-mortem Iris Images
This paper presents a method for segmenting iris images obtained from the
deceased subjects, by training a deep convolutional neural network (DCNN)
designed for the purpose of semantic segmentation. Post-mortem iris recognition
has recently emerged as an alternative, or additional, method useful in
forensic analysis. At the same time it poses many new challenges from the
technological standpoint, one of them being the image segmentation stage, which
has proven difficult to be reliably executed by conventional iris recognition
methods. Our approach is based on the SegNet architecture, fine-tuned with
1,300 manually segmented post-mortem iris images taken from the
Warsaw-BioBase-Post-Mortem-Iris v1.0 database. The experiments presented in
this paper show that this data-driven solution is able to learn specific
deformations present in post-mortem samples, which are missing from alive
irises, and offers a considerable improvement over the state-of-the-art,
conventional segmentation algorithm (OSIRIS): the Intersection over Union (IoU)
metric was improved from 73.6% (for OSIRIS) to 83% (for DCNN-based presented in
this paper) averaged over subject-disjoint, multiple splits of the data into
train and test subsets. This paper offers the first known to us method of
automatic processing of post-mortem iris images. We offer source codes with the
trained DCNN that perform end-to-end segmentation of post-mortem iris images,
as described in this paper. Also, we offer binary masks corresponding to manual
segmentation of samples from Warsaw-BioBase-Post-Mortem-Iris v1.0 database to
facilitate development of alternative methods for post-mortem iris
segmentation
- …