2,284 research outputs found
ICDAR2003 Page Segmentation Competition
There is a significant need to objectively evaluate layout analysis (page segmentation and region classification) methods. This paper describes the Page Segmentation Competition (modus operandi, dataset and evaluation criteria) held in the context of ICDAR2003 and presents the results of the evaluation of the candidate methods. The main objective of the competition was to evaluate such methods using scanned documents from commonly-occurring publications. The results indicate that although methods seem to be maturing, there is still a considerable need to develop robust methods that deal with everyday documents
Matterport3D: Learning from RGB-D Data in Indoor Environments
Access to large, diverse RGB-D datasets is critical for training RGB-D scene
understanding algorithms. However, existing datasets still cover only a limited
number of views or a restricted scale of spaces. In this paper, we introduce
Matterport3D, a large-scale RGB-D dataset containing 10,800 panoramic views
from 194,400 RGB-D images of 90 building-scale scenes. Annotations are provided
with surface reconstructions, camera poses, and 2D and 3D semantic
segmentations. The precise global alignment and comprehensive, diverse
panoramic set of views over entire buildings enable a variety of supervised and
self-supervised computer vision tasks, including keypoint matching, view
overlap prediction, normal prediction from color, semantic segmentation, and
region classification
A Self-Organizing System for Classifying Complex Images: Natural Textures and Synthetic Aperture Radar
A self-organizing architecture is developed for image region classification. The system consists of a preprocessor that utilizes multi-scale filtering, competition, cooperation, and diffusion to compute a vector of image boundary and surface properties, notably texture and brightness properties. This vector inputs to a system that incrementally learns noisy multidimensional mappings and their probabilities. The architecture is applied to difficult real-world image classification problems, including classification of synthetic aperture radar and natural texture images, and outperforms a recent state-of-the-art system at classifying natural texturns.Office of Naval Research (N00014-95-1-0409, N00014-95-1-0657, N00014-91-J-4100); Advanced Research Projects Agency (N00014-92-J-4015); Air Force Office of Scientific Research (F49620-92-J-0225, F49620-92-J-0334); National Science Foundation (IRI-90-00530, IRI-90-24877
A Self-Organizing System for Classifying Complex Images: Natural Textures and Synthetic Aperture Radar
A self-organizing architecture is developed for image region classification. The system consists of a preprocessor that utilizes multi-scale filtering, competition, cooperation, and diffusion to compute a vector of image boundary and surface properties, notably texture and brightness properties. This vector inputs to a system that incrementally learns noisy multidimensional mappings and their probabilities. The architecture is applied to difficult real-world image classification problems, including classification of synthetic aperture radar and natural texture images, and outperforms a recent state-of-the-art system at classifying natural texturns.Office of Naval Research (N00014-95-1-0409, N00014-95-1-0657, N00014-91-J-4100); Advanced Research Projects Agency (N00014-92-J-4015); Air Force Office of Scientific Research (F49620-92-J-0225, F49620-92-J-0334); National Science Foundation (IRI-90-00530, IRI-90-24877
Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome
Tiling arrays make possible a large scale exploration of the genome thanks to
probes which cover the whole genome with very high density until 2 000 000
probes. Biological questions usually addressed are either the expression
difference between two conditions or the detection of transcribed regions. In
this work we propose to consider simultaneously both questions as an
unsupervised classification problem by modeling the joint distribution of the
two conditions. In contrast to previous methods, we account for all available
information on the probes as well as biological knowledge like annotation and
spatial dependence between probes. Since probes are not biologically relevant
units we propose a classification rule for non-connected regions covered by
several probes. Applications to transcriptomic and ChIP-chip data of
Arabidopsis thaliana obtained with a NimbleGen tiling array highlight the
importance of a precise modeling and the region classification
ARTEX: A Self-Organizing Architecture for Classifying Image Regions
A self-organizing architect is developed for image region classification. The system consists of a preprocessor that utilizes multi-scale filtering, competition, cooperation, and diffusion to compute a vector of image boundary and surface properties, notably texture and brightness properties. This vector inputs to a system that incrementally learns noisy multidimensional mappings and their probabilities. The architecture is applied to diflicult real-world image classification problems, including classification of synthetic aperture radar and natural textural images, and outperforms a recent state-of-the-art system at classifying natural textures.Office of Naval Research (N00014-95-1-0409, N00014-95-1-0657, N00014-91-J-4100, N00014-95-1-0409); Advanced Research Projects Agency (N00014-92-J-4015); Air Force Office of Scientific Research (F49620-92-J-4015, F49620-92-J-0334); National Science Foundation (IRI-90-00530, IRI-90-24877). an
Ground Truth for Layout Analysis Performance Evaluation
Over the past two decades a significant number of layout analysis (page segmentation and region classification) approaches have been proposed in the literature. Each approach has been devised for and/or evaluated using (usually small) application-specific datasets. While the need for objective performance evaluation of layout analysis algorithms is evident, there does not exist a suitable dataset with ground truth that reflects the realities of everyday documents (widely varying layouts, complex entities, colour, noise etc.). The most significant impediment is the creation of accurate and flexible (in representation) ground truth, a task that is costly and must be carefully designed. This paper discusses the issues related to the design, representation and creation of ground truth in the context of a realistic dataset developed by the authors. The effectiveness of the ground truth discussed in this paper has been successfully shown in its use for two international page segmentation competitions (ICDAR2003 and ICDAR2005)
Deep Learning Body Region Classification of MRI and CT examinations
Standardized body region labelling of individual images provides data that
can improve human and computer use of medical images. A CNN-based classifier
was developed to identify body regions in CT and MRI. 17 CT (18 MRI) body
regions covering the entire human body were defined for the classification
task. Three retrospective databases were built for the AI model training,
validation, and testing, with a balanced distribution of studies per body
region. The test databases originated from a different healthcare network.
Accuracy, recall and precision of the classifier was evaluated for patient age,
patient gender, institution, scanner manufacturer, contrast, slice thickness,
MRI sequence, and CT kernel. The data included a retrospective cohort of 2,934
anonymized CT cases (training: 1,804 studies, validation: 602 studies, test:
528 studies) and 3,185 anonymized MRI cases (training: 1,911 studies,
validation: 636 studies, test: 638 studies). 27 institutions from primary care
hospitals, community hospitals and imaging centers contributed to the test
datasets. The data included cases of all genders in equal proportions and
subjects aged from a few months old to +90 years old. An image-level prediction
accuracy of 91.9% (90.2 - 92.1) for CT, and 94.2% (92.0 - 95.6) for MRI was
achieved. The classification results were robust across all body regions and
confounding factors. Due to limited data, performance results for subjects
under 10 years-old could not be reliably evaluated. We show that deep learning
models can classify CT and MRI images by body region including lower and upper
extremities with high accuracy.Comment: 21 pages, 2 figures, 4 table
Context-Aware Zero-Shot Recognition
We present a novel problem setting in zero-shot learning, zero-shot object
recognition and detection in the context. Contrary to the traditional zero-shot
learning methods, which simply infers unseen categories by transferring
knowledge from the objects belonging to semantically similar seen categories,
we aim to understand the identity of the novel objects in an image surrounded
by the known objects using the inter-object relation prior. Specifically, we
leverage the visual context and the geometric relationships between all pairs
of objects in a single image, and capture the information useful to infer
unseen categories. We integrate our context-aware zero-shot learning framework
into the traditional zero-shot learning techniques seamlessly using a
Conditional Random Field (CRF). The proposed algorithm is evaluated on both
zero-shot region classification and zero-shot detection tasks. The results on
Visual Genome (VG) dataset show that our model significantly boosts performance
with the additional visual context compared to traditional methods
- …