28,311 research outputs found
A Stochastic Modeling Approach to Region-and Edge-Based Image Segmentation
The purpose of image segmentation is to isolate objects in a scene from the background. This is a very important step in any computer vision system since various tasks, such as shape analysis and object recognition, require accurate image segmentation. Image segmentation can also produce tremendous data reduction. Edge-based and region-based segmentation have been examined and two new algorithms based on recent results in random field theory have been developed. The edge-based segmentation algorithm uses the pixel gray level intensity information to allocate object boundaries in two stages: edge enhancement, followed by edge linking. Edge enhancement is accomplished by maximum energy filters used in one-dimensional bandlimited signal analysis. The issue of optimum filter spatial support is analyzed for ideal edge models. Edge linking is performed by quantitative sequential search using the Stack algorithm. Two probabilistic search metrics are introduced and their optimality is proven and demonstrated on test as well as real scenes. Compared to other methods, this algorithm is shown to produce more accurate allocation of object boundaries. Region-based segmentation was modeled as a MAP estimation problem in which the actual (unknown) objects were estimated from the observed (known) image by a recursive classification algorithms. The observed image was modeled by an Autoregressive (AR) model whose parameters were estimated locally, and a Gibbs-Markov random field (GMRF) model was used to model the unknown scene. A computational study was conducted on images having various types of texture images. The issues of parameter estimation, neighborhood selection, and model orders were examined. It is concluded that the MAP approach for region segmentation generally works well on images having a large content of microtextures which can be properly modeled by both AR and GMRF models. On these texture images, second order AR and GMRF models were shown to be adequate
Investigation on advanced image search techniques
Content-based image search for retrieval of images based on the similarity in their visual contents, such as color, texture, and shape, to a query image is an active research area due to its broad applications. Color, for example, provides powerful information for image search and classification. This dissertation investigates advanced image search techniques and presents new color descriptors for image search and classification and robust image enhancement and segmentation methods for iris recognition.
First, several new color descriptors have been developed for color image search. Specifically, a new oRGB-SIFT descriptor, which integrates the oRGB color space and the Scale-Invariant Feature Transform (SIFT), is proposed for image search and classification. The oRGB-SIFT descriptor is further integrated with other color SIFT features to produce the novel Color SIFT Fusion (CSF), the Color Grayscale SIFT Fusion (CGSF), and the CGSF+PHOG descriptors for image category search with applications to biometrics. Image classification is implemented using a novel EFM-KNN classifier, which combines the Enhanced Fisher Model (EFM) and the K Nearest Neighbor (KNN) decision rule. Experimental results on four large scale, grand challenge datasets have shown that the proposed oRGB-SIFT descriptor improves recognition performance upon other color SIFT descriptors, and the CSF, the CGSF, and the CGSF+PHOG descriptors perform better than the other color SIFT descriptors. The fusion of both Color SIFT descriptors (CSF) and Color Grayscale SIFT descriptor (CGSF) shows significant improvement in the classification performance, which indicates that various color-SIFT descriptors and grayscale-SIFT descriptor are not redundant for image search.
Second, four novel color Local Binary Pattern (LBP) descriptors are presented for scene image and image texture classification. Specifically, the oRGB-LBP descriptor is derived in the oRGB color space. The other three color LBP descriptors, namely, the Color LBP Fusion (CLF), the Color Grayscale LBP Fusion (CGLF), and the CGLF+PHOG descriptors, are obtained by integrating the oRGB-LBP descriptor with some additional image features. Experimental results on three large scale, grand challenge datasets have shown that the proposed descriptors can improve scene image and image texture classification performance.
Finally, a new iris recognition method based on a robust iris segmentation approach is presented for improving iris recognition performance. The proposed robust iris segmentation approach applies power-law transformations for more accurate detection of the pupil region, which significantly reduces the candidate limbic boundary search space for increasing detection accuracy and efficiency. As the limbic circle, which has a center within a close range of the pupil center, is selectively detected, the eyelid detection approach leads to improved iris recognition performance. Experiments using the Iris Challenge Evaluation (ICE) database show the effectiveness of the proposed method
Perceptual-based textures for scene labeling: a bottom-up and a top-down approach
Due to the semantic gap, the automatic interpretation of digital images is a very challenging task. Both the segmentation and classification are intricate because of the high variation of the data. Therefore, the application of appropriate features is of utter importance. This paper presents biologically inspired texture features for material classification and interpreting outdoor scenery images. Experiments show that the presented texture features obtain the best classification results for material recognition compared to other well-known texture features, with an average classification rate of 93.0%. For scene analysis, both a bottom-up and top-down strategy are employed to bridge the semantic gap. At first, images are segmented into regions based on the perceptual texture and next, a semantic label is calculated for these regions. Since this emerging interpretation is still error prone, domain knowledge is ingested to achieve a more accurate description of the depicted scene. By applying both strategies, 91.9% of the pixels from outdoor scenery images obtained a correct label
Hybrid image representation methods for automatic image annotation: a survey
In most automatic image annotation systems, images are represented with low level features using either global
methods or local methods. In global methods, the entire image is used as a unit. Local methods divide images into blocks where fixed-size sub-image blocks are adopted as sub-units; or into regions by using segmented regions as sub-units in images. In contrast to typical automatic image annotation methods that use either global or local features exclusively, several recent methods have considered incorporating the two kinds of information, and believe that the combination of the two levels of features is
beneficial in annotating images. In this paper, we provide a
survey on automatic image annotation techniques according to
one aspect: feature extraction, and, in order to complement
existing surveys in literature, we focus on the emerging image annotation methods: hybrid methods that combine both global and local features for image representation
Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
In this paper, we introduce a novel end-end framework for multi-oriented
scene text detection from an instance-aware semantic segmentation perspective.
We present Fused Text Segmentation Networks, which combine multi-level features
during the feature extracting as text instance may rely on finer feature
expression compared to general objects. It detects and segments the text
instance jointly and simultaneously, leveraging merits from both semantic
segmentation task and region proposal based object detection task. Not
involving any extra pipelines, our approach surpasses the current state of the
art on multi-oriented scene text detection benchmarks: ICDAR2015 Incidental
Scene Text and MSRA-TD500 reaching Hmean 84.1% and 82.0% respectively. Morever,
we report a baseline on total-text containing curved text which suggests
effectiveness of the proposed approach.Comment: Accepted by ICPR201
Efficient Scene Text Localization and Recognition with Local Character Refinement
An unconstrained end-to-end text localization and recognition method is
presented. The method detects initial text hypothesis in a single pass by an
efficient region-based method and subsequently refines the text hypothesis
using a more robust local text model, which deviates from the common assumption
of region-based methods that all characters are detected as connected
components.
Additionally, a novel feature based on character stroke area estimation is
introduced. The feature is efficiently computed from a region distance map, it
is invariant to scaling and rotations and allows to efficiently detect text
regions regardless of what portion of text they capture.
The method runs in real time and achieves state-of-the-art text localization
and recognition results on the ICDAR 2013 Robust Reading dataset
- âŠ