9,342 research outputs found
Rotation-invariant features for multi-oriented text detection in natural images.
Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes
Data-Driven Shape Analysis and Processing
Data-driven methods play an increasingly important role in discovering
geometric, structural, and semantic relationships between 3D shapes in
collections, and applying this analysis to support intelligent modeling,
editing, and visualization of geometric data. In contrast to traditional
approaches, a key feature of data-driven approaches is that they aggregate
information from a collection of shapes to improve the analysis and processing
of individual shapes. In addition, they are able to learn models that reason
about properties and relationships of shapes without relying on hard-coded
rules or explicitly programmed instructions. We provide an overview of the main
concepts and components of these techniques, and discuss their application to
shape classification, segmentation, matching, reconstruction, modeling and
exploration, as well as scene analysis and synthesis, through reviewing the
literature and relating the existing works with both qualitative and numerical
comparisons. We conclude our report with ideas that can inspire future research
in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
A Discriminative Representation of Convolutional Features for Indoor Scene Recognition
Indoor scene recognition is a multi-faceted and challenging problem due to
the diverse intra-class variations and the confusing inter-class similarities.
This paper presents a novel approach which exploits rich mid-level
convolutional features to categorize indoor scenes. Traditionally used
convolutional features preserve the global spatial structure, which is a
desirable property for general object recognition. However, we argue that this
structuredness is not much helpful when we have large variations in scene
layouts, e.g., in indoor scenes. We propose to transform the structured
convolutional activations to another highly discriminative feature space. The
representation in the transformed space not only incorporates the
discriminative aspects of the target dataset, but it also encodes the features
in terms of the general object categories that are present in indoor scenes. To
this end, we introduce a new large-scale dataset of 1300 object categories
which are commonly present in indoor scenes. Our proposed approach achieves a
significant performance boost over previous state of the art approaches on five
major scene classification datasets
A Framework for Symmetric Part Detection in Cluttered Scenes
The role of symmetry in computer vision has waxed and waned in importance
during the evolution of the field from its earliest days. At first figuring
prominently in support of bottom-up indexing, it fell out of favor as shape
gave way to appearance and recognition gave way to detection. With a strong
prior in the form of a target object, the role of the weaker priors offered by
perceptual grouping was greatly diminished. However, as the field returns to
the problem of recognition from a large database, the bottom-up recovery of the
parts that make up the objects in a cluttered scene is critical for their
recognition. The medial axis community has long exploited the ubiquitous
regularity of symmetry as a basis for the decomposition of a closed contour
into medial parts. However, today's recognition systems are faced with
cluttered scenes, and the assumption that a closed contour exists, i.e. that
figure-ground segmentation has been solved, renders much of the medial axis
community's work inapplicable. In this article, we review a computational
framework, previously reported in Lee et al. (2013), Levinshtein et al. (2009,
2013), that bridges the representation power of the medial axis and the need to
recover and group an object's parts in a cluttered scene. Our framework is
rooted in the idea that a maximally inscribed disc, the building block of a
medial axis, can be modeled as a compact superpixel in the image. We evaluate
the method on images of cluttered scenes.Comment: 10 pages, 8 figure
Designing a fruit identification algorithm in orchard conditions to develop robots using video processing and majority voting based on hybrid artificial neural network
The first step in identifying fruits on trees is to develop garden robots for different purposes
such as fruit harvesting and spatial specific spraying. Due to the natural conditions of the fruit
orchards and the unevenness of the various objects throughout it, usage of the controlled conditions
is very difficult. As a result, these operations should be performed in natural conditions, both
in light and in the background. Due to the dependency of other garden robot operations on the
fruit identification stage, this step must be performed precisely. Therefore, the purpose of this
paper was to design an identification algorithm in orchard conditions using a combination of video
processing and majority voting based on different hybrid artificial neural networks. The different
steps of designing this algorithm were: (1) Recording video of different plum orchards at different
light intensities; (2) converting the videos produced into its frames; (3) extracting different color
properties from pixels; (4) selecting effective properties from color extraction properties using
hybrid artificial neural network-harmony search (ANN-HS); and (5) classification using majority
voting based on three classifiers of artificial neural network-bees algorithm (ANN-BA), artificial
neural network-biogeography-based optimization (ANN-BBO), and artificial neural network-firefly
algorithm (ANN-FA). Most effective features selected by the hybrid ANN-HS consisted of the third
channel in hue saturation lightness (HSL) color space, the second channel in lightness chroma hue
(LCH) color space, the first channel in L*a*b* color space, and the first channel in hue saturation
intensity (HSI). The results showed that the accuracy of the majority voting method in the best execution
and in 500 executions was 98.01% and 97.20%, respectively. Based on different performance evaluation
criteria of the classifiers, it was found that the majority voting method had a higher performance.European Union (EU) under Erasmus+ project entitled
“Fostering Internationalization in Agricultural Engineering in Iran and Russia” [FARmER] with grant
number 585596-EPP-1-2017-1-DE-EPPKA2-CBHE-JPinfo:eu-repo/semantics/publishedVersio
- …