9,341 research outputs found

    Rotation-invariant features for multi-oriented text detection in natural images.

    Get PDF
    Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes

    Data-Driven Shape Analysis and Processing

    Full text link
    Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

    A Discriminative Representation of Convolutional Features for Indoor Scene Recognition

    Full text link
    Indoor scene recognition is a multi-faceted and challenging problem due to the diverse intra-class variations and the confusing inter-class similarities. This paper presents a novel approach which exploits rich mid-level convolutional features to categorize indoor scenes. Traditionally used convolutional features preserve the global spatial structure, which is a desirable property for general object recognition. However, we argue that this structuredness is not much helpful when we have large variations in scene layouts, e.g., in indoor scenes. We propose to transform the structured convolutional activations to another highly discriminative feature space. The representation in the transformed space not only incorporates the discriminative aspects of the target dataset, but it also encodes the features in terms of the general object categories that are present in indoor scenes. To this end, we introduce a new large-scale dataset of 1300 object categories which are commonly present in indoor scenes. Our proposed approach achieves a significant performance boost over previous state of the art approaches on five major scene classification datasets

    A Framework for Symmetric Part Detection in Cluttered Scenes

    Full text link
    The role of symmetry in computer vision has waxed and waned in importance during the evolution of the field from its earliest days. At first figuring prominently in support of bottom-up indexing, it fell out of favor as shape gave way to appearance and recognition gave way to detection. With a strong prior in the form of a target object, the role of the weaker priors offered by perceptual grouping was greatly diminished. However, as the field returns to the problem of recognition from a large database, the bottom-up recovery of the parts that make up the objects in a cluttered scene is critical for their recognition. The medial axis community has long exploited the ubiquitous regularity of symmetry as a basis for the decomposition of a closed contour into medial parts. However, today's recognition systems are faced with cluttered scenes, and the assumption that a closed contour exists, i.e. that figure-ground segmentation has been solved, renders much of the medial axis community's work inapplicable. In this article, we review a computational framework, previously reported in Lee et al. (2013), Levinshtein et al. (2009, 2013), that bridges the representation power of the medial axis and the need to recover and group an object's parts in a cluttered scene. Our framework is rooted in the idea that a maximally inscribed disc, the building block of a medial axis, can be modeled as a compact superpixel in the image. We evaluate the method on images of cluttered scenes.Comment: 10 pages, 8 figure

    Designing a fruit identification algorithm in orchard conditions to develop robots using video processing and majority voting based on hybrid artificial neural network

    Get PDF
    The first step in identifying fruits on trees is to develop garden robots for different purposes such as fruit harvesting and spatial specific spraying. Due to the natural conditions of the fruit orchards and the unevenness of the various objects throughout it, usage of the controlled conditions is very difficult. As a result, these operations should be performed in natural conditions, both in light and in the background. Due to the dependency of other garden robot operations on the fruit identification stage, this step must be performed precisely. Therefore, the purpose of this paper was to design an identification algorithm in orchard conditions using a combination of video processing and majority voting based on different hybrid artificial neural networks. The different steps of designing this algorithm were: (1) Recording video of different plum orchards at different light intensities; (2) converting the videos produced into its frames; (3) extracting different color properties from pixels; (4) selecting effective properties from color extraction properties using hybrid artificial neural network-harmony search (ANN-HS); and (5) classification using majority voting based on three classifiers of artificial neural network-bees algorithm (ANN-BA), artificial neural network-biogeography-based optimization (ANN-BBO), and artificial neural network-firefly algorithm (ANN-FA). Most effective features selected by the hybrid ANN-HS consisted of the third channel in hue saturation lightness (HSL) color space, the second channel in lightness chroma hue (LCH) color space, the first channel in L*a*b* color space, and the first channel in hue saturation intensity (HSI). The results showed that the accuracy of the majority voting method in the best execution and in 500 executions was 98.01% and 97.20%, respectively. Based on different performance evaluation criteria of the classifiers, it was found that the majority voting method had a higher performance.European Union (EU) under Erasmus+ project entitled “Fostering Internationalization in Agricultural Engineering in Iran and Russia” [FARmER] with grant number 585596-EPP-1-2017-1-DE-EPPKA2-CBHE-JPinfo:eu-repo/semantics/publishedVersio
    corecore