Search CORE

156,184 research outputs found

Advanced miniature processing handware for ATR applications

Author: Chao Tien-Hsin
Daud Taher
Thakoor Anikumar
Publication venue
Publication date: 04/03/2003
Field of study

A Hybrid Optoelectronic Neural Object Recognition System (HONORS), is disclosed, comprising two major building blocks: (1) an advanced grayscale optical correlator (OC) and (2) a massively parallel three-dimensional neural-processor. The optical correlator, with its inherent advantages in parallel processing and shift invariance, is used for target of interest (TOI) detection and segmentation. The three-dimensional neural-processor, with its robust neural learning capability, is used for target classification and identification. The hybrid optoelectronic neural object recognition system, with its powerful combination of optical processing and neural networks, enables real-time, large frame, automatic target recognition (ATR)

NASA Technical Reports Server

Toward a perceptual object recognition system

Author: Awad Dounia
Publication venue: 'Universitat Autonoma de Barcelona'
Publication date: 01/01/2015
Field of study

[1] demonstrated that humans are easily able to recognize an object in less than 0.5 seconds. Unfortunately,object recognition remains one of the most challenging problems in computer vision. Many algorithms basedon local approaches have been proposed in recent decades. Local approaches can be divided in 4 phases:region selection, region appearance description, image representation and classification [2]. Although thesesystems have demonstrated excellent performance, some weaknesses remain. The first limitation is in the region selection phase. Many existing techniques extract a large number of points/regions of interest. For instance, dense grids contain tens of thousands of points per image while interest point detectors often extract thousands of points. Furthermore, some studies have demonstrated that these techniques were not designed to detect the most pertinent regions for object recognition. There is only a weak correlation between the distribution of extracted points and eye fixations [3]. The second limitation mentioned in the literature concerns the region appearance description phase. The techniques used in this phase typically describe image regions using high-dimensional vectors [4]. For example, SIFT, the most popular descriptor for object recognition, produces a 128-dimensional vector per region [5].The main objective of this thesis is to propose a pipeline for an object recognition algorithm based on human perception which addresses the object recognition system complexity: query run time and memory allocation. In this context, we propose a filter based on a visual attention system [6] to address the problems of extracting a large number of points of interest using existing region selection techniques. We chose to use bottom-up visual attention systems that encode attentional fixations in a topographic map, known as a saliency map. This map serves as basis for generating a mask to select salient points according to human interest, from the points extracted by a region selection technique [7]. Furthermore, we addressed the problem of high dimensionality of descriptors in region appearance phase. We proposed a new hybrid descriptor representing the spatial frequency of some perceptual features, extracted by a visual attention system (color, texture, intensity [8]. This descriptor consist of a concatenation of energy measures computed at the output of a filter bank [9], at each level of the multi-resolution pyramid of perceptual features. This descriptor has the advantage of being lower dimensional than traditional descriptors.The test of our filtering approach, using Perreira da Silva system [10] as a filter on VOC2005, demonstrated that we can maintain approximately the same performance of an object recognition system by selecting only 40% of extracted points (using Harris-Laplace [11] and Laplacian [12]), while having an important reduction in complexity (40% reduction in query run time). Furthermore, evaluating our descriptor with an object recognition system using Harris-Laplace and Laplacian interest point detectors on VOC2007 database showed a slight decrease in performance ( 5% reduction of average precision) compared to the original system based on the SIFT descriptor, but with a 50% reduction in complexity. In addition, we evaluated our descriptor using a visual attention system as the region selection technique on VOC2005. The experiment showed a slight decrease in performance (3% reduction in precision), but a drastically reduced complexity of the system (with 5% reduction in query run-time and 70% in complexity).In this thesis, we proposed two approaches to manage the problems of complexity in object recognitionsystem. In future, it would be interesting to address the problems of the last two phases in object system: image representation and classification, by introducing perceptually plausible concepts such as deep learning techniques

Crossref

Directory of Open Access Journals

Revistes Catalanes amb Accés Obert

Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)

Diposit Digital de Documents de la UAB

Machine learning paradigms for modeling spatial and temporal information in multimedia data mining

Author: Amira A
Bouchaffra D
Chen CS
Zhu C
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2010
Field of study

Multimedia data mining and knowledge discovery is a fast emerging interdisciplinary applied research area. There is tremendous potential for effective use of multimedia data mining (MDM) through intelligent analysis. Diverse application areas are increasingly relying on multimedia under-standing systems. Advances in multimedia understanding are related directly to advances in signal processing, computer vision, machine learning, pattern recognition, multimedia databases, and smart sensors. The main mission of this special issue is to identify state-of-the-art machine learning paradigms that are particularly powerful and effective for modeling and combining temporal and spatial media cues such as audio, visual, and face information and for accomplishing tasks of multimedia data mining and knowledge discovery. These models should be able to bridge the gap between low-level audiovisual features which require signal processing and high-level semantics. A number of papers have been submitted to the special issue in the areas of imaging, artificial intelligence; and pattern recognition and five contributions have been selected covering state-of-the-art algorithms and advanced related topics. The first contribution by D. Xiang et al. “Evaluation of data quality and drought monitoring capability of FY-3A MERSI data” describes some basic parameters and major technical indicators of the FY-3A, and evaluates data quality and drought monitoring capability of the Medium-Resolution Imager (MERSI) onboard the FY-3A. The second contribution by A. Belatreche et al. “Computing with biologically inspired neural oscillators: application to color image segmentation” investigates the computing capabilities and potential applications of neural oscillators, a biologically inspired neural model, to gray scale and color image segmentation, an important task in image understanding and object recognition. The major contribution of this paper is the ability to use neural oscillators as a learning scheme for solving real world engineering problems. The third paper by A. Dargazany et al. entitled “Multibandwidth Kernel-based object tracking” explores new methods for object tracking using the mean shift (MS). A bandwidth-handling MS technique is deployed in which the tracker reach the global mode of the density function not requiring a specific staring point. It has been proven via experiments that the Gradual Multibandwidth Mean Shift tracking algorithm can converge faster than the conventional kernel-based object tracking (known as the mean shift). The fourth contribution by S. Alzu’bi et al. entitled “3D medical volume segmentation using hybrid multi-resolution statistical approaches” studies new 3D volume segmentation using multiresolution statistical approaches based on discrete wavelet transform and hidden Markov models. This system commonly reduced the percentage error achieved using the traditional 2D segmentation techniques by several percent. Furthermore, a contribution by G. Cabanes et al. entitled “Unsupervised topographic learning for spatiotemporal data mining” proposes a new unsupervised algorithm, suitable for the analysis of noisy spatiotemporal Radio Frequency Identification (RFID) data. The new unsupervised algorithm depicted in this article is an efficient data mining tool for behavioral studies based on RFID technology. It has the ability to discover and compare stable patterns in a RFID signal, and is appropriate for continuous learning. Finally, we would like to thank all those who helped to make this special issue possible, especially the authors and the reviewers of the articles. Our thanks go to the Hindawi staff and personnel, the journal Manager in bringing about the issue and giving us the opportunity to edit this special issue

Crossref

Directory of Open Access Journals

Brunel University Research Archive

A Personalized and Scalable Machine Learning-Based File Management System

Author: Dhiraj Sati
Veena Bansal
Publication venue: 'University North'
Publication date: 01/01/2022
Field of study

In this work, we present a hybrid image and document filing system that we have built. When a user wants to store a file in the system, it is processed to generate tags using an appropriate open-source machine learning system. Presently, we use OpenCV and Tesseract OCR for tagging files. OpenCV recognizes objects in the images and TesserAct recognizes text in the image. An image file is processed for object recognition using OpenCV as well for text/captions process using TesserAct, which are used for tagging the file. All other files are processed using Tesseract only for generating tags. The user can also enter their own tags. A database system has been built that stores tags and the image path. Every file is stored with its owner identification and it is time-stamped. The system has a client-server architecture and can be used for storing and retrieving a large number of files. This is a highly scalable system

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

A Novel Hybrid CNN-AIS Visual Pattern Recognition Engine

Author: A Borji
A Krizhevsky
E Hart
F Lauer
G Dudek
LN Castro De
N Pinto
S Knerr
T Serre
Y LeCun
Y Shao
Publication venue
Publication date: 11/03/2015
Field of study

Machine learning methods are used today for most recognition problems. Convolutional Neural Networks (CNN) have time and again proved successful for many image processing tasks primarily for their architecture. In this paper we propose to apply CNN to small data sets like for example, personal albums or other similar environs where the size of training dataset is a limitation, within the framework of a proposed hybrid CNN-AIS model. We use Artificial Immune System Principles to enhance small size of training data set. A layer of Clonal Selection is added to the local filtering and max pooling of CNN Architecture. The proposed Architecture is evaluated using the standard MNIST dataset by limiting the data size and also with a small personal data sample belonging to two different classes. Experimental results show that the proposed hybrid CNN-AIS based recognition engine works well when the size of training data is limited in siz

arXiv.org e-Print Archive

Crossref

A Self-Organizing Neural System for Learning to Recognize Textured Scenes

Author: Grossberg Stephen
Williamson James
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/01/1997
Field of study

A self-organizing ARTEX model is developed to categorize and classify textured image regions. ARTEX specializes the FACADE model of how the visual cortex sees, and the ART model of how temporal and prefrontal cortices interact with the hippocampal system to learn visual recognition categories and their names. FACADE processing generates a vector of boundary and surface properties, notably texture and brightness properties, by utilizing multi-scale filtering, competition, and diffusive filling-in. Its context-sensitive local measures of textured scenes can be used to recognize scenic properties that gradually change across space, as well a.s abrupt texture boundaries. ART incrementally learns recognition categories that classify FACADE output vectors, class names of these categories, and their probabilities. Top-down expectations within ART encode learned prototypes that pay attention to expected visual features. When novel visual information creates a poor match with the best existing category prototype, a memory search selects a new category with which classify the novel data. ARTEX is compared with psychophysical data, and is benchmarked on classification of natural textures and synthetic aperture radar images. It outperforms state-of-the-art systems that use rule-based, backpropagation, and K-nearest neighbor classifiers.Defense Advanced Research Projects Agency; Office of Naval Research (N00014-95-1-0409, N00014-95-1-0657

Boston University Institutional Repository (OpenBU)