458,461 research outputs found

    Development of Collaborative Research Initiatives to Advance the Aerospace Sciences-via the Communications, Electronics, Information Systems Focus Group

    Get PDF
    The primary goal of the Adaptive Vision Laboratory Research project was to develop advanced computer vision systems for automatic target recognition. The approach used in this effort combined several machine learning paradigms including evolutionary learning algorithms, neural networks, and adaptive clustering techniques to develop the E-MOR.PH system. This system is capable of generating pattern recognition systems to solve a wide variety of complex recognition tasks. A series of simulation experiments were conducted using E-MORPH to solve problems in OCR, military target recognition, industrial inspection, and medical image analysis. The bulk of the funds provided through this grant were used to purchase computer hardware and software to support these computationally intensive simulations. The payoff from this effort is the reduced need for human involvement in the design and implementation of recognition systems. We have shown that the techniques used in E-MORPH are generic and readily transition to other problem domains. Specifically, E-MORPH is multi-phase evolutionary leaming system that evolves cooperative sets of features detectors and combines their response using an adaptive classifier to form a complete pattern recognition system. The system can operate on binary or grayscale images. In our most recent experiments, we used multi-resolution images that are formed by applying a Gabor wavelet transform to a set of grayscale input images. To begin the leaming process, candidate chips are extracted from the multi-resolution images to form a training set and a test set. A population of detector sets is randomly initialized to start the evolutionary process. Using a combination of evolutionary programming and genetic algorithms, the feature detectors are enhanced to solve a recognition problem. The design of E-MORPH and recognition results for a complex problem in medical image analysis are described at the end of this report. The specific task involves the identification of vertebrae in x-ray images of human spinal columns. This problem is extremely challenging because the individual vertebra exhibit variation in shape, scale, orientation, and contrast. E-MORPH generated several accurate recognition systems to solve this task. This dual use of this ATR technology clearly demonstrates the flexibility and power of our approach

    Vehicle-Rear: A New Dataset to Explore Feature Fusion for Vehicle Identification Using Convolutional Neural Networks

    Full text link
    This work addresses the problem of vehicle identification through non-overlapping cameras. As our main contribution, we introduce a novel dataset for vehicle identification, called Vehicle-Rear, that contains more than three hours of high-resolution videos, with accurate information about the make, model, color and year of nearly 3,000 vehicles, in addition to the position and identification of their license plates. To explore our dataset we design a two-stream CNN that simultaneously uses two of the most distinctive and persistent features available: the vehicle's appearance and its license plate. This is an attempt to tackle a major problem: false alarms caused by vehicles with similar designs or by very close license plate identifiers. In the first network stream, shape similarities are identified by a Siamese CNN that uses a pair of low-resolution vehicle patches recorded by two different cameras. In the second stream, we use a CNN for OCR to extract textual information, confidence scores, and string similarities from a pair of high-resolution license plate patches. Then, features from both streams are merged by a sequence of fully connected layers for decision. In our experiments, we compared the two-stream network against several well-known CNN architectures using single or multiple vehicle features. The architectures, trained models, and dataset are publicly available at https://github.com/icarofua/vehicle-rear

    STV-based Video Feature Processing for Action Recognition

    Get PDF
    In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end

    Tactile Mapping and Localization from High-Resolution Tactile Imprints

    Full text link
    This work studies the problem of shape reconstruction and object localization using a vision-based tactile sensor, GelSlim. The main contributions are the recovery of local shapes from contact, an approach to reconstruct the tactile shape of objects from tactile imprints, and an accurate method for object localization of previously reconstructed objects. The algorithms can be applied to a large variety of 3D objects and provide accurate tactile feedback for in-hand manipulation. Results show that by exploiting the dense tactile information we can reconstruct the shape of objects with high accuracy and do on-line object identification and localization, opening the door to reactive manipulation guided by tactile sensing. We provide videos and supplemental information in the project's website http://web.mit.edu/mcube/research/tactile_localization.html.Comment: ICRA 2019, 7 pages, 7 figures. Website: http://web.mit.edu/mcube/research/tactile_localization.html Video: https://youtu.be/uMkspjmDbq

    Pigment Melanin: Pattern for Iris Recognition

    Full text link
    Recognition of iris based on Visible Light (VL) imaging is a difficult problem because of the light reflection from the cornea. Nonetheless, pigment melanin provides a rich feature source in VL, unavailable in Near-Infrared (NIR) imaging. This is due to biological spectroscopy of eumelanin, a chemical not stimulated in NIR. In this case, a plausible solution to observe such patterns may be provided by an adaptive procedure using a variational technique on the image histogram. To describe the patterns, a shape analysis method is used to derive feature-code for each subject. An important question is how much the melanin patterns, extracted from VL, are independent of iris texture in NIR. With this question in mind, the present investigation proposes fusion of features extracted from NIR and VL to boost the recognition performance. We have collected our own database (UTIRIS) consisting of both NIR and VL images of 158 eyes of 79 individuals. This investigation demonstrates that the proposed algorithm is highly sensitive to the patterns of cromophores and improves the iris recognition rate.Comment: To be Published on Special Issue on Biometrics, IEEE Transaction on Instruments and Measurements, Volume 59, Issue number 4, April 201

    Wireless Interference Identification with Convolutional Neural Networks

    Full text link
    The steadily growing use of license-free frequency bands requires reliable coexistence management for deterministic medium utilization. For interference mitigation, proper wireless interference identification (WII) is essential. In this work we propose the first WII approach based upon deep convolutional neural networks (CNNs). The CNN naively learns its features through self-optimization during an extensive data-driven GPU-based training process. We propose a CNN example which is based upon sensing snapshots with a limited duration of 12.8 {\mu}s and an acquisition bandwidth of 10 MHz. The CNN differs between 15 classes. They represent packet transmissions of IEEE 802.11 b/g, IEEE 802.15.4 and IEEE 802.15.1 with overlapping frequency channels within the 2.4 GHz ISM band. We show that the CNN outperforms state-of-the-art WII approaches and has a classification accuracy greater than 95% for signal-to-noise ratio of at least -5 dB

    Towards Automatic Speech Identification from Vocal Tract Shape Dynamics in Real-time MRI

    Full text link
    Vocal tract configurations play a vital role in generating distinguishable speech sounds, by modulating the airflow and creating different resonant cavities in speech production. They contain abundant information that can be utilized to better understand the underlying speech production mechanism. As a step towards automatic mapping of vocal tract shape geometry to acoustics, this paper employs effective video action recognition techniques, like Long-term Recurrent Convolutional Networks (LRCN) models, to identify different vowel-consonant-vowel (VCV) sequences from dynamic shaping of the vocal tract. Such a model typically combines a CNN based deep hierarchical visual feature extractor with Recurrent Networks, that ideally makes the network spatio-temporally deep enough to learn the sequential dynamics of a short video clip for video classification tasks. We use a database consisting of 2D real-time MRI of vocal tract shaping during VCV utterances by 17 speakers. The comparative performances of this class of algorithms under various parameter settings and for various classification tasks are discussed. Interestingly, the results show a marked difference in the model performance in the context of speech classification with respect to generic sequence or video classification tasks.Comment: To appear in the INTERSPEECH 2018 Proceeding

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
    • …
    corecore