Search CORE

3,796 research outputs found

Automatic target recognition in sonar imagery using a cascade of boosted classifiers

Author: Sawas Jamil
Publication venue: Engineering and Physical Sciences
Publication date: 01/05/2015
Field of study

This thesis is concerned with the problem of automating the interpretation of data representing the underwater environment retrieved from sensors. This is an important task which potentially allows underwater robots to become completely autonomous, keeping humans out of harm’s way and reducing the operational time and cost of many underwater applications. Typical applications include unexploded ordnance clearance, ship/plane wreck hunting (e.g. Malaysia Airlines flight MH370), and oilfield inspection (e.g. Deepwater Horizon disaster). Two attributes of the processing are crucial if automated interpretation is to be successful. First, computational efficiency is required to allow real-time analysis to be performed on-board robots with limited resources. Second, detection accuracy comparable to human experts is required in order to replace them. Approaches in the open literature do not appear capable of achieving these requirements and this therefore has become the objective of this thesis. This thesis proposes a novel approach capable of recognizing targets in sonar data extremely rapidly with a low number of false alarms. The approach was originally developed for face detection in video, and it is applied to sonar data here for the first time. Aside from the application, the main contribution of this thesis, therefore, is in the way this approach is extended to reduce its training time and improve its detection accuracy. Results obtained on large sets of real sonar data on a variety of challenging terrains are presented to show the discriminative power of the proposed approach. In real field trials, the proposed approach was capable of processing sonar data real-time on-board underwater robots. In direct comparison with human experts, the proposed approach offers 40% reduction in the number of false alarms

Forest structure from terrestrial laser scanning – in support of remote sensing calibration/validation and operational inventory

Author: Kelbe David
Publication venue: RIT Scholar Works
Publication date: 16/07/2015
Field of study

Forests are an important part of the natural ecosystem, providing resources such as timber and fuel, performing services such as energy exchange and carbon storage, and presenting risks, such as fire damage and invasive species impacts. Improved characterization of forest structural attributes is desirable, as it could improve our understanding and management of these natural resources. However, the traditional, systematic collection of forest information – dubbed “forest inventory” – is time-consuming, expensive, and coarse when compared to novel 3-D measurement technologies. Remote sensing estimates, on the other hand, provide synoptic coverage, but often fail to capture the fine- scale structural variation of the forest environment. Terrestrial laser scanning (TLS) has demonstrated a potential to address these limitations, but its operational use has remained limited due to unsatisfactory performance characteristics vs. budgetary constraints of many end-users. To address this gap, my dissertation advanced affordable mobile laser scanning capabilities for operational forest structure assessment. We developed geometric reconstruction of forest structure from rapid-scan, low-resolution point cloud data, providing for automatic extraction of standard forest inventory metrics. To augment these results over larger areas, we designed a view-invariant feature descriptor to enable marker-free registration of TLS data pairs, without knowledge of the initial sensor pose. Finally, a graph-theory framework was integrated to perform multi-view registration between a network of disconnected scans, which provided improved assessment of forest inventory variables. This work addresses a major limitation related to the inability of TLS to assess forest structure at an operational scale, and may facilitate improved understanding of the phenomenology of airborne sensing systems, by providing fine-scale reference data with which to interpret the active or passive electromagnetic radiation interactions with forest structure. Outputs are being utilized to provide antecedent science data for NASA’s HyspIRI mission and to support the National Ecological Observatory Network’s (NEON) long-term environmental monitoring initiatives

RIT Scholar Works

Symbolic and Visual Retrieval of Mathematical Notation using Formula Graph Symbol Pair Matching and Structural Alignment

Author: Davila Castellanos Kenny
Publication venue: RIT Scholar Works
Publication date: 01/07/2017
Field of study

Large data collections containing millions of math formulae in different formats are available on-line. Retrieving math expressions from these collections is challenging. We propose a framework for retrieval of mathematical notation using symbol pairs extracted from visual and semantic representations of mathematical expressions on the symbolic domain for retrieval of text documents. We further adapt our model for retrieval of mathematical notation on images and lecture videos. Graph-based representations are used on each modality to describe math formulas. For symbolic formula retrieval, where the structure is known, we use symbol layout trees and operator trees. For image-based formula retrieval, since the structure is unknown we use a more general Line of Sight graph representation. Paths of these graphs define symbol pairs tuples that are used as the entries for our inverted index of mathematical notation. Our retrieval framework uses a three-stage approach with a fast selection of candidates as the first layer, a more detailed matching algorithm with similarity metric computation in the second stage, and finally when relevance assessments are available, we use an optional third layer with linear regression for estimation of relevance using multiple similarity scores for final re-ranking. Our model has been evaluated using large collections of documents, and preliminary results are presented for videos and cross-modal search. The proposed framework can be adapted for other domains like chemistry or technical diagrams where two visually similar elements from a collection are usually related to each other

RIT Scholar Works

3rd SC@RUG 2006 proceedings:Student Colloquium 2005-2006

Author
Publication venue: Rijksuniversiteit Groningen. Universiteitsbibliotheek
Publication date: 01/01/2006
Field of study

Dissertations of the University of Groningen

3rd SC@RUG 2006 proceedings:Student Colloquium 2005-2006

Author
Publication venue: Rijksuniversiteit Groningen. Universiteitsbibliotheek
Publication date: 01/01/2006
Field of study

Dissertations of the University of Groningen

Query-Driven Global Graph Attention Model for Visual Parsing: Recognizing Handwritten and Typeset Math Formulas

Author: Mahdavi Mahshad
Publication venue: RIT Scholar Works
Publication date: 07/08/2020
Field of study

We present a new visual parsing method based on standard Convolutional Neural Networks (CNNs) for handwritten and typeset mathematical formulas. The Query-Driven Global Graph Attention (QD-GGA) parser employs multi-task learning, using a single feature representation for locating, classifying, and relating symbols. QD-GGA parses formulas by first constructing a Line-Of-Sight (LOS) graph over the input primitives (e.g handwritten strokes or connected components in images). Second, class distributions for LOS nodes and edges are obtained using query-specific feature filters (i.e., attention) in a single feed-forward pass. This allows end-to-end structure learning using a joint loss over primitive node and edge class distributions. Finally, a Maximum Spanning Tree (MST) is extracted from the weighted graph using Edmonds\u27 Arborescence Algorithm. The model may be run recurrently over the input graph, updating attention to focus on symbols detected in the previous iteration. QD-GGA does not require additional grammar rules and the language model is learned from the sets of symbols/relationships and the statistics over them in the training set. We benchmark our system against both handwritten and typeset state-of-the-art math recognition systems. Our preliminary results show that this is a promising new approach for visual parsing of math formulas. Using recurrent execution, symbol detection is near perfect for both handwritten and typeset formulas: we obtain a symbol f-measure of over 99.4% for both the CROHME (handwritten) and INFTYMCCDB-2 (typeset formula image) datasets. Our method is also much faster in both training and execution than state-of-the-art RNN-based formula parsers. The unlabeled structure detection of QDGGA is competitive with encoder-decoder models, but QD-GGA symbol and relationship classification is weaker. We believe this may be addressed through increased use of spatial features and global context

RIT Scholar Works