1,028 research outputs found

    Spherical similarity explorer for comparative case analysis

    Get PDF
    Comparative Case Analysis (CCA) is an important tool for criminal investigation and crime theory extraction. It analyzes the commonalities and differences between a collection of crime reports in order to understand crime patterns and identify abnormal cases. A big challenge of CCA is the data processing and exploration. Traditional manual approach can no longer cope with the increasing volume and complexity of the data. In this paper we introduce a novel visual analytics system, Spherical Similarity Explorer (SSE) that automates the data processing process and provides interactive visualizations to support the data exploration. We illustrate the use of the system with uses cases that involve real world application data and evaluate the system with criminal intelligence analysts

    Spherical similarity explorer for comparative case analysis

    Get PDF
    Comparative Case Analysis (CCA) is an important tool for criminal investigation and crime theory extraction. It analyzes the commonalities and differences between a collection of crime reports in order to understand crime patterns and identify abnormal cases. A big challenge of CCA is the data processing and exploration. Traditional manual approach can no longer cope with the increasing volume and complexity of the data. In this paper we introduce a novel visual analytics system, Spherical Similarity Explorer (SSE) that automates the data processing process and provides interactive visualizations to support the data exploration. We illustrate the use of the system with uses cases that involve real world application data and evaluate the system with criminal intelligence analysts

    Advances in dissimilarity-based data visualisation

    Get PDF
    Gisbrecht A. Advances in dissimilarity-based data visualisation. Bielefeld: Universitätsbibliothek Bielefeld; 2015

    Automated Impact Crater Detection and Characterization Using Digital Elevation Data

    Get PDF
    Impact craters are used as subjects for the remote study of a wide variety of surface and subsurface processes throughout the solar system. Their populations and shape characteristics are collected, often manually, and analysed by a large community of planetary scientists. This research investigates the application of automated methods for both the detection and characterization of impact craters on the Moon and Mars, using machine learning techniques and digital elevation data collected by orbital spacecraft. We begin by first assessing the effect of lunar terrain type variation on automated crater detection results. Next, we develop a novel automated crater degradation classification system for martian complex craters using polynomial profile approximation. This work identifies that surface age estimations and crater statistics acquired through automatic crater detection are influenced by terrain type, with unique detection error responses. Additionally, we demonstrate an objective system that can be used to automate the classification of crater degradation states, and identify some potential areas of improvement for such a system

    Manifold Learning Approaches to Compressing Latent Spaces of Unsupervised Feature Hierarchies

    Get PDF
    Field robots encounter dynamic unstructured environments containing a vast array of unique objects. In order to make sense of the world in which they are placed, they collect large quantities of unlabelled data with a variety of sensors. Producing robust and reliable applications depends entirely on the ability of the robot to understand the unlabelled data it obtains. Deep Learning techniques have had a high level of success in learning powerful unsupervised representations for a variety of discriminative and generative models. Applying these techniques to problems encountered in field robotics remains a challenging endeavour. Modern Deep Learning methods are typically trained with a substantial labelled dataset, while datasets produced in a field robotics context contain limited labelled training data. The primary motivation for this thesis stems from the problem of applying large scale Deep Learning models to field robotics datasets that are label poor. While the lack of labelled ground truth data drives the desire for unsupervised methods, the need for improving the model scaling is driven by two factors, performance and computational requirements. When utilising unsupervised layer outputs as representations for classification, the classification performance increases with layer size. Scaling up models with multiple large layers of features is problematic, as the sizes of subsequent hidden layers scales with the size of the previous layer. This quadratic scaling, and the associated time required to train such networks has prevented adoption of large Deep Learning models beyond cluster computing. The contributions in this thesis are developed from the observation that parameters or filter el- ements learnt in Deep Learning systems are typically highly structured, and contain related ele- ments. Firstly, the structure of unsupervised filters is utilised to construct a mapping from the high dimensional filter space to a low dimensional manifold. This creates a significantly smaller repre- sentation for subsequent feature learning. This mapping, and its effect on the resulting encodings, highlights the need for the ability to learn highly overcomplete sets of convolutional features. Driven by this need, the unsupervised pretraining of Deep Convolutional Networks is developed to include a number of modern training and regularisation methods. These pretrained models are then used to provide initialisations for supervised convolutional models trained on low quantities of labelled data. By utilising pretraining, a significant increase in classification performance on a number of publicly available datasets is achieved. In order to apply these techniques to outdoor 3D Laser Illuminated Detection And Ranging data, we develop a set of resampling techniques to provide uniform input to Deep Learning models. The features learnt in these systems outperform the high effort hand engineered features developed specifically for 3D data. The representation of a given signal is then reinterpreted as a combination of modes that exist on the learnt low dimensional filter manifold. From this, we develop an encoding technique that allows the high dimensional layer output to be represented as a combination of low dimensional components. This allows the growth of subsequent layers to only be dependent on the intrinsic dimensionality of the filter manifold and not the number of elements contained in the previous layer. Finally, the resulting unsupervised convolutional model, the encoding frameworks and the em- bedding methodology are used to produce a new unsupervised learning stratergy that is able to encode images in terms of overcomplete filter spaces, without producing an explosion in the size of the intermediate parameter spaces. This model produces classification results on par with state of the art models, yet requires significantly less computational resources and is suitable for use in the constrained computation environment of a field robot

    Data exploration with learning metrics

    Get PDF
    A crucial problem in exploratory analysis of data is that it is difficult for computational methods to focus on interesting aspects of data. Traditional methods of unsupervised learning cannot differentiate between interesting and noninteresting variation, and hence may model, visualize, or cluster parts of data that are not interesting to the analyst. This wastes the computational power of the methods and may mislead the analyst. In this thesis, a principle called "learning metrics" is used to develop visualization and clustering methods that automatically focus on the interesting aspects, based on auxiliary labels supplied with the data samples. The principle yields non-Euclidean (Riemannian) metrics that are data-driven, widely applicable, versatile, invariant to many transformations, and in part invariant to noise. Learning metric methods are introduced for five tasks: nonlinear visualization by Self-Organizing Maps and Multidimensional Scaling, linear projection, and clustering of discrete data and multinomial distributions. The resulting methods either explicitly estimate distances in the Riemannian metric, or optimize a tailored cost function which is implicitly related to such a metric. The methods have rigorous theoretical relationships to information geometry and probabilistic modeling, and are empirically shown to yield good practical results in exploratory and information retrieval tasks.reviewe

    The use of airborne laser scanning to develop a pixel-based stratification for a verified carbon offset project

    Get PDF
    Background The voluntary carbon market is a new and growing market that is increasingly important to consider in managing forestland. Monitoring, reporting, and verifying carbon stocks and fluxes at a project level is the single largest direct cost of a forest carbon offset project. There are now many methods for estimating forest stocks with high accuracy that use both Airborne Laser Scanning (ALS) and high-resolution optical remote sensing data. However, many of these methods are not appropriate for use under existing carbon offset standards and most have not been field tested. Results This paper presents a pixel-based forest stratification method that uses both ALS and optical remote sensing data to optimally partition the variability across an ~10,000 ha forest ownership in Mendocino County, CA, USA. This new stratification approach improved the accuracy of the forest inventory, reduced the cost of field-based inventory, and provides a powerful tool for future management planning. This approach also details a method of determining the optimum pixel size to best partition a forest. Conclusions The use of ALS and optical remote sensing data can help reduce the cost of field inventory and can help to locate areas that need the most intensive inventory effort. This pixel-based stratification method may provide a cost-effective approach to reducing inventory costs over larger areas when the remote sensing data acquisition costs can be kept low on a per acre basis

    Mapping plant diversity and composition across North Carolina Piedmont forest landscapes using LiDAR-hyperspectral remote sensing

    Get PDF
    Forest modification, from local stress to global change, has given rise to efforts to model, map, and monitor critical properties of forest communities like structure, composition, and diversity. Predictive models based on data from spatially-nested field plots and LiDAR-hyperspectral remote sensing systems are one particularly effective means towards the otherwise prohibitively resource-intensive task of consistently characterizing forest community dynamics at landscape scales. However, to date, most predictive models fail to account for actual (rather than idealized) species and community distributions, are unsuccessful in predicting understory components in structurally and taxonomically heterogeneous forests, and may suffer from diminished predictive accuracy due to incongruity in scale and precision between field plot samples, remotely-sensed data, and target biota of varying size and density. This three-part study addresses these and other concerns in the modeling and mapping of emergent properties of forest communities by shifting the scope of prediction from the individual or taxon to the whole stand or community. It is, after all, at the stand scale where emergent properties like functional processes, biodiversity, and habitat aggregate and manifest. In the first study, I explore the relationship between forest structure (a proxy for successional demographics and resource competition) and tree species diversity in the North Carolina Piedmont, highlighting the empirical basis and potential for utilizing forest structure from LiDAR in predictive models of tree species diversity. I then extend these conclusions to map landscape pattern in multi-scale vascular plant diversity as well as turnover in community-continua at varying compositional resolutions in a North Carolina Piedmont landscape using remotely-sensed LiDAR-hyperspectral estimates of topography, canopy structure, and foliar biochemistry. Recognizing that the distinction between correlation and causation mirrors that between knowledge and understanding, all three studies distinguish between prediction of pattern and inference of process. Thus, in addition to advancing mapping methodologies relevant to a range of forest ecosystem management and monitoring applications, all three studies are noteworthy for assessing the ecological relationship between environmental predictors and emergent landscape patterns in plant composition and diversity in North Carolina Piedmont forests.Doctor of Philosoph
    corecore