3,747 research outputs found

    Automated Classification of Airborne Laser Scanning Point Clouds

    Full text link
    Making sense of the physical world has always been at the core of mapping. Up until recently, this has always dependent on using the human eye. Using airborne lasers, it has become possible to quickly "see" more of the world in many more dimensions. The resulting enormous point clouds serve as data sources for applications far beyond the original mapping purposes ranging from flooding protection and forestry to threat mitigation. In order to process these large quantities of data, novel methods are required. In this contribution, we develop models to automatically classify ground cover and soil types. Using the logic of machine learning, we critically review the advantages of supervised and unsupervised methods. Focusing on decision trees, we improve accuracy by including beam vector components and using a genetic algorithm. We find that our approach delivers consistently high quality classifications, surpassing classical methods

    Predicting Car Availability in Free Floating Car Sharing Systems: Leveraging Machine Learning in Challenging Contexts

    Get PDF
    5Free-Floating Car Sharing (FFCS) services are currently available in tens of cities and countries spread all over the worlds. Depending on citizens’ habits, service policies, and road conditions, car usage profiles are rather variable and often hardly predictable. Even within the same city, different usage trends emerge in different districts and in various time slots and weekdays. Therefore, modeling car availability in FFCS systems is particularly challenging. For these reasons, the research community has started to investigate the applicability of Machine Learning models to analyze FFCS usage data. This paper addresses the problem of predicting the short-term level of availability of the FFCS service in the short term. Specifically, it investigates the applicability of Machine Learning models to forecast the number of available car within a restricted urban area. It seeks the spatial and temporal contexts in which nonlinear ML models, trained on past usage data, are necessary to accurately predict car availability. Leveraging ML has shown to be particularly effective while considering highly dynamic urban contexts, where FFCS service usage is likely to suddenly and unexpectedly change. To tailor predictive models to the real FFCS data, we study also the influence of ML algorithm, prediction horizon, and characteristics of the neighborhood of the target area. The empirical outcomes allow us to provide system managers with practical guidelines to setup and tune ML models.openopenDaraio, Elena; Cagliero, Luca; Chiusano, Silvia; Garza, Paolo; Giordano, DaniloDaraio, Elena; Cagliero, Luca; Chiusano, Silvia; Garza, Paolo; Giordano, Danil

    Singular Continuation: Generating Piece-wise Linear Approximations to Pareto Sets via Global Analysis

    Full text link
    We propose a strategy for approximating Pareto optimal sets based on the global analysis framework proposed by Smale (Dynamical systems, New York, 1973, pp. 531-544). The method highlights and exploits the underlying manifold structure of the Pareto sets, approximating Pareto optima by means of simplicial complexes. The method distinguishes the hierarchy between singular set, Pareto critical set and stable Pareto critical set, and can handle the problem of superposition of local Pareto fronts, occurring in the general nonconvex case. Furthermore, a quadratic convergence result in a suitable set-wise sense is proven and tested in a number of numerical examples.Comment: 29 pages, 12 figure

    Unsupervised Algorithms for Microarray Sample Stratification

    Get PDF
    The amount of data made available by microarrays gives researchers the opportunity to delve into the complexity of biological systems. However, the noisy and extremely high-dimensional nature of this kind of data poses significant challenges. Microarrays allow for the parallel measurement of thousands of molecular objects spanning different layers of interactions. In order to be able to discover hidden patterns, the most disparate analytical techniques have been proposed. Here, we describe the basic methodologies to approach the analysis of microarray datasets that focus on the task of (sub)group discovery.Peer reviewe

    Detection and Generalization of Spatio-temporal Trajectories for Motion Imagery

    Get PDF
    In today\u27s world of vast information availability users often confront large unorganized amounts of data with limited tools for managing them. Motion imagery datasets have become increasingly popular means for exposing and disseminating information. Commonly, moving objects are of primary interest in modeling such datasets. Users may require different levels of detail mainly for visualization and further processing purposes according to the application at hand. In this thesis we exploit the geometric attributes of objects for dataset summarization by using a series of image processing and neural network tools. In order to form data summaries we select representative time instances through the segmentation of an object\u27s spatio-temporal trajectory lines. High movement variation instances are selected through a new hybrid self-organizing map (SOM) technique to describe a single spatio-temporal trajectory. Multiple objects move in diverse yet classifiable patterns. In order to group corresponding trajectories we utilize an abstraction mechanism that investigates a vague moving relevance between the data in space and time. Thus, we introduce the spatio-temporal neighborhood unit as a variable generalization surface. By altering the unit\u27s dimensions, scaled generalization is accomplished. Common complications in tracking applications that include occlusion, noise, information gaps and unconnected segments of data sequences are addressed through the hybrid-SOM analysis. Nevertheless, entangled data sequences where no information on which data entry belongs to each corresponding trajectory are frequently evident. A multidimensional classification technique that combines geometric and backpropagation neural network implementation is used to distinguish between trajectory data. Further more, modeling and summarization of two-dimensional phenomena evolving in time brings forward the novel concept of spatio-temporal helixes as compact event representations. The phenomena models are comprised of SOM movement nodes (spines) and cardinality shape-change descriptors (prongs). While we focus on the analysis of MI datasets, the framework can be generalized to function with other types of spatio-temporal datasets. Multiple scale generalization is allowed in a dynamic significance-based scale rather than a constant one. The constructed summaries are not just a visualization product but they support further processing for metadata creation, indexing, and querying. Experimentation, comparisons and error estimations for each technique support the analyses discussed

    Inference of Biogeographical Ancestry Under Resource Constraints

    Get PDF
    We study the problem of predicting human biogeographical ancestry using genomic data. While continental level ancestry prediction is relatively simple using genomic information, distinguishing between individuals from closely associated sub-populations (e.g., from the same continent) is still a difficult challenge. In particular, we focus on the case where the analysis is constrained to using single nucleotide polymorphisms (SNPs) from just one chromosome. We thus propose methods to construct ancestry informative SNP panels analyzing variants from a single chromosome, and evaluate the performance of such panels for both continental-level and sub-continental level ancestry prediction.;Efficient selection of ancestry informative SNPs is the key to successful ancestry prediction. The removal of redundant and noisy SNP features is essential prior to applying a learning algorithm. Here we propose two distinct methods of SNP selection: one is correlation-based SNP selection which uses a correlation metric to evaluate the usefulness of SNP features, while the other is random subspace projection based SNP selection which uses the learning algorithm itself to evaluate the worth of the SNP features. Correlation-based SNP selection approach can construct a small panel of useful SNPs for both continental level classification as well as binary classification of sub-populations. Unlike the correlation-based selection, random subspace projection based selection can construct efficient panel of SNP markers to address the difficult task of multinomial classification with multiple closely related sub-populations. We include results that demonstrate the performance of both methods, including comparison with other recently published related methods

    Shrunken Locally Linear Embedding for Passive Microwave Retrieval of Precipitation

    Full text link
    This paper introduces a new Bayesian approach to the inverse problem of passive microwave rainfall retrieval. The proposed methodology relies on a regularization technique and makes use of two joint dictionaries of coincidental rainfall profiles and their corresponding upwelling spectral radiative fluxes. A sequential detection-estimation strategy is adopted, which basically assumes that similar rainfall intensity values and their spectral radiances live close to some sufficiently smooth manifolds with analogous local geometry. The detection step employs a nearest neighborhood classification rule, while the estimation scheme is equipped with a constrained shrinkage estimator to ensure stability of retrieval and some physical consistency. The algorithm is examined using coincidental observations of the active precipitation radar (PR) and passive microwave imager (TMI) on board the Tropical Rainfall Measuring Mission (TRMM) satellite. We present promising results of instantaneous rainfall retrieval for some tropical storms and mesoscale convective systems over ocean, land, and coastal zones. We provide evidence that the algorithm is capable of properly capturing different storm morphologies including high intensity rain-cells and trailing light rainfall, especially over land and coastal areas. The algorithm is also validated at an annual scale for calendar year 2013 versus the standard (version 7) radar (2A25) and radiometer (2A12) rainfall products of the TRMM satellite
    • …
    corecore