52 research outputs found

    Towards hierarchical blackboard mapping on a whiskered robot

    Get PDF
    The paradigm case for robotic mapping assumes large quantities of sensory information which allow the use of relatively weak priors. In contrast, the present study considers the mapping problem for a mobile robot, CrunchBot, where only sparse, local tactile information from whisker sensors is available. To compensate for such weak likelihood information, we make use of low-level signal processing and strong hierarchical object priors. Hierarchical models were popular in classical blackboard systems but are here applied in a Bayesian setting as a mapping algorithm. The hierarchical models require reports of whisker distance to contact and of surface orientation at contact, and we demonstrate that this information can be retrieved by classifiers from strain data collected by CrunchBot's physical whiskers. We then provide a demonstration in simulation of how this information can be used to build maps (but not yet full SLAM) in an zero-odometry-noise environment containing walls and table-like hierarchical objects. © 2012 Elsevier B.V. All rights reserved

    Rotating machine prognostics using system-level models

    Get PDF
    The prognostics of rotating machines is crucial for the reliable and safe operation as well as maximizing usage time. Many reliability studies focus on component-level prognostics. However, in many cases, the desired information is the residual life of the system, rather than the lifetimes of its constituent components. This review paper focuses on system-level prognostic techniques that can be applied to rotating machinery. These approaches use multi-dimensional condition monitoring data collected from different parts of the system of interest to predict the remaining useful life at the system level. The working principles, merits and drawbacks as well as field of applications of these techniques are summarized

    Morphometric Otolith Analysis

    Get PDF
    Abstract Fish otoliths have long played an important role in sustainable �sheries management. Stock assessment models currently used rely on species speci�c age pro�les obtained from the seasonal patterns of growth marks that otoliths exhibit. We compare methods widely used in �sheries science (elliptical Fourier) with an industry standardised encoding method (MPEG7 - Curvature-Scale-Space) and with a recent addition to shape modelling techniques (time-series shapelets) to determine which performs best. An investigation is carried out into transform methods that retain size-information, and whether the boundary encoding method is impacted be otolith age, performing tests over three 2-class otolith datasets across six discrete and concurrent age groups. Impact of segmentation methods are assessed to determine whether automated or expert segmented methods of boundary extraction are more advantageous, and whether constructed classi�ers can be used at di�erent institutions. Tests show that neither time-series shaplets nor Curvature-Scale-Space methods o�er any real advantage over Fourier transform methods given mixed age datasets. However, we show that size indices are most indicative of �sheries stock in younger single-age datasets, with shape holding more discriminatory potential in older samples. Whilst commonly used Fourier transform methods generally return best results; we show that classi�cation of otolith boundaries is impacted by the method of boundary segmentation. Hand traced boundaries produce classi�ers more robust to test data segmentation methods and are more suited to distributed classi�ers. Additionally we present a proof of concept study showing that high energy synchrotron scans are a new, non-invasive method of modelling internal otolith structure, allowing comparison of slices along near in�nite numbers of virtual complex planes

    Time Series classification through transformation and ensembles

    Get PDF
    The problem of time series classification (TSC), where we consider any real-valued ordered data a time series, offers a specific challenge. Unlike traditional classification problems, the ordering of attributes is often crucial for identifying discriminatory features between classes. TSC problems arise across a diverse range of domains, and this variety has meant that no single approach outperforms all others. The general consensus is that the benchmark for TSC is nearest neighbour (NN) classifiers using Euclidean distance or Dynamic Time Warping (DTW). Though conceptually simple, many have reported that NN classifiers are very diffi�cult to beat and new work is often compared to NN classifiers. The majority of approaches have focused on classification in the time domain, typically proposing alternative elastic similarity measures for NN classification. Other work has investigated more specialised approaches, such as building support vector machines on variable intervals and creating tree-based ensembles with summary measures. We wish to answer a specific research question: given a new TSC problem without any prior, specialised knowledge, what is the best way to approach the problem? Our thesis is that the best methodology is to first transform data into alternative representations where discriminatory features are more easily detected, and then build ensemble classifiers on each representation. In support of our thesis, we propose an elastic ensemble classifier that we believe is the first ever to significantly outperform DTW on the widely used UCR datasets. Next, we propose the shapelet-transform, a new data transformation that allows complex classifiers to be coupled with shapelets, which outperforms the original algorithm and is competitive with DTW. Finally, we combine these two works with with heterogeneous ensembles built on autocorrelation and spectral-transformed data to propose a collective of transformation-based ensembles (COTE). The results of COTE are, we believe, the best ever published on the UCR datasets

    Agrupamiento, predicción y clasificación ordinal para series temporales utilizando técnicas de machine learning: aplicaciones

    Get PDF
    In the last years, there has been an increase in the number of fields improving their standard processes by using machine learning (ML) techniques. The main reason for this is that the vast amount of data generated by these processes is difficult to be processed by humans. Therefore, the development of automatic methods to process and extract relevant information from these data processes is of great necessity, giving that these approaches could lead to an increase in the economic benefit of enterprises or to a reduction in the workload of some current employments. Concretely, in this Thesis, ML approaches are applied to problems concerning time series data. Time series is a special kind of data in which data points are collected chronologically. Time series are present in a wide variety of fields, such as atmospheric events or engineering applications. Besides, according to the main objective to be satisfied, there are different tasks in the literature applied to time series. Some of them are those on which this Thesis is mainly focused: clustering, classification, prediction and, in general, analysis. Generally, the amount of data to be processed is huge, arising the need of methods able to reduce the dimensionality of time series without decreasing the amount of information. In this sense, the application of time series segmentation procedures dividing the time series into different subsequences is a good option, given that each segment defines a specific behaviour. Once the different segments are obtained, the use of statistical features to characterise them is an excellent way to maximise the information of the time series and simultaneously reducing considerably their dimensionality. In the case of time series clustering, the objective is to find groups of similar time series with the idea of discovering interesting patterns in time series datasets. In this Thesis, we have developed a novel time series clustering technique. The aim of this proposal is twofold: to reduce as much as possible the dimensionality and to develop a time series clustering approach able to outperform current state-of-the-art techniques. In this sense, for the first objective, the time series are segmented in order to divide the them identifying different behaviours. Then, these segments are projected into a vector of statistical features aiming to reduce the dimensionality of the time series. Once this preprocessing step is done, the clustering of the time series is carried out, with a significantly lower computational load. This novel approach has been tested on all the time series datasets available in the University of East Anglia and University of California Riverside (UEA/UCR) time series classification (TSC) repository. Regarding time series classification, two main paths could be differentiated: firstly, nominal TSC, which is a well-known field involving a wide variety of proposals and transformations applied to time series. Concretely, one of the most popular transformation is the shapelet transform (ST), which has been widely used in this field. The original method extracts shapelets from the original time series and uses them for classification purposes. Nevertheless, the full enumeration of all possible shapelets is very time consuming. Therefore, in this Thesis, we have developed a hybrid method that starts with the best shapelets extracted by using the original approach with a time constraint and then tunes these shapelets by using a convolutional neural network (CNN) model. Secondly, time series ordinal classification (TSOC) is an unexplored field beginning with this Thesis. In this way, we have adapted the original ST to the ordinal classification (OC) paradigm by proposing several shapelet quality measures taking advantage of the ordinal information of the time series. This methodology leads to better results than the state-of-the-art TSC techniques for those ordinal time series datasets. All these proposals have been tested on all the time series datasets available in the UEA/UCR TSC repository. With respect to time series prediction, it is based on estimating the next value or values of the time series by considering the previous ones. In this Thesis, several different approaches have been considered depending on the problem to be solved. Firstly, the prediction of low-visibility events produced by fog conditions is carried out by means of hybrid autoregressive models (ARs) combining fixed-size and dynamic windows, adapting itself to the dynamics of the time series. Secondly, the prediction of convective cloud formation (which is a highly imbalance problem given that the number of convective cloud events is much lower than that of non-convective situations) is performed in two completely different ways: 1) tackling the problem as a multi-objective classification task by the use of multi-objective evolutionary artificial neural networks (MOEANNs), in which the two conflictive objectives are accuracy of the minority class and the global accuracy, and 2) tackling the problem from the OC point of view, in which, in order to reduce the imbalance degree, an oversampling approach is proposed along with the use of OC techniques. Thirdly, the prediction of solar radiation is carried out by means of evolutionary artificial neural networks (EANNs) with different combinations of basis functions in the hidden and output layers. Finally, the last challenging problem is the prediction of energy flux from waves and tides. For this, a multitask EANN has been proposed aiming to predict the energy flux at several prediction time horizons (from 6h to 48h). All these proposals and techniques have been corroborated and discussed according to physical and atmospheric models. The work developed in this Thesis is supported by 11 JCR-indexed papers in international journals (7 Q1, 3 Q2, 1 Q3), 11 papers in international conferences, and 4 papers in national conferences

    Simulation Analytics for Deeper Comparisons

    Get PDF
    Output analysis for stochastic simulation has traditionally focused on obtaining statistical summaries of time-averaged and replication-averaged performance measures. Although providing a useful overview of expected long-run results, this focus ignores the finer behaviour and dynamic interactions that characterise a stochastic system, motivating an opening for simulation analytics. Data analysis efforts directed towards the detailed event logs of simulation sample paths can extend the analytical toolkit of simulation beyond static summaries of long-run behaviour. This thesis contributes novel methodologies to the field of simulation analytics. Through a careful mining of sample path data and application of appropriate machine learning techniques, we unlock new opportunities for understanding and improving the performance of stochastic systems. Our first area of focus is on the real-time prediction of dynamic performance measures, and we demonstrate a k-nearest neighbours model on the multivariate state of a simulation. In conjunction with this, metric learning is employed to refine a system-specific distance measure that operates between simulation states. The involvement of metric learning is found not only to enhance prediction accuracy, but also to offer insight into the driving factors behind a system’s stochastic performance. Our main contribution within this approach is the adaptation of a metric learning formulation to accommodate the type of data that is typical of simulation sample paths. Secondly, we explore the continuous-time trajectories of simulation variables. Shapelets are found to identify the patterns that characterise and distinguish the trajectories of competing systems. Tailoring to the structure of discrete-event sample paths, we probe a deeper understanding and comparison of the dynamic behaviours of stochastic simulation

    Mining time-series data using discriminative subsequences

    Get PDF
    Time-series data is abundant, and must be analysed to extract usable knowledge. Local-shape-based methods offer improved performance for many problems, and a comprehensible method of understanding both data and models. For time-series classification, we transform the data into a local-shape space using a shapelet transform. A shapelet is a time-series subsequence that is discriminative of the class of the original series. We use a heterogeneous ensemble classifier on the transformed data. The accuracy of our method is significantly better than the time-series classification benchmark (1-nearest-neighbour with dynamic time-warping distance), and significantly better than the previous best shapelet-based classifiers. We use two methods to increase interpretability: First, we cluster the shapelets using a novel, parameterless clustering method based on Minimum Description Length, reducing dimensionality and removing duplicate shapelets. Second, we transform the shapelet data into binary data reflecting the presence or absence of particular shapelets, a representation that is straightforward to interpret and understand. We supplement the ensemble classifier with partial classifocation. We generate rule sets on the binary-shapelet data, improving performance on certain classes, and revealing the relationship between the shapelets and the class label. To aid interpretability, we use a novel algorithm, BruteSuppression, that can substantially reduce the size of a rule set without negatively affecting performance, leading to a more compact, comprehensible model. Finally, we propose three novel algorithms for unsupervised mining of approximately repeated patterns in time-series data, testing their performance in terms of speed and accuracy on synthetic data, and on a real-world electricity-consumption device-disambiguation problem. We show that individual devices can be found automatically and in an unsupervised manner using a local-shape-based approach

    Time Series Mining: Shapelet Discovery, Ensembling, and Applications

    Get PDF
    Time series is a prominent class of temporal data sequences that has the properties of being equally spaced in time, chronologically ordered, and highly dimensional. Time series classification is an important branch of time series mining. Existing time series classifiers operate either on row data in the time domain or into an alternate data space in the shapelets or frequency domains. Combining time series classifiers, is another powerful technique used to improve the classification accuracy. It was demonstrated that different classifiers can be expert in predicting different subset of classes over others. The challenge lies in learning the expertise of different base learners. In addition, the high dimensionality characteristic of time series data makes it difficult to visualize their distribution. In this thesis we developed a new time series ensembling methods in order to improve the predictive performance, investigated the interpretability of classifiers by leveraging the power of deep learning models and adjusting them to provide visual shapelets as a by-product of the classification task. Finally, we show application through problems of solar energetic particle events prediction

    How to Discover Knowledge for Improving Availability in the Manufacturing Domain?

    Get PDF
    This paper presents a specific process model for Knowledge Discovery in Databases (KDD) projects aiming at availability improvement in manufacturing. For this purpose, Overall Equipment Efficiency (OEE) is analyzed and used, since it is an approved approach to monitor and improve the degree of availability in manufacturing. To define the specific process model, we use the generic CRISPDM reference model and conduct a mapping for availability improvement. We prove the applicability of our model in the context of a specific KDD project in a large enterprise in the manufacturing industry
    corecore