541 research outputs found

    Dynamic Metric Learning from Pairwise Comparisons

    Full text link
    Recent work in distance metric learning has focused on learning transformations of data that best align with specified pairwise similarity and dissimilarity constraints, often supplied by a human observer. The learned transformations lead to improved retrieval, classification, and clustering algorithms due to the better adapted distance or similarity measures. Here, we address the problem of learning these transformations when the underlying constraint generation process is nonstationary. This nonstationarity can be due to changes in either the ground-truth clustering used to generate constraints or changes in the feature subspaces in which the class structure is apparent. We propose Online Convex Ensemble StrongLy Adaptive Dynamic Learning (OCELAD), a general adaptive, online approach for learning and tracking optimal metrics as they change over time that is highly robust to a variety of nonstationary behaviors in the changing metric. We apply the OCELAD framework to an ensemble of online learners. Specifically, we create a retro-initialized composite objective mirror descent (COMID) ensemble (RICE) consisting of a set of parallel COMID learners with different learning rates, demonstrate RICE-OCELAD on both real and synthetic data sets and show significant performance improvements relative to previously proposed batch and online distance metric learning algorithms.Comment: to appear Allerton 2016. arXiv admin note: substantial text overlap with arXiv:1603.0367

    Knowledge-based fault detection using time-frequency analysis

    Get PDF
    This work studies a fault detection method which analyzes sensor data for changes in their characteristics to detect the occurrence of faults in a dynamic system. The test system considered in this research is a Boeing-747 aircraft system and the faults considered are the actuator faults in the aircraft. The method is an alternative to conventional fault detection method and does not rely on analytical mathematical models but acquires knowledge about the system through experiments. In this work, we test the concept that the energy distribution of resolution than the windowed Fourier transform. Verification of the proposed methodology is carried in two parts. The first set of experiments considers entire data as a single window. Results show that the method effectively classifies the indicators by more that 85% as correct detections. The second set of experiments verifies the method for online fault detection. It is observed that the mean detection delay was less than 8 seconds. We also developed a simple graphical user interface to run the online fault detection

    Non-cooperative identification of civil aircraft using a generalised mutual subspace method

    Get PDF
    The subspace-based methods are effectively applied to classify sets of feature vectors by modelling them as subspaces. However, their application to the field of non-cooperative target identification of flying aircraft is barely seen in the literature. In these methods, setting the subspace dimensionality is always an issue. Here, it is demonstrated that a modified mutual subspace method, which uses softweights to set the importance of each subspace basis, is a promising classifier for identifying sets of range profiles coming from real in-flight targets with no need to set the subspace dimensionality in advance. The assembly of a recognition database is also a challenging task. In this study, this database comprises predicted range profiles coming from electromagnetic simulations. Even though the predicted and actual profiles differ, the high recognition rates achieved reveal that the algorithm might be a good candidate for its application in an operational target recognition system

    Model Selection in an Information Economy : Choosing what to Learn

    Get PDF
    As online markets for the exchange of goods and services become more common, the study of markets composed at least in part of autonomous agents has taken on increasing importance. In contrast to traditional completeinformation economic scenarios, agents that are operating in an electronic marketplace often do so under considerable uncertainty. In order to reduce their uncertainty, these agents must learn about the world around them. When an agent producer is engaged in a learning task in which data collection is costly, such as learning the preferences of a consumer population, it is faced with a classic decision problem: when to explore and when to exploit. If the agent has a limited number of chances to experiment, it must explicitly consider the cost of learning (in terms of foregone profit) against the value of the information acquired. Information goods add an additional dimension to this problem; due to their flexibility, they can be bundled and priced according to a number of different price schedules. An optimizing producer should consider the profit each price schedule can extract, as well as the difficulty of learning of this schedule. In this paper, we demonstrate the tradeoff between complexity and profitability for a number of common price schedules. We begin with a one-shot decision as to which schedule to learn. Schedules with moderate complexity are preferred in the short and medium term, as they are learned quickly, yet extract a significant fraction of the available profit. We then turn to the repeated version of this one-shot decision and show that moderate complexity schedules, in particular two-part tariff, perform well when the producer must adapt to nonstationarity in the consumer population. When a producer can dynamically change schedules as it learns, it can use an explicit decision-theoretic formulation to greedily select the schedule which appears to yield the greatest profit in the next period. By explicitly considering the both the learnability and the profit extracted by different price schedules, a producer can extract more profit as it learns than if it naively chose models that are accurate once learned.Online learning; information economics; model selection; direct search

    COMPOSE: Compacted object sample extraction a framework for semi-supervised learning in nonstationary environments

    Get PDF
    An increasing number of real-world applications are associated with streaming data drawn from drifting and nonstationary distributions. These applications demand new algorithms that can learn and adapt to such changes, also known as concept drift. Proper characterization of such data with existing approaches typically requires substantial amount of labeled instances, which may be difficult, expensive, or even impractical to obtain. In this thesis, compacted object sample extraction (COMPOSE) is introduced - a computational geometry-based framework to learn from nonstationary streaming data - where labels are unavailable (or presented very sporadically) after initialization. The feasibility and performance of the algorithm are evaluated on several synthetic and real-world data sets, which present various different scenarios of initially labeled streaming environments. On carefully designed synthetic data sets, we also compare the performance of COMPOSE against the optimal Bayes classifier, as well as the arbitrary subpopulation tracker algorithm, which addresses a similar environment referred to as extreme verification latency. Furthermore, using the real-world National Oceanic and Atmospheric Administration weather data set, we demonstrate that COMPOSE is competitive even with a well-established and fully supervised nonstationary learning algorithm that receives labeled data in every batch

    Climate Change Projection and Time-varying Multi-dimensional Risk Analysis

    Get PDF
    In recent decades, population growth and global warming consequent to greenhouse gas emissions because of human activities, has changed the atmospheric composition leading to intensifying extreme climate phenomena and overall increase of extreme events. These extreme events have caused human suffering and devastating effects in recent record-breaking warming years. To mitigate adverse consequences arising from global warming, the best strategy is to project the future probabilistic behavior of extreme climate phenomena under changing environment. The first contribution of this research is to improve the predictive power of regression-based statistical downscaling processes to accurately project the future behavior of extreme climate phenomena. First, a supervised dimensionality reduction algorithm is proposed for the statistical downscaling to derive a low-dimensional manifold representing climate change signals encoding of high-dimensional atmospheric variables. Such an algorithm is novel in climate change studies as past literature has focused on deriving low-dimensional principal components from large-scale atmospheric projectors without taking into account the target hydro-climate variables. The new algorithm called Supervised Principal Component analysis (Supervised PCA) outperforms all of the existing state-of-the-art dimensionality reduction algorithms. The model improves the performance of the statistical downscaling modelling through deriving subspaces that have maximum dependency with the target hydro-climate variables. A kernel version of Supervised PCA is also introduced to reduce nonlinear dimensionality and capture all of the nonlinear and complex variabilities between hydro-climate response variable and atmospheric projectors. To address the biases arising from difference between observed and simulated large-scale atmospheric projectors, and to represent anomalies of low frequency variability of teleconnections in General Circulation Models (GCMs), a Multivariate Recursive Nesting Bias Correction (MRNBC) is proposed to the regression-based statistical downscaling. The proposed method is able to use multiple variables in multiple locations to simultaneously correct temporal and spatial biases in cross dependent multi-projectors. To reduce another source of uncertainty arising from complexity and nonlinearity in quantitative empirical relationships in the statistical downscaling, the results demonstrate the superiority of a Bayesian machine-learning algorithm. The predictive power of the statistical downscaling is therefore improved through addressing the aforementioned sources of uncertainty. This results in improvement of the projection of the global warming impacts on the probabilistic behavior of hydro-climate variables using future multi-model ensemble GCMs under forcing climate change scenarios. The results of two Design-of-Experiments also reveal that the proposed comprehensive statistical downscaling is credible and adjustable to the changes under non-stationary conditions arising from climate change. Under the impact of climate change arising from anthropogenic global warming, it is demonstrated that the nature and the risk of extreme climate phenomena are changed over time. It is also well known that the extreme climate processes are multi-dimensional by their very nature characterized by multi-dimensions that are highly dependent. Accordingly, to strength the reliability of infrastructure designs and the management of water systems in the changing climate, it is of crucial importance to update the risk concept to a new adaptive multi-dimensional time-varying one to integrate anomalies of dynamic anthropogenically forced environments. The main contribution of this research is to develop a new generation of multivariate time-varying risk concept for an adaptive design framework in non-stationary conditions arising from climate change. This research develops a Bayesian, dynamic conditional copula model describing time-varying dependence structure between mixed continuous and discrete marginals of extreme multi-dimensional climate phenomena. The framework is able to integrate any anomalies in extreme multi-dimensional events in non-stationary conditions arising from climate change. It generates iterative samples using a Markov Chain Monte Carlo (MCMC) method from the full conditional marginals and joint distribution in a fully likelihood-based Bayesian inference. The framework also introduces a fully Bayesian, time-varying Joint Return Period (JRP) concept to quantify the extent of changes in the nature and the risk of extreme multi-dimensional events over time under the impact of climate change. The proposed generalized time-dependent risk framework can be applied to all stochastic multi-dimensional climate systems that are under the influence of changing environments
    • 

    corecore