3,444 research outputs found

    Detecting adversarial manipulation using inductive Venn-ABERS predictors

    Get PDF
    Inductive Venn-ABERS predictors (IVAPs) are a type of probabilistic predictors with the theoretical guarantee that their predictions are perfectly calibrated. In this paper, we propose to exploit this calibration property for the detection of adversarial examples in binary classification tasks. By rejecting predictions if the uncertainty of the IVAP is too high, we obtain an algorithm that is both accurate on the original test set and resistant to adversarial examples. This robustness is observed on adversarials for the underlying model as well as adversarials that were generated by taking the IVAP into account. The method appears to offer competitive robustness compared to the state-of-the-art in adversarial defense yet it is computationally much more tractable

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Tree Edit Distance Learning via Adaptive Symbol Embeddings

    Full text link
    Metric learning has the aim to improve classification accuracy by learning a distance measure which brings data points from the same class closer together and pushes data points from different classes further apart. Recent research has demonstrated that metric learning approaches can also be applied to trees, such as molecular structures, abstract syntax trees of computer programs, or syntax trees of natural language, by learning the cost function of an edit distance, i.e. the costs of replacing, deleting, or inserting nodes in a tree. However, learning such costs directly may yield an edit distance which violates metric axioms, is challenging to interpret, and may not generalize well. In this contribution, we propose a novel metric learning approach for trees which we call embedding edit distance learning (BEDL) and which learns an edit distance indirectly by embedding the tree nodes as vectors, such that the Euclidean distance between those vectors supports class discrimination. We learn such embeddings by reducing the distance to prototypical trees from the same class and increasing the distance to prototypical trees from different classes. In our experiments, we show that BEDL improves upon the state-of-the-art in metric learning for trees on six benchmark data sets, ranging from computer science over biomedical data to a natural-language processing data set containing over 300,000 nodes.Comment: Paper at the International Conference of Machine Learning (2018), 2018-07-10 to 2018-07-15 in Stockholm, Swede

    State estimation for coupled uncertain stochastic networks with missing measurements and time-varying delays: The discrete-time case

    Get PDF
    Copyright [2009] IEEE. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Brunel University's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to [email protected]. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.This paper is concerned with the problem of state estimation for a class of discrete-time coupled uncertain stochastic complex networks with missing measurements and time-varying delay. The parameter uncertainties are assumed to be norm-bounded and enter into both the network state and the network output. The stochastic Brownian motions affect not only the coupling term of the network but also the overall network dynamics. The nonlinear terms that satisfy the usual Lipschitz conditions exist in both the state and measurement equations. Through available output measurements described by a binary switching sequence that obeys a conditional probability distribution, we aim to design a state estimator to estimate the network states such that, for all admissible parameter uncertainties and time-varying delays, the dynamics of the estimation error is guaranteed to be globally exponentially stable in the mean square. By employing the Lyapunov functional method combined with the stochastic analysis approach, several delay-dependent criteria are established that ensure the existence of the desired estimator gains, and then the explicit expression of such estimator gains is characterized in terms of the solution to certain linear matrix inequalities (LMIs). Two numerical examples are exploited to illustrate the effectiveness of the proposed estimator design schemes

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

    Study and Observation of the Variations of Accuracies for Handwritten Digits Recognition with Various Hidden Layers and Epochs using Neural Network Algorithm

    Full text link
    In recent days, Artificial Neural Network (ANN) can be applied to a vast majority of fields including business, medicine, engineering, etc. The most popular areas where ANN is employed nowadays are pattern and sequence recognition, novelty detection, character recognition, regression analysis, speech recognition, image compression, stock market prediction, Electronic nose, security, loan applications, data processing, robotics, and control. The benefits associated with its broad applications leads to increasing popularity of ANN in the era of 21st Century. ANN confers many benefits such as organic learning, nonlinear data processing, fault tolerance, and self-repairing compared to other conventional approaches. The primary objective of this paper is to analyze the influence of the hidden layers of a neural network over the overall performance of the network. To demonstrate this influence, we applied neural network with different layers on the MNIST dataset. Also, another goal is to observe the variations of accuracies of ANN for different numbers of hidden layers and epochs and to compare and contrast among them.Comment: To be published in the 4th IEEE International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT 2018

    A review of domain adaptation without target labels

    Full text link
    Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.Comment: 20 pages, 5 figure
    corecore