96 research outputs found

    A Survey on Negative Transfer

    Full text link
    Transfer learning (TL) tries to utilize data or knowledge from one or more source domains to facilitate the learning in a target domain. It is particularly useful when the target domain has few or no labeled data, due to annotation expense, privacy concerns, etc. Unfortunately, the effectiveness of TL is not always guaranteed. Negative transfer (NT), i.e., the source domain data/knowledge cause reduced learning performance in the target domain, has been a long-standing and challenging problem in TL. Various approaches to handle NT have been proposed in the literature. However, this filed lacks a systematic survey on the formalization of NT, their factors and the algorithms that handle NT. This paper proposes to fill this gap. First, the definition of negative transfer is considered and a taxonomy of the factors are discussed. Then, near fifty representative approaches for handling NT are categorized and reviewed, from four perspectives: secure transfer, domain similarity estimation, distant transfer and negative transfer mitigation. NT in related fields, e.g., multi-task learning, lifelong learning, and adversarial attacks are also discussed

    A discrete contextual stochastic model for the off-line recognition of handwritten Chinese characters

    Get PDF
    We study a discrete contextual stochastic (CS) model for complex and variant patterns like handwritten Chinese characters. Three fundamental problems of using CS models for character recognition are discussed, and several practical techniques for solving these problems are investigated. A formulation for discriminative training of CS model parameters is also introduced and its practical usage investigated. To illustrate the characteristics of the various algorithms, comparative experiments are performed on a recognition task with a vocabulary consisting of 50 pairs of highly similar handwritten Chinese characters. The experimental results confirm the effectiveness of the discriminative training for improving recognition performance.published_or_final_versio

    Visualizing and Predicting the Effects of Rheumatoid Arthritis on Hands

    Get PDF
    This dissertation was inspired by difficult decisions patients of chronic diseases have to make about about treatment options in light of uncertainty. We look at rheumatoid arthritis (RA), a chronic, autoimmune disease that primarily affects the synovial joints of the hands and causes pain and deformities. In this work, we focus on several parts of a computer-based decision tool that patients can interact with using gestures, ask questions about the disease, and visualize possible futures. We propose a hand gesture based interaction method that is easily setup in a doctor\u27s office and can be trained using a custom set of gestures that are least painful. Our system is versatile and can be used for operations like simple selections to navigating a 3D world. We propose a point distribution model (PDM) that is capable of modeling hand deformities that occur due to RA and a generalized fitting method for use on radiographs of hands. Using our shape model, we show novel visualization of disease progression. Using expertly staged radiographs, we propose a novel distance metric learning and embedding technique that can be used to automatically stage an unlabeled radiograph. Given a large set of expertly labeled radiographs, our data-driven approach can be used to extract different modes of deformation specific to a disease

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

    Automated framework for robust content-based verification of print-scan degraded text documents

    Get PDF
    Fraudulent documents frequently cause severe financial damages and impose security breaches to civil and government organizations. The rapid advances in technology and the widespread availability of personal computers has not reduced the use of printed documents. While digital documents can be verified by many robust and secure methods such as digital signatures and digital watermarks, verification of printed documents still relies on manual inspection of embedded physical security mechanisms.The objective of this thesis is to propose an efficient automated framework for robust content-based verification of printed documents. The principal issue is to achieve robustness with respect to the degradations and increased levels of noise that occur from multiple cycles of printing and scanning. It is shown that classic OCR systems fail under such conditions, moreover OCR systems typically rely heavily on the use of high level linguistic structures to improve recognition rates. However inferring knowledge about the contents of the document image from a-priori statistics is contrary to the nature of document verification. Instead a system is proposed that utilizes specific knowledge of the document to perform highly accurate content verification based on a Print-Scan degradation model and character shape recognition. Such specific knowledge of the document is a reasonable choice for the verification domain since the document contents are already known in order to verify them.The system analyses digital multi font PDF documents to generate a descriptive summary of the document, referred to as \Document Description Map" (DDM). The DDM is later used for verifying the content of printed and scanned copies of the original documents. The system utilizes 2-D Discrete Cosine Transform based features and an adaptive hierarchical classifier trained with synthetic data generated by a Print-Scan degradation model. The system is tested with varying degrees of Print-Scan Channel corruption on a variety of documents with corruption produced by repetitive printing and scanning of the test documents. Results show the approach achieves excellent accuracy and robustness despite the high level of noise

    Detecting Anomalies in VoIP traffic usign Principal Components Analysis

    Get PDF
    The idea of using a method based on Principal Components Analysis to detect anomalies in network's traffic was first introduced by A. Lakina, M. Crovella and C. Diot in an article published in 2004 called “Diagnosing Network­Wide Traffic Anomalies” [1]. They proposed a general method to diagnose traffic anomalies, using PCA to effectively separate the high­dimensional space occupied by a set of network traffic measurements into disjoint subspaces corresponding to normal and anomalous network conditions. This algorithm was tested in subsequent works, taking into consideration different characteristics of IP traffic over a network (such as byte counts, packet counts, IP­flow counts, etc...) [2]. The proposal of using entropy as a summarization tool inside the algorithm led to significant advances in terms or possibility of analyzing massive data sources [3]; but this type of AD method still lacked the possibility of recognizing the users responsible of the anomalies detected. This last step was obtained using random aggregations of the IP flows, by means of sketches [4], leading to better performances in the detection of anomalies and to the possibility of identifying the responsible IP flows. This version of the algorithm has been implemented by C. Callegari and L. Gazzarini, in Universitá di Pisa, in an AD software, described in [5], for analyzing IP traffic traces and detecting anomalies in them. Our work consisted in adapting this software (designed for working with IP traffic traces) for using it with VoIP Call Data Records, in order to test its applicability as an Anomaly Detection system for voice traffic. We then used our modified version of the software to scan a real VoIP traffic trace, obtained by a telephonic operator, in order to analyze the software's performances in a real environment situation. We used two different types of analysis on the same traffic trace, in order to understand software's features and limits, other than its possibility of application in AD problematics. As we discovered that the software's performances are heavily dependent on the input parameters used in the analysis, we concluded with several tests performed using artificially created anomalies, in order to understand the relationships between each input parameter's value and the software's capability of detecting different types of anomalies. The different analysis performed, in the ending, led us to some considerations upon the possibility of applying this PCA's based software as an Anomaly Detector in VoIP environments. At the best of our knowledge this is the first time a technique based on Principal Components Analysis is used to detect anomalous users in VoIP traffic; in more detail our contribution consisted in: • Creating a version of an AD software based on PCA that could be used on VoIP traffic traces • Testing the software's performances on a real traffic trace, obtained by a telephonic operator • From the first tests, analyzing the appropriate parameters' values that permitted us to obtain results that could be useful for detecting anomalous users in a VoIP environment Observing the types of users detected using the software on this trace and classify them, according to their behavior during the whole duration of the trace Analyzing how the parameters' choice impact the type of detections obtained from the analysis and testing which are the best choices for detecting each type of anomalous users Proposing a new kind of application of the software that avoids the biggest limitation of the first type of analysis (that we will see that is the impossibility of detecting more than one anomalous user per time­bin) Testing the software's performances with this new type of analysis, observing also how this different type of applications impacts the results' dependence from the input parameters Comparing the software's ability of detecting anomalous users with another type of AD software that works on the same type of trace (VoIP SEAL) Modifying the trace in order to obtain, from the real trace, a version cleaned from all the detectable anomalies, in order to add in that trace artificial anomalies Testing the software's performances in detecting different type of artificial anomalies Analyzing in more detail the software's sensibility from the input parameters, when used for detecting artificially created anomalies Comparing results and observations obtained from these different types of analysis to derive a global analysis of the characteristics of an Anomaly Detector based on Principal Components Analysis, its values and its lacks when applying it on a VoIP trace The structure of our work is the following: 1. We will start analyzing the PCA theory, describing the structure of the algorithm used in our software, his features and the type of data it needs to be used as an Anomaly Detection system for VoIP traffic. 2. Then, after shortly describing the type of trace we used to test our software, we will introduce the first type of analysis performed, the single round analysis, pointing out the results obtained and their dependence from the parameters' values. 3. In the following section we will focus on a different type of analysis, the multiple round analysis, that we introduced to test the software's performances, removing its biggest limitation (the impossibility of detecting more than one user per time­bin); we will describe the results obtained, comparing them with the ones obtained with the single round analysis, check their dependence from the parameters and compare the performances with the ones obtained using another type of AD software (VoIP SEAL) on the same trace. 4. We will then consider the results and observations obtained testing our software using artificial anomalies added on a “cleaned” version of our original trace (in which we removed all the anomalous users detectable with our software), comparing the software's performances in detecting different types of anomalies and analyzing in detail their dependence from the parameters' values. 5. At last we will describe our conclusions, derived using all the observations obtained with different types of analysis, about the applicability of a software based on PCA as an Anomaly Detector in a VoIP environment

    Learning from Very Few Samples: A Survey

    Full text link
    Few sample learning (FSL) is significant and challenging in the field of machine learning. The capability of learning and generalizing from very few samples successfully is a noticeable demarcation separating artificial intelligence and human intelligence since humans can readily establish their cognition to novelty from just a single or a handful of examples whereas machine learning algorithms typically entail hundreds or thousands of supervised samples to guarantee generalization ability. Despite the long history dated back to the early 2000s and the widespread attention in recent years with booming deep learning technologies, little surveys or reviews for FSL are available until now. In this context, we extensively review 300+ papers of FSL spanning from the 2000s to 2019 and provide a timely and comprehensive survey for FSL. In this survey, we review the evolution history as well as the current progress on FSL, categorize FSL approaches into the generative model based and discriminative model based kinds in principle, and emphasize particularly on the meta learning based FSL approaches. We also summarize several recently emerging extensional topics of FSL and review the latest advances on these topics. Furthermore, we highlight the important FSL applications covering many research hotspots in computer vision, natural language processing, audio and speech, reinforcement learning and robotic, data analysis, etc. Finally, we conclude the survey with a discussion on promising trends in the hope of providing guidance and insights to follow-up researches.Comment: 30 page

    Doctor of Philosophy

    Get PDF
    dissertationMachine learning is the science of building predictive models from data that automatically improve based on past experience. To learn these models, traditional learning algorithms require labeled data. They also require that the entire dataset fits in the memory of a single machine. Labeled data are available or can be acquired for small and moderately sized datasets but curating large datasets can be prohibitively expensive. Similarly, massive datasets are usually too huge to fit into the memory of a single machine. An alternative is to distribute the dataset over multiple machines. Distributed learning, however, poses new challenges as most existing machine learning techniques are inherently sequential. Additionally, these distributed approaches have to be designed keeping in mind various resource limitations of real-world settings, prime among them being intermachine communication. With the advent of big datasets machine learning algorithms are facing new challenges. Their design is no longer limited to minimizing some loss function but, additionally, needs to consider other resources that are critical when learning at scale. In this thesis, we explore different models and measures for learning with limited resources that have a budget. What budgetary constraints are posed by modern datasets? Can we reuse or combine existing machine learning paradigms to address these challenges at scale? How does the cost metrics change when we shift to distributed models for learning? These are some of the questions that have been investigated in this thesis. The answers to these questions hold the key to addressing some of the challenges faced when learning on massive datasets. In the first part of this thesis, we present three different budgeted scenarios that deal with scarcity of labeled data and limited computational resources. The goal is to leverage transfer information from related domains to learn under budgetary constraints. Our proposed techniques comprise semisupervised transfer, online transfer and active transfer. In the second part of this thesis, we study distributed learning with limited communication. We present initial sampling based results, as well as, propose communication protocols for learning distributed linear classifiers

    Robust multivariate methods in Chemometrics

    Full text link
    This chapter presents an introduction to robust statistics with applications of a chemometric nature. Following a description of the basic ideas and concepts behind robust statistics, including how robust estimators can be conceived, the chapter builds up to the construction (and use) of robust alternatives for some methods for multivariate analysis frequently used in chemometrics, such as principal component analysis and partial least squares. The chapter then provides an insight into how these robust methods can be used or extended to classification. To conclude, the issue of validation of the results is being addressed: it is shown how uncertainty statements associated with robust estimates, can be obtained.Comment: This article is an update of: P. Filzmoser, S. Serneels, R. Maronna, P.J. Van Espen, 3.24 - Robust Multivariate Methods in Chemometrics, in Comprehensive Chemometrics, 1st Edition, edited by Steven D. Brown, Rom\'a Tauler, Beata Walczak, Elsevier, 2009, https://doi.org/10.1016/B978-044452701-1.00113-
    • …
    corecore