Search CORE

1,559 research outputs found

Restricted Minimum Error Entropy Criterion for Robust Classification

Author: Chen Badong
Koike Yasuharu
Li Yuanhao
Yoshimura Natsue
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/07/2020
Field of study

The minimum error entropy (MEE) criterion has been verified as a powerful approach for non-Gaussian signal processing and robust machine learning. However, the implementation of MEE on robust classification is rather a vacancy in the literature. The original MEE only focuses on minimizing the Renyi's quadratic entropy of the error probability distribution function (PDF), which could cause failure in noisy classification tasks. To this end, we analyze the optimal error distribution in the presence of outliers for those classifiers with continuous errors, and introduce a simple codebook to restrict MEE so that it drives the error PDF towards the desired case. Half-quadratic based optimization and convergence analysis of the new learning criterion, called restricted MEE (RMEE), are provided. Experimental results with logistic regression and extreme learning machine are presented to verify the desirable robustness of RMEE

arXiv.org e-Print Archive

Robust and Adversarial Data Mining

Author: Wang Fei
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 01/01/2015
Field of study

In the domain of data mining and machine learning, researchers have made significant contributions in developing algorithms handling clustering and classification problems. We develop algorithms under assumptions that are not met by previous works. (i) In adversarial learning, which is the study of machine learning techniques deployed in non-benign environments. We design an algorithm to show how a classifier should be designed to be robust against sparse adversarial attacks. Our main insight is that sparse feature attacks are best defended by designing classifiers which use L1 regularizers. (ii) The different properties between L1 (Lasso) and L2 (Tikhonov or Ridge) regularization has been studied extensively. However, given a data set, principle to follow in terms of choosing the suitable regularizer is yet to be developed. We use mathematical properties of the two regularization methods followed by detailed experimentation to understand their impact based on four characteristics. (iii) The identification of anomalies is an inherent component of knowledge discovery. In lots of cases, the number of features of a data set can be traced to a much smaller set of features. We claim that algorithms applied in a latent space are more robust. This can lead to more accurate results, and potentially provide a natural medium to explain and describe outliers. (iv) We also apply data mining techniques on health care industry. In a lot cases, health insurance companies cover unnecessary costs carried out by healthcare providers. The potential adversarial behaviours of surgeon physicians are addressed. We describe a specific con- text of private healthcare in Australia and describe our social network based approach (applied to health insurance claims) to understand the nature of collaboration among doctors treating hospital inpatients and explore the impact of collaboration on cost and quality of care. (v) We further develop models that predict the behaviours of orthopaedic surgeons in regard to surgery type and use of prosthetic device. An important feature of these models is that they can not only predict the behaviours of surgeons but also provide explanation for the predictions

Sydney eScholarship

Advanced machine learning algorithms for discrete datasets

Author: Mansha Sameen
Publication venue: 'University of Queensland Library'
Publication date: 04/08/2020
Field of study

University of Queensland eSpace

Data-driven design of intelligent wireless networks: an overview and tutorial

Author: De Poorter Eli
Deschrijver Dirk
Fortuna Carolina
Kulin Merima
Moerman Ingrid
Publication venue: 'MDPI AG'
Publication date: 01/01/2016
Field of study

Data science or "data-driven research" is a research approach that uses real-life data to gain insight about the behavior of systems. It enables the analysis of small, simple as well as large and more complex systems in order to assess whether they function according to the intended design and as seen in simulation. Data science approaches have been successfully applied to analyze networked interactions in several research areas such as large-scale social networks, advanced business and healthcare processes. Wireless networks can exhibit unpredictable interactions between algorithms from multiple protocol layers, interactions between multiple devices, and hardware specific influences. These interactions can lead to a difference between real-world functioning and design time functioning. Data science methods can help to detect the actual behavior and possibly help to correct it. Data science is increasingly used in wireless research. To support data-driven research in wireless networks, this paper illustrates the step-by-step methodology that has to be applied to extract knowledge from raw data traces. To this end, the paper (i) clarifies when, why and how to use data science in wireless network research; (ii) provides a generic framework for applying data science in wireless networks; (iii) gives an overview of existing research papers that utilized data science approaches in wireless networks; (iv) illustrates the overall knowledge discovery process through an extensive example in which device types are identified based on their traffic patterns; (v) provides the reader the necessary datasets and scripts to go through the tutorial steps themselves

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Directory of Open Access Journals

PubMed Central

Information Theory and Machine Learning

Author
Publication venue: 'MDPI AG'
Publication date: 25/10/2022
Field of study

The recent successes of machine learning, especially regarding systems based on deep neural networks, have encouraged further research activities and raised a new set of challenges in understanding and designing complex machine learning algorithms. New applications require learning algorithms to be distributed, have transferable learning results, use computation resources efficiently, convergence quickly on online settings, have performance guarantees, satisfy fairness or privacy constraints, incorporate domain knowledge on model structures, etc. A new wave of developments in statistical learning theory and information theory has set out to address these challenges. This Special Issue, "Machine Learning and Information Theory", aims to collect recent results in this direction reflecting a diverse spectrum of visions and efforts to extend conventional theories and develop analysis tools for these complex machine learning systems

Directory of Open Access Books (DOAB)