108,951 research outputs found
Recommended from our members
The effect of missing values using genetic programming on evolvable diagnosis
Medical databases usually contain missing values due the policy of
reducing stress and harm to the patient. In practice missing values has been a
problem mainly due to the necessity to evaluate mathematical equations obtained
by genetic programming. The solution to this problem is to use fill in methods to
estimate the missing values. This paper analyses three fill in methods: (1) attribute
means, (2) conditional means, and (3) random number generation. The methods
are evaluated using sensitivity, specificity, and entropy to explain the exchange in
knowledge of the results. The results are illustrated based on the breast cancer
database. Conditional means produced the best fill in experimental results
A survey of cost-sensitive decision tree induction algorithms
The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field
Rough set theory applied to pattern recognition of partial discharge in noise affected cable data
This paper presents an effective, Rough Set (RS) based, pattern recognition method for rejecting interference signals and recognising Partial Discharge (PD) signals from different sources. Firstly, RS theory is presented in terms of Information System, Lower and Upper Approximation, Signal Discretisation, Attribute Reduction and a flowchart of the RS based pattern recognition method. Secondly, PD testing of five types of artificial defect in ethylene-propylene rubber (EPR) cable is carried out and data pre-processing and feature extraction are employed to separate PD and interference signals. Thirdly, the RS based PD signal recognition method is applied to 4000 samples and is proven to have 99% accuracy. Fourthly, the RS based PD recognition method is applied to signals from five different sources and an accuracy of more than 93% is attained when a combination of signal discretisation and attribute reduction methods are applied. Finally, Back-propagation Neural Network (BPNN) and Support Vector Machine (SVM) methods are studied and compared with the developed method. The proposed RS method is proven to have higher accuracy than SVM and BPNN and can be applied for on-line PD monitoring of cable systems after training with valid sample data
Ethical Adversaries: Towards Mitigating Unfairness with Adversarial Machine Learning
Machine learning is being integrated into a growing number of critical
systems with far-reaching impacts on society. Unexpected behaviour and unfair
decision processes are coming under increasing scrutiny due to this widespread
use and its theoretical considerations. Individuals, as well as organisations,
notice, test, and criticize unfair results to hold model designers and
deployers accountable. We offer a framework that assists these groups in
mitigating unfair representations stemming from the training datasets. Our
framework relies on two inter-operating adversaries to improve fairness. First,
a model is trained with the goal of preventing the guessing of protected
attributes' values while limiting utility losses. This first step optimizes the
model's parameters for fairness. Second, the framework leverages evasion
attacks from adversarial machine learning to generate new examples that will be
misclassified. These new examples are then used to retrain and improve the
model in the first step. These two steps are iteratively applied until a
significant improvement in fairness is obtained. We evaluated our framework on
well-studied datasets in the fairness literature -- including COMPAS -- where
it can surpass other approaches concerning demographic parity, equality of
opportunity and also the model's utility. We also illustrate our findings on
the subtle difficulties when mitigating unfairness and highlight how our
framework can assist model designers.Comment: 15 pages, 3 figures, 1 tabl
- …