31,449 research outputs found
ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning
To relieve the pain of manually selecting machine learning algorithms and
tuning hyperparameters, automated machine learning (AutoML) methods have been
developed to automatically search for good models. Due to the huge model search
space, it is impossible to try all models. Users tend to distrust automatic
results and increase the search budget as much as they can, thereby undermining
the efficiency of AutoML. To address these issues, we design and implement
ATMSeer, an interactive visualization tool that supports users in refining the
search space of AutoML and analyzing the results. To guide the design of
ATMSeer, we derive a workflow of using AutoML based on interviews with machine
learning experts. A multi-granularity visualization is proposed to enable users
to monitor the AutoML process, analyze the searched models, and refine the
search space in real time. We demonstrate the utility and usability of ATMSeer
through two case studies, expert interviews, and a user study with 13 end
users.Comment: Published in the ACM Conference on Human Factors in Computing Systems
(CHI), 2019, Glasgow, Scotland U
Online Tool Condition Monitoring Based on Parsimonious Ensemble+
Accurate diagnosis of tool wear in metal turning process remains an open
challenge for both scientists and industrial practitioners because of
inhomogeneities in workpiece material, nonstationary machining settings to suit
production requirements, and nonlinear relations between measured variables and
tool wear. Common methodologies for tool condition monitoring still rely on
batch approaches which cannot cope with a fast sampling rate of metal cutting
process. Furthermore they require a retraining process to be completed from
scratch when dealing with a new set of machining parameters. This paper
presents an online tool condition monitoring approach based on Parsimonious
Ensemble+, pENsemble+. The unique feature of pENsemble+ lies in its highly
flexible principle where both ensemble structure and base-classifier structure
can automatically grow and shrink on the fly based on the characteristics of
data streams. Moreover, the online feature selection scenario is integrated to
actively sample relevant input attributes. The paper presents advancement of a
newly developed ensemble learning algorithm, pENsemble+, where online active
learning scenario is incorporated to reduce operator labelling effort. The
ensemble merging scenario is proposed which allows reduction of ensemble
complexity while retaining its diversity. Experimental studies utilising
real-world manufacturing data streams and comparisons with well known
algorithms were carried out. Furthermore, the efficacy of pENsemble was
examined using benchmark concept drift data streams. It has been found that
pENsemble+ incurs low structural complexity and results in a significant
reduction of operator labelling effort.Comment: this paper has been published by IEEE Transactions on Cybernetic
Recommended from our members
Integrative machine learning approach for multi-class SCOP protein fold classification
Classification and prediction of protein structure has been a central research theme in structural bioinformatics. Due to the imbalanced distribution of proteins over multi SCOP classification, most discriminative machine learning suffers the well-known ‘False Positives ’ problem when learning over these types of problems. We have devised eKISS, an ensemble machine learning specifically designed to increase the coverage of positive examples when learning under multiclass imbalanced data sets. We have applied eKISS to classify 25 SCOP folds and show that our learning system improved over classical learning methods
Recommended from our members
Multi-class protein fold classification using a new ensemble machine learning approach.
Protein structure classification represents an important process in understanding the associations
between sequence and structure as well as possible functional and evolutionary relationships.
Recent structural genomics initiatives and other high-throughput experiments have populated the
biological databases at a rapid pace. The amount of structural data has made traditional methods
such as manual inspection of the protein structure become impossible. Machine learning has been
widely applied to bioinformatics and has gained a lot of success in this research area. This work
proposes a novel ensemble machine learning method that improves the coverage of the classifiers
under the multi-class imbalanced sample sets by integrating knowledge induced from different base
classifiers, and we illustrate this idea in classifying multi-class SCOP protein fold data. We have
compared our approach with PART and show that our method improves the sensitivity of the
classifier in protein fold classification. Furthermore, we have extended this method to learning over
multiple data types, preserving the independence of their corresponding data sources, and show
that our new approach performs at least as well as the traditional technique over a single joined
data source. These experimental results are encouraging, and can be applied to other bioinformatics
problems similarly characterised by multi-class imbalanced data sets held in multiple data
sources
ReCon: Revealing and Controlling PII Leaks in Mobile Network Traffic
It is well known that apps running on mobile devices extensively track and
leak users' personally identifiable information (PII); however, these users
have little visibility into PII leaked through the network traffic generated by
their devices, and have poor control over how, when and where that traffic is
sent and handled by third parties. In this paper, we present the design,
implementation, and evaluation of ReCon: a cross-platform system that reveals
PII leaks and gives users control over them without requiring any special
privileges or custom OSes. ReCon leverages machine learning to reveal potential
PII leaks by inspecting network traffic, and provides a visualization tool to
empower users with the ability to control these leaks via blocking or
substitution of PII. We evaluate ReCon's effectiveness with measurements from
controlled experiments using leaks from the 100 most popular iOS, Android, and
Windows Phone apps, and via an IRB-approved user study with 92 participants. We
show that ReCon is accurate, efficient, and identifies a wider range of PII
than previous approaches.Comment: Please use MobiSys version when referencing this work:
http://dl.acm.org/citation.cfm?id=2906392. 18 pages, recon.meddle.mob
- …