Search CORE

31,449 research outputs found

ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning

Author: Jin Zhihua
Liu Dongyu
Ming Yao
Qu Huamin
Shen Qiaomu
Smith Micah J.
Veeramachaneni Kalyan
Wang Qianwen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/02/2019
Field of study

To relieve the pain of manually selecting machine learning algorithms and tuning hyperparameters, automated machine learning (AutoML) methods have been developed to automatically search for good models. Due to the huge model search space, it is impossible to try all models. Users tend to distrust automatic results and increase the search budget as much as they can, thereby undermining the efficiency of AutoML. To address these issues, we design and implement ATMSeer, an interactive visualization tool that supports users in refining the search space of AutoML and analyzing the results. To guide the design of ATMSeer, we derive a workflow of using AutoML based on interviews with machine learning experts. A multi-granularity visualization is proposed to enable users to monitor the AutoML process, analyze the searched models, and refine the search space in real time. We demonstrate the utility and usability of ATMSeer through two case studies, expert interviews, and a user study with 13 end users.Comment: Published in the ACM Conference on Human Factors in Computing Systems (CHI), 2019, Glasgow, Scotland U

arXiv.org e-Print Archive

Crossref

Classifying emotions in Stack Overflow and JIRA using a multi-label approach

Author: BESSIS NIKOLAOS
CABRERA DIEGO LUIS ADRIAN
KORKONTZELOS YANNIS
Publication venue: 'Elsevier BV'
Publication date: 11/05/2020
Field of study

Edge Hill University Research Information Repository

Online Tool Condition Monitoring Based on Parsimonious Ensemble+

Author: Dimla Eric
Lughofer Edwin
Pedrycz Witold
Pratama Mahardhika
Tjahjowidowo Tegoeh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/12/2019
Field of study

Accurate diagnosis of tool wear in metal turning process remains an open challenge for both scientists and industrial practitioners because of inhomogeneities in workpiece material, nonstationary machining settings to suit production requirements, and nonlinear relations between measured variables and tool wear. Common methodologies for tool condition monitoring still rely on batch approaches which cannot cope with a fast sampling rate of metal cutting process. Furthermore they require a retraining process to be completed from scratch when dealing with a new set of machining parameters. This paper presents an online tool condition monitoring approach based on Parsimonious Ensemble+, pENsemble+. The unique feature of pENsemble+ lies in its highly flexible principle where both ensemble structure and base-classifier structure can automatically grow and shrink on the fly based on the characteristics of data streams. Moreover, the online feature selection scenario is integrated to actively sample relevant input attributes. The paper presents advancement of a newly developed ensemble learning algorithm, pENsemble+, where online active learning scenario is incorporated to reduce operator labelling effort. The ensemble merging scenario is proposed which allows reduction of ensemble complexity while retaining its diversity. Experimental studies utilising real-world manufacturing data streams and comparisons with well known algorithms were carried out. Furthermore, the efficacy of pENsemble was examined using benchmark concept drift data streams. It has been found that pENsemble+ incurs low structural complexity and results in a significant reduction of operator labelling effort.Comment: this paper has been published by IEEE Transactions on Cybernetic

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Recommended from our members

Integrative machine learning approach for multi-class SCOP protein fold classification

Author: Deville Y
Gilbert D
Tan A C
Publication venue: GCB
Publication date: 01/01/2003
Field of study

Classification and prediction of protein structure has been a central research theme in structural bioinformatics. Due to the imbalanced distribution of proteins over multi SCOP classification, most discriminative machine learning suffers the well-known ‘False Positives ’ problem when learning over these types of problems. We have devised eKISS, an ensemble machine learning specifically designed to increase the coverage of positive examples when learning under multiclass imbalanced data sets. We have applied eKISS to classify 25 SCOP folds and show that our learning system improved over classical learning methods

Brunel University Research Archive

Recommended from our members

Multi-class protein fold classification using a new ensemble machine learning approach.

Author: Deville Y
Gilbert D
Tan A
Publication venue: GIW
Publication date: 01/01/2003
Field of study

Protein structure classification represents an important process in understanding the associations between sequence and structure as well as possible functional and evolutionary relationships. Recent structural genomics initiatives and other high-throughput experiments have populated the biological databases at a rapid pace. The amount of structural data has made traditional methods such as manual inspection of the protein structure become impossible. Machine learning has been widely applied to bioinformatics and has gained a lot of success in this research area. This work proposes a novel ensemble machine learning method that improves the coverage of the classifiers under the multi-class imbalanced sample sets by integrating knowledge induced from different base classifiers, and we illustrate this idea in classifying multi-class SCOP protein fold data. We have compared our approach with PART and show that our method improves the sensitivity of the classifier in protein fold classification. Furthermore, we have extended this method to learning over multiple data types, preserving the independence of their corresponding data sources, and show that our new approach performs at least as well as the traditional technique over a single joined data source. These experimental results are encouraging, and can be applied to other bioinformatics problems similarly characterised by multi-class imbalanced data sets held in multiple data sources

Brunel University Research Archive

ReCon: Revealing and Controlling PII Leaks in Mobile Network Traffic

Author: Egele M.
Enck W.
Jagabathula S.
Kim J.
Rao A.
Roesner F.
Street The Wall
Yan L. K.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/06/2016
Field of study

It is well known that apps running on mobile devices extensively track and leak users' personally identifiable information (PII); however, these users have little visibility into PII leaked through the network traffic generated by their devices, and have poor control over how, when and where that traffic is sent and handled by third parties. In this paper, we present the design, implementation, and evaluation of ReCon: a cross-platform system that reveals PII leaks and gives users control over them without requiring any special privileges or custom OSes. ReCon leverages machine learning to reveal potential PII leaks by inspecting network traffic, and provides a visualization tool to empower users with the ability to control these leaks via blocking or substitution of PII. We evaluate ReCon's effectiveness with measurements from controlled experiments using leaks from the 100 most popular iOS, Android, and Windows Phone apps, and via an IRB-approved user study with 92 participants. We show that ReCon is accurate, efficient, and identifies a wider range of PII than previous approaches.Comment: Please use MobiSys version when referencing this work: http://dl.acm.org/citation.cfm?id=2906392. 18 pages, recon.meddle.mob

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server