Search CORE

42,467 research outputs found

Multi-objective variable subset selection using heterogeneous surrogate modeling and sequential design

Author: Couckuyt Ivo
Deschrijver Dirk
Dhaene Tom
van der Herten Joachim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Crossref

Ghent University Academic Bibliography

Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms

Author: Hoos Holger H.
Hutter Frank
Leyton-Brown Kevin
Thornton Chris
Publication venue
Publication date: 01/01/2012
Field of study

Many different machine learning algorithms exist; taking into account each algorithm's hyperparameters, there is a staggeringly large number of possible alternatives overall. We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that addresses these issues in isolation. We show that this problem can be addressed by a fully automated approach, leveraging recent innovations in Bayesian optimization. Specifically, we consider a wide range of feature selection techniques (combining 3 search and 8 evaluator methods) and all classification approaches implemented in WEKA, spanning 2 ensemble methods, 10 meta-methods, 27 base classifiers, and hyperparameter settings for each classifier. On each of 21 popular datasets from the UCI repository, the KDD Cup 09, variants of the MNIST dataset and CIFAR-10, we show classification performance often much better than using standard selection/hyperparameter optimization methods. We hope that our approach will help non-expert users to more effectively identify machine learning algorithms and hyperparameter settings appropriate to their applications, and hence to achieve improved performance.Comment: 9 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Analysis of group evolution prediction in complex networks

Author: Bródka Piotr
Kazienko Przemysław
Koziarski Michał
Saganowski Stanisław
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

In the world, in which acceptance and the identification with social communities are highly desired, the ability to predict evolution of groups over time appears to be a vital but very complex research problem. Therefore, we propose a new, adaptable, generic and mutli-stage method for Group Evolution Prediction (GEP) in complex networks, that facilitates reasoning about the future states of the recently discovered groups. The precise GEP modularity enabled us to carry out extensive and versatile empirical studies on many real-world complex / social networks to analyze the impact of numerous setups and parameters like time window type and size, group detection method, evolution chain length, prediction models, etc. Additionally, many new predictive features reflecting the group state at a given time have been identified and tested. Some other research problems like enriching learning evolution chains with external data have been analyzed as well

arXiv.org e-Print Archive

Directory of Open Access Journals

Towards Efficient Data Valuation Based on the Shapley Value

Author: Dao David
Gurel Nezihe Merve
Hubis Frances Ann
Hynes Nick
Jia Ruoxi
Li Bo
Song Dawn
Spanos Costas
Wang Boxin
Zhang Ce
Publication venue
Publication date: 16/08/2020
Field of study

"How much is my data worth?" is an increasingly common question posed by organizations and individuals alike. An answer to this question could allow, for instance, fairly distributing profits among multiple data contributors and determining prospective compensation when data breaches happen. In this paper, we study the problem of data valuation by utilizing the Shapley value, a popular notion of value which originated in coopoerative game theory. The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value. However, the Shapley value often requires exponential time to compute. To meet this challenge, we propose a repertoire of efficient algorithms for approximating the Shapley value. We also demonstrate the value of each training instance for various benchmark datasets

arXiv.org e-Print Archive

How to Find More Supernovae with Less Work: Object Classification Techniques for Difference Imaging

Author: B. A. Weaver
Becker A. C.
C. Aragon
D. Wong
Fisher R. A.
Freund Y.
R. C. Thomas
R. Romano
S. Bailey
Zahn C. T.
Publication venue: 'University of Chicago Press'
Publication date: 02/05/2007
Field of study

We present the results of applying new object classification techniques to difference images in the context of the Nearby Supernova Factory supernova search. Most current supernova searches subtract reference images from new images, identify objects in these difference images, and apply simple threshold cuts on parameters such as statistical significance, shape, and motion to reject objects such as cosmic rays, asteroids, and subtraction artifacts. Although most static objects subtract cleanly, even a very low false positive detection rate can lead to hundreds of non-supernova candidates which must be vetted by human inspection before triggering additional followup. In comparison to simple threshold cuts, more sophisticated methods such as Boosted Decision Trees, Random Forests, and Support Vector Machines provide dramatically better object discrimination. At the Nearby Supernova Factory, we reduced the number of non-supernova candidates by a factor of 10 while increasing our supernova identification efficiency. Methods such as these will be crucial for maintaining a reasonable false positive rate in the automated transient alert pipelines of upcoming projects such as PanSTARRS and LSST.Comment: 25 pages; 6 figures; submitted to Ap

arXiv.org e-Print Archive

Crossref

UNT Digital Library