Search CORE

288 research outputs found

Efficient AUC Optimization for Information Ranking Applications

Author: C Cortes
C Manning
CJ Burges
Q Wu
T Calders
T Fawcett
T Qin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Adequate evaluation of an information retrieval system to estimate future performance is a crucial task. Area under the ROC curve (AUC) is widely used to evaluate the generalization of a retrieval system. However, the objective function optimized in many retrieval systems is the error rate and not the AUC value. This paper provides an efficient and effective non-linear approach to optimize AUC using additive regression trees, with a special emphasis on the use of multi-class AUC (MAUC) because multiple relevance levels are widely used in many ranking applications. Compared to a conventional linear approach, the performance of the non-linear approach is comparable on binary-relevance benchmark datasets and is better on multi-relevance benchmark datasets.Comment: 12 page

arXiv.org e-Print Archive

Crossref

On Making Good Games - Using Player Virtue Ethics and Gameplay Design Patterns to Identify Generally Desirable Gameplay Features

Author: A. M. Frisch
G. Georgakopoulos
N. Nilsson
R. Fagin
R. Fagin
T. Calders
Publication venue
Publication date: 01/01/2001
Field of study

This paper uses a framework of player virtues to perform a theoretical exploration of what is required to make a game good. The choice of player virtues is based upon the view that games can be seen as implements, and that these are good if they support an intended use, and the intended use of games is to support people to be good players. A collection of gameplay design patterns, identified through their relation to the virtues, is presented to provide specific starting points for considering design options for this type of good games. 24 patterns are identified supporting the virtues, including RISK/REWARD, DYNAMIC ALLIANCES, GAME MASTERS, and PLAYER DECIDED RESULTS, as are 7 countering three or more virtues, including ANALYSIS PARALYSIS, EARLY ELIMINATION, and GRINDING. The paper concludes by identifying limitations of the approach as well as by showing how it can be applied using other views of what are preferable features in games

RISE – Research Institutes of Sweden

Chalmers Research

Institutional Repository Universiteit Antwerpen

Digitala Vetenskapliga Arkivet - Academic Archive On-line

DI-fusion

Swedish Institute of Computer Science Publications Database

Learning what matters - Sampling interesting patterns

Author: M Bhuiyan
M Boley
M Leeuwen van
M Leeuwen van
S Chakraborty
S Shalev-Shwartz
T Calders
V Dzyuba
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

In the field of exploratory data mining, local structure in data can be described by patterns and discovered by mining algorithms. Although many solutions have been proposed to address the redundancy problems in pattern mining, most of them either provide succinct pattern sets or take the interests of the user into account-but not both. Consequently, the analyst has to invest substantial effort in identifying those patterns that are relevant to her specific interests and goals. To address this problem, we propose a novel approach that combines pattern sampling with interactive data mining. In particular, we introduce the LetSIP algorithm, which builds upon recent advances in 1) weighted sampling in SAT and 2) learning to rank in interactive pattern mining. Specifically, it exploits user feedback to directly learn the parameters of the sampling distribution that represents the user's interests. We compare the performance of the proposed algorithm to the state-of-the-art in interactive pattern mining by emulating the interests of a user. The resulting system allows efficient and interleaved learning and sampling, thus user-specific anytime data exploration. Finally, LetSIP demonstrates favourable trade-offs concerning both quality-diversity and exploitation-exploration when compared to existing methods.Comment: PAKDD 2017, extended versio

Lirias

arXiv.org e-Print Archive

Crossref

Leiden University Scholary Publications

An automatic critical care urine meter

Author: A. Bykowski
A. Bykowski
A. Giacometti
B. Ganter
B. Jeudy
B. Liu
G. Piatetsky-Shapiro
H. Mannila
H. Mannila
J. Besson
J. Galambos
J.-F. Boulicaut
J.-F. Boulicaut
J.-F. Boulicaut
J.-F. Boulicaut
L. Raedt De
L. Raedt De
M. Kryszkiewicz
N. Pasquier
R. Agrawal
T. Calders
T. Calders
T. Calders
T. Imielinski
Y. Bastide
Y. Bastide
Publication venue: 'MDPI AG'
Publication date: 01/01/2004
Field of study

Nowadays patients admitted to critical care units have most of their physiological parameters measured automatically by sophisticated commercial monitoring devices. More often than not, these devices supervise whether the values of the parameters they measure lie within a pre-established range, and issue warning of deviations from this range by triggering alarms. The automation of measuring and supervising tasks not only discharges the healthcare staff of a considerable workload but also avoids human errors in these repetitive and monotonous tasks. Arguably, the most relevant physiological parameter that is still measured and supervised manually by critical care unit staff is urine output (UO). In this paper we present a patent-pending device that provides continuous and accurate measurements of patient’s UO. The device uses capacitive sensors to take continuous measurements of the height of the column of liquid accumulated in two chambers that make up a plastic container. The ﬁrst chamber, where the urine inputs, has a small volume. Once it has been ﬁlled it overﬂows into a second bigger chamber. The ﬁrst chamber provides accurate UO measures of patients whose UO has to be closely supervised, while the second one avoids the need for frequent interventions by the nursing staff to empty the containe

Institutional Repository Universiteit Antwerpen

DI-fusion

HAL: Hyper Article en Ligne

Hal-Diderot

Archivo Digital UPM (Univ. Politécnica de Madrid)

Flexible constrained sampling with guarantees for pattern mining

Author: A Giacometti
A Zimmermann
C Bucilă
CP Gomes
F Bonchi
Luc De Raedt
M Berlingerio
M Boley
MA Hasan
Matthijs van Leeuwen
S Ermon
S Nijssen
T Calders
T Guns
T Guns
Vladimir Dzyuba
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Pattern sampling has been proposed as a potential solution to the infamous pattern explosion. Instead of enumerating all patterns that satisfy the constraints, individual patterns are sampled proportional to a given quality measure. Several sampling algorithms have been proposed, but each of them has its limitations when it comes to 1) flexibility in terms of quality measures and constraints that can be used, and/or 2) guarantees with respect to sampling accuracy. We therefore present Flexics, the first flexible pattern sampler that supports a broad class of quality measures and constraints, while providing strong guarantees regarding sampling accuracy. To achieve this, we leverage the perspective on pattern mining as a constraint satisfaction problem and build upon the latest advances in sampling solutions in SAT as well as existing pattern mining algorithms. Furthermore, the proposed algorithm is applicable to a variety of pattern languages, which allows us to introduce and tackle the novel task of sampling sets of patterns. We introduce and empirically evaluate two variants of Flexics: 1) a generic variant that addresses the well-known itemset sampling task and the novel pattern set sampling task as well as a wide range of expressive constraints within these tasks, and 2) a specialized variant that exploits existing frequent itemset techniques to achieve substantial speed-ups. Experiments show that Flexics is both accurate and efficient, making it a useful tool for pattern-based data exploration.Comment: Accepted for publication in Data Mining & Knowledge Discovery journal (ECML/PKDD 2017 journal track

Lirias

arXiv.org e-Print Archive

Crossref

Leiden University Scholary Publications

Swepub

Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

Author: A Freitas
C C Aggarwal P S Y
D Gunopulos R Khardon, H Mannila, S Sal
Georg Gottlob
J-F Boulicaut A Bykowski, C Rigotti: Fr
J-L Guigues V Duquenne:
José Balcázar
R Agrawal T Imielinski, A Swam
R Dechter J Pearl:
R Khardon D Roth
T Calders B Goethals:
T Eiter G Gottlob
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 01/01/2009
Field of study

Association rules are among the most widely employed data analysis methods in the field of Data Mining. An association rule is a form of partial implication between two sets of binary variables. In the most common approach, association rules are parameterized by a lower bound on their confidence, which is the empirical conditional probability of their consequent given the antecedent, and/or by some other parameter bounds such as "support" or deviation from independence. We study here notions of redundancy among association rules from a fundamental perspective. We see each transaction in a dataset as an interpretation (or model) in the propositional logic sense, and consider existing notions of redundancy, that is, of logical entailment, among association rules, of the form "any dataset in which this first rule holds must obey also that second rule, therefore the second is redundant". We discuss several existing alternative definitions of redundancy between association rules and provide new characterizations and relationships among them. We show that the main alternatives we discuss correspond actually to just two variants, which differ in the treatment of full-confidence implications. For each of these two notions of redundancy, we provide a sound and complete deduction calculus, and we show how to construct complete bases (that is, axiomatizations) of absolutely minimum size in terms of the number of rules. We explore finally an approach to redundancy with respect to several association rules, and fully characterize its simplest case of two partial premises.Comment: LMCS accepted pape

arXiv.org e-Print Archive

CiteSeerX

Crossref

Episciences.org

Directory of Open Access Journals

Family and identity. Catholic and non-Catholic intermarriage: attitudes to children, identity and sharing household responsibilities

Author: A. Moore
C. Silverstein
D. Pavlov
D.E. Knuth
H. Mannila
J.-F. Boulicaut
R. Agrawal
R. Meo
S. Jaroszewicz
T. Calders
Z. Zheng
Publication venue: David Lovell Publishing
Publication date: 01/01/2005
Field of study

info:eu-repo/semantics/publishe

CiteSeerX

DRO Deakin Research Online

Crossref

Repository TU/e

Pure OAI Repository

Institutional Repository Universiteit Antwerpen

DI-fusion

New perspectives on the ecology of tree structure and tree communities through terrestrial laser scanning

Author: Bartholomeus H
Bentley LP
Calders K
Disney MI
Herold M
Jackson T
Lau A
Malhi Y
Shenkin A
Publication venue: 'The Royal Society'
Publication date: 01/01/2018
Field of study

Terrestrial laser scanning (TLS) opens up the possibility of describing the three-dimensional structures of trees in natural environments with unprecedented detail and accuracy. It is already being extensively applied to describe how ecosystem biomass and structure vary between sites, but can also facilitate major advances in developing and testing mechanistic theories of tree form and forest structure, thereby enabling us to understand why trees and forests have the biomass and three-dimensional structure they do. Here we focus on the ecological challenges and benefits of understanding tree form, and highlight some advances related to capturing and describing tree shape that are becoming possible with the advent of TLS. We present examples of ongoing work that applies, or could potentially apply, new TLS measurements to better understand the constraints on optimization of tree form. Theories of resource distribution networks, such as metabolic scaling theory, can be tested and further refined. TLS can also provide new approaches to the scaling of woody surface area and crown area, and thereby better quantify the metabolism of trees. Finally, we demonstrate how we can develop a more mechanistic understanding of the effects of avoidance of wind risk on tree form and maximum size. Over the next few years, TLS promises to deliver both major empirical and conceptual advances in the quantitative understanding of trees and tree-dominated ecosystems, leading to advances in understanding the ecology of why trees and ecosystems look and grow the way they do

Crossref

Ghent University Academic Bibliography

UCL Discovery

Wageningen University & Research Publications

Особливості формування етнічного складу селянської верстви Степового Побужжя

Author: A. Siebes
H. Mannila
J. Vreeken
J.-F. Boulicaut
M. Leeuwen van
M. Leeuwen van
M. Leeuwen van
N. Pasquier
R. Bathoorn
T. Calders
T.M. Cover
Publication venue: Інститут історії України НАН України
Publication date: 01/01/2009
Field of study

In this short paper we sketch a brief introduction to our Krimp algorithm. Moreover, we briefly discuss some of the large body of follow up research. Pointers to the relevant papers are provided in the bibliography

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)

Crossref

Utrecht University Repository

Algebraic Comparison of Partial Lists in Bioinformatics

Author: A Gobbi
A Kalousis
A Kossenkov
A Sboner
AC Haury
AL Boulesteix
Arkady B. Khodursky
B Di Camillo
B Efron
B Efron
B Efron
B Schowe
C Cortes
C Cortes
C Furlanello
C Schneider
C Schneider
C Soneson
C Yao
Cesare Furlanello
Consortium The MicroArray Quality Control (MAQC)
D Albanese
D Cai
D Corrada
D Critchlow
D Saari
D Witten
G Guzzetta
G Jurman
G Jurman
G Lance
G Lance
G Smyth
Giuseppe Jurman
GS Cheon
I Guyon
I Jeffery
I Lönnstedt
J Bar-Ilan
J Borda
J Chen
J Ioannidis
J Neter
J Storey
L Ein-Dor
L Kuncheva
L Yu
L Zhang
M Desarkar
M Kauers
M Kauers
M Kendall
M Schimek
M Schimek
M Slawski
M Villarino
M Villarino
O Bousquet
P Baldi
P Diaconis
P Diaconis
P Hall
P Hall
P Krízek
PC Boutros
R Fagin
R Gentleman
R Graham
R Pearson
R Pique-Regi
R Pique-Regi
R Simon
Roberto Visintainer
S Abramov
S Dudoit
S Lin
S Lin
S Mukherjee
S Setlur
S Simićc
S Vanderlooy
Samantha Riccadonna
SK Lau
T Bø
T Calders
V Tusher
Visintainer
W Fury
W Hoeffding
W Shi
X Wang
X Yang
Y Xiao
Y Xiao
Z He
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 08/04/2010
Field of study

The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Directory of Open Access Journals

PubMed Central