Search CORE

51 research outputs found

F-measure Maximization in Multi-Label Classification with Conditionally Independent Label Subsets

Author: K Dembczynski
O Luaces
W Waegeman
WN Venables
Publication venue
Publication date: 01/07/2016
Field of study

We discuss a method to improve the exact F-measure maximization algorithm called GFM, proposed in (Dembczynski et al. 2011) for multi-label classification, assuming the label set can be can partitioned into conditionally independent subsets given the input features. If the labels were all independent, the estimation of only

m

parameters (

m

denoting the number of labels) would suffice to derive Bayes-optimal predictions in

O(m^2)

operations. In the general case,

m^2+1

parameters are required by GFM, to solve the problem in

O(m^3)

operations. In this work, we show that the number of parameters can be reduced further to

m^2/n

, in the best case, assuming the label set can be partitioned into

n

conditionally independent subsets. As this label partition needs to be estimated from the data beforehand, we use first the procedure proposed in (Gasse et al. 2015) that finds such partition and then infer the required parameters locally in each label subset. The latter are aggregated and serve as input to GFM to form the Bayes-optimal prediction. We show on a synthetic experiment that the reduction in the number of parameters brings about significant benefits in terms of performance

arXiv.org e-Print Archive

Crossref

Hal-Diderot

Exploring Criteria in Forest Aesthetics: Rough Sets Theory and Discourse Analysis

Author: Dembczynski K.
Kohsaka R.
Publication venue: IR-03-038
Publication date: 01/12/2003
Field of study

International Institute for Applied Systems Analysis (IIASA)

Ground-state hyperfine-structure measurements of unstable $Eu^+$ isotopes in a Paul ion trap

Author: Dembczynski J
Enders K
Georg U
Marx G
Stachowska E
Werth Günther
Zölch C
Publication venue
Publication date: 01/01/1997
Field of study

CERN Document Server

Conformal Rule-Based Multi-label Classification

Author: A Gammerman
A Wieczorkowska
E Parzen
EL Mencía
G Shafer
H Papadopoulos
H Papadopoulos
K Dembczynski
ML Zhang
P Auer
V Vovk
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/07/2020
Field of study

We advocate the use of conformal prediction (CP) to enhance rule-based multi-label classification (MLC). In particular, we highlight the mutual benefit of CP and rule learning: Rules have the ability to provide natural (non-)conformity scores, which are required by CP, while CP suggests a way to calibrate the assessment of candidate rules, thereby supporting better predictions and more elaborate decision making. We illustrate the potential usefulness of calibrated conformity scores in a case study on lazy multi-label rule learning

arXiv.org e-Print Archive

Crossref

A Rough Set Approach for the Discovery of Classification Rules in Interval-Valued Information Systems

Author: A De Korvin
A Mrozek
A Skowron
A Skowron
C C Chan
D Dubois
D Kim
D Slezak
G Shafer
I Duntsch
J Han
J S Mi
J S Mi
J S Mi
J Stefanowski
J Stefanowski
J W Grzym Ala-Busse
J W Grzymala-Busse
J W Grzymala-Busse
Ju-Sheng Mi
K Dembczynski
L Polkowski
L Polkowski
M J Beynon
M Kryszkiewicz
M Kryszkiewicz
M Kryszkiewicz
M R Chmielewski
P J Lingras
R Slowinski
R Slowinski
R Yasdi
S Bodjanova
S Greco
S Greco
S Greco
T P Hong
W X Zhang
W Z Wu
W Z Wu
W Z Wu
W Z Wu
W Z Wu
W Ziarko
Wei-Zhi Wu
Y Leung
Y Leung
Y Y Yao
Y Y Yao
Y Y Yao
Yee Leung
Z Pawlak
Z Pawlak
Z Pawlak
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

Crossref

Immobilized Lactobacillus acidophilus produced from whey and alginate

Author: Annan N. T.
Apás A. L
Aragon-Alegro L. C.
Bailey J. E.
Baillon M. L. A.
Balcázar J. L.
Capela P.
Chan E.S.
Chaucheyras-Durand
Chávarri M.
Cruz A. G.
de Magalhães J. T.
Dembczynski R.
DiGeronimo M. J.
Fuller R.
Gomes A. M. P.
Kailasapathy K.
Krasaekoopt W.
Leary N. O.
Lowry O. H.
Man J. D.
Marshall-Jones Z. V.
Meng X. C.
Miller G. L.
Mitic S.
Oelschlaeger T. A.
Ozmihci S.
Pan X.
Prazeres A. R.
Rayment P.
Rech R.
Selmer-Olsen E.
Sun X.
Taha H. A.
Talwalkar A.
Tamime A.
Tari C.
Wohlgemuth S.
Zhao R.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

ENDER: A Statistical Framework for Boosting Decision Rules

Author: Dembczynski K.
Kotlowski W.T. (Wojciech)
Slowinski R.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2010
Field of study

Induction of decision rules plays an important role in machine learning. Themain advantage of decision rules is their simplicity and human-interpretable form. Moreover, they are capable of modeling complex interactions between attributes. In this paper, we thoroughly analyze a learning algorithm, called ENDER, which constructs an ensemble of decision rules. This algorithm is tailored for regression and binary classification problems. It uses the boosting approach for learning, which can be treated as generalization of sequential covering. Each new rule is fitted by focusing on examples which were the hardest to classify correctly by the rules already present in the ensemble. We consider different loss functions and minimization techniques often encountered in the boosting framework. The minimization techniques are used to derive impurity measures which control construction of single decision rules. Properties of four different impurity measures are analyzed with respect to the trade-off between misclassification (discrimination) and coverage (completeness) of the rule. Moreover, we consider regularization consisting of shrinking and sampling. Finally, we compare the ENDER algorithm with other well-known decision rule learners such as SLIPPER, LRI and RuleFit

CWI's Institutional Repository

Bipartite Ranking through Minimization of Univariate Loss

Author: Dembczynski K.
Huellermeier E.
Kotlowski W.T. (Wojciech)
Publication venue: Omnipress
Publication date: 01/07/2011
Field of study

Minimization of the rank loss or, equivalently, maximization of the AUC in bipartite ranking calls for minimizing the number of disagreements between pairs of instances. Since the complexity of this problem is inherently quadratic in the number of training examples, it is tempting to ask how much is actually lost by minimizing a simple univariate loss function, as done by standard classification methods, as a surrogate. In this paper, we first note that minimization of 0/1 loss is not an option, as it may yield an arbitrarily high rank loss. We show, however, that better results can be achieved by means of a weighted (cost-sensitive) version of 0/1 loss. Yet, the real gain is obtained through margin-based loss functions, for which we are able to derive proper bounds, not only for rank risk but, more importantly, also for rank regret. The paper is completed with an experimental study in which we address specific questions raised by our theoretical analysis

CWI's Institutional Repository

Quality of rough approximation in multi-criteria classification problems

Author: Dembczynski K.
Greco Salvatore
Kotlowski W.
Slowinski R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Portsmouth University Research Portal (Pure)

F-Measure Maximization in Multi-Label Classification with Conditionally Independent Label Subsets

Author: K Dembczynski
O Luaces
W Waegeman
WN Venables
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/09/2016
Field of study

International audienceWe discuss a method to improve the exact F-measure max-imization algorithm called GFM, proposed in [2] for multi-label classification , assuming the label set can be partitioned into conditionally independent subsets given the input features. If the labels were all independent , the estimation of only m parameters (m denoting the number of labels) would suffice to derive Bayes-optimal predictions in O(m^2) operations [10]. In the general case, m^2 + 1 parameters are required by GFM, to solve the problem in O(m^3) operations. In this work, we show that the number of parameters can be reduced further to m^2 /n, in the best case, assuming the label set can be partitioned into n conditionally independent subsets. As this label partition needs to be estimated from the data beforehand, we use first the procedure proposed in [4] that finds such partition and then infer the required parameters locally in each label subset. The latter are aggregated and serve as input to GFM to form the Bayes-optimal prediction. We show on a synthetic experiment that the reduction in the number of parameters brings about significant benefits in terms of performance

Crossref