392 research outputs found
Learning nonlinear monotone classifiers using the Choquet Integral
In der jüngeren Vergangenheit hat das Lernen von Vorhersagemodellen, die eine monotone Beziehung zwischen Ein- und Ausgabevariablen garantieren, wachsende Aufmerksamkeit im Bereich des maschinellen Lernens erlangt. Besonders für flexible nichtlineare Modelle stellt die Gewährleistung der Monotonie eine große Herausforderung für die Umsetzung dar. Die vorgelegte Arbeit nutzt das Choquet Integral als mathematische Grundlage für die Entwicklung neuer Modelle für nichtlineare Klassifikationsaufgaben. Neben den bekannten Einsatzgebieten des Choquet-Integrals als flexible Aggregationsfunktion in multi-kriteriellen Entscheidungsverfahren, findet der Formalismus damit Eingang als wichtiges Werkzeug für Modelle des maschinellen Lernens. Neben dem Vorteil, Monotonie und Flexibilität auf elegante Weise mathematisch vereinbar zu machen, bietet das Choquet-Integral Möglichkeiten zur Quantifizierung von Wechselwirkungen zwischen Gruppen von Attributen der Eingabedaten, wodurch interpretierbare Modelle gewonnen werden können. In der Arbeit werden konkrete Methoden für das Lernen mit dem Choquet Integral entwickelt, welche zwei unterschiedliche Ansätze nutzen, die Maximum-Likelihood-Schätzung und die strukturelle Risikominimierung. Während der erste Ansatz zu einer Verallgemeinerung der logistischen Regression führt, wird der zweite mit Hilfe von Support-Vektor-Maschinen realisiert. In beiden Fällen wird das Lernproblem imWesentlichen auf die Parameter-Identifikation von Fuzzy-Maßen für das Choquet Integral zurückgeführt. Die exponentielle Anzahl von Freiheitsgraden zur Modellierung aller Attribut-Teilmengen stellt dabei besondere Herausforderungen im Hinblick auf Laufzeitkomplexität und Generalisierungsleistung. Vor deren Hintergrund werden die beiden Ansätze praktisch bewertet und auch theoretisch analysiert. Zudem werden auch geeignete Verfahren zur Komplexitätsreduktion und Modellregularisierung vorgeschlagen und untersucht. Die experimentellen Ergebnisse sind auch für anspruchsvolle Referenzprobleme im Vergleich mit aktuellen Verfahren sehr gut und heben die Nützlichkeit der Kombination aus Monotonie und Flexibilität des Choquet Integrals in verschiedenen Ansätzen des maschinellen Lernens hervor
Preference Learning
This report documents the program and the outcomes of Dagstuhl Seminar 14101 “Preference Learning”. Preferences have recently received considerable attention in disciplines such as machine learning, knowledge discovery, information retrieval, statistics, social choice theory, multiple criteria decision making, decision under risk and uncertainty, operations research, and others. The motivation for this seminar was to showcase recent progress in these different areas with the goal of working towards a common basis of understanding, which should help to facilitate future synergies
Dominance-based Rough Set Approach, basic ideas and main trends
Dominance-based Rough Approach (DRSA) has been proposed as a machine learning
and knowledge discovery methodology to handle Multiple Criteria Decision Aiding
(MCDA). Due to its capacity of asking the decision maker (DM) for simple
preference information and supplying easily understandable and explainable
recommendations, DRSA gained much interest during the years and it is now one
of the most appreciated MCDA approaches. In fact, it has been applied also
beyond MCDA domain, as a general knowledge discovery and data mining
methodology for the analysis of monotonic (and also non-monotonic) data. In
this contribution, we recall the basic principles and the main concepts of
DRSA, with a general overview of its developments and software. We present also
a historical reconstruction of the genesis of the methodology, with a specific
focus on the contribution of Roman S{\l}owi\'nski.Comment: This research was partially supported by TAILOR, a project funded by
European Union (EU) Horizon 2020 research and innovation programme under GA
No 952215. This submission is a preprint of a book chapter accepted by
Springer, with very few minor differences of just technical natur
Rough set and rule-based multicriteria decision aiding
The aim of multicriteria decision aiding is to give the decision maker a recommendation concerning a set of objects evaluated from multiple points of view called criteria. Since a rational decision maker acts with respect to his/her value system, in order to recommend the most-preferred decision, one must identify decision maker's preferences. In this paper, we focus on preference discovery from data concerning some past decisions of the decision maker. We consider the preference model in the form of a set of "if..., then..." decision rules discovered from the data by inductive learning. To structure the data prior to induction of rules, we use the Dominance-based Rough Set Approach (DRSA). DRSA is a methodology for reasoning about data, which handles ordinal evaluations of objects on considered criteria and monotonic relationships between these evaluations and the decision. We review applications of DRSA to a large variety of multicriteria decision problems
Ordinal regression methods: survey and experimental study
Abstract—Ordinal regression problems are those machine learning problems where the objective is to classify patterns using a
categorical scale which shows a natural order between the labels. Many real-world applications present this labelling structure and
that has increased the number of methods and algorithms developed over the last years in this field. Although ordinal regression can
be faced using standard nominal classification techniques, there are several algorithms which can specifically benefit from the ordering
information. Therefore, this paper is aimed at reviewing the state of the art on these techniques and proposing a taxonomy based on
how the models are constructed to take the order into account. Furthermore, a thorough experimental study is proposed to check if
the use of the order information improves the performance of the models obtained, considering some of the approaches within the
taxonomy. The results confirm that ordering information benefits ordinal models improving their accuracy and the closeness of the
predictions to actual targets in the ordinal scal
Knowledge Discovery and Monotonicity
The monotonicity property is ubiquitous in our lives and it appears in different roles: as domain knowledge, as a requirement, as a property that reduces the complexity of the problem, and so on. It is present in various domains: economics, mathematics, languages, operations research and many others. This thesis is focused on the monotonicity property in knowledge discovery and more specifically in classification, attribute reduction, function decomposition, frequent patterns generation and missing values handling. Four specific problems are addressed within four different methodologies, namely, rough sets theory, monotone decision trees, function decomposition and frequent patterns generation. In the first three parts, the monotonicity is domain knowledge and a requirement for the outcome of the classification process. The three methodologies are extended for dealing with monotone data in order to be able to guarantee that the outcome will also satisfy the monotonicity requirement. In the last part, monotonicity is a property that helps reduce the computation of the process of frequent patterns generation. Here the focus is on two of the best algorithms and their comparison both theoretically and experimentally.
About the Author:
Viara Popova was born in Bourgas, Bulgaria in 1972. She followed her secondary
education at Mathematics High School "Nikola Obreshkov" in Bourgas. In 1996
she finished her higher education at Sofia University, Faculty of Mathematics
and Informatics where she graduated with major in Informatics and specialization
in Information Technologies in Education. She then joined the Department
of Information Technologies,
First as an associated member and from 1997 as an assistant professor.
In 1999 she became a PhD student at Erasmus University Rotterdam, Faculty
of Economics, Department of Computer Science. In 2004 she joined the
Artificial Intelligence Group within the Department of Computer Science, Faculty
of Sciences at Vrije Universiteit Amsterdam as a PostDoc researcher.This thesis is positioned in the area of knowledge discovery with special attention to problems where the property of monotonicity plays an important role. Monotonicity is a ubiquitous property in all areas of life and has therefore been widely studied in mathematics. Monotonicity in knowledge discovery can be treated as available background information that can facilitate and guide the knowledge extraction process. While in some sub-areas methods have already been developed for taking this additional information into account, in most methodologies it has not been extensively studied or even has not been addressed at all. This thesis is a contribution to a change in that direction. In the thesis, four specific problems have been examined from different sub-areas of knowledge discovery: the rough sets methodology, monotone decision trees, function decomposition and frequent patterns discovery. In the first three parts, the monotonicity is domain knowledge and a requirement for the outcome of the classification process. The three methodologies are extended for dealing with monotone data in order to be able to guarantee that the outcome will also satisfy the monotonicity requirement. In the last part, monotonicity is a property that helps reduce the computation of the process of frequent patterns generation. Here the focus is on two of the best algorithms and their comparison both theoretically and experimentally
Recent advances in the theory and practice of logical analysis of data
Logical Analysis of Data (LAD) is a data analysis methodology introduced by Peter L. Hammer in 1986. LAD distinguishes itself from other classification and machine learning methods by the fact that it analyzes a significant subset of combinations of variables to describe the positive or negative nature of an observation and uses combinatorial techniques to extract models defined in terms of patterns. In recent years, the methodology has tremendously advanced through numerous theoretical developments and practical applications. In the present paper, we review the methodology and its recent advances, describe novel applications in engineering, finance, health care, and algorithmic techniques for some stochastic optimization problems, and provide a comparative description of LAD with well-known classification methods
- …