Search CORE

20,256 research outputs found

Learning Interpretable Rules for Multi-label Classification

Author: A Gabriel
AA Freitas
AJ Knobbe
B Liu
B Minnaert
D Malerba
E Gibaja
E Gibaja
E Loza Mencía
E Montañés
F Charte
F Herrera
F Janssen
F Thabtah
G Bosc
G Tsoumakas
Grigorios Tsoumakas
H Allahyari
J Arunadevi
J Demšar
J Fürnkranz
J Han
J Hipp
J Read
JN Sulzmann
K Dembczyński
K Dembczyński
L Chekina
L Raedt De
LE Sucar
M Atzmüller
M Beckerle
M Friedman
M Zhang
Miltiadis Allamanis
MR Boutell
P Kralj Novak
PJ Hayes
R Senge
RM Cameron-Jones
Shantanu Godbole
W Duivesteijn
W Waegeman
WW Cohen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2018
Field of study

Multi-label classification (MLC) is a supervised learning problem in which, contrary to standard multiclass classification, an instance can be associated with several class labels simultaneously. In this chapter, we advocate a rule-based approach to multi-label classification. Rule learning algorithms are often employed when one is not only interested in accurate predictions, but also requires an interpretable theory that can be understood, analyzed, and qualitatively evaluated by domain experts. Ideally, by revealing patterns and regularities contained in the data, a rule-based theory yields new insights in the application domain. Recently, several authors have started to investigate how rule-based models can be used for modeling multi-label data. Discussing this task in detail, we highlight some of the problems that make rule learning considerably more challenging for MLC than for conventional classification. While mainly focusing on our own previous work, we also provide a short overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models in Computer Vision and Machine Learning. The Springer Series on Challenges in Machine Learning. Springer (2018). See http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further informatio

arXiv.org e-Print Archive

TUbiblio

Crossref

Optimal model-free prediction from multivariate time series

Author: Donner RV
Kurths J
Runge J
Publication venue: 'American Physical Society (APS)'
Publication date: 18/02/2015
Field of study

© 2015 American Physical Society.Forecasting a time series from multivariate predictors constitutes a challenging problem, especially using model-free approaches. Most techniques, such as nearest-neighbor prediction, quickly suffer from the curse of dimensionality and overfitting for more than a few predictors which has limited their application mostly to the univariate case. Therefore, selection strategies are needed that harness the available information as efficiently as possible. Since often the right combination of predictors matters, ideally all subsets of possible predictors should be tested for their predictive power, but the exponentially growing number of combinations makes such an approach computationally prohibitive. Here a prediction scheme that overcomes this strong limitation is introduced utilizing a causal preselection step which drastically reduces the number of possible predictors to the most predictive set of causal drivers making a globally optimal search scheme tractable. The information-theoretic optimality is derived and practical selection criteria are discussed. As demonstrated for multivariate nonlinear stochastic delay processes, the optimal scheme can even be less computationally expensive than commonly used suboptimal schemes like forward selection. The method suggests a general framework to apply the optimal model-free approach to select variables and subsequently fit a model to further improve a prediction or learn statistical dependencies. The performance of this framework is illustrated on a climatological index of El Niño Southern Oscillation

Spiral - Imperial College Digital Repository

Optimal model-free prediction from multivariate time series

Author: Donner Reik V.
Kurths Jürgen
Runge Jakob
Publication venue: 'American Physical Society (APS)'
Publication date: 13/05/2015
Field of study

Forecasting a time series from multivariate predictors constitutes a challenging problem, especially using model-free approaches. Most techniques, such as nearest-neighbor prediction, quickly suffer from the curse of dimensionality and overfitting for more than a few predictors which has limited their application mostly to the univariate case. Therefore, selection strategies are needed that harness the available information as efficiently as possible. Since often the right combination of predictors matters, ideally all subsets of possible predictors should be tested for their predictive power, but the exponentially growing number of combinations makes such an approach computationally prohibitive. Here a prediction scheme that overcomes this strong limitation is introduced utilizing a causal pre-selection step which drastically reduces the number of possible predictors to the most predictive set of causal drivers making a globally optimal search scheme tractable. The information-theoretic optimality is derived and practical selection criteria are discussed. As demonstrated for multivariate nonlinear stochastic delay processes, the optimal scheme can even be less computationally expensive than commonly used sub-optimal schemes like forward selection. The method suggests a general framework to apply the optimal model-free approach to select variables and subsequently fit a model to further improve a prediction or learn statistical dependencies. The performance of this framework is illustrated on a climatological index of El Ni\~no Southern Oscillation.Comment: 14 pages, 9 figure

arXiv.org e-Print Archive

Aberdeen University Research

Crossref

Efficient Discovery of Ontology Functional Dependencies

Author: Baskaran Sridevi
Chiang Fei
Keller Alexander
Lukasz Golab
Szlichta Jaroslaw
Publication venue
Publication date: 23/05/2017
Field of study

Poor data quality has become a pervasive issue due to the increasing complexity and size of modern datasets. Constraint based data cleaning techniques rely on integrity constraints as a benchmark to identify and correct errors. Data values that do not satisfy the given set of constraints are flagged as dirty, and data updates are made to re-align the data and the constraints. However, many errors often require user input to resolve due to domain expertise defining specific terminology and relationships. For example, in pharmaceuticals, 'Advil' \emph{is-a} brand name for 'ibuprofen' that can be captured in a pharmaceutical ontology. While functional dependencies (FDs) have traditionally been used in existing data cleaning solutions to model syntactic equivalence, they are not able to model broader relationships (e.g., is-a) defined by an ontology. In this paper, we take a first step towards extending the set of data quality constraints used in data cleaning by defining and discovering \emph{Ontology Functional Dependencies} (OFDs). We lay out theoretical and practical foundations for OFDs, including a set of sound and complete axioms, and a linear inference procedure. We then develop effective algorithms for discovering OFDs, and a set of optimizations that efficiently prune the search space. Our experimental evaluation using real data show the scalability and accuracy of our algorithms.Comment: 12 page

arXiv.org e-Print Archive

Crossref

Knowledge data discovery and data mining in a design environment

Author: Duffy Alex
Haffey Mark
Publication venue
Publication date: 01/01/2000
Field of study

Designers, in the process of satisfying design requirements, generally encounter difficulties in, firstly, understanding the problem and secondly, finding a solution [Cross 1998]. Often the process of understanding the problem and developing a feasible solution are developed simultaneously by proposing a solution to gauge the extent to which the solution satisfies the specific requirements. Support for future design activities has long been recognised to exist in the form of past design cases, however the varying degrees of similarity and dissimilarity found between previous and current design requirements and solutions has restrained the effectiveness of utilising past design solutions. The knowledge embedded within past designs provides a source of experience with the potential to be utilised in future developments provided that the ability to structure and manipulate that knowledgecan be made a reality. The importance of providing the ability to manipulate past design knowledge, allows the ranging viewpoints experienced by a designer, during a design process, to be reflected and supported. Data Mining systems are gaining acceptance in several domains but to date remain largely unrecognised in terms of the potential to support design activities. It is the focus of this paper to introduce the functionality possessed within the realm of Data Mining tools, and to evaluate the level of support that may be achieved in manipulating and utilising experiential knowledge to satisfy designers' ranging perspectives throughout a product's development

University of Strathclyde Institutional Repository

Approaching new migration through Elias' 'established' and 'outsiders' lens

Author: Petintseva Olga
Publication venue: Michigan Publishing
Publication date: 01/01/2015
Field of study

When considering social positions and features that become distinguishing for migrants’ positioning, scholars quite often rely on empirical descriptions, based on discrete and supposedly clearly definable factors. Whereas elements such as legal position, citizenship, etc. are of huge relevance in numerous contexts, in other domains relying on such delineations while studying discriminatory processes oversimplifies the picture. In this paper, a conceptual issue regarding the understandings of the positions of migrants (particularly recent migrations to the Western Europe) is raised. After a discussion of the definitions of ‘new migrations’, a broad heuristic device for thinking about ‘new’ migrants’ positioning will be outlined. This framework – inspired by Elias’ and Scotson’s ‘The Established and the Outsiders’ (1994 [1965]) – can be adapted in different manners by academics addressing topics related to definition, marginalization, and discriminatory processes. The central point is that although various characteristics (e.g. ethnicity, legal position) can be assigned importance in human figurations, the relationships of othering, inequality, and domination need to be seen in the light of the configuration of social relationships and power imbalances

Ghent University Academic Bibliography

Recommended from our members

Understanding the changing nature of sports organisations in transforming societies

Author: Girginov V
Sandanski I
Publication venue: 'Elsevier BV'
Publication date: 01/05/2008
Field of study

The paper examined the process of changing in three Bulgarian national sport organisations (NSO) in swimming, weightlifting and field hockey, as the country is undergoing fundamental political, economic and social transformations from state socialism (1945-1989) to democratisation (1990-present). Drawing on the contextualist approach to organisational change (Pettigrew, 1985) the study was concerned with understanding long-term processes in their context. Analysed were NSOs’ conceptual orientation, structures, resources, capabilities and outcomes. Changing was unveiled through the interplay between three levels of analysis - wider political and economic, sport sector, and organisation-specific. The history of changing unfolded over a 25 years period and followed three stages of crisis of governability (1980-1989), crisis displacement (1989-1997) and identity search (1998-2004). Changing was determined by tensions generated in the previous socialist sport system, the new forces in the NSOs’ context, and by managers’ interpretation of events, and was a discovery process. The three NSOs followed different change patterns of shrinking, insulation and expansion. Two key reasons were responsible for those differences - the institualisation of the broader political and sport sector contexts, and NSOs’ choice to pursue narrow elitism (specialism) or the broader aims of sports development (generalism). The contextualist approach allowed us to appreciate the historical, contextual and processual nature of changing and to discuss the role of managers and various forces in shaping its course and outcomes

Brunel University Research Archive