Search CORE

20,742 research outputs found

Interpretable multiclass classification by MDL-based rule lists

Author: Proença Hugo M.
van Leeuwen Matthijs
Publication venue: 'Elsevier BV'
Publication date: 31/10/2019
Field of study

Interpretable classifiers have recently witnessed an increase in attention from the data mining community because they are inherently easier to understand and explain than their more complex counterparts. Examples of interpretable classification models include decision trees, rule sets, and rule lists. Learning such models often involves optimizing hyperparameters, which typically requires substantial amounts of data and may result in relatively large models. In this paper, we consider the problem of learning compact yet accurate probabilistic rule lists for multiclass classification. Specifically, we propose a novel formalization based on probabilistic rule lists and the minimum description length (MDL) principle. This results in virtually parameter-free model selection that naturally allows to trade-off model complexity with goodness of fit, by which overfitting and the need for hyperparameter tuning are effectively avoided. Finally, we introduce the Classy algorithm, which greedily finds rule lists according to the proposed criterion. We empirically demonstrate that Classy selects small probabilistic rule lists that outperform state-of-the-art classifiers when it comes to the combination of predictive performance and interpretability. We show that Classy is insensitive to its only parameter, i.e., the candidate set, and that compression on the training set correlates with classification performance, validating our MDL-based selection criterion

arXiv.org e-Print Archive

Leiden University Scholary Publications

Social action on social media

Author: Carl Miller
Publication venue: Nesta
Publication date
Field of study

This paper examines a new way of detecting and measuring social action, especially that which takes place below the radar. Abstract People try to help others in a wide number of ways. Taken together this is social action - the heart of civil society, and the foundation of a healthy one. However, some social action is hard to spot. It may be unregistered, be carried out with little or no income, or have little formal governance. This paper examines a new way of detecting and measuring social action – especially that which takes place below the radar. It uses a new methodology developed by CASM to use social media to spot, collect and measure social action that normally is carried out below the radar. It uses natural language processing algorithms to analyse, and sort large quantities of Tweets related to two key events: the flooding of 2014, and the launch of the Step up to Serve Campaign. This paper finds: Disasters, accidents and catastrophes are likely to create a explosions of Tweets too large to manually read. Some people will use Twitter to either offer or ask for help. This will often be specific to the disaster, spontaneous, and by people operating outside of any organization or charity. Twitter is a significant new forum which people will use in response to events to try to help each other. And it recommends: An Ebay for social action on social media’: Connecting social action supply with demand: When social action information is found, it could be centralized onto a real-time online platform, information exchange or brokerage hub, clearly related to a specific event and segmented either being offered

Analysis and Policy Observatory (APO)