Search CORE

10,104 research outputs found

Recommended from our members

A statistical learning method to fast generalised rule induction directly from raw measurements

Author: Gaber M. M.
Le Thien
Stahl Frederic
Wrench C.
Publication venue
Publication date: 01/01/2017
Field of study

Induction of descriptive models is one of the most important technologies in data mining. The expressiveness of descriptive models are of paramount importance in applications that examine the causality of relationships between variables. Most of the work on descriptive models has concentrated on less expressive approaches such as clustering algorithms or rule-based approaches that are limited to a particular type of data, such as association rule mining for binary data. However, in many applications its important to understand the structure of the produced model for further human evaluation. In this research we present a novel generalised rule induction method that allows the induction of descriptive and expressive rules directly from both categorical and numerical features

Central Archive at the University of Reading

Recommended from our members

Towards expressive modular rule induction for numerical attributes

Author: Almutairi Manal
Bramer Max
Jennings Mathew
Le Thien
Stahl Frederic
Publication venue: Springer International Publishing
Publication date: 01/01/2016
Field of study

The Prism family is an alternative set of predictive data mining algorithms to the more established decision tree data mining algorithms. Prism classifiers are more expressive and user friendly compared with decision trees and achieve a similar accuracy compared with that of decision trees and even outperform decision trees in some cases. This is especially the case where there is noise and clashes in the training data. However, Prism algorithms still tend to overfit on noisy data; this has led to the development of pruning methods which have allowed the Prism algorithms to generalise better over the dataset. The work presented in this paper aims to address the problem of overfitting at rule induction stage for numerical attributes by proposing a new numerical rule term structure based on the Gauss Probability Density Distribution. This new rule term structure is not only expected to lead to a more robust classifier, but also lowers the computational requirements as it needs to induce fewer rule terms

Central Archive at the University of Reading

Crossref

Potential changes in disease patterns and pharmaceutical use in response to climate change

Author: Davidson I
Depledge MH
Fleming LE
Redshaw CH
Stahl-Timmins WM
Publication venue: 'Informa UK Limited'
Publication date: 02/01/2018
Field of study

This is the final version of the article. Available from Taylor & Francis via the DOI in this record.As climate change alters environmental conditions, the incidence and global patterns of human diseases are changing. These modifications to disease profiles and the effects upon human pharmaceutical usage are discussed. Climate-related environmental changes are associated with a rise in the incidence of chronic diseases already prevalent in the Northern Hemisphere, for example, cardiovascular disease and mental illness, leading to greater use of associated heavily used Western medications. Sufferers of respiratory diseases may exhibit exacerbated symptoms due to altered environmental conditions (e.g., pollen). Respiratory, water-borne, and food-borne toxicants and infections, including those that are vector borne, may become more common in Western countries, central and eastern Asia, and across North America. As new disease threats emerge, substantially higher pharmaceutical use appears inevitable, especially of pharmaceuticals not commonly employed at present (e.g., antiprotozoals). The use of medications for the treatment of general symptoms (e.g., analgesics) will also rise. These developments need to be viewed in the context of other major environmental changes (e.g., industrial chemical pollution, biodiversity loss, reduced water and food security) as well as marked shifts in human demographics, including aging of the population. To identify, prevent, mitigate, and adapt to potential threats, one needs to be aware of the major factors underlying changes in the use of pharmaceuticals and their subsequent release, deliberately or unintentionally, into the environment. This review explores the likely consequences of climate change upon the use of medical pharmaceuticals in the Northern Hemisphere.The European Centre for Environment and Human Health (part of the University of Exeter Medical School) is partly financed by the European Regional Development Fund Programme 2007 to 2013 and European Social Fund Convergence Programme for Cornwall and the Isles of Scilly

Open Research Exeter

Recommended from our members

A method of rule induction for predicting and describing future alarms in a telecommunication network

Author: Di Fatta Giuseppe
Karthikeyan Vidhyalakshmi
Le Thien
Nauck Detlef
Stahl Frederic
Wrench Chris
Publication venue: Springer International Publishing
Publication date: 01/01/2016
Field of study

In order to gain insights into events and issues that may cause alarms in parts of IP networks, intelligent methods that capture and express causal relationships are needed. Methods that are predictive and descriptive are rare and those that do predict are often limited to using a single feature from a vast data set. This paper follows the progression of a Rule Induction Algorithm that produces rules with strong causal links that are both descriptive and predict events ahead of time. The algorithm is based on an information theoretic approach to extract rules comprising of a conjunction of network events that are significant prior to network alarms. An empirical evaluation of the algorithm is provided

Central Archive at the University of Reading

Recommended from our members

Computationally efficient rule-based classification for continuous streaming data

Author: Di Fatta Giuseppe
Gaber Mohamed Medhat
Gomes João Bártolo
Le Thien
Stahl Frederic
Publication venue: Springer International Publishing
Publication date: 01/01/2014
Field of study

Advances in hardware and software technologies allow to capture streaming data. The area of Data Stream Mining (DSM) is concerned with the analysis of these vast amounts of data as it is generated in real-time. Data stream classification is one of the most important DSM techniques allowing to classify previously unseen data instances. Different to traditional classifiers for static data, data stream classifiers need to adapt to concept changes (concept drift) in the stream in real-time in order to reflect the most recent concept in the data as accurately as possible. A recent addition to the data stream classifier toolbox is eRules which induces and updates a set of expressive rules that can easily be interpreted by humans. However, like most rule-based data stream classifiers, eRules exhibits a poor computational performance when confronted with continuous attributes. In this work, we propose an approach to deal with continuous data effectively and accurately in rule-based classifiers by using the Gaussian distribution as heuristic for building rule terms on continuous attributes. We show on the example of eRules that incorporating our method for continuous attributes indeed speeds up the real-time rule induction process while maintaining a similar level of accuracy compared with the original eRules classifier. We termed this new version of eRules with our approach G-eRules

Central Archive at the University of Reading

Crossref

Sticky prices in the euro area: a summary of new micro evidence

Author: Dhyne Emmanuel
Hoeberichts Marco
Kwapil Claudia
Le Bihan Hervé
Lünnemann Patrick
Martins Fernando
Sabbatini Roberto
Stahl Harald
Vermeulen Philip
Vilmunen Juoko
Álvarez Luis J.
Publication venue
Publication date
Field of study

This paper presents original evidence on price setting in the euro area at the individual level. We use micro data on consumer (CPI) and producer (PPI) prices, as well as survey information. Our main findings are: (i) prices in the euro area are sticky and more so than in the US; (ii) there is evidence of heterogeneity and of asymmetries in price setting behaviour; (iii) downward price rigidity is only slightly more marked than upward price rigidity and (iv) implicit or explicit contracts and coordination failure theories are important, whereas menu or information costs are judged much less relevant by firms. --Price setting,Price stickiness,Consumer prices,Producer prices,survey data

Research Papers in Economics

Sticky Prices in The Euro Area: a Summary of New Micro Evidence

Author: C. Kwapil
Emmanuel Dhyne
Fernando Martins
H. Stahl
Hervé Le Bihan
Jouko Vilmunen
Luis J. Álvarez
Marco Hoeberichts
Patrick Lünnemann
Philip Vermeulen
R. Sabbatini
Publication venue
Publication date
Field of study

Research Papers in Economics

Sticky prices in the euro area: a summary of new micro evidence

Author: Dhyne Emmanuel
Hoeberichts Marco M.
Kwapil Claudia
Le Bihan Hervé
Lünnemann Patrick
Martins Fernando
Sabbatini Roberto
Stahl Harald
Vermeulen Philip
Vilmunen Jouko
Álvarez Luis J.
Publication venue
Publication date
Field of study

This paper presents original evidence on price setting in the euro area at the individual level. We use micro data on consumer (CPI) and producer (PPI) prices, as well as survey information. Our main findings are: (i) prices in the euro area are sticky and more so than in the US; (ii) there is evidence of heterogeneity and of asymmetries in price setting behaviour; (iii) downward price rigidity is only slightly more marked than upward price rigidity and (iv) implicit or explicit contracts and coordination failure theories are important, whereas menu or information costs are judged much less relevant by firms. JEL Classification: C25, D40, E31consumer prices, price setting, Price stickiness, producer prices, survey data

Research Papers in Economics

Hubble Space Telescope Survey of Interstellar ^12CO/^13CO in the Solar Neighborhood

Author: Boogert A. C. A.
Clegg R. E. S.
D. L. Lambert
Eidelsberg M.
Henkel C.
Kahane C.
Le Coupanec P.
Lequeux J.
Liszt H. S.
Lucas R.
M. Rogers
Palla F.
R. Gredel
S. R. Federman
Stahl O.
Warin S.
Y. Sheffer
Publication venue: 'University of Chicago Press'
Publication date: 01/01/2007
Field of study

We examine 20 diffuse and translucent Galactic sight lines and extract the column densities of the ^12CO and ^13CO isotopologues from their ultraviolet A--X absorption bands detected in archival Space Telescope Imaging Spectrograph data with lambda/Deltalambda geq 46,000. Five more targets with Goddard High-Resolution Spectrograph data are added to the sample that more than doubles the number of sight lines with published Hubble Space Telescope observations of ^13CO. Most sight lines have 12-to-13 isotopic ratios that are not significantly different from the local value of 70 for ^12C/^13C, which is based on mm-wave observations of rotational lines in emission from CO and H_2CO inside dense molecular clouds, as well as on results from optical measurements of CH^+. Five of the 25 sight lines are found to be fractionated toward lower 12-to-13 values, while three sight lines in the sample are fractionated toward higher ratios, signaling the predominance of either isotopic charge exchange or selective photodissociation, respectively. There are no obvious trends of the ^12CO-to-^13CO ratio with physical conditions such as gas temperature or density, yet ^12CO/^13CO does vary in a complicated manner with the column density of either CO isotopologue, owing to varying levels of competition between isotopic charge exchange and selective photodissociation in the fractionation of CO. Finally, rotational temperatures of H_2 show that all sight lines with detected amounts of ^13CO pass through gas that is on average colder by 20 K than the gas without ^13CO. This colder gas is also sampled by CN and C_2 molecules, the latter indicating gas kinetic temperatures of only 28 K, enough to facilitate an efficient charge exchange reaction that lowers the value of ^12CO/^13CO.Comment: 1-column emulateapj, 23 pages, 9 figure

arXiv.org e-Print Archive

Crossref

A Frequent Pattern Conjunction Heuristic for Rule Generation in Data Streams

Author: Badii Atta
Gaber Mohamed Medhat
Le Thien
Stahl Frederic
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

This paper introduces a new and expressive algorithm for inducing descriptive rule-sets from streaming data in real-time in order to describe frequent patterns explicitly encoded in the stream. Data Stream Mining (DSM) is concerned with the automatic analysis of data streams in real-time. Rapid flows of data challenge the state-of-the art processing and communication infrastructure, hence the motivation for research and innovation into real-time algorithms that analyse data streams on-the-fly and can automatically adapt to concept drifts. To date, DSM techniques have largely focused on predictive data mining applications that aim to forecast the value of a particular target feature of unseen data instances, answering questions such as whether a credit card transaction is fraudulent or not. A real-time, expressive and descriptive Data Mining technique for streaming data has not been previously established as part of the DSM toolkit. This has motivated the work reported in this paper, which has resulted in developing and validating a Generalised Rule Induction (GRI) tool, thus producing expressive rules as explanations that can be easily understood by human analysts. The expressiveness of decision models in data streams serves the objectives of transparency, underpinning the vision of `explainable AI’ and yet is an area of research that has attracted less attention despite being of high practical importance. The algorithm introduced and described in this paper is termed Fast Generalised Rule Induction (FGRI). FGRI is able to induce descriptive rules incrementally for raw data from both categorical and numerical features. FGRI is able to adapt rule-sets to changes of the pattern encoded in the data stream (concept drift) on the fly as new data arrives and can thus be applied continuously in real-time. The paper also provides a theoretical, qualitative and empirical evaluation of FGRI

Multidisciplinary Digital Publishing Institute

Central Archive at the University of Reading

Birmingham City University Open Access Repository

Directory of Open Access Journals

BCU Open Access