10,104 research outputs found
Recommended from our members
A statistical learning method to fast generalised rule induction directly from raw measurements
Induction of descriptive models is one of the most important technologies in data mining. The expressiveness of descriptive models are of paramount importance in applications that examine the causality of relationships between variables. Most of the work on descriptive models has concentrated on less expressive approaches such as clustering algorithms or rule-based approaches that are limited to a particular type of data, such as association rule mining for binary data. However, in many applications its important to understand the structure of the produced model for further human evaluation. In this research we present a novel generalised rule induction method that allows the induction of descriptive and expressive rules directly from both categorical and numerical features
Recommended from our members
Towards expressive modular rule induction for numerical attributes
The Prism family is an alternative set of predictive data mining algorithms to the more established decision tree data mining algorithms. Prism classifiers are more expressive and user friendly compared with decision trees and achieve a similar accuracy compared with that of decision trees and even outperform decision trees in some cases. This is especially the case where there is noise and clashes in the training data. However, Prism algorithms still tend to overfit on noisy data; this has led to the development of pruning methods which have allowed the Prism algorithms to generalise better over the dataset. The work presented in this paper aims to address the problem of overfitting at rule induction stage for numerical attributes by proposing a new numerical rule term structure based on the Gauss Probability Density Distribution. This new rule term structure is not only expected to lead to a more robust classifier, but also lowers the computational requirements as it needs to induce fewer rule terms
Potential changes in disease patterns and pharmaceutical use in response to climate change
This is the final version of the article. Available from Taylor & Francis via the DOI in this record.As climate change alters environmental conditions, the incidence and global patterns of human diseases are changing. These modifications to disease profiles and the effects upon human pharmaceutical usage are discussed. Climate-related environmental changes are associated with a rise in the incidence of chronic diseases already prevalent in the Northern Hemisphere, for example, cardiovascular disease and mental illness, leading to greater use of associated heavily used Western medications. Sufferers of respiratory diseases may exhibit exacerbated symptoms due to altered environmental conditions (e.g., pollen). Respiratory, water-borne, and food-borne toxicants and infections, including those that are vector borne, may become more common in Western countries, central and eastern Asia, and across North America. As new disease threats emerge, substantially higher pharmaceutical use appears inevitable, especially of pharmaceuticals not commonly employed at present (e.g., antiprotozoals). The use of medications for the treatment of general symptoms (e.g., analgesics) will also rise. These developments need to be viewed in the context of other major environmental changes (e.g., industrial chemical pollution, biodiversity loss, reduced water and food security) as well as marked shifts in human demographics, including aging of the population. To identify, prevent, mitigate, and adapt to potential threats, one needs to be aware of the major factors underlying changes in the use of pharmaceuticals and their subsequent release, deliberately or unintentionally, into the environment. This review explores the likely consequences of climate change upon the use of medical pharmaceuticals in the Northern Hemisphere.The European Centre for Environment and Human Health (part of the University of Exeter Medical School) is partly financed by the European Regional Development Fund Programme 2007 to 2013 and European Social Fund Convergence Programme for Cornwall and the Isles of Scilly
Recommended from our members
A method of rule induction for predicting and describing future alarms in a telecommunication network
In order to gain insights into events and issues that may cause alarms in parts of IP networks, intelligent methods that capture and express causal relationships are needed. Methods that are predictive and descriptive are rare and those that do predict are often limited to using a single feature from a vast data set. This paper follows the progression of a Rule Induction Algorithm that produces rules with strong causal links that are both descriptive and predict events ahead of time. The algorithm is based on an information theoretic approach to extract rules comprising of a conjunction of network events that are significant prior to network alarms. An empirical evaluation of the algorithm is provided
Recommended from our members
Computationally efficient rule-based classification for continuous streaming data
Advances in hardware and software technologies allow to capture streaming data. The area of Data Stream Mining (DSM) is concerned with the analysis of these vast amounts of data as it is generated in real-time. Data stream classification is one of the most important DSM techniques allowing to classify previously unseen data instances. Different to traditional classifiers for static data, data stream classifiers need to adapt to concept changes (concept drift) in the stream in real-time in order to reflect the most recent concept in the data as accurately as possible. A recent addition to the data stream classifier toolbox is eRules which induces and updates a set of expressive rules that can easily be interpreted by humans. However, like most rule-based data stream classifiers, eRules exhibits a poor computational performance when confronted with continuous attributes. In this work, we propose an approach to deal with continuous data effectively and accurately in rule-based classifiers by using the Gaussian distribution as heuristic for building rule terms on continuous attributes. We show on the example of eRules that incorporating our method for continuous attributes indeed speeds up the real-time rule induction process while maintaining a similar level of accuracy compared with the original eRules classifier. We termed this new version of eRules with our approach G-eRules
Sticky prices in the euro area: a summary of new micro evidence
This paper presents original evidence on price setting in the euro area at the individual level. We use micro data on consumer (CPI) and producer (PPI) prices, as well as survey information. Our main findings are: (i) prices in the euro area are sticky and more so than in the US; (ii) there is evidence of heterogeneity and of asymmetries in price setting behaviour; (iii) downward price rigidity is only slightly more marked than upward price rigidity and (iv) implicit or explicit contracts and coordination failure theories are important, whereas menu or information costs are judged much less relevant by firms. --Price setting,Price stickiness,Consumer prices,Producer prices,survey data
Sticky Prices in The Euro Area: a Summary of New Micro Evidence
This paper presents original evidence on price setting in the euro area at the individual level. We use micro data on consumer (CPI) and producer (PPI) prices, as well as survey information. Our main findings are: (i) prices in the euro area are sticky and more so than in the US; (ii) there is evidence of heterogeneity and of asymmetries in price setting behaviour; (iii) downward price rigidity is only slightly more marked than upward price rigidity and (iv) implicit or explicit contracts and coordination failure theories are important, whereas menu or information costs are judged much less relevant by firms.
Sticky prices in the euro area: a summary of new micro evidence
This paper presents original evidence on price setting in the euro area at the individual level. We use micro data on consumer (CPI) and producer (PPI) prices, as well as survey information. Our main findings are: (i) prices in the euro area are sticky and more so than in the US; (ii) there is evidence of heterogeneity and of asymmetries in price setting behaviour; (iii) downward price rigidity is only slightly more marked than upward price rigidity and (iv) implicit or explicit contracts and coordination failure theories are important, whereas menu or information costs are judged much less relevant by firms. JEL Classification: C25, D40, E31consumer prices, price setting, Price stickiness, producer prices, survey data
Hubble Space Telescope Survey of Interstellar ^12CO/^13CO in the Solar Neighborhood
We examine 20 diffuse and translucent Galactic sight lines and extract the
column densities of the ^12CO and ^13CO isotopologues from their ultraviolet
A--X absorption bands detected in archival Space Telescope Imaging Spectrograph
data with lambda/Deltalambda geq 46,000. Five more targets with Goddard
High-Resolution Spectrograph data are added to the sample that more than
doubles the number of sight lines with published Hubble Space Telescope
observations of ^13CO. Most sight lines have 12-to-13 isotopic ratios that are
not significantly different from the local value of 70 for ^12C/^13C, which is
based on mm-wave observations of rotational lines in emission from CO and H_2CO
inside dense molecular clouds, as well as on results from optical measurements
of CH^+. Five of the 25 sight lines are found to be fractionated toward lower
12-to-13 values, while three sight lines in the sample are fractionated toward
higher ratios, signaling the predominance of either isotopic charge exchange or
selective photodissociation, respectively. There are no obvious trends of the
^12CO-to-^13CO ratio with physical conditions such as gas temperature or
density, yet ^12CO/^13CO does vary in a complicated manner with the column
density of either CO isotopologue, owing to varying levels of competition
between isotopic charge exchange and selective photodissociation in the
fractionation of CO. Finally, rotational temperatures of H_2 show that all
sight lines with detected amounts of ^13CO pass through gas that is on average
colder by 20 K than the gas without ^13CO. This colder gas is also sampled by
CN and C_2 molecules, the latter indicating gas kinetic temperatures of only 28
K, enough to facilitate an efficient charge exchange reaction that lowers the
value of ^12CO/^13CO.Comment: 1-column emulateapj, 23 pages, 9 figure
A Frequent Pattern Conjunction Heuristic for Rule Generation in Data Streams
This paper introduces a new and expressive algorithm for inducing descriptive rule-sets from streaming data in real-time in order to describe frequent patterns explicitly encoded in the stream. Data Stream Mining (DSM) is concerned with the automatic analysis of data streams in real-time. Rapid flows of data challenge the state-of-the art processing and communication infrastructure, hence the motivation for research and innovation into real-time algorithms that analyse data streams on-the-fly and can automatically adapt to concept drifts. To date, DSM techniques have largely focused on predictive data mining applications that aim to forecast the value of a particular target feature of unseen data instances, answering questions such as whether a credit card transaction is fraudulent or not. A real-time, expressive and descriptive Data Mining technique for streaming data has not been previously established as part of the DSM toolkit. This has motivated the work reported in this paper, which has resulted in developing and validating a Generalised Rule Induction (GRI) tool, thus producing expressive rules as explanations that can be easily understood by human analysts. The expressiveness of decision models in data streams serves the objectives of transparency, underpinning the vision of `explainable AIâ and yet is an area of research that has attracted less attention despite being of high practical importance. The algorithm introduced and described in this paper is termed Fast Generalised Rule Induction (FGRI). FGRI is able to induce descriptive rules incrementally for raw data from both categorical and numerical features. FGRI is able to adapt rule-sets to changes of the pattern encoded in the data stream (concept drift) on the fly as new data arrives and can thus be applied continuously in real-time. The paper also provides a theoretical, qualitative and empirical evaluation of FGRI
- âŠ