Search CORE

16,703 research outputs found

Recommended from our members

A survey of induction algorithms for machine learning

Author: Porter Bruce W.
Publication venue: eScholarship, University of California
Publication date: 01/01/1983
Field of study

Central to all systems for machine learning from examples is an induction algorithm. The purpose of the algorithm is to generalize from a finite set of training examples a description consistent with the examples seen, and, hopefully, with the potentially infinite set of examples not seen. This paper surveys four machine learning induction algorithms. The knowledge representation schemes and a PDL description of algorithm control are emphasized. System characteristics that are peculiar to a domain of application are de-emphasized. Finally, a comparative summary of the learning algorithms is presented

eScholarship - University of California

Representation Independent Analytics Over Structured Data

Author: Chodpathumwan Yodsawalai
Fern Alan
Picado Jose
Sun Yizhou
Termehchy Arash
Publication venue
Publication date: 08/09/2014
Field of study

Database analytics algorithms leverage quantifiable structural properties of the data to predict interesting concepts and relationships. The same information, however, can be represented using many different structures and the structural properties observed over particular representations do not necessarily hold for alternative structures. Thus, there is no guarantee that current database analytics algorithms will still provide the correct insights, no matter what structures are chosen to organize the database. Because these algorithms tend to be highly effective over some choices of structure, such as that of the databases used to validate them, but not so effective with others, database analytics has largely remained the province of experts who can find the desired forms for these algorithms. We argue that in order to make database analytics usable, we should use or develop algorithms that are effective over a wide range of choices of structural organizations. We introduce the notion of representation independence, study its fundamental properties for a wide range of data analytics algorithms, and empirically analyze the amount of representation independence of some popular database analytics algorithms. Our results indicate that most algorithms are not generally representation independent and find the characteristics of more representation independent heuristics under certain representational shifts

arXiv.org e-Print Archive

CiteSeerX

Recursive Program Optimization Through Inductive Synthesis Proof Transformation

Author: Bundy Alan
Madden P.
Smaill A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1999
Field of study

The research described in this paper involved developing transformation techniques which increase the efficiency of the noriginal program, the source, by transforming its synthesis proof into one, the target, which yields a computationally more efficient algorithm. We describe a working proof transformation system which, by exploiting the duality between mathematical induction and recursion, employs the novel strategy of optimizing recursive programs by transforming inductive proofs. We compare and contrast this approach with the more traditional approaches to program transformation, and highlight the benefits of proof transformation with regards to search, correctness, automatability and generality

CiteSeerX

Edinburgh Research Explorer

MPG.PuRe

On the role of pre and post-processing in environmental data mining

Author: Athanasiadis Ioannis
Comas Joaquim
Gibert Karina
Holmes Geoffrey
Izquierdo Joaquin
Sanchez-Marre Miquel
Publication venue: International Environmental Modelling and Software Society
Publication date: 01/01/2008
Field of study

The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed

Research Commons@Waikato

The use of data-mining for the automatic formation of tactics

Author: Bundy A.
Duncan H.
Levine J.
Pollet M.
Storkey A.
Publication venue
Publication date: 01/07/2004
Field of study

This paper discusses the usse of data-mining for the automatic formation of tactics. It was presented at the Workshop on Computer-Supported Mathematical Theory Development held at IJCAR in 2004. The aim of this project is to evaluate the applicability of data-mining techniques to the automatic formation of tactics from large corpuses of proofs. We data-mine information from large proof corpuses to find commonly occurring patterns. These patterns are then evolved into tactics using genetic programming techniques

University of Strathclyde Institutional Repository

Stable states of perturbed Markov chains

Author: Betz Volker
Roux Stephane Le
Publication venue
Publication date: 12/02/2016
Field of study

Given an infinitesimal perturbation of a discrete-time finite Markov chain, we seek the states that are stable despite the perturbation, \textit{i.e.} the states whose weights in the stationary distributions can be bounded away from

0

as the noise fades away. Chemists, economists, and computer scientists have been studying irreducible perturbations built with exponential maps. Under these assumptions, Young proved the existence of and computed the stable states in cubic time. We fully drop these assumptions, generalize Young's technique, and show that stability is decidable as long as

f\in O(g)

is. Furthermore, if the perturbation maps (and their multiplications) satisfy

f\in O(g)

g\in O(f)

, we prove the existence of and compute the stable states and the metastable dynamics at all time scales where some states vanish. Conversely, if the big-

O

assumption does not hold, we build a perturbation with these maps and no stable state. Our algorithm also runs in cubic time despite the general assumptions and the additional work. Proving the correctness of the algorithm relies on new or rephrased results in Markov chain theory, and on algebraic abstractions thereof

arXiv.org e-Print Archive

DI-fusion

Improving the Interpretability of Classification Rules Discovered by an Ant Colony Algorithm: Extended Results

Author: Freitas Alex A.
Otero Fernando E.B.
Publication venue: 'MIT Press - Journals'
Publication date: 01/09/2016
Field of study

The vast majority of Ant Colony Optimization (ACO) algorithms for inducing classification rules use an ACO-based procedure to create a rule in an one-at-a-time fashion. An improved search strategy has been proposed in the cAnt-MinerPB algorithm, where an ACO-based procedure is used to create a complete list of rules (ordered rules)-i.e., the ACO search is guided by the quality of a list of rules, instead of an individual rule. In this paper we propose an extension of the cAnt-MinerPB algorithm to discover a set of rules (unordered rules). The main motivations for this work are to improve the interpretation of individual rules by discovering a set of rules and to evaluate the impact on the predictive accuracy of the algorithm. We also propose a new measure to evaluate the interpretability of the discovered rules to mitigate the fact that the commonly-used model size measure ignores how the rules are used to make a class prediction. Comparisons with state-of-the-art rule induction algorithms, support vector machines and the cAnt-MinerPB producing ordered rules are also presented

Kent Academic Repository