Attribute scoring based on performance of an learning algorithm on samples of attribute space

Weiss, Gregor

slides

oai:generic.eprints.org:1319/core382

Attribute scoring based on performance of an learning algorithm on samples of attribute space

Authors: Gregor Weiss
Publication date: 17 March 2011
Publisher

Abstract

In the field of machine learning and knowledge discovery in databases attributes or features have a central role, thus it is reasonable to also question their quality and importance for the given problem. Because this is in general a difficult problem, we focused in the thesis on the development of a new method for estimating attribute importance. The new method is based on sampling the attribute space, evaluating the performance of algorithms for machine learning and reasoning about the importance of individual attributes based on the obtained scores. More specifically, at first different combinations of attributes are chosen and smaller data sets that contain them are prepared on which a testing procedure with sampling obtains estimates on performance of an arbitrary chosen learning algorithm. Performance estimates obtained that way are statistically processed for each attribute according to their presence and with a given formula joined into final scores for individual attributes. In order to determine how well different variants of the new method work, an appropriate experimental methodology and many diverse data sets has been prepared. Some successful methods have also been further tested in more detail to reinforce the conclusion, that certain variants of the new method really are statistically significant better than conventional widely used methods for this problem, but unfortunately an improved version of the best one of them still seems to be better. The thesis concludes with a discussion of the results and various ideas for further work, improvements and applications of the method

Similar works

Full text

Open in the Core reader

Download PDF

ePrints.FRI

oai:generic.eprints.org:1319/c...

Last time updated on 12/07/2013

This paper was published in ePrints.FRI.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.