5 research outputs found

    Fitness Function Comparison for GA-based Feature Construction

    No full text
    Abstract. When primitive data representation yields attribute interactions, learning requires feature construction. MFE2/GA, a GA-based feature construction has been shown to learn more accurately than others when there exist several complex attribute interactions. A new fitness function, based on the principle of Minimum Description Length (MDL), is proposed and implemented as part of the MFE3/GA system. Since the individuals of the GA population are collections of new features constructed to change the representation of data, an MDL-based fitness considers not only the part of data left unexplained by the constructed features (errors), but also the complexity of the constructed features as a new representation (theory). An empirical study shows the advantage of the new fitness over other fitness not based on MDL, and both are compared to the performance baselines provided by relevant systems

    Design Methodology–Feature evaluation and selection General Terms: Algorithms, Design

    No full text
    Primitive data representation of real-world data facilitates attribute interactions, which make information opaque to and encapsulate interactions into new features and outline them to the learner. When a GA is applied to perform FC, the goal is to generate features that facilitate more accurate learning. Then the GA’s fitness function should estimate the quality of the constructed features. We propose a new fitness function based on Minimum Description Length (MDL). This fitness is incorporated in MFE2/GA [4] to improve its accuracy. The new system is compared with other systems based on Entropy or error-rate fitness. There are three common forms of evaluating features: i) MDL-based fitness function measures the inconsistency and complexity of constructed features based on MDL principle
    corecore