2 research outputs found

    A Hierarchical Approach to Multimodal Classification

    Full text link
    Abstract. Data models that are induced in classifier construction often consists of multiple parts, each of which explains part of the data. Classi-fication methods for such models are called the multimodal classification methods. The model parts may overlap or have insufficient coverage. How to deal best with the problems of overlapping and insufficient cov-erage? In this paper we propose hierarchical or layered approach to this problem. Rather than seeking a single model, we consider a series of models under gradually relaxing conditions, which form a hierarchical structure. To demonstrate the effectiveness of this approach we imple-mented it in two classifiers that construct multi-part models: one based on the so-called lattice machine and the other one based on rough set rule induction. This leads to hierarchical versions of the classifiers. The classification performance of these two hierarchical classifiers is compared with C4.5, Support Vector Machine (SVM), rule based classifiers (with the optimisation of rule shortening) implemented in Rough Set Explo-ration System (RSES), and a method combining k-nn with rough set rule induction (RIONA in RSES). The results of the experiments show that this hierarchical approach leads to improved multimodal classifiers

    Hyperrelations in Version Space

    Get PDF
    A version space is a set of all hypotheses consistent with a given set of training examples, delimited by the specific boundary and the general boundary. In existing studies [5, 6, 8] a hypothesis is a conjunction of attribute-value pairs, which is shown to have limited expressive power [9]. In a more expressive hypothesis space, e.g., disjunction of conjunction of attribute-value pairs, a general version space becomes uninteresting unless some restriction (inductive bias) is imposed [9]. In this paper we investigate version space in a hypothesis space where a hypothesis is a hyperrelation, which is in effect a disjunction of conjunctions of disjunctions of attribute-value pairs. Such a hypothesis space is more expressive than the conjunction of attribute-value pairs and the disjunction of conjunction of attribute-value pairs. However, given a dataset, we focus our attention only on those hypotheses which are consistent with given data and are maximal in the sense that the elements in a hypothesis can not be merged further. Such a hypothesis is called an E--set for the given data, and the set of all E--sets is the version space which is delimited by the least E--set (specific boundary) and the greatest E--set (general boundary). Based on this version space we propose three classification rules for use in different situations. The first two are based on E--sets, and the third one is based on "degraded" E--sets called weak hypotheses, where the maximality constraint is relaxed. We present an algorithm to calculate E--sets, though it is computationally expensive in the worst case. We also present an efficient algorithm to calculate weak hypotheses. The third rule is evaluated using public datasets, and the results compare well with C5.0 decision tree classifier.
    corecore