152,949 research outputs found

    A Hybrid Rough Sets K-Means Vector Quantization Model For Neural Networks Based Arabic Speech Recognition

    Get PDF
    Speech is a natural, convenient and rapid means of human communication. The abil ity to respond to spoken language is of special importance in computer application wherein the user cannot use his/her limbs in a proper way, and may be useful in office automation systems. It can help in developing control systems for many applications such as in telephone assistance systems. Rough sets theory represents a mathematical approach to vagueness and uncertainty. Data analysis, data reduction, approxi mate classification, machine learning, and discovery of pattern in data are functions performed by a rough sets analysis. It was one of the first non-statistical methodologies of data analysis. It extends classical set theory by incorporating into the set model the notion of classification as indiscernibility relation.In previous work rough sets approach application to the field of speech recognition was limited to the pattern matching stage. That is, to use training speech patterns to generate classification rules that can be used later to classify input words patterns. In this thesis rough sets approach was used in the preprocessing stages, namely in the vector quantization operation in which feature vectors are quantized or classified to a finite set of codebook classes. Classification rules were generated from training feature vectors set, and a modified form of the standard voter classification algorithm, that use the rough sets generated rules, was applied. A vector quantization model that incorporate rough sets attribute reduction and rules generation with a modified version of the K-means clustering algorithm was developed, implemented and tested as a part of a speech recognition framework, in which the Learning Vector Quantization (LVQ) neural network model was used in the pattern matching stage. In addition to the Arabic speech data that used in the original experiments, for both speaker dependant and speaker independent tests, more verification experiments were conducted using the TI20 speech data. The rough sets vector quantization model proved its usefulness in the speech recognition framework, however it can be extended to different applications that involve large amounts of data such as speaker verification

    Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough-Based Approaches

    Get PDF
    Abstract—Semantics-preserving dimensionality reduction refers to the problem of selecting those input features that are most predictive of a given outcome; a problem encountered in many areas such as machine learning, pattern recognition, and signal processing. This has found successful application in tasks that involve data sets containing huge numbers of features (in the order of tens of thousands), which would be impossible to process further. Recent examples include text processing and Web content classification. One of the many successful applications of rough set theory has been to this feature selection area. This paper reviews those techniques that preserve the underlying semantics of the data, using crisp and fuzzy rough set-based methodologies. Several approaches to feature selection based on rough set theory are experimentally compared. Additionally, a new area in feature selection, feature grouping, is highlighted and a rough set-based feature grouping technique is detailed. Index Terms—Dimensionality reduction, feature selection, feature transformation, rough selection, fuzzy-rough selection.

    The application of rough set and Kohonen network to feature selection for object extraction

    Full text link
    Selecting a set of features which is optimal for a given task is a problem which plays an important role in a wide variety of contexts including pattern recognition, images understanding and machine learning. The paper describes an application of rough sets method to feature selection and reduction in texture images recognition. The proposed methods include continuous data discretization based on Kohonen neural network and maximum covariance, and rough set algorithms for feature selection and reduction. The experiments on trees extraction from aerial images show that the methods presented in this paper are practical and effective. <br /

    Exploring the Boundary Region of Tolerance Rough Sets for Feature Selection

    Get PDF
    Of all of the challenges which face the effective application of computational intelli-gence technologies for pattern recognition, dataset dimensionality is undoubtedly one of the primary impediments. In order for pattern classifiers to be efficient, a dimensionality reduction stage is usually performed prior to classification. Much use has been made of Rough Set Theory for this purpose as it is completely data-driven and no other information is required; most other methods require some additional knowledge. However, traditional rough set-based methods in the literature are restricted to the requirement that all data must be discrete. It is therefore not possible to consider real-valued or noisy data. This is usually addressed by employing a discretisation method, which can result in information loss. This paper proposes a new approach based on the tolerance rough set model, which has the abil-ity to deal with real-valued data whilst simultaneously retaining dataset semantics. More significantly, this paper describes the underlying mechanism for this new approach to utilise the information contained within the boundary region or region of uncertainty. The use of this information can result in the discovery of more compact feature subsets and improved classification accuracy. These results are supported by an experimental evaluation which compares the proposed approach with a number of existing feature selection techniques. Key words: feature selection, attribute reduction, rough sets, classification

    Fuzzy-Rough Data Reduction with Ant Colony Optimization

    Get PDF
    Feature selection refers to the problem of selecting those input features that are most predictive of a given outcome; a problem encountered in many areas such as machine learning, pattern recognition and signal processing. In particular, solution to this has found successful application in tasks that involve datasets containing huge numbers of features (in the order of tens of thousands), which would be impossible to process further. Recent examples include text processing and web content classification. Rough set theory has been used as such a dataset pre-processor with much success, but current methods are inadequate at finding minimal reductions, the smallest sets of features possible. To alleviate this difficulty, a feature selection technique that employs a hybrid variant of rough sets, fuzzy-rough sets, has been developed recently and has been shown to be effective. However, this method is still not able to find the optimal subsets regularly. This paper proposes a new feature selection mechanism based on Ant Colony Optimization in an attempt to combat this. The method is then applied to the problem of finding optimal feature subsets in the fuzzy-rough data reduction process. The present work is applied to complex systems monitoring and experimentally compared with the original fuzzy-rough method, an entropy-based feature selector, and a transformation-based reduction method, PCA. Comparisons with the use of a support vector classifier are also included

    Fuzzy-Rough Sets Assisted Attribute Selection

    Get PDF
    Attribute selection (AS) refers to the problem of selecting those input attributes or features that are most predictive of a given outcome; a problem encountered in many areas such as machine learning, pattern recognition and signal processing. Unlike other dimensionality reduction methods, attribute selectors preserve the original meaning of the attributes after reduction. This has found application in tasks that involve datasets containing huge numbers of attributes (in the order of tens of thousands) which, for some learning algorithms, might be impossible to process further. Recent examples include text processing and web content classification. AS techniques have also been applied to small and medium-sized datasets in order to locate the most informative attributes for later use. One of the many successful applications of rough set theory has been to this area. The rough set ideology of using only the supplied data and no other information has many benefits in AS, where most other methods require supplementary knowledge. However, the main limitation of rough set-based attribute selection in the literature is the restrictive requirement that all data is discrete. In classical rough set theory, it is not possible to consider real-valued or noisy data. This paper investigates a novel approach based on fuzzy-rough sets, fuzzy rough feature selection (FRFS), that addresses these problems and retains dataset semantics. FRFS is applied to two challenging domains where a feature reducing step is important; namely, web content classification and complex systems monitoring. The utility of this approach is demonstrated and is compared empirically with several dimensionality reducers. In the experimental studies, FRFS is shown to equal or improve classification accuracy when compared to the results from unreduced data. Classifiers that use a lower dimensional set of attributes which are retained by fuzzy-rough reduction outperform those that employ more attributes returned by the existing crisp rough reduction method. In addition, it is shown that FRFS is more powerful than the other AS techniques in the comparative study

    Rough set theory applied to pattern recognition of partial discharge in noise affected cable data

    Get PDF
    This paper presents an effective, Rough Set (RS) based, pattern recognition method for rejecting interference signals and recognising Partial Discharge (PD) signals from different sources. Firstly, RS theory is presented in terms of Information System, Lower and Upper Approximation, Signal Discretisation, Attribute Reduction and a flowchart of the RS based pattern recognition method. Secondly, PD testing of five types of artificial defect in ethylene-propylene rubber (EPR) cable is carried out and data pre-processing and feature extraction are employed to separate PD and interference signals. Thirdly, the RS based PD signal recognition method is applied to 4000 samples and is proven to have 99% accuracy. Fourthly, the RS based PD recognition method is applied to signals from five different sources and an accuracy of more than 93% is attained when a combination of signal discretisation and attribute reduction methods are applied. Finally, Back-propagation Neural Network (BPNN) and Support Vector Machine (SVM) methods are studied and compared with the developed method. The proposed RS method is proven to have higher accuracy than SVM and BPNN and can be applied for on-line PD monitoring of cable systems after training with valid sample data

    Class Association Rules Mining based Rough Set Method

    Full text link
    This paper investigates the mining of class association rules with rough set approach. In data mining, an association occurs between two set of elements when one element set happen together with another. A class association rule set (CARs) is a subset of association rules with classes specified as their consequences. We present an efficient algorithm for mining the finest class rule set inspired form Apriori algorithm, where the support and confidence are computed based on the elementary set of lower approximation included in the property of rough set theory. Our proposed approach has been shown very effective, where the rough set approach for class association discovery is much simpler than the classic association method.Comment: 10 pages, 2 figure
    corecore