14,695 research outputs found

    Data granulation by the principles of uncertainty

    Full text link
    Researches in granular modeling produced a variety of mathematical models, such as intervals, (higher-order) fuzzy sets, rough sets, and shadowed sets, which are all suitable to characterize the so-called information granules. Modeling of the input data uncertainty is recognized as a crucial aspect in information granulation. Moreover, the uncertainty is a well-studied concept in many mathematical settings, such as those of probability theory, fuzzy set theory, and possibility theory. This fact suggests that an appropriate quantification of the uncertainty expressed by the information granule model could be used to define an invariant property, to be exploited in practical situations of information granulation. In this perspective, a procedure of information granulation is effective if the uncertainty conveyed by the synthesized information granule is in a monotonically increasing relation with the uncertainty of the input data. In this paper, we present a data granulation framework that elaborates over the principles of uncertainty introduced by Klir. Being the uncertainty a mesoscopic descriptor of systems and data, it is possible to apply such principles regardless of the input data type and the specific mathematical setting adopted for the information granules. The proposed framework is conceived (i) to offer a guideline for the synthesis of information granules and (ii) to build a groundwork to compare and quantitatively judge over different data granulation procedures. To provide a suitable case study, we introduce a new data granulation technique based on the minimum sum of distances, which is designed to generate type-2 fuzzy sets. We analyze the procedure by performing different experiments on two distinct data types: feature vectors and labeled graphs. Results show that the uncertainty of the input data is suitably conveyed by the generated type-2 fuzzy set models.Comment: 16 pages, 9 figures, 52 reference

    Grooming Detection using Fuzzy-Rough Feature Selection and Text Classification

    Get PDF
    Online child grooming detection has recently attracted intensive research interests from both the machine learning community and digital forensics community due to its great social impact. The existing data-driven approaches usually face the challenges of lack of training data and the uncertainty of classes in terms of the classification or decision boundary. This paper proposes a grooming detection approach in an effort to address such uncertainty based on a data set derived from a publicly available profiling data set. In particular, the approach firstly applies the conventional text feature extraction approach in identifying the most significant words in the data set. This is followed by the application of a fuzzy-rough feature selection approach in reducing the high dimensions of the selected words for fast processing, which at the same time addressing the uncertainty of class boundaries. The experimental results demonstrate the efficiency and efficacy

    Fuzzy Supernova Templates I: Classification

    Full text link
    Modern supernova (SN) surveys are now uncovering stellar explosions at rates that far surpass what the world's spectroscopic resources can handle. In order to make full use of these SN datasets, it is necessary to use analysis methods that depend only on the survey photometry. This paper presents two methods for utilizing a set of SN light curve templates to classify SN objects. In the first case we present an updated version of the Bayesian Adaptive Template Matching program (BATM). To address some shortcomings of that strictly Bayesian approach, we introduce a method for Supernova Ontology with Fuzzy Templates (SOFT), which utilizes Fuzzy Set Theory for the definition and combination of SN light curve models. For well-sampled light curves with a modest signal to noise ratio (S/N>10), the SOFT method can correctly separate thermonuclear (Type Ia) SNe from core collapse SNe with 98% accuracy. In addition, the SOFT method has the potential to classify supernovae into sub-types, providing photometric identification of very rare or peculiar explosions. The accuracy and precision of the SOFT method is verified using Monte Carlo simulations as well as real SN light curves from the Sloan Digital Sky Survey and the SuperNova Legacy Survey. In a subsequent paper the SOFT method is extended to address the problem of parameter estimation, providing estimates of redshift, distance, and host galaxy extinction without any spectroscopy.Comment: 26 pages, 12 figures. Accepted to Ap

    Fuzzy-Rough Nearest Neighbour Classification and Prediction

    Get PDF
    AbstractNearest neighbour (NN) approaches are inspired by the way humans make decisions, comparing a test object to previously encountered samples. In this paper, we propose an NN algorithm that uses the lower and upper approximations from fuzzy-rough set theory in order to classify test objects, or predict their decision value. It is shown experimentally that our method outperforms other NN approaches (classical, fuzzy and fuzzy-rough ones) and that it is competitive with leading classification and prediction methods. Moreover, we show that the robustness of our methods against noise can be enhanced effectively by invoking the approximations of the Vaguely Quantified Rough Set (VQRS) model, which emulates the linguistic quantifiers “some” and “most” from natural language

    Accurate and reliable segmentation of the optic disc in digital fundus images

    Get PDF
    We describe a complete pipeline for the detection and accurate automatic segmentation of the optic disc in digital fundus images. This procedure provides separation of vascular information and accurate inpainting of vessel-removed images, symmetry-based optic disc localization, and fitting of incrementally complex contour models at increasing resolutions using information related to inpainted images and vessel masks. Validation experiments, performed on a large dataset of images of healthy and pathological eyes, annotated by experts and partially graded with a quality label, demonstrate the good performances of the proposed approach. The method is able to detect the optic disc and trace its contours better than the other systems presented in the literature and tested on the same data. The average error in the obtained contour masks is reasonably close to the interoperator errors and suitable for practical applications. The optic disc segmentation pipeline is currently integrated in a complete software suite for the semiautomatic quantification of retinal vessel properties from fundus camera images (VAMPIRE)

    An Efficient Classification Model using Fuzzy Rough Set Theory and Random Weight Neural Network

    Get PDF
    In the area of fuzzy rough set theory (FRST), researchers have gained much interest in handling the high-dimensional data. Rough set theory (RST) is one of the important tools used to pre-process the data and helps to obtain a better predictive model, but in RST, the process of discretization may loss useful information. Therefore, fuzzy rough set theory contributes well with the real-valued data. In this paper, an efficient technique is presented based on Fuzzy rough set theory (FRST) to pre-process the large-scale data sets to increase the efficacy of the predictive model. Therefore, a fuzzy rough set-based feature selection (FRSFS) technique is associated with a Random weight neural network (RWNN) classifier to obtain the better generalization ability. Results on different dataset show that the proposed technique performs well and provides better speed and accuracy when compared by associating FRSFS with other machine learning classifiers (i.e., KNN, Naive Bayes, SVM, decision tree and backpropagation neural network)

    Incremental Perspective for Feature Selection Based on Fuzzy Rough Sets

    Get PDF
    • …
    corecore