Search CORE

6,617 research outputs found

GBG++: A Fast and Stable Granular Ball Generation Method for Classification

Author: Ding Weiping
Wang Guoyin
Wu Chengying
Xia Shuyin
Xie Qin
Zhang Qinghua
Zhao Fan
Publication venue
Publication date: 13/11/2023
Field of study

Granular ball computing (GBC), as an efficient, robust, and scalable learning method, has become a popular research topic of granular computing. GBC includes two stages: granular ball generation (GBG) and multi-granularity learning based on the granular ball (GB). However, the stability and efficiency of existing GBG methods need to be further improved due to their strong dependence on

k

-means or

k

-division. In addition, GB-based classifiers only unilaterally consider the GB's geometric characteristics to construct classification rules, but the GB's quality is ignored. Therefore, in this paper, based on the attention mechanism, a fast and stable GBG (GBG++) method is proposed first. Specifically, the proposed GBG++ method only needs to calculate the distances from the data-driven center to the undivided samples when splitting each GB instead of randomly selecting the center and calculating the distances between it and all samples. Moreover, an outlier detection method is introduced to identify local outliers. Consequently, the GBG++ method can significantly improve effectiveness, robustness, and efficiency while being absolutely stable. Second, considering the influence of the sample size within the GB on the GB's quality, based on the GBG++ method, an improved GB-based

k

-nearest neighbors algorithm (GB

k

NN++) is presented, which can reduce misclassification at the class boundary. Finally, the experimental results indicate that the proposed method outperforms several existing GB-based classifiers and classical machine learning classifiers on

24

public benchmark datasets

arXiv.org e-Print Archive

Automatic generation of fuzzy classification rules using granulation-based adaptive clustering

Author: Abbod MF
Al-Shammaa M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2015
Field of study

A central problem of fuzzy modelling is the generation of fuzzy rules that fit the data to the highest possible extent. In this study, we present a method for automatic generation of fuzzy rules from data. The main advantage of the proposed method is its ability to perform data clustering without the requirement of predefining any parameters including number of clusters. The proposed method creates data clusters at different levels of granulation and selects the best clustering results based on some measures. The proposed method involves merging clusters into new clusters that have a coarser granulation. To evaluate performance of the proposed method, three different datasets are used to compare performance of the proposed method to other classifiers: SVM classifier, FCM fuzzy classifier, subtractive clustering fuzzy classifier. Results show that the proposed method has better classification results than other classifiers for all the datasets used

Brunel University Research Archive

Recommended from our members

Granular computing approach for the design of medical data classification systems

Author: Abbod MF
Al-Shammaa M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2015
Field of study

Granular computing is a computation theory that imitates human thinking and reasoning by dealing with information at different levels of abstraction/precision. The adoption of granular computing approach in the design of data classification systems improves their performance in dealing with data uncertainty and facilitates handling large volumes of data. In this paper, a new approach for the design of medical data classification systems is proposed. The proposed approach makes use of data granulation in training the classifier. Training data is granulated at different levels and data from each level is used for constructing the classification system. To evaluate performance of the proposed approach, a classification system based on neural network is implemented. Four medical datasets are used to compare performance of the proposed approach to other classifiers: neural network classifier, ANFIS classifier and SVM classifier. Results show that the proposed approach improves classification performance of neural network classifier and produces better accuracy and area under curve than other classifiers for most of the datasets used

Brunel University Research Archive

X-ray Astronomical Point Sources Recognition Using Granular Binary-tree SVM

Author: Li Weitian
Ma Zhixian
Wang Lei
Xu Haiguang
Zhu Jie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/03/2017
Field of study

The study on point sources in astronomical images is of special importance, since most energetic celestial objects in the Universe exhibit a point-like appearance. An approach to recognize the point sources (PS) in the X-ray astronomical images using our newly designed granular binary-tree support vector machine (GBT-SVM) classifier is proposed. First, all potential point sources are located by peak detection on the image. The image and spectral features of these potential point sources are then extracted. Finally, a classifier to recognize the true point sources is build through the extracted features. Experiments and applications of our approach on real X-ray astronomical images are demonstrated. comparisons between our approach and other SVM-based classifiers are also carried out by evaluating the precision and recall rates, which prove that our approach is better and achieves a higher accuracy of around 89%.Comment: Accepted by ICSP201

arXiv.org e-Print Archive

Face Alignment Using Boosting and Evolutionary Search

Author: Liu Duanduan
Nijholt Anton
Poel Mannes
Zhang Hua
Publication venue: Springer Verlag
Publication date: 01/01/2010
Field of study

In this paper, we present a face alignment approach using granular features, boosting, and an evolutionary search algorithm. Active Appearance Models (AAM) integrate a shape-texture-combined morphable face model into an efficient fitting strategy, then Boosting Appearance Models (BAM) consider the face alignment problem as a process of maximizing the response from a boosting classifier. Enlightened by AAM and BAM, we present a framework which implements improved boosting classifiers based on more discriminative features and exhaustive search strategies. In this paper, we utilize granular features to replace the conventional rectangular Haar-like features, to improve discriminability, computational efficiency, and a larger search space. At the same time, we adopt the evolutionary search process to solve the deficiency of searching in the large feature space. Finally, we test our approach on a series of challenging data sets, to show the accuracy and efficiency on versatile face images

University of Twente Research Information

Probabilistic identification of cerebellar cortical neurones across species.

Author: Angelaki Dora E
Arenz Alexander
Bengtsson Fredrik
Blazquez Pablo M
Dalley Jeffrey W
Edgley Steve
Ekerot Carl-Fredrik
Heiney Shane A
Holtzman Tahl
Jörntell Henrik
Margrie Troy W
Meng Hui
Mostofi Abteen
Van Dijck Gert
Van Hulle Marc M
Publication venue: PLoS One
Publication date: 01/01/2013
Field of study

Despite our fine-grain anatomical knowledge of the cerebellar cortex, electrophysiological studies of circuit information processing over the last fifty years have been hampered by the difficulty of reliably assigning signals to identified cell types. We approached this problem by assessing the spontaneous activity signatures of identified cerebellar cortical neurones. A range of statistics describing firing frequency and irregularity were then used, individually and in combination, to build Gaussian Process Classifiers (GPC) leading to a probabilistic classification of each neurone type and the computation of equi-probable decision boundaries between cell classes. Firing frequency statistics were useful for separating Purkinje cells from granular layer units, whilst firing irregularity measures proved most useful for distinguishing cells within granular layer cell classes. Considered as single statistics, we achieved classification accuracies of 72.5% and 92.7% for granular layer and molecular layer units respectively. Combining statistics to form twin-variate GPC models substantially improved classification accuracies with the combination of mean spike frequency and log-interval entropy offering classification accuracies of 92.7% and 99.2% for our molecular and granular layer models, respectively. A cross-species comparison was performed, using data drawn from anaesthetised mice and decerebrate cats, where our models offered 80% and 100% classification accuracy. We then used our models to assess non-identified data from awake monkeys and rabbits in order to highlight subsets of neurones with the greatest degree of similarity to identified cell classes. In this way, our GPC-based approach for tentatively identifying neurones from their spontaneous activity signatures, in the absence of an established ground-truth, nonetheless affords the experimenter a statistically robust means of grouping cells with properties matching known cell classes. Our approach therefore may have broad application to a variety of future cerebellar cortical investigations, particularly in awake animals where opportunities for definitive cell identification are limited

Lund University Publications

Directory of Open Access Journals

FigShare

Classifying sequences by the optimized dissimilarity space embedding approach: a case study on the solubility analysis of the E. coli proteome

Author: Livi Lorenzo
Rizzi Antonello
Sadeghian Alireza
Publication venue: 'IOS Press'
Publication date: 01/01/2015
Field of study

We evaluate a version of the recently-proposed classification system named Optimized Dissimilarity Space Embedding (ODSE) that operates in the input space of sequences of generic objects. The ODSE system has been originally presented as a classification system for patterns represented as labeled graphs. However, since ODSE is founded on the dissimilarity space representation of the input data, the classifier can be easily adapted to any input domain where it is possible to define a meaningful dissimilarity measure. Here we demonstrate the effectiveness of the ODSE classifier for sequences by considering an application dealing with the recognition of the solubility degree of the Escherichia coli proteome. Solubility, or analogously aggregation propensity, is an important property of protein molecules, which is intimately related to the mechanisms underlying the chemico-physical process of folding. Each protein of our dataset is initially associated with a solubility degree and it is represented as a sequence of symbols, denoting the 20 amino acid residues. The herein obtained computational results, which we stress that have been achieved with no context-dependent tuning of the ODSE system, confirm the validity and generality of the ODSE-based approach for structured data classification.Comment: 10 pages, 49 reference

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Learning to Predict with Highly Granular Temporal Data: Estimating individual behavioral profiles with smart meter data

Author: Mikhaylov Slava J.
Ushakova Anastasia
Publication venue
Publication date: 15/11/2017
Field of study

Big spatio-temporal datasets, available through both open and administrative data sources, offer significant potential for social science research. The magnitude of the data allows for increased resolution and analysis at individual level. While there are recent advances in forecasting techniques for highly granular temporal data, little attention is given to segmenting the time series and finding homogeneous patterns. In this paper, it is proposed to estimate behavioral profiles of individuals' activities over time using Gaussian Process-based models. In particular, the aim is to investigate how individuals or groups may be clustered according to the model parameters. Such a Bayesian non-parametric method is then tested by looking at the predictability of the segments using a combination of models to fit different parts of the temporal profiles. Model validity is then tested on a set of holdout data. The dataset consists of half hourly energy consumption records from smart meters from more than 100,000 households in the UK and covers the period from 2015 to 2016. The methodological approach developed in the paper may be easily applied to datasets of similar structure and granularity, for example social media data, and may lead to improved accuracy in the prediction of social dynamics and behavior

arXiv.org e-Print Archive