Search CORE

45,946 research outputs found

Characterization and Robust Classification of EEG Signal from Image RSVP Events with Independent Time-Frequency Features

Author: Jia Meng
Kay Robbins
Lawrence M. Ward
Lenis Mauricio Meriño
Nima Bigdely Shamlo
Scott Makeig
Yufei Huang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

This paper considers the problem of automatic characterization and detection of target images in a rapid serial visual presentation (RSVP) task based on EEG data. A novel method that aims to identify single-trial event-related potentials (ERPs) in time-frequency is proposed, and a robust classifier with feature clustering is developed to better utilize the correlated ERP features. The method is applied to EEG recordings of a RSVP experiment with multiple sessions and subjects. The results show that the target image events are mainly characterized by 3 distinct patterns in the time-frequency domain, i.e., a theta band (4.3 Hz) power boosting 300–700 ms after the target image onset, an alpha band (12 Hz) power boosting 500–1000 ms after the stimulus onset, and a delta band (2 Hz) power boosting after 500 ms. The most discriminant time-frequency features are power boosting and are relatively consistent among multiple sessions and subjects. Since the original discriminant time-frequency features are highly correlated, we constructed the uncorrelated features using hierarchical clustering for better classification of target and non-target images. With feature clustering, performance (area under ROC) improved from 0.85 to 0.89 on within-session tests, and from 0.76 to 0.84 on cross-subject tests. The constructed uncorrelated features were more robust than the original discriminant features and corresponded to a number of local regions on the time-frequency plane

DSpace@MIT

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Modularity-Guided Graph Topology Optimization And Self-Boosting Clustering

Author: Hao Shiqi
Liu Zhangxun
Wang Yongyu
Zhuang Xiaotian
Publication venue
Publication date: 28/03/2023
Field of study

Existing modularity-based community detection methods attempt to find community memberships which can lead to the maximum of modularity in a fixed graph topology. In this work, we propose to optimize the graph topology through the modularity maximization process. We introduce a modularity-guided graph optimization approach for learning sparse high modularity graph from algorithmically generated clustering results by iterative pruning edges between two distant clusters. To the best of our knowledge, this represents a first attempt for using modularity to guide graph topology learning. Extensive experiments conducted on various real-world data sets show that our method outperforms the state-of-the-art graph construction methods by a large margin. Our experiments show that with increasing modularity, the accuracy of graph-based clustering algorithm is simultaneously increased, demonstrating the validity of modularity theory through numerical experimental results of real-world data sets. From clustering perspective, our method can also be seen as a self-boosting clustering method

arXiv.org e-Print Archive

Interpretable Sequence Clustering

Author: Dong Junjie
He Zengyou
Hu Lianyu
Jiang Mudi
Yang Xinyi
Publication venue
Publication date: 03/09/2023
Field of study

Categorical sequence clustering plays a crucial role in various fields, but the lack of interpretability in cluster assignments poses significant challenges. Sequences inherently lack explicit features, and existing sequence clustering algorithms heavily rely on complex representations, making it difficult to explain their results. To address this issue, we propose a method called Interpretable Sequence Clustering Tree (ISCT), which combines sequential patterns with a concise and interpretable tree structure. ISCT leverages k-1 patterns to generate k leaf nodes, corresponding to k clusters, which provides an intuitive explanation on how each cluster is formed. More precisely, ISCT first projects sequences into random subspaces and then utilizes the k-means algorithm to obtain high-quality initial cluster assignments. Subsequently, it constructs a pattern-based decision tree using a boosting-based construction strategy in which sequences are re-projected and re-clustered at each node before mining the top-1 discriminative splitting pattern. Experimental results on 14 real-world data sets demonstrate that our proposed method provides an interpretable tree structure while delivering fast and accurate cluster assignments.Comment: 11 pages, 6 figure

arXiv.org e-Print Archive

Clustering of variables for enhanced interpretability of predictive models

Author: Vigneau Evelyne
Publication venue
Publication date: 18/08/2020
Field of study

A new strategy is proposed for building easy to interpret predictive models in the context of a high-dimensional dataset, with a large number of highly correlated explanatory variables. The strategy is based on a first step of variables clustering using the CLustering of Variables around Latent Variables (CLV) method. The exploration of the hierarchical clustering dendrogram is undertaken in order to sequentially select the explanatory variables in a group-wise fashion. For model setting implementation, the dendrogram is used as the base-learner in an L2-boosting procedure. The proposed approach, named lmCLV, is illustrated on the basis of a toy-simulated example when the clusters and predictive equation are already known, and on a real case study dealing with the authentication of orange juices based on 1H-NMR spectroscopic analysis. In both illustrative examples, this procedure was shown to have similar predictive efficiency to other methods, with additional interpretability capacity. It is available in the R package ClustVarLV.Comment: 24 pages, 7 figure

arXiv.org e-Print Archive

CUSBoost: Cluster-based Under-sampling with Boosting for Imbalanced Classification

Author: Ahmed Sajid
Farid Dewan Md.
Jani Md. Rafsan
Mahbub Asif
Rayhan Farshid
Shatabda Swakkhar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/12/2017
Field of study

Class imbalance classification is a challenging research problem in data mining and machine learning, as most of the real-life datasets are often imbalanced in nature. Existing learning algorithms maximise the classification accuracy by correctly classifying the majority class, but misclassify the minority class. However, the minority class instances are representing the concept with greater interest than the majority class instances in real-life applications. Recently, several techniques based on sampling methods (under-sampling of the majority class and over-sampling the minority class), cost-sensitive learning methods, and ensemble learning have been used in the literature for classifying imbalanced datasets. In this paper, we introduce a new clustering-based under-sampling approach with boosting (AdaBoost) algorithm, called CUSBoost, for effective imbalanced classification. The proposed algorithm provides an alternative to RUSBoost (random under-sampling with AdaBoost) and SMOTEBoost (synthetic minority over-sampling with AdaBoost) algorithms. We evaluated the performance of CUSBoost algorithm with the state-of-the-art methods based on ensemble learning like AdaBoost, RUSBoost, SMOTEBoost on 13 imbalance binary and multi-class datasets with various imbalance ratios. The experimental results show that the CUSBoost is a promising and effective approach for dealing with highly imbalanced datasets.Comment: CSITSS-201

arXiv.org e-Print Archive

Crossref