19 research outputs found

    Automatic detection of gender on the blogs

    Get PDF
    International audienceIn this paper, we are interested in defining the gender of blogger while using only texts written from bloggers. For that purpose, we offer a number of features based on specific words, which were categorized into classes. For each blog, a score is calculated based on these characteristics, thereby determining the gender of its author. The evaluation was made on a corpus of 681,288 Blogs (140 million words) tagged as men or women. In our work, this collection will be taken as a reference. The obtained results show gender detection over 82% compared to the referenced collection

    Domain adaptation for EEG-based, cross-subject epileptic seizure prediction

    Get PDF
    The ability to predict the occurrence of an epileptic seizure is a safeguard against patient injury and health complications. However, a major challenge in seizure prediction arises from the significant variability observed in patient data. Common patient-specific approaches, which apply to each patient independently, often perform poorly for other patients due to the data variability. The aim of this study is to propose deep learning models which can handle this variability and generalize across various patients. This study addresses this challenge by introducing a novel cross-subject and multi-subject prediction models. Multiple-subject modeling broadens the scope of patient-specific modeling to account for the data from a dedicated ensemble of patients, thereby providing some useful, though relatively modest, level of generalization. The basic neural network architecture of this model is then adapted to cross-subject prediction, thereby providing a broader, more realistic, context of application. For accrued performance, and generalization ability, cross-subject modeling is enhanced by domain adaptation. Experimental evaluation using the publicly available CHB-MIT and SIENA data datasets shows that our multiple-subject model achieved better performance compared to existing works. However, the cross-subject faces challenges when applied to different patients. Finally, through investigating three domain adaptation methods, the model accuracy has been notably improved by 10.30% and 7.4% for the CHB-MIT and SIENA datasets, respectively

    EEG oscillatory power and complexity for epileptic seizure detection

    Get PDF
    Monitoring patients at risk of epileptic seizure is critical for optimal treatment and ensuing the reduction of seizure risk and complications. In general, seizure detection is done manually in hospitals and involves time-consuming visual inspection and interpretation by experts of electroencephalography (EEG) recordings. The purpose of this study is to investigate the pertinence of band-limited spectral power and signal complexity in order to discriminate between seizure and seizure-free EEG brain activity. The signal complexity and spectral power are evaluated in five frequency intervals, namely, the delta, theta, alpha, beta, and gamma bands, to be used as EEG signal feature representation. Classification of seizure and seizure-free data was performed by prevalent potent classifiers. Substantial comparative performance evaluation experiments were performed on a large EEG data record of 341 patients in the Temple University Hospital EEG seizure database. Based on statistically validated criteria, results show the efficiency of band-limited spectral power and signal complexity when using random forest and gradient-boosting decision tree classifiers (95% of the area under the curve (AUC) and 91% for both F-measure and accuracy). These results support the use of these automatic classification schemes to assist the practicing neurologist interpret EEG records more accurately and without tedious visual inspection

    Is-ClusterMPP: clustering algorithm through point processes and influence space towards high-dimensional data

    No full text
    International audienceClustering via Marked Point Processes and Influence Space, Is-ClusterMPP, is a new unsupervised clustering algorithm through adaptive MCMC sampling of a Marked point processes of interacting balls. The chosen Gibbs energy cost function makes use of k-influence space information. It detects clusters of different shapes, sizes and unbalanced local densities. It aims at dealing also with high-dimensionaland scalable datasets. Is-ClusterMPP solves the problem of local heterogeneity in densities and prevents the impact of the global density in the detection of unbalanced classes, by using the k-influence space. This concept reduces also the input values amount. The curse of dimensionality is handled by using a local subspace clustering principal embedded in a weighted similarity metric. Balls are constituting aconfiguration sampled from the Marked point process. Due to the choice of the energy, they tends to cover neighboring data, then considered sharing the same cluster set. The energy is balancing different goals. (1) The data driven objective function is provided according to k-influence space. Data in a high-dense region are favored to be covered by a ball. (2) An interaction part in the energy prevents theballs full overlap phenomenon and favors connected groups of balls. The algorithm, Markov dynamics, does converge towards configurations sampled from the MPP model. This algorithm has been applied in real benchmarks through gene expression data of various sizes. Different experiments have been done to compare Is-ClusterMPP against the most well-known clustering algorithms and its efficiency is claimed
    corecore