Search CORE

1,542 research outputs found

Data mining based cyber-attack detection

Author: Tianfield Huaglory
Publication venue
Publication date: 31/05/2017
Field of study

MACOC: a medoid-based ACO clustering algorithm

Author: A.P. Dempster
C. Blum
D. Martens
E. Hruschka
F. Otero
F. Otero
F. Wilcoxon
F.O. França de
M. Borrotti
O.M. Jafar
R. Parpinelli
S. Schaeffer
X. Zhang
Y. Kao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The application of ACO-based algorithms in data mining is growing over the last few years and several supervised and unsupervised learning algorithms have been developed using this bio-inspired approach. Most recent works concerning unsupervised learning have been focused on clustering, showing great potential of ACO-based techniques. This work presents an ACO-based clustering algorithm inspired by the ACO Clustering (ACOC) algorithm. The proposed approach restructures ACOC from a centroid-based technique to a medoid-based technique, where the properties of the search space are not necessarily known. Instead, it only relies on the information about the distances amongst data. The new algorithm, called MACOC, has been compared against well-known algorithms (K-means and Partition Around Medoids) and with ACOC. The experiments measure the accuracy of the algorithm for both synthetic datasets and real-world datasets extracted from the UCI Machine Learning Repository

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Kent Academic Repository

Biblos-e Archivo

Maximum Margin Clustering for State Decomposition of Metastable Systems

Author: Allwein
Becker
Berglund
Biancalani
Bowman
Boyd
Chema
Chodera
Chodera
Chodera
Crammer
Daura
Deuflhard
Deuflhard
Elmer
Genova
Glättli
Groningen
Hao Wu
Hastie
Horn
Jain
Keller
Kellogg
Kloeden
Kwak
McGibbon
Mehrmann
Noé
Noé
Noé
Noé
Noé
Nüske
Prinz
Pryor
Pérez-Hernández
Rahimi
Sarich
Schwantes
Shalev-Shwartz
Shao
Sorin
Swope
Vapnik
Wu
Xu
Yao
Zhang
Publication venue
Publication date: 31/12/2014
Field of study

When studying a metastable dynamical system, a prime concern is how to decompose the phase space into a set of metastable states. Unfortunately, the metastable state decomposition based on simulation or experimental data is still a challenge. The most popular and simplest approach is geometric clustering which is developed based on the classical clustering technique. However, the prerequisites of this approach are: (1) data are obtained from simulations or experiments which are in global equilibrium and (2) the coordinate system is appropriately selected. Recently, the kinetic clustering approach based on phase space discretization and transition probability estimation has drawn much attention due to its applicability to more general cases, but the choice of discretization policy is a difficult task. In this paper, a new decomposition method designated as maximum margin metastable clustering is proposed, which converts the problem of metastable state decomposition to a semi-supervised learning problem so that the large margin technique can be utilized to search for the optimal decomposition without phase space discretization. Moreover, several simulation examples are given to illustrate the effectiveness of the proposed method

arXiv.org e-Print Archive

Crossref

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

A survey of outlier detection methodologies

Author: Austin J.
Hodge V.J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

CiteSeerX

Crossref

White Rose Research Online

Deep Metric Learning via Facility Location

Author: Jegelka Stefanie
Murphy Kevin
Rathod Vivek
Song Hyun Oh
Publication venue
Publication date: 11/04/2017
Field of study

Learning the representation and the similarity metric in an end-to-end fashion with deep networks have demonstrated outstanding results for clustering and retrieval. However, these recent approaches still suffer from the performance degradation stemming from the local metric training procedure which is unaware of the global structure of the embedding space. We propose a global metric learning scheme for optimizing the deep metric embedding with the learnable clustering function and the clustering metric (NMI) in a novel structured prediction framework. Our experiments on CUB200-2011, Cars196, and Stanford online products datasets show state of the art performance both on the clustering and retrieval tasks measured in the NMI and Recall@K evaluation metrics.Comment: Submission accepted at CVPR 201

arXiv.org e-Print Archive

Crossref

Text stream mining for Massive Open Online Courses:review and perspectives

Author: Cocea Mihaela
Gaber Mohamad Medhat
Shatnawi Safwan
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2014
Field of study

Portsmouth University Research Portal (Pure)