Search CORE

2,182 research outputs found

Maximum Margin Clustering for State Decomposition of Metastable Systems

Author: Allwein
Becker
Berglund
Biancalani
Bowman
Boyd
Chema
Chodera
Chodera
Chodera
Crammer
Daura
Deuflhard
Deuflhard
Elmer
Genova
Glättli
Groningen
Hao Wu
Hastie
Horn
Jain
Keller
Kellogg
Kloeden
Kwak
McGibbon
Mehrmann
Noé
Noé
Noé
Noé
Noé
Nüske
Prinz
Pryor
Pérez-Hernández
Rahimi
Sarich
Schwantes
Shalev-Shwartz
Shao
Sorin
Swope
Vapnik
Wu
Xu
Yao
Zhang
Publication venue
Publication date: 31/12/2014
Field of study

When studying a metastable dynamical system, a prime concern is how to decompose the phase space into a set of metastable states. Unfortunately, the metastable state decomposition based on simulation or experimental data is still a challenge. The most popular and simplest approach is geometric clustering which is developed based on the classical clustering technique. However, the prerequisites of this approach are: (1) data are obtained from simulations or experiments which are in global equilibrium and (2) the coordinate system is appropriately selected. Recently, the kinetic clustering approach based on phase space discretization and transition probability estimation has drawn much attention due to its applicability to more general cases, but the choice of discretization policy is a difficult task. In this paper, a new decomposition method designated as maximum margin metastable clustering is proposed, which converts the problem of metastable state decomposition to a semi-supervised learning problem so that the large margin technique can be utilized to search for the optimal decomposition without phase space discretization. Moreover, several simulation examples are given to illustrate the effectiveness of the proposed method

arXiv.org e-Print Archive

Crossref

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

Malware Classification based on Call Graph Clustering

Author: Kinable Joris
Kostakis Orestis
Publication venue
Publication date: 25/08/2010
Field of study

Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations away, and enable the detection of structural similarities between samples. The ability to cluster similar samples together will make more generic detection techniques possible, thereby targeting the commonalities of the samples within a cluster. To compare call graphs mutually, we compute pairwise graph similarity scores via graph matchings which approximately minimize the graph edit distance. Next, to facilitate the discovery of similar malware samples, we employ several clustering algorithms, including k-medoids and DBSCAN. Clustering experiments are conducted on a collection of real malware samples, and the results are evaluated against manual classifications provided by human malware analysts. Experiments show that it is indeed possible to accurately detect malware families via call graph clustering. We anticipate that in the future, call graphs can be used to analyse the emergence of new malware families, and ultimately to automate implementation of generic detection schemes.Comment: This research has been supported by TEKES - the Finnish Funding Agency for Technology and Innovation as part of its ICT SHOK Future Internet research programme, grant 40212/0

arXiv.org e-Print Archive

CiteSeerX

Repository TU/e

Finding groups in data: Cluster analysis with ants

Author: Berger
Bonabeau
Bonabeau
Brito
Brucker
Chu
Deneubourg
Deneubourg
Dorigo
Dubes
Ester
Franks
Ganti
Gibson
Guha
Halkidi
Handl
Hansen
Jain
Karypis
Kaufman
Kennedy
Lee
Lumer
MacQueen
Ng
Oprisan
Rijsbergen
Urszula Boryczka
Welch
Zait
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

Wepresent in this paper a modification of Lumer and Faieta’s algorithm for data clustering. This approach mimics the clustering behavior observed in real ant colonies. This algorithm discovers automatically clusters in numerical data without prior knowledge of possible number of clusters. In this paper we focus on ant-based clustering algorithms, a particular kind of a swarm intelligent system, and on the effects on the final clustering by using during the classification differentmetrics of dissimilarity: Euclidean, Cosine, and Gower measures. Clustering with swarm-based algorithms is emerging as an alternative to more conventional clustering methods, such as e.g. k-means, etc. Among the many bio-inspired techniques, ant clustering algorithms have received special attention, especially because they still require much investigation to improve performance, stability and other key features that would make such algorithms mature tools for data mining. As a case study, this paper focus on the behavior of clustering procedures in those new approaches. The proposed algorithm and its modifications are evaluated in a number of well-known benchmark datasets. Empirical results clearly show that ant-based clustering algorithms performs well when compared to another techniques

Crossref

Bournemouth University Research Online

Consensus clustering and functional interpretation of gene-expression data

Author: Kellam P.
Liu X.
Martin Nigel
Orengo C.A.
Swift S.
Tucker A.
Vinciotti V.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Microarray analysis using clustering algorithms can suffer from lack of inter-method consistency in assigning related gene-expression profiles to clusters. Obtaining a consensus set of clusters from a number of clustering methods should improve confidence in gene-expression analysis. Here we introduce consensus clustering, which provides such an advantage. When coupled with a statistically based gene functional analysis, our method allowed the identification of novel genes regulated by NFκB and the unfolded protein response in certain B-cell lymphomas

Springer - Publisher Connector

UCL Discovery

PubMed Central

Birkbeck Institutional Research Online

Spiral - Imperial College Digital Repository

Brunel University Research Archive