27,017 research outputs found
Lockout-Tagout Ransomware:A Detection Method for Ransomware using Fuzzy Hashing and Clustering
Ransomware attacks are a prevalent cybersecurity threat to every user and enterprise today. This is attributed to their polymorphic behaviour and dispersion of inexhaustible versions due to the same ransomware family or threat actor. A certain ransomware family or threat actor repeatedly utilises nearly the same style or codebase to create a vast number of ransomware versions. Therefore, it is essential for users and enterprises to keep well-informed about this threat landscape and adopt proactive prevention strategies to minimise its spread and affects. This requires a technique to detect ransomware samples to determine the similarity and link with the known ransomware family or threat actor. Therefore, this paper presents a detection method for ransomware by employing a combination of a similarity preserving hashing method called fuzzy hashing and a clustering method. This detection method is applied on the collected WannaCry/WannaCryptor ransomware samples utilising a range of fuzzy hashing and clustering methods. The clustering results of various clustering methods are evaluated through the use of the internal evaluation indexes to determine the accuracy and consistency of their clustering results, thus the effective combination of fuzzy hashing and clustering method as applied to the particular ransomware corpus. The proposed detection method is a static analysis method, which requires fewer computational overheads and performs rapid comparative analysis with respect to other static analysis methods
Semi-supervised cross-entropy clustering with information bottleneck constraint
In this paper, we propose a semi-supervised clustering method, CEC-IB, that
models data with a set of Gaussian distributions and that retrieves clusters
based on a partial labeling provided by the user (partition-level side
information). By combining the ideas from cross-entropy clustering (CEC) with
those from the information bottleneck method (IB), our method trades between
three conflicting goals: the accuracy with which the data set is modeled, the
simplicity of the model, and the consistency of the clustering with side
information. Experiments demonstrate that CEC-IB has a performance comparable
to Gaussian mixture models (GMM) in a classical semi-supervised scenario, but
is faster, more robust to noisy labels, automatically determines the optimal
number of clusters, and performs well when not all classes are present in the
side information. Moreover, in contrast to other semi-supervised models, it can
be successfully applied in discovering natural subgroups if the partition-level
side information is derived from the top levels of a hierarchical clustering
Overlapping modularity at the critical point of k-clique percolation
One of the most remarkable social phenomena is the formation of communities
in social networks corresponding to families, friendship circles, work teams,
etc. Since people usually belong to several different communities at the same
time, the induced overlaps result in an extremely complicated web of the
communities themselves. Thus, uncovering the intricate community structure of
social networks is a non-trivial task with great potential for practical
applications, gaining a notable interest in the recent years. The Clique
Percolation Method (CPM) is one of the earliest overlapping community finding
methods, which was already used in the analysis of several different social
networks. In this approach the communities correspond to k-clique percolation
clusters, and the general heuristic for setting the parameters of the method is
to tune the system just below the critical point of k-clique percolation.
However, this rule is based on simple physical principles and its validity was
never subject to quantitative analysis. Here we examine the quality of the
partitioning in the vicinity of the critical point using recently introduced
overlapping modularity measures. According to our results on real social- and
other networks, the overlapping modularities show a maximum close to the
critical point, justifying the original criteria for the optimal parameter
settings.Comment: 20 pages, 6 figure
Soft ranking in clustering
Due to the diffusion of large-dimensional data sets (e.g., in DNA microarray or document organization and retrieval applications), there is a growing interest in clustering methods based on a proximity matrix. These have the advantage of being based on a data structure whose size only depends on cardinality, not dimensionality. In this paper, we propose a clustering technique based on fuzzy ranks. The use of ranks helps to overcome several issues of large-dimensional data sets, whereas the fuzzy formulation is useful in encoding the information contained in the smallest entries of the proximity matrix. Comparative experiments are presented, using several standard hierarchical clustering techniques as a
reference
A survey of kernel and spectral methods for clustering
Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., K-means, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel K-means clustering algorithm. (C) 2007 Pattem Recognition Society. Published by Elsevier Ltd. All rights reserved
Application of k Means Clustering algorithm for prediction of Students Academic Performance
The ability to monitor the progress of students academic performance is a
critical issue to the academic community of higher learning. A system for
analyzing students results based on cluster analysis and uses standard
statistical algorithms to arrange their scores data according to the level of
their performance is described. In this paper, we also implemented k mean
clustering algorithm for analyzing students result data. The model was combined
with the deterministic model to analyze the students results of a private
Institution in Nigeria which is a good benchmark to monitor the progression of
academic performance of students in higher Institution for the purpose of
making an effective decision by the academic planners.Comment: IEEE format, International Journal of Computer Science and
Information Security, IJCSIS January 2010, ISSN 1947 5500,
http://sites.google.com/site/ijcsis
- âŠ