27,017 research outputs found

    Lockout-Tagout Ransomware:A Detection Method for Ransomware using Fuzzy Hashing and Clustering

    Get PDF
    Ransomware attacks are a prevalent cybersecurity threat to every user and enterprise today. This is attributed to their polymorphic behaviour and dispersion of inexhaustible versions due to the same ransomware family or threat actor. A certain ransomware family or threat actor repeatedly utilises nearly the same style or codebase to create a vast number of ransomware versions. Therefore, it is essential for users and enterprises to keep well-informed about this threat landscape and adopt proactive prevention strategies to minimise its spread and affects. This requires a technique to detect ransomware samples to determine the similarity and link with the known ransomware family or threat actor. Therefore, this paper presents a detection method for ransomware by employing a combination of a similarity preserving hashing method called fuzzy hashing and a clustering method. This detection method is applied on the collected WannaCry/WannaCryptor ransomware samples utilising a range of fuzzy hashing and clustering methods. The clustering results of various clustering methods are evaluated through the use of the internal evaluation indexes to determine the accuracy and consistency of their clustering results, thus the effective combination of fuzzy hashing and clustering method as applied to the particular ransomware corpus. The proposed detection method is a static analysis method, which requires fewer computational overheads and performs rapid comparative analysis with respect to other static analysis methods

    Semi-supervised cross-entropy clustering with information bottleneck constraint

    Full text link
    In this paper, we propose a semi-supervised clustering method, CEC-IB, that models data with a set of Gaussian distributions and that retrieves clusters based on a partial labeling provided by the user (partition-level side information). By combining the ideas from cross-entropy clustering (CEC) with those from the information bottleneck method (IB), our method trades between three conflicting goals: the accuracy with which the data set is modeled, the simplicity of the model, and the consistency of the clustering with side information. Experiments demonstrate that CEC-IB has a performance comparable to Gaussian mixture models (GMM) in a classical semi-supervised scenario, but is faster, more robust to noisy labels, automatically determines the optimal number of clusters, and performs well when not all classes are present in the side information. Moreover, in contrast to other semi-supervised models, it can be successfully applied in discovering natural subgroups if the partition-level side information is derived from the top levels of a hierarchical clustering

    Overlapping modularity at the critical point of k-clique percolation

    Get PDF
    One of the most remarkable social phenomena is the formation of communities in social networks corresponding to families, friendship circles, work teams, etc. Since people usually belong to several different communities at the same time, the induced overlaps result in an extremely complicated web of the communities themselves. Thus, uncovering the intricate community structure of social networks is a non-trivial task with great potential for practical applications, gaining a notable interest in the recent years. The Clique Percolation Method (CPM) is one of the earliest overlapping community finding methods, which was already used in the analysis of several different social networks. In this approach the communities correspond to k-clique percolation clusters, and the general heuristic for setting the parameters of the method is to tune the system just below the critical point of k-clique percolation. However, this rule is based on simple physical principles and its validity was never subject to quantitative analysis. Here we examine the quality of the partitioning in the vicinity of the critical point using recently introduced overlapping modularity measures. According to our results on real social- and other networks, the overlapping modularities show a maximum close to the critical point, justifying the original criteria for the optimal parameter settings.Comment: 20 pages, 6 figure

    Soft ranking in clustering

    Get PDF
    Due to the diffusion of large-dimensional data sets (e.g., in DNA microarray or document organization and retrieval applications), there is a growing interest in clustering methods based on a proximity matrix. These have the advantage of being based on a data structure whose size only depends on cardinality, not dimensionality. In this paper, we propose a clustering technique based on fuzzy ranks. The use of ranks helps to overcome several issues of large-dimensional data sets, whereas the fuzzy formulation is useful in encoding the information contained in the smallest entries of the proximity matrix. Comparative experiments are presented, using several standard hierarchical clustering techniques as a reference

    A survey of kernel and spectral methods for clustering

    Get PDF
    Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., K-means, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel K-means clustering algorithm. (C) 2007 Pattem Recognition Society. Published by Elsevier Ltd. All rights reserved

    Application of k Means Clustering algorithm for prediction of Students Academic Performance

    Get PDF
    The ability to monitor the progress of students academic performance is a critical issue to the academic community of higher learning. A system for analyzing students results based on cluster analysis and uses standard statistical algorithms to arrange their scores data according to the level of their performance is described. In this paper, we also implemented k mean clustering algorithm for analyzing students result data. The model was combined with the deterministic model to analyze the students results of a private Institution in Nigeria which is a good benchmark to monitor the progression of academic performance of students in higher Institution for the purpose of making an effective decision by the academic planners.Comment: IEEE format, International Journal of Computer Science and Information Security, IJCSIS January 2010, ISSN 1947 5500, http://sites.google.com/site/ijcsis
    • 

    corecore