2,672 research outputs found

    New data induced metric for density based clustering

    Full text link

    An analytics-based heuristic decomposition of a bilevel multiple-follower cutting stock problem

    Get PDF
    This paper presents a new class of multiple-follower bilevel problems and a heuristic approach to solving them. In this new class of problems, the followers may be nonlinear, do not share constraints or variables, and are at most weakly constrained. This allows the leader variables to be partitioned among the followers. We show that current approaches for solving multiple-follower problems are unsuitable for our new class of problems and instead we propose a novel analytics-based heuristic decomposition approach. This approach uses Monte Carlo simulation and k-medoids clustering to reduce the bilevel problem to a single level, which can then be solved using integer programming techniques. The examples presented show that our approach produces better solutions and scales up better than the other approaches in the literature. Furthermore, for large problems, we combine our approach with the use of self-organising maps in place of k-medoids clustering, which significantly reduces the clustering times. Finally, we apply our approach to a real-life cutting stock problem. Here a forest harvesting problem is reformulated as a multiple-follower bilevel problem and solved using our approachThis publication has emanated from research conducted with the financial support of Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/228

    Density based fuzzy c-means clustering of non-convex patterns

    Full text link
    We propose a new technique to perform unsupervised data classification (clustering) based on density induced metric and non-smooth optimization. Our goal is to automatically recognize multidimensional clusters of non-convex shape. We present a modification of the fuzzy c-means algorithm, which uses the data induced metric, defined with the help of Delaunay triangulation. We detail computation of the distances in such a metric using graph algorithms. To find optimal positions of cluster prototypes we employ the discrete gradient method of non-smooth optimization. The new clustering method is capable to identify non-convex overlapped d-dimensional clusters.<br /

    AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments

    Get PDF
    This report considers the application of Articial Intelligence (AI) techniques to the problem of misuse detection and misuse localisation within telecommunications environments. A broad survey of techniques is provided, that covers inter alia rule based systems, model-based systems, case based reasoning, pattern matching, clustering and feature extraction, articial neural networks, genetic algorithms, arti cial immune systems, agent based systems, data mining and a variety of hybrid approaches. The report then considers the central issue of event correlation, that is at the heart of many misuse detection and localisation systems. The notion of being able to infer misuse by the correlation of individual temporally distributed events within a multiple data stream environment is explored, and a range of techniques, covering model based approaches, `programmed' AI and machine learning paradigms. It is found that, in general, correlation is best achieved via rule based approaches, but that these suffer from a number of drawbacks, such as the difculty of developing and maintaining an appropriate knowledge base, and the lack of ability to generalise from known misuses to new unseen misuses. Two distinct approaches are evident. One attempts to encode knowledge of known misuses, typically within rules, and use this to screen events. This approach cannot generally detect misuses for which it has not been programmed, i.e. it is prone to issuing false negatives. The other attempts to `learn' the features of event patterns that constitute normal behaviour, and, by observing patterns that do not match expected behaviour, detect when a misuse has occurred. This approach is prone to issuing false positives, i.e. inferring misuse from innocent patterns of behaviour that the system was not trained to recognise. Contemporary approaches are seen to favour hybridisation, often combining detection or localisation mechanisms for both abnormal and normal behaviour, the former to capture known cases of misuse, the latter to capture unknown cases. In some systems, these mechanisms even work together to update each other to increase detection rates and lower false positive rates. It is concluded that hybridisation offers the most promising future direction, but that a rule or state based component is likely to remain, being the most natural approach to the correlation of complex events. The challenge, then, is to mitigate the weaknesses of canonical programmed systems such that learning, generalisation and adaptation are more readily facilitated

    A simulated annealing-based maximum-margin clustering algorithm

    Get PDF
    © 2018 Wiley Periodicals, Inc. Maximum-margin clustering is an extension of the support vector machine (SVM) to clustering. It partitions a set of unlabeled data into multiple groups by finding hyperplanes with the largest margins. Although existing algorithms have shown promising results, there is no guarantee of convergence of these algorithms to global solutions due to the nonconvexity of the optimization problem. In this paper, we propose a simulated annealing-based algorithm that is able to mitigate the issue of local minima in the maximum-margin clustering problem. The novelty of our algorithm is twofold, ie, (i) it comprises a comprehensive cluster modification scheme based on simulated annealing, and (ii) it introduces a new approach based on the combination of k-means++ and SVM at each step of the annealing process. More precisely, k-means++ is initially applied to extract subsets of the data points. Then, an unsupervised SVM is applied to improve the clustering results. Experimental results on various benchmark data sets (of up to over a million points) give evidence that the proposed algorithm is more effective at solving the clustering problem than a number of popular clustering algorithms
    corecore