32 research outputs found

    Learning Feature Weights for Density-Based Clustering

    Get PDF
    K-Means is the most popular and widely used clustering algorithm. This algorithm cannot recover non-spherical shape clusters in data sets. DBSCAN is arguably the most popular algorithm to recover arbitrary shape clusters; this is why this density-based clustering algorithm is of great interest to tackle its weaknesses. One issue of concern is that DBSCAN requires two parameters, and it cannot recover widely variable density clusters. The problem lies at the heart of this thesis is that during the clustering process DBSCAN takes all the available features and treats all the features equally regardless of their degree of relevance in the data set, which can have negative impacts. This thesis addresses the above problems by laying the foundation of the feature weighted density-based clustering. Specifically, the thesis introduces a densitybased clustering algorithm using reverse nearest neighbour, DBSCANR that require less parameter than DBSCAN for recovering clusters. DBSCANR is based on the insight that in real-world data sets the densities of arbitrary shape clusters to be recovered within a data set are very different from each other. The thesis extends DBSCANR to what is referred to as weighted DBSCANR, WDBSCANR by exploiting feature weighting technique to give the different level of relevance to the features in a data set. The thesis extends W-DBSCANR further by using the Minkowski metric so that the weight can be interpreted as feature re-scaling factors named MW-DBSCANR. Experiments on both artificial and realworld data sets demonstrate the superiority of our method over DBSCAN type algorithms. These weighted algorithms considerably reduce the impact of irrelevant features while recovering arbitrary shape clusters of different level of densities in a high-dimensional data set. Within this context, this thesis incorporates a popular algorithm, feature selection using feature similarity, FSFS into bothW-DBSCANR andMW-DBSCANR, to address the problem of feature selection. This unsupervised feature selection algorithm makes use of feature clustering and feature similarity to reduce the number of features in a data set. With a similar aim, exploiting the concept of feature similarity, the thesis introduces a method, density-based feature selection using feature similarity, DBFSFS to take density-based cluster structure into consideration for reducing the number of features in a data set. This thesis then applies the developed method to real-world high-dimensional gene expression data sets. DBFSFS improves the clustering recovery by substantially reducing the number of features from high-dimensional low sample size data sets

    Computer-based tools for supporting forest management. The experience and the expertise world-wide

    Get PDF
    Report of Cost Action FP 0804 Forest Management Decision Support Systems (FORSYS)Computer-based tools for supporting forest management. The experience and the expertise world-wide answers a call from both the research and the professional communities for a synthesis of current knowledge about the use of computerized tools in forest management planning. According to the aims of the Forest Management Decision Support Systems (FORSYS) (http://fp0804.emu.ee/) this synthesis is a critical success factor to develop a comprehensive quality reference for forest management decision support systems. The emphasis of the book is on identifying and assessing the support provided by computerized tools to enhance forest management planning in real-world contexts. The book thus identifies the management planning problems that prevail world-wide to discuss the architecture and the components of the tools used to address them. Of importance is the report of architecture approaches, models and methods, knowledge management and participatory planning techniques used to address specific management planning problems. We think that this synthesis may provide effective support to research and outreach activities that focus on the development of forest management decision support systems. It may contribute further to support forest managers when defining the requirements for a tool that best meets their needs. The first chapter of the book provides an introduction to the use of decision support systems in the forest sector and lays out the FORSYS framework for reporting the experience and expertise acquired in each country. Emphasis is on the FORSYS ontology to facilitate the sharing of experiences needed to characterize and evaluate the use of computerized tools when addressing forest management planning problems. The twenty six country reports share a structure designed to underline a problem-centric focus. Specifically, they all start with the identification of the management planning problems that are prevalent in the country and they move on to the characterization and assessment of the computerized tools used to address them. The reports were led by researchers with background and expertise in areas that range from ecological modeling to forest modeling, management planning and information and communication technology development. They benefited from the input provided by forest practitioners and by organizations that are responsible for developing and implementing forest management plans. A conclusions chapter highlights the success of bringing together such a wide range of disciplines and perspectives. This book benefited from voluntary contributions by 94 authors and from the involvement of several forest stakeholders from twenty six countries in Europe, North and South America, Africa and Asia over a three-year period. We, the chair of FORSYS and the editorial committee of the publication, acknowledge and thank for the valuable contributions from all authors, editors, stakeholders and FORSYS actors involved in this project

    Algorithms for nonuniform networks

    Get PDF
    In this thesis, observations on structural properties of natural networks are taken as a starting point for developing efficient algorithms for natural instances of different graph problems. The key areas discussed are sampling, clustering, routing, and pattern mining for large, nonuniform graphs. The results include observations on structural effects together with algorithms that aim to reveal structural properties or exploit their presence in solving an interesting graph problem. Traditionally networks were modeled with uniform random graphs, assuming that each vertex was equally important and each edge equally likely to be present. Within the last decade, the approach has drastically changed due to the numerous observations on structural complexity in natural networks, many of which proved the uniform model to be inadequate for some contexts. This quickly lead to various models and measures that aim to characterize topological properties of different kinds of real-world networks also beyond the uniform networks. The goal of this thesis is to utilize such observations in algorithm design, in addition to empowering the process of network analysis. Knowing that a graph exhibits certain characteristics allows for more efficient storage, processing, analysis, and feature extraction. Our emphasis is on local methods that avoid resorting to information of the graph structure that is not relevant to the answer sought. For example, when seeking for the cluster of a single vertex, we compute it without using any global knowledge of the graph, iteratively examining the vicinity of the seed vertex. Similarly we propose methods for sampling and spanning-tree construction according to certain criteria on the outcome without requiring knowledge of the graph as a whole. Our motivation for concentrating on local methods is two-fold: one driving factor is the ever-increasing size of real-world problems, but an equally important fact is the nonuniformity present in many natural graph instances; properties that hold for the entire graph are often lost when only a small subgraph is examined.reviewe

    LIPIcs, Volume 244, ESA 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 244, ESA 2022, Complete Volum

    Towards a unified method to synthesising scenarios and solvers in combinatorial optimisation via graph-based approaches

    Get PDF
    Hyper-heuristics is a collection of search methods for selecting, combining and generating heuristics used to solve combinatorial optimisation problems. The primary objective of hyper-heuristics research is to develop more generally applicable search procedures that can be easily applied to a wide variety of problems. However, current hyper-heuristic architectures assume the existence of a domain barrier that does not allow low-level heuristics or operators to be applied outside their designed problem domain. Additionally the representation used to encode solvers differs from the one used to encode solutions. This means that hyper-heuristic internal components can not be optimised by the system itself. In this thesis we address these issues by using graph reformulations of selected problems and search in the space of operators by using Grammatical Evolution techniques to evolve new perturbative and constructive heuristics. The low-level heuristics (representing graph transformations) are evolved using a single grammar which is capable of adapting to multiple domains. We test our generators of heuristics on instances of the Travelling Salesman Problem, Knapsack Problem and Load Balancing Problem and show that the best evolved heuristics can compete with human written heuristics and representations designed for each problem domain. Further we propose a conceptual framework for the production and combination of graph structures. We show how these concepts can be used to describe and provide a representation for problems in combinatorics and the inner mechanics of hyper-heuristic systems. The final contribution is a new benchmark that can generate problem instances for multiple problem domains that can be used for the assessment of multi-domain problem solvers

    A Tabu Search Approach to Optimal Structuring Element Extraction for MST-Based Shapes Description

    No full text
    In this paper, we propose a novel method for extracting optimal structure element for MST-based shape description. Specifically, we use tabu search to solve optimal structure element extraction problem for MSTbased shape description. In the best of our knowledge, there is very little work on how to explore tabu search in computer vision. Our tabu search (TS) has a number of advantages: (1) TS avoids entrapment in local minima and continues the search to give a near-optimal final solution; (2) TS is very general and conceptually much simpler than either SA or GA; (3) TS is very easy to implement and the entire procedure occupies only a few lines of code; (4) TS is a flexible framework of a variety of strategies originating from artificial intelligence and is therefore open to further improvement. Keywords: Shape description, Mathematical morophology, Shape mathing, Optimal structure element, Tabu search, model-based vision. This work was partially supported by the Chinese National Sc..

    Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference

    Get PDF
    corecore