10 research outputs found

    The Vulnverability Cube: A Multi-Dimensional Framework for Assessing Relative Vulnerability

    Get PDF
    The diversity and abundance of information available for vulnerability assessments can present a challenge to decision-makers. Here we propose a framework to aggregate and present socioeconomic and environmental data in a visual vulnerability assessment that will help prioritize management options for communities vulnerable to environmental change. Socioeconomic and environmental data are aggregated into distinct categorical indices across three dimensions and arranged in a cube, so that individual communities can be plotted in a three-dimensional space to assess the type and relative magnitude of the communities’ vulnerabilities based on their position in the cube. We present an example assessment using a subset of the USEPA National Estuary Program (NEP) estuaries: coastal communities vulnerable to the effects of environmental change on ecosystem health and water quality. Using three categorical indices created from a pool of publicly available data (socioeconomic index, land use index, estuary condition index), the estuaries were ranked based on their normalized averaged scores and then plotted along the three axes to form a vulnerability cube. The position of each community within the three-dimensional space communicates both the types of vulnerability endemic to each estuary and allows for the clustering of estuaries with like-vulnerabilities to be classified into typologies. The typologies highlight specific vulnerability descriptions that may be helpful in creating specific management strategies. The data used to create the categorical indices are flexible depending on the goals of the decision makers, as different data should be chosen based on availability or importance to the system. Therefore, the analysis can be tailored to specific types of communities, allowing a data rich process to inform decision-making

    Association rule mining activity : A general formalism

    No full text
    Mining association rules from a large collection of databases is based on two main tasks. One is generation of large itemsets; and the other is finding associations between the discovered large itemsets. Existing formalism for association rules are based on a single transaction database which is not sufficient to describe the association rules based on multiple database environment. In this paper, we give a general characterization of association rules and also give a framework for knowledge-based mining of multiple databases for association rules

    Knowledge-based association rule mining using AND–OR taxonomies

    No full text
    We introduce a knowledge-based approach to mine generalized association rules which is sound and interactive. Proposed mining is sound because our scheme uses knowledge for mining for only those concepts that are of interest to the user. It is interactive because we provide a user controllable parameter with the help of which user can interactively mine. For this, we use a taxonomy based on functionality, and a restricted way of generalization of the items. We call such a taxonomy A O taxonomy and the corresponding generalization A O generalization. We claim that this type of generalization is more meaningful since it is based on a semantic-grouping of concepts. We use this knowledge to naturally exploit the mining of interesting negative association rules. We define the interestingness of association rules based on the level of the concepts in the taxonomy. We give an efficient algorithm based on A O taxonomy which not only derives generalized association rules, but also accesses the database only once

    Tree structure for efficient data mining using rough sets

    No full text
    In data mining, an important goal is to generate an abstraction of the data. Such an abstraction helps in reducing the space and search time requirements of the overall decision making process. Further, it is important that the abstraction is generated from the data with a small number of disk scans. We propose a novel data structure, pattern count tree (PC-tree), that can be built by scanning the database only once. PC-tree is a minimal size complete representation of the data and it can be used to represent dynamic databases with the help of knowledge that is either static or changing. We show that further compactness can be achieved by constructing the PC-tree on segmented patterns. We exploit the flexibility offered by rough sets to realize a rough PC-tree and use it for efficient and effective rough classification. To be consistent with the sizes of the branches of the PC-tree, we use upper and lower approximations of feature sets in a manner different from the conventional rough set theory. We conducted experiments using the proposed classification scheme on a large-scale hand-written digit data set. We use the experimental results to establish the efficacy of the proposed approach. (C) 2002 Elsevier Science B.V. All rights reserved

    Scalable, Distributed and Dynamic Mining of Association Rules

    No full text
    We propose a novel pattern tree called Pattern Count tree (PC- tree) which is a complete and compact representation of the database. We show that construction of this tree and then generation of all large itemsets requires a single database scan where as the current algorithms need at least two database scans. The completeness property of the PC-tree with respect to the database makes it amenable for mining association rules in the context of changing data and knowledge, which we call dynamic mining. Algorithms based on PC-tree are scalable because PC-tree is compact. We propose a partitioned distributed architecture and an efficient distributed association rule mining algorithm based on the PC-tree structure

    Efficient clustering of large data sets

    No full text
    Clustering is an activity of finding abstractions from data and these abstractions can be used for decision making [1]. In this paper, we select the cluster representatives as prototypes for efficient classification [3]. There are a variety of clustering algorithms reported in the literature. However, clustering algorithms that perform multiple scans of large databases (of size in Tera bytes) residing on the disk demand prohibitive computational times. As a consequence, there is a growing interest in designing clustering algorithms that scan the database only once. Algorithms like BIRCH [2], Leader [5] and Single-pass k-means algorithm [4] belong to this category
    corecore