169 research outputs found

    Identification of Biodiversity and Other Forest Attributes for Sustainable Forest Management: Siberian Forest Case Study

    Get PDF
    This paper attempts to identify characteristics of biodiversity and other (forest) ecosystem conditions that are considered essential for a description of ecosystem functioning and development of sustainable forest management practices in the Siberian forests. This is accomplished through an analysis of net primary production of phytomass (NPP) which acts as a proxy for ecosystem functioning. Rough Sets (RS) analysis is applied to study the Siberian ecoregions classified into compact and cohesive NPP performance classes. Through a heuristic procedure, a reduced set of attributes is generated for a NPP classification problem. In order to interpret relationships between various forest characteristics, so-called "interesting rules" are generated on a basis of reduced problem description. These "interesting rules" provide means to draw conclusions in the form of knowledge statements about functioning of the Siberian forests

    Reducing the Memory Size of a Fuzzy Case-Based Reasoning System Applying Rough Set Techniques

    Get PDF
    Early work on case-based reasoning (CBR) reported in the literature shows the importance of soft computing techniques applied to different stages of the classical four-step CBR life cycle. This correspondence proposes a reduction technique based on rough sets theory capable of minimizing the case memory by analyzing the contribution of each case feature. Inspired by the application of the minimum description length principle, the method uses the granularity of the original data to compute the relevance of each attribute. The rough feature weighting and selection method is applied as a preprocessing step prior to the generation of a fuzzy rule system, which is employed in the revision phase of the proposed CBR system. Experiments using real oceanographic data show that the rough sets reduction method maintains the accuracy of the employed fuzzy rules, while reducing the computational effort needed in its generation and increasing the explanatory strength of the fuzzy rules

    Uncertainty Management of Intelligent Feature Selection in Wireless Sensor Networks

    Get PDF
    Wireless sensor networks (WSN) are envisioned to revolutionize the paradigm of monitoring complex real-world systems at a very high resolution. However, the deployment of a large number of unattended sensor nodes in hostile environments, frequent changes of environment dynamics, and severe resource constraints pose uncertainties and limit the potential use of WSN in complex real-world applications. Although uncertainty management in Artificial Intelligence (AI) is well developed and well investigated, its implications in wireless sensor environments are inadequately addressed. This dissertation addresses uncertainty management issues of spatio-temporal patterns generated from sensor data. It provides a framework for characterizing spatio-temporal pattern in WSN. Using rough set theory and temporal reasoning a novel formalism has been developed to characterize and quantify the uncertainties in predicting spatio-temporal patterns from sensor data. This research also uncovers the trade-off among the uncertainty measures, which can be used to develop a multi-objective optimization model for real-time decision making in sensor data aggregation and samplin

    DOMINANT ATTRIBUTE AND MULTIPLE SCANNING APPROACHES FOR DISCRETIZATION OF NUMERICAL ATTRIBUTES

    Get PDF
    Rapid development of high throughput technologies and database management systems has made it possible to produce and store large amount of data. However, making sense of big data and discovering knowledge from it is a compounding challenge. Generally, data mining techniques search for information in datasets and express gained knowledge in the form of trends, regularities, patterns or rules. Rules are frequently identified automatically by a technique called rule induction, which is the most important technique in data mining and machine learning and it was developed primarily to handle symbolic data. However, real life data often contain numerical attributes and therefore, in order to fully utilize the power of rule induction techniques, an essential preprocessing step of converting numeric data into symbolic data called discretization is employed in data mining. Here we present two entropy based discretization techniques known as dominant attribute approach and multiple scanning approach, respectively. These approaches were implemented as two explicit algorithms in a JAVA programming language and experiments were conducted by applying each algorithm separately on seventeen well known numerical data sets. The resulting discretized data sets were used for rule induction by LEM2 or Learning from Examples Module 2 algorithm. For each dataset in multiple scanning approach, experiments were repeated with incremental scans until interval counts were stabilized. Preliminary results from this study indicated that multiple scanning approach performed better than dominant attribute approach in terms of producing comparatively smaller and simpler rule sets

    Internet-based solutions to support distributed manufacturing

    Get PDF
    With the globalisation and constant changes in the marketplace, enterprises are adapting themselves to face new challenges. Therefore, strategic corporate alliances to share knowledge, expertise and resources represent an advantage in an increasing competitive world. This has led the integration of companies, customers, suppliers and partners using networked environments. This thesis presents three novel solutions in the tooling area, developed for Seco tools Ltd, UK. These approaches implement a proposed distributed computing architecture using Internet technologies to assist geographically dispersed tooling engineers in process planning tasks. The systems are summarised as follows. TTS is a Web-based system to support engineers and technical staff in the task of providing technical advice to clients. Seco sales engineers access the system from remote machining sites and submit/retrieve/update the required tooling data located in databases at the company headquarters. The communication platform used for this system provides an effective mechanism to share information nationwide. This system implements efficient methods, such as data relaxation techniques, confidence score and importance levels of attributes, to help the user in finding the closest solutions when specific requirements are not fully matched In the database. Cluster-F has been developed to assist engineers and clients in the assessment of cutting parameters for the tooling process. In this approach the Internet acts as a vehicle to transport the data between users and the database. Cluster-F is a KD approach that makes use of clustering and fuzzy set techniques. The novel proposal In this system is the implementation of fuzzy set concepts to obtain the proximity matrix that will lead the classification of the data. Then hierarchical clustering methods are applied on these data to link the closest objects. A general KD methodology applying rough set concepts Is proposed In this research. This covers aspects of data redundancy, Identification of relevant attributes, detection of data inconsistency, and generation of knowledge rules. R-sets, the third proposed solution, has been developed using this KD methodology. This system evaluates the variables of the tooling database to analyse known and unknown relationships in the data generated after the execution of technical trials. The aim is to discover cause-effect patterns from selected attributes contained In the database. A fourth system was also developed. It is called DBManager and was conceived to administrate the systems users accounts, sales engineers’ accounts and tool trial monitoring process of the data. This supports the implementation of the proposed distributed architecture and the maintenance of the users' accounts for the access restrictions to the system running under this architecture

    Combining rough and fuzzy sets for feature selection

    Get PDF

    Hypergraph Partitioning in the Cloud

    Get PDF
    The thesis investigates the partitioning and load balancing problem which has many applications in High Performance Computing (HPC). The application to be partitioned is described with a graph or hypergraph. The latter is of greater interest as hypergraphs, compared to graphs, have a more general structure and can be used to model more complex relationships between groups of objects such as non-symmetric dependencies. Optimal graph and hypergraph partitioning is known to be NP-Hard but good polynomial time heuristic algorithms have been proposed. In this thesis, we propose two multi-level hypergraph partitioning algorithms. The algorithms are based on rough set clustering techniques. The first algorithm, which is a serial algorithm, obtains high quality partitionings and improves the partitioning cut by up to 71\% compared to the state-of-the-art serial hypergraph partitioning algorithms. Furthermore, the capacity of serial algorithms is limited due to the rapid growth of problem sizes of distributed applications. Consequently, we also propose a parallel hypergraph partitioning algorithm. Considering the generality of the hypergraph model, designing a parallel algorithm is difficult and the available parallel hypergraph algorithms offer less scalability compared to their graph counterparts. The issue is twofold: the parallel algorithm and the complexity of the hypergraph structure. Our parallel algorithm provides a trade-off between global and local vertex clustering decisions. By employing novel techniques and approaches, our algorithm achieves better scalability than the state-of-the-art parallel hypergraph partitioner in the Zoltan tool on a set of benchmarks, especially ones with irregular structure. Furthermore, recent advances in cloud computing and the services they provide have led to a trend in moving HPC and large scale distributed applications into the cloud. Despite its advantages, some aspects of the cloud, such as limited network resources, present a challenge to running communication-intensive applications and make them non-scalable in the cloud. While hypergraph partitioning is proposed as a solution for decreasing the communication overhead within parallel distributed applications, it can also offer advantages for running these applications in the cloud. The partitioning is usually done as a pre-processing step before running the parallel application. As parallel hypergraph partitioning itself is a communication-intensive operation, running it in the cloud is hard and suffers from poor scalability. The thesis also investigates the scalability of parallel hypergraph partitioning algorithms in the cloud, the challenges they present, and proposes solutions to improve the cost/performance ratio for running the partitioning problem in the cloud. Our algorithms are implemented as a new hypergraph partitioning package within Zoltan. It is an open source Linux-based toolkit for parallel partitioning, load balancing and data-management designed at Sandia National Labs. The algorithms are known as FEHG and PFEHG algorithms

    Acta Cybernetica : Volume 24. Number 1.

    Get PDF
    • …
    corecore