151 research outputs found

    Incremental Perspective for Feature Selection Based on Fuzzy Rough Sets

    Get PDF

    Active Sample Selection Based Incremental Algorithm for Attribute Reduction with Rough Sets

    Get PDF
    Attribute reduction with rough sets is an effective technique for obtaining a compact and informative attribute set from a given dataset. However, traditional algorithms have no explicit provision for handling dynamic datasets where data present themselves in successive samples. Incremental algorithms for attribute reduction with rough sets have been recently introduced to handle dynamic datasets with large samples, though they have high complexity in time and space. To address the time/space complexity issue of the algorithms, this paper presents a novel incremental algorithm for attribute reduction with rough sets based on the adoption of an active sample selection process and an insight into the attribute reduction process. This algorithm first decides whether each incoming sample is useful with respect to the current dataset by the active sample selection process. A useless sample is discarded while a useful sample is selected to update a reduct. At the arrival of a useful sample, the attribute reduction process is then employed to guide how to add and/or delete attributes in the current reduct. The two processes thus constitute the theoretical framework of our algorithm. The proposed algorithm is finally experimentally shown to be efficient in time and space

    Hypergraph Partitioning in the Cloud

    Get PDF
    The thesis investigates the partitioning and load balancing problem which has many applications in High Performance Computing (HPC). The application to be partitioned is described with a graph or hypergraph. The latter is of greater interest as hypergraphs, compared to graphs, have a more general structure and can be used to model more complex relationships between groups of objects such as non-symmetric dependencies. Optimal graph and hypergraph partitioning is known to be NP-Hard but good polynomial time heuristic algorithms have been proposed. In this thesis, we propose two multi-level hypergraph partitioning algorithms. The algorithms are based on rough set clustering techniques. The first algorithm, which is a serial algorithm, obtains high quality partitionings and improves the partitioning cut by up to 71\% compared to the state-of-the-art serial hypergraph partitioning algorithms. Furthermore, the capacity of serial algorithms is limited due to the rapid growth of problem sizes of distributed applications. Consequently, we also propose a parallel hypergraph partitioning algorithm. Considering the generality of the hypergraph model, designing a parallel algorithm is difficult and the available parallel hypergraph algorithms offer less scalability compared to their graph counterparts. The issue is twofold: the parallel algorithm and the complexity of the hypergraph structure. Our parallel algorithm provides a trade-off between global and local vertex clustering decisions. By employing novel techniques and approaches, our algorithm achieves better scalability than the state-of-the-art parallel hypergraph partitioner in the Zoltan tool on a set of benchmarks, especially ones with irregular structure. Furthermore, recent advances in cloud computing and the services they provide have led to a trend in moving HPC and large scale distributed applications into the cloud. Despite its advantages, some aspects of the cloud, such as limited network resources, present a challenge to running communication-intensive applications and make them non-scalable in the cloud. While hypergraph partitioning is proposed as a solution for decreasing the communication overhead within parallel distributed applications, it can also offer advantages for running these applications in the cloud. The partitioning is usually done as a pre-processing step before running the parallel application. As parallel hypergraph partitioning itself is a communication-intensive operation, running it in the cloud is hard and suffers from poor scalability. The thesis also investigates the scalability of parallel hypergraph partitioning algorithms in the cloud, the challenges they present, and proposes solutions to improve the cost/performance ratio for running the partitioning problem in the cloud. Our algorithms are implemented as a new hypergraph partitioning package within Zoltan. It is an open source Linux-based toolkit for parallel partitioning, load balancing and data-management designed at Sandia National Labs. The algorithms are known as FEHG and PFEHG algorithms

    An Intelligent Decision Support System for Business IT Security Strategy

    Get PDF
    Cyber threat intelligence (CTI) is an emerging approach to improve cyber security of business IT environment. It has information of an a ected business IT context. CTI sharing tools are available for subscribers, and CTI feeds are increasingly available. If another business IT context is similar to a CTI feed context, the threat described in the CTI feed might also take place in the business IT context. Businesses can take proactive defensive actions if relevant CTI is identi ed. However, a challenge is how to develop an e ective connection strategy for CTI onto business IT contexts. Businesses are still insu ciently using CTI because not all of them have su cient knowledge from domain experts. Moreover, business IT contexts vary over time. When the business IT contextual states have changed, the relevant CTI might be no longer appropriate and applicable. Another challenge is how a connection strategy has the ability to adapt to the business IT contextual changes. To ll the gap, in this Ph.D project, a dynamic connection strategy for CTI onto business IT contexts is proposed and the strategy is instantiated to be a dynamic connection rule assembly system. The system can identify relevant CTI for a business IT context and can modify its internal con gurations and structures to adapt to the business IT contextual changes. This thesis introduces the system development phases from design to delivery, and the contributions to knowledge are explained as follows. A hybrid representation of the dynamic connection strategy is proposed to generalise and interpret the problem domain and the system development. The representation uses selected computational intelligence models and software development models. In terms of the computational intelligence models, a CTI feed context and a business IT context are generalised to be the same type, i.e., context object. Grey number model is selected to represent the attribute values of context objects. Fuzzy sets are used to represent the context objects, and linguistic densities of the attribute values of context objects are reasoned. To assemble applicable connection knowledge, the system constructs a set of connection objects based on the context objects and uses rough set operations to extract applicable connection objects that contain the connection knowledge. Furthermore, to adapt to contextual changes, a rough set based incremental updating approach with multiple operations is developed to incrementally update the approximations. A set of propositions are proposed to describe how the system changes based on the previous states and internal structures of the system, and their complexities and e ciencies are analysed. In terms of the software development models, some uni ed modelling language (UML) models are selected to represent the system in design phase. Activity diagram is used to represent the business process of the system. Use case diagram is used to represent the human interactions with the system. Class diagram is used to represent the internal components and relationships between them. Using the representation, developers can develop a prototype of the system rapidly. Using the representation, an application of the system is developed using mainstream software development techniques. RESTful software architecture is used for the communication of the business IT contextual information and the analysis results using CTI between the server and the clients. A script based method is deployed in the clients to collect the contextual information. Observer pattern and a timer are used for the design and development of the monitor-trigger mechanism. In summary, the representation generalises real-world cases in the problem domain and interprets the system data. A speci c business can initialise an instance of the representation to be a speci c system based on its IT context and CTI feeds, and the knowledge assembled by the system can be used to identify relevant CTI feeds. From the relevant CTI data, the system locates and retrieves the useful information that can inform security decisions and then sends it to the client users. When the system needs to modify itself to adapt to the business IT contextual changes, the system can invoke the corresponding incremental updating functions and avoid a time-consuming re-computation. With this updating strategy, the application can provide its users in the client side with timely support and useful information that can inform security decisions using CTI

    A Rough Sets-based Agent Trust Management Framework

    Full text link

    变精度多粒度粗糙集近似集更新的矩阵算法

    Get PDF
    随着信息大爆炸时代的到来,数据集的巨大化、数据集结构的复杂化已经成为近似计算中一个不能忽视的问题。动态计算是解决这些问题的一种行之有效的途径。本文对现有的应用于经典多粒度粗糙集动态近似集更新方法进行了改进,提出了应用于变精度多粒度粗糙集的向量矩阵近似计算与更新方法。首先,提出了一种基于向量矩阵的变精度多粒度粗糙集的近似集静态计算算法;其次,重新考虑了变精度多粒度粗糙集近似集更新时的搜索区域,并根据变精度多粒度粗糙集的性质缩小了该区域,这能有效地提升近似集更新算法的时间效率;再次,根据新的搜索区域,在变精度多粒度粗糙集近似集静态计算算法的基础上,提出了一种新的变精度多粒度粗糙集近似集更新的向量矩阵算法;最后,进行实验验证了本文提出的算法的有效性
    corecore