619 research outputs found

    Incremental Perspective for Feature Selection Based on Fuzzy Rough Sets

    Get PDF

    A scalable approach to fuzzy rough nearest neighbour classification with ordered weighted averaging operators

    Get PDF
    Fuzzy rough sets have been successfully applied in classification tasks, in particular in combination with OWA operators. There has been a lot of research into adapting algorithms for use with Big Data through parallelisation, but no concrete strategy exists to design a Big Data fuzzy rough sets based classifier. Existing Big Data approaches use fuzzy rough sets for feature and prototype selection, and have often not involved very large datasets. We fill this gap by presenting the first Big Data extension of an algorithm that uses fuzzy rough sets directly to classify test instances, a distributed implementation of FRNN-OWA in Apache Spark. Through a series of systematic tests involving generated datasets, we demonstrate that it can achieve a speedup effectively equal to the number of computing cores used, meaning that it can scale to arbitrarily large datasets

    An Intelligent Decision Support System for Business IT Security Strategy

    Get PDF
    Cyber threat intelligence (CTI) is an emerging approach to improve cyber security of business IT environment. It has information of an a ected business IT context. CTI sharing tools are available for subscribers, and CTI feeds are increasingly available. If another business IT context is similar to a CTI feed context, the threat described in the CTI feed might also take place in the business IT context. Businesses can take proactive defensive actions if relevant CTI is identi ed. However, a challenge is how to develop an e ective connection strategy for CTI onto business IT contexts. Businesses are still insu ciently using CTI because not all of them have su cient knowledge from domain experts. Moreover, business IT contexts vary over time. When the business IT contextual states have changed, the relevant CTI might be no longer appropriate and applicable. Another challenge is how a connection strategy has the ability to adapt to the business IT contextual changes. To ll the gap, in this Ph.D project, a dynamic connection strategy for CTI onto business IT contexts is proposed and the strategy is instantiated to be a dynamic connection rule assembly system. The system can identify relevant CTI for a business IT context and can modify its internal con gurations and structures to adapt to the business IT contextual changes. This thesis introduces the system development phases from design to delivery, and the contributions to knowledge are explained as follows. A hybrid representation of the dynamic connection strategy is proposed to generalise and interpret the problem domain and the system development. The representation uses selected computational intelligence models and software development models. In terms of the computational intelligence models, a CTI feed context and a business IT context are generalised to be the same type, i.e., context object. Grey number model is selected to represent the attribute values of context objects. Fuzzy sets are used to represent the context objects, and linguistic densities of the attribute values of context objects are reasoned. To assemble applicable connection knowledge, the system constructs a set of connection objects based on the context objects and uses rough set operations to extract applicable connection objects that contain the connection knowledge. Furthermore, to adapt to contextual changes, a rough set based incremental updating approach with multiple operations is developed to incrementally update the approximations. A set of propositions are proposed to describe how the system changes based on the previous states and internal structures of the system, and their complexities and e ciencies are analysed. In terms of the software development models, some uni ed modelling language (UML) models are selected to represent the system in design phase. Activity diagram is used to represent the business process of the system. Use case diagram is used to represent the human interactions with the system. Class diagram is used to represent the internal components and relationships between them. Using the representation, developers can develop a prototype of the system rapidly. Using the representation, an application of the system is developed using mainstream software development techniques. RESTful software architecture is used for the communication of the business IT contextual information and the analysis results using CTI between the server and the clients. A script based method is deployed in the clients to collect the contextual information. Observer pattern and a timer are used for the design and development of the monitor-trigger mechanism. In summary, the representation generalises real-world cases in the problem domain and interprets the system data. A speci c business can initialise an instance of the representation to be a speci c system based on its IT context and CTI feeds, and the knowledge assembled by the system can be used to identify relevant CTI feeds. From the relevant CTI data, the system locates and retrieves the useful information that can inform security decisions and then sends it to the client users. When the system needs to modify itself to adapt to the business IT contextual changes, the system can invoke the corresponding incremental updating functions and avoid a time-consuming re-computation. With this updating strategy, the application can provide its users in the client side with timely support and useful information that can inform security decisions using CTI

    Active Sample Selection Based Incremental Algorithm for Attribute Reduction with Rough Sets

    Get PDF
    Attribute reduction with rough sets is an effective technique for obtaining a compact and informative attribute set from a given dataset. However, traditional algorithms have no explicit provision for handling dynamic datasets where data present themselves in successive samples. Incremental algorithms for attribute reduction with rough sets have been recently introduced to handle dynamic datasets with large samples, though they have high complexity in time and space. To address the time/space complexity issue of the algorithms, this paper presents a novel incremental algorithm for attribute reduction with rough sets based on the adoption of an active sample selection process and an insight into the attribute reduction process. This algorithm first decides whether each incoming sample is useful with respect to the current dataset by the active sample selection process. A useless sample is discarded while a useful sample is selected to update a reduct. At the arrival of a useful sample, the attribute reduction process is then employed to guide how to add and/or delete attributes in the current reduct. The two processes thus constitute the theoretical framework of our algorithm. The proposed algorithm is finally experimentally shown to be efficient in time and space

    EEG-Based Biometric Authentication Modelling Using Incremental Fuzzy-Rough Nearest Neighbour Technique

    Get PDF
    This paper proposes an Incremental Fuzzy-Rough Nearest Neighbour (IncFRNN) technique for biometric authentication modelling using feature extracted visual evoked. Only small training set is needed for model initialisation. The embedded heuristic update method adjusts the knowledge granules incrementally to maintain all representative electroencephalogram (EEG) signal patterns and eliminate those rarely used. It reshapes the personalized knowledge granules through insertion and deletion of a test object, based on similarity measures. A predefined window size can be used to reduce the overall processing time. This proposed algorithm was verified with test data from 37 healthy subjects. Signal pre-processing steps on segmentation, filtering and artefact rejection were carried out to improve the data quality before model building. The experimental paradigm was designed in three different conditions to evaluate the authentication performance of the IncFRNN technique against the benchmarked incremental K-Nearest Neighbour (KNN) technique. The performance was measured in terms of accuracy, area under the Receiver Operating Characteristic (ROC) curve (AUC) and Cohen's Kappa coefficient. The proposed IncFRNN technique is proven to be statistically better than the KNN technique in the controlled window size environment. Future work will focus on the use of dynamic data features to improve the robustness of the proposed model

    Selecting Informative Features with Fuzzy-Rough Sets and its Application for Complex Systems Monitoring

    Get PDF
    One of the main obstacles facing current intelligent pattern recognition appli-cations is that of dataset dimensionality. To enable these systems to be effective, a redundancy-removing step is usually carried out beforehand. Rough Set Theory (RST) has been used as such a dataset pre-processor with much success, however it is reliant upon a crisp dataset; important information may be lost as a result of quantization of the underlying numerical features. This paper proposes a feature selection technique that employs a hybrid variant of rough sets, fuzzy-rough sets, to avoid this information loss. The current work retains dataset semantics, allowing for the creation of clear, readable fuzzy models. Experimental results, of applying the present work to complex systems monitoring, show that fuzzy-rough selection is more powerful than conventional entropy-based, PCA-based and random-based methods. Key words: feature selection; feature dependency; fuzzy-rough sets; reduct search; rule induction; systems monitoring.
    corecore