47 research outputs found

    A DISTANCE BASED INCREMENTAL FILTER-WRAPPER ALGORITHM FOR FINDING REDUCT IN INCOMPLETE DECISION TABLES

    Get PDF
    Tolerance rough set model is an effective tool for attribute reduction in incomplete decision tables. In recent years, some incremental algorithms have been proposed to find reduct of dynamic incomplete decision tables in order to reduce computation time. However, they are classical filter algorithms, in which the classification accuracy of decision tables is computed after obtaining reduct. Therefore, the obtained reducts of these algorithms are not optimal on cardinality of reduct and classification accuracy. In this paper, we propose the incremental filter-wrapper algorithm IDS_IFW_AO to find one reduct of an incomplete desision table in case of adding multiple objects. The experimental results on some sample datasets show that the proposed filter-wrapper algorithm IDS_IFW_AO is more effective than the filter algorithm IARM-I [17] on classification accuracy and cardinality of reduc

    Internet-based solutions to support distributed manufacturing

    Get PDF
    With the globalisation and constant changes in the marketplace, enterprises are adapting themselves to face new challenges. Therefore, strategic corporate alliances to share knowledge, expertise and resources represent an advantage in an increasing competitive world. This has led the integration of companies, customers, suppliers and partners using networked environments. This thesis presents three novel solutions in the tooling area, developed for Seco tools Ltd, UK. These approaches implement a proposed distributed computing architecture using Internet technologies to assist geographically dispersed tooling engineers in process planning tasks. The systems are summarised as follows. TTS is a Web-based system to support engineers and technical staff in the task of providing technical advice to clients. Seco sales engineers access the system from remote machining sites and submit/retrieve/update the required tooling data located in databases at the company headquarters. The communication platform used for this system provides an effective mechanism to share information nationwide. This system implements efficient methods, such as data relaxation techniques, confidence score and importance levels of attributes, to help the user in finding the closest solutions when specific requirements are not fully matched In the database. Cluster-F has been developed to assist engineers and clients in the assessment of cutting parameters for the tooling process. In this approach the Internet acts as a vehicle to transport the data between users and the database. Cluster-F is a KD approach that makes use of clustering and fuzzy set techniques. The novel proposal In this system is the implementation of fuzzy set concepts to obtain the proximity matrix that will lead the classification of the data. Then hierarchical clustering methods are applied on these data to link the closest objects. A general KD methodology applying rough set concepts Is proposed In this research. This covers aspects of data redundancy, Identification of relevant attributes, detection of data inconsistency, and generation of knowledge rules. R-sets, the third proposed solution, has been developed using this KD methodology. This system evaluates the variables of the tooling database to analyse known and unknown relationships in the data generated after the execution of technical trials. The aim is to discover cause-effect patterns from selected attributes contained In the database. A fourth system was also developed. It is called DBManager and was conceived to administrate the systems users accounts, sales engineers’ accounts and tool trial monitoring process of the data. This supports the implementation of the proposed distributed architecture and the maintenance of the users' accounts for the access restrictions to the system running under this architecture

    Advances in Data Mining Knowledge Discovery and Applications

    Get PDF
    Advances in Data Mining Knowledge Discovery and Applications aims to help data miners, researchers, scholars, and PhD students who wish to apply data mining techniques. The primary contribution of this book is highlighting frontier fields and implementations of the knowledge discovery and data mining. It seems to be same things are repeated again. But in general, same approach and techniques may help us in different fields and expertise areas. This book presents knowledge discovery and data mining applications in two different sections. As known that, data mining covers areas of statistics, machine learning, data management and databases, pattern recognition, artificial intelligence, and other areas. In this book, most of the areas are covered with different data mining applications. The eighteen chapters have been classified in two parts: Knowledge Discovery and Data Mining Applications

    LearnFCA: A Fuzzy FCA and Probability Based Approach for Learning and Classification

    Get PDF
    Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering. This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide a literature review of it’s applications and various approaches adopted by researchers in the areas of dataanalysis, knowledge management with emphasis to data-learning and classification problems. We propose LearnFCA, a novel approach based on FuzzyFCA and probability theory for learning and classification problems. LearnFCA uses an enhanced version of FuzzyLattice which has been developed to store class labels and probability vectors and has the capability to be used for classifying instances with encoded and unlabelled features. We evaluate LearnFCA on encodings from three datasets - mnist, omniglot and cancer images with interesting results and varying degrees of success. Adviser: Dr Jitender Deogu

    Front Matter - Soft Computing for Data Mining Applications

    Get PDF
    Efficient tools and algorithms for knowledge discovery in large data sets have been devised during the recent years. These methods exploit the capability of computers to search huge amounts of data in a fast and effective manner. However, the data to be analyzed is imprecise and afflicted with uncertainty. In the case of heterogeneous data sources such as text, audio and video, the data might moreover be ambiguous and partly conflicting. Besides, patterns and relationships of interest are usually vague and approximate. Thus, in order to make the information mining process more robust or say, human-like methods for searching and learning it requires tolerance towards imprecision, uncertainty and exceptions. Thus, they have approximate reasoning capabilities and are capable of handling partial truth. Properties of the aforementioned kind are typical soft computing. Soft computing techniques like Genetic

    Advances in knowledge discovery and data mining Part II

    Get PDF
    19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p

    Enhancing Exploratory Analysis across Multiple Levels of Detail of Spatiotemporal Events

    Get PDF
    Crimes, forest fires, accidents, infectious diseases, human interactions with mobile devices (e.g., tweets) are being logged as spatiotemporal events. For each event, its spatial location, time and related attributes are known with high levels of detail (LoDs). The LoD of analysis plays a crucial role in the user’s perception of phenomena. From one LoD to another, some patterns can be easily perceived or different patterns may be detected, thus requiring modeling phenomena at different LoDs as there is no exclusive LoD to study them. Granular computing emerged as a paradigm of knowledge representation and processing, where granules are basic ingredients of information. These can be arranged in a hierarchical alike structure, allowing the same phenomenon to be perceived at different LoDs. This PhD Thesis introduces a formal Theory of Granularities (ToG) in order to have granules defined over any domain and reason over them. This approach is more general than the related literature because these appear as particular cases of the proposed ToG. Based on this theory we propose a granular computing approach to model spatiotemporal phenomena at multiple LoDs, and called it a granularities-based model. This approach stands out from the related literature because it models a phenomenon through statements rather than just using granules to model abstract real-world entities. Furthermore, it formalizes the concept of LoD and follows an automated approach to generalize a phenomenon from one LoD to a coarser one. Present-day practices work on a single LoD driven by the users despite the fact that the identification of the suitable LoDs is a key issue for them. This PhD Thesis presents a framework for SUmmarizIng spatioTemporal Events (SUITE) across multiple LoDs. The SUITE framework makes no assumptions about the phenomenon and the analytical task. A Visual Analytics approach implementing the SUITE framework is presented, which allow users to inspect a phenomenon across multiple LoDs, simultaneously, thus helping to understand in what LoDs the phenomenon perception is different or in what LoDs patterns emerge
    corecore