17 research outputs found

    On Parallelization of the NIS-apriori Algorithm for Data Mining

    Get PDF
    We have been developing the getRNIA software tool for data mining under uncertain information. The getRNIA software tool is powered by the NIS-Apriori algorithm, which is a variation of the well-known Apriori algorithm. This paper considers the parallelization of the NIS-Apriori algorithm, and implements a part of this algorithm based on the Apache-Spark environment. We especially apply the implemented software to two data sets, the Mammographic data set and the Mushroom data set in order to show the property of the parallelization. Even though this parallelization was not so effective for the Mammographic data set, it was much more effective for the Mushroom data set.19th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems, September 7-9, 2015, Singapor

    NIS-Apriori Algorithm with a Target Descriptor for Handling Rules Supported by Minor Instances

    Get PDF
    For each implication τ: Condition_part⇒ Decision_part defined in table data sets, we see τ is a rule if τ satisfies appropriate constraints, i.e., support(τ)≥α and accuracy(τ)≥β for two threshold values α and β (0<α,β≤1 ). If τ is a rule for relatively high α , we say τ is supported by major instances. On the other hand, if τ is a rule for lower α , we say τ is supported by minor instances. This paper focuses on rules supported by minor instances, and clarifies some problems. Then, the NIS-Apriori algorithm, which was proposed for handling rules supported by major instances from tables with information incompleteness, is extended to the NIS-Apriori algorithm with a target descriptor. The effectiveness of the new algorithm is examined by some experiments.The seventh International Symposium on Integrated Uncertainty in Knowledge Modelling and Decision Making (IUKM 2019), 27 - 29 March, 2019, Nara, Japa

    NIS-Apriori-based rule generation with three-way decisions and its application system in SQL

    Get PDF
    In the study, non-deterministic information systems-Apriori-based (NIS-Apriori-based) rule generation from table data sets with incomplete information, SQL implementation, and the unique characteristics of the new framework are presented. Additionally, a few unsolved new research topics are proposed based on the framework. We follow the framework of NISs and propose certain rules and possible rules based on possible world semantics. Although each rule τ depends on a large number of possible tables, we prove that each rule τ is determined by examining only two τ -dependent possible tables. The NIS-Apriori algorithm is an adjusted Apriori algorithm that can handle such tables. Furthermore, it is logically sound and complete with regard to the rules. Subsequently, the implementation of the NIS-Apriori algorithm in SQL is described and a few new topics induced by effects of NIS-Apriori-based rule generation are confirmed. One of the topics that are considered is the possibility of estimating missing values via the obtained certain rules. The proposed methodology and the environment yielded by NIS-Apriori-based rule generation in SQL are useful for table data analysis with three-way decisions

    On NIS-Apriori Based Data Mining in SQL

    Get PDF
    We have proposed a framework of Rough Non-deterministic Information Analysis (RNIA) for tables with non-deterministic information, and applied RNIA to analyzing tables with uncertainty. We have also developed the RNIA software tool in Prolog and getRNIA in Python, in addition to these two tools we newly consider the RNIA software tool in SQL for handling large size data sets. This paper reports the current state of the prototype named NIS-Apriori in SQL, which will afford us more convenient environment for data analysis.International Joint Conference on Rough Sets (IJCRS 2016), October 7-11, 2016, Santiago, Chil

    A Proposal of a Privacy-preserving Questionnaire by Non-deterministic Information and Its Analysis

    Get PDF
    We focus on a questionnaire consisting of three-choice question or multiple-choice question, and propose a privacy-preserving questionnaire by non-deterministic information. Each respondent usually answers one choice from the multiple choices, and each choice is stored as a tuple in a table data. The organizer of this questionnaire analyzes the table data set, and obtains rules and the tendency. If this table data set contains personal information, the organizer needs to employ the analytical procedures with the privacy-preserving functionality. In this paper, we propose a new framework that each respondent intentionally answers non-deterministic information instead of deterministic information. For example, he answers ‘either A, B, or C’ instead of the actual choice A, and he intentionally dilutes his choice. This may be the similar concept on the k-anonymity. Non-deterministic information will be desirable for preserving each respondent\u27s information. We follow the framework of Rough Non-deterministic Information Analysis (RNIA), and apply RNIA to the privacy-preserving questionnaire by non-deterministic information. In the current data mining algorithms, the tuples with non-deterministic information may be removed based on the data cleaning process. However, RNIA can handle such tuples as well as the tuples with deterministic information. By using RNIA, we can consider new types of privacy-preserving questionnaire.2016 IEEE International Conference on Big Data, December 5-8, 2016, Washington DC, US

    A Proposal of Machine Learning by Rule Generation from Tables with Non-deterministic Information and Its Prototype System

    Get PDF
    A logical framework on Machine Learning by Rule Generation (MLRG) from tables with non-deterministic information is proposed, and its prototype system in SQL is implemented. In MLRG, the certain rules defined in Rough Non-deterministic Information Analysis (RNIA) are obtained at first, and each uncertain attribute value is estimated so as to cause the certain rules as many as possible, because the certain rules show us the most reliable information. This strategy is similar to the maximum likelihood estimation in statistics. By repeating this process, a standard table and the rules in its table are learned (or estimated) from a given table with non-deterministic information. Even though it will be hard to know the actual unknown values, MLRG will give a plausible estimation value.International Joint Conference on Rough Sets (IJCRS 2017), 3-7 July, 2017, Olsztyn, Polan

    An adjusted Apriori algorithm to itemsets defined by tables and an improved rule generator with three-way decisions

    Get PDF
    The NIS-Apriori algorithm, which is extended from the Apriori algorithm, was proposed for rule generation from non-deterministic information systems and implemented in SQL. The realized system handles the concept of certainty, possibility, and three-way decisions. This paper newly focuses on such a characteristic of table data sets that there is usually a fixed decision attribute. Therefore, it is enough for us to handle itemsets with one decision attribute, and we can see that one frequent itemset defines one implication. We make use of these characteristics and reduce the unnecessary itemsets for improving the performance of execution. Some experiments by the implemented software tool in Python clarify the improved performance.International Joint Conference on Rough Sets, IJCRS 2020, June 29 – July 3, 2020, Havana, Cuba (COVID-19の感染拡大によるオンライン開催に変更

    Families of the Granules for Association Rules and Their Properties

    Get PDF
    We employed the granule (or the equivalence class) defined by a descriptor in tables, and investigated rough set-based rule generation. In this paper, we consider the new granules defined by an implication, and propose a family of the granules defined by an implication in a table with exact data. Each family consists of the four granules, and we show that three criterion values, support, accuracy, and coverage, can easily be obtained by using the four granules. Then, we extend this framework to tables with non-deterministic data. In this case, each family consists of the nine granules, and the minimum and the maximum values of three criteria are also obtained by using the nine granules. We prove that there is a table causing support and accuracy the minimum, and generally there is no table causing support, accuracy, and coverage the minimum. Finally, we consider the application of these properties to Apriori-based rule generation from uncertain data. These properties will make Apriori-based rule generation more effective.10th International Conference, RSKT 2015, Held as Part of the International Joint Conference on Rough Sets, IJCRS 2015, November 20-23, 2015, Tianjin, Chin

    Rough Set-Based Information Dilution by Non-deterministic Information

    Get PDF
    We have investigated rough set-based concepts for a given Non-deterministic Information System (NIS). In this paper, we consider generating a NIS from a Deterministic Information System (DIS) intentionally. A NIS Φ is seen as a diluted DIS ϕ, and we can hide the actual values in ϕ by using Φ. We name this way of hiding Information Dilution by non-deterministic information. This paper considers information dilution and its application to hiding the actual values in a table.14th International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing, RSFDGrC 2013, October 11-14, 2013, Halifax, NS, Canad

    Granules for Association Rules and Decision Support in the getRNIA System

    Get PDF
    This paper proposes granules for association rules in Deterministic Information Systems (DISs) and Non-deterministic Information Systems (NISs). Granules for an association rule are defined for every implication, and give us a new methodology for knowledge discovery and decision support. We see that decision support based on a table under the condition P is to fix the decision Q by using the most proper association rule P〵Rightarrow Q. We recently implemented a system getRNIA powered by granules for association rules. This paper describes how the getRNIA system deals with decision support under uncertainty, and shows some results of the experiment
    corecore