642 research outputs found

    An Algorithm for Generating Non-Redundant Sequential Rules for Medical Time Series Data

    Get PDF
    In this paper, an algorithm for generating non-redundant sequential rules for the medical time series data is designed. This study is the continuation of my previous study titled �An Algorithm for Mining Closed Weighted Sequential Patterns with Flexing Time Interval for Medical Time Series Data� [25]. In my previous work, the sequence weight for each sequence was calculated based on the time interval between the itemsets.Subsequently, the candidate sequences were generated with flexible time intervals initially. The next step was, computation of frequent sequential patterns with the aid of proposed support measure. Next the frequent sequential patterns were subjected to closure checking process which leads to filter the closed sequential patterns with flexible time intervals. Finally, the methodology produced with necessary sequential patterns was proved. This methodology constructed closed sequential patterns which was 23.2% lesser than the sequential patterns. In this study, the sequential rules are generated based on the calculation of confidence value of the rule from the closed sequential pattern. Once the closed sequential rules are generated which are subjected to non-redundant checking process, that leads to produce the final set of non-redundant weighted closed sequential rules with flexible time intervals. This study produces non-redundant sequential rules which is 172.37% lesser than sequential rules

    Non-redundant sequential association rule mining based on closed sequential patterns

    Get PDF
    In many applications, e.g., bioinformatics, web access traces, system utilisation logs, etc., the data is naturally in the form of sequences. People have taken great interest in analysing the sequential data and finding the inherent characteristics or relationships within the data. Sequential association rule mining is one of the possible methods used to analyse this data. As conventional sequential association rule mining very often generates a huge number of association rules, of which many are redundant, it is desirable to find a solution to get rid of those unnecessary association rules. Because of the complexity and temporal ordered characteristics of sequential data, current research on sequential association rule mining is limited. Although several sequential association rule prediction models using either sequence constraints or temporal constraints have been proposed, none of them considered the redundancy problem in rule mining. The main contribution of this research is to propose a non-redundant association rule mining method based on closed frequent sequences and minimal sequential generators. We also give a definition for the non-redundant sequential rules, which are sequential rules with minimal antecedents but maximal consequents. A new algorithm called CSGM (closed sequential and generator mining) for generating closed sequences and minimal sequential generators is also introduced. A further experiment has been done to compare the performance of generating non-redundant sequential rules and full sequential rules, meanwhile, performance evaluation of our CSGM and other closed sequential pattern mining or generator mining algorithms has also been conducted. We also use generated non-redundant sequential rules for query expansion in order to improve recommendations for infrequently purchased products

    Mining Partially-Ordered Sequential Rules Common to Multiple Sequences

    Full text link
    Š 2015 IEEE. Sequential rule mining is an important data mining problem with multiple applications. An important limitation of algorithms for mining sequential rules common to multiple sequences is that rules are very specific and therefore many similar rules may represent the same situation. This can cause three major problems: (1) similar rules can be rated quite differently, (2) rules may not be found because they are individually considered uninteresting, and (3) rules that are too specific are less likely to be used for making predictions. To address these issues, we explore the idea of mining "partially-ordered sequential rules" (POSR), a more general form of sequential rules such that items in the antecedent and the consequent of each rule are unordered. To mine POSR, we propose the RuleGrowth algorithm, which is efficient and easily extendable. In particular, we present an extension (TRuleGrowth) that accepts a sliding-window constraint to find rules occurring within a maximum amount of time. A performance study with four real-life datasets show that RuleGrowth and TRuleGrowth have excellent performance and scalability compared to baseline algorithms and that the number of rules discovered can be several orders of magnitude smaller when the sliding-window constraint is applied. Furthermore, we also report results from a real application showing that POSR can provide a much higher prediction accuracy than regular sequential rules for sequence prediction

    Searching for patterns in political event sequences: Experiments with the KEDs database

    Get PDF
    This paper presents an empirical study on the possibility of discovering interesting event sequences and sequential rules in a large database of international political events. A data mining algorithm first presented by Mannila and Toivonen (1996), has been implemented and extended, which is able to search for generalized episodes in such event databases. Experiments conducted with this algorithm on the Kansas Event Data System (KEDS) database, an event data set covering interactions between countries in the Persian Gulf region, are described. Some qualitative and quantitative results are reported, and experiences with strategies for reducing the problem complexity and focusing on the search on interesting subsets of events are described

    Ekstraksi Fitur Menggunakan Metode Class Sequential Rules pada Product Review

    Get PDF
    Saat ini, e-commerce menyediakan halaman khusus bagi konsumen yang ingin menyampaikan ulasan mengenai produk yang dijual pada e-commerce tersebut (product review). Hal ini dapat membantu calon pembeli untuk mengambil keputusan apakah produk yang mereka ingin beli adalah produk bagus atau tidak. Selain itu, product review membantu penjual untuk mendapatkan feedback dari konsumen. Calon pembeli dan penjual dapat membaca ulasan satu per satu product review dan mengelompokkannya menjadi opini positif dan opini negatif. Namun, permasalahan yang terjadi adalah semakin hari jumlah product review semakin bertambah dan konsumen memberikan tanggapan positif dan negatif terhadap beberapa fitur suatu produk secara bersamaan (format bebas). Hal ini dapat menyulitkan penjual dan calon pembeli untuk melihat fitur yang mana yang positif atau negatif dari suatu produk. Oleh karena itu, dibutuhkan sistem yang dapat mengekstraksi fitur dari suatu produk dan mengklasifikasikan opini terhadap fitur tersebut secara otomatis. Metode Class Sequential Rules (CSR) dapat diimplementasikan pada proses ekstraksi fitur dan metode Opinion Lexicon dapat diimplementasikan pada proses klasifikasi opini. Hal ini terbukti dengan munculnya kumpulan fitur sebagai hasil ekstraksi fitur dan kumpulan pasangan fitur-polaritas sebagai hasil klasifikasi opini. Nilai f-score tertinggi dari ekstraksi fitur menggunakan CSR pada review beroformat bebas yaitu sebesar 51,26%. Sedangkan nilai f-score tertinggi pada klasifikasi opini menggunakan Opinion Lexicon pada review berformat bebas yaitu sebesar 35,65%. Kata kunci: ekstraksi fitur, klasifikasi opini, opinion mining, product review, Class Sequential Rule, Opinion Lexico

    Judgment aggregation by quota rules

    Get PDF
    It is known that majority voting among several individuals on logically interconnected propositions may generate irrational collective judgments. We generalize majority voting by considering quota rules, which accept each proposition if and only if the number of individuals accepting it exceeds some (proposition-specific) threshold. After characterizing quota rules, we prove necessary and sufficient conditions under which their outcomes satisfy various rationality conditions. We also consider sequential quota rules, which adjudicate propositions sequentially, letting earlier judgments constrain later ones. While ensuring rationality, sequential rules may be path-dependent. We characterize path-independence and prove its equivalence to strategy- proofness under mild conditions. Our results generalize earlier (im)possibility theorems.Judgment aggregation, quota rules, collective rationality, path-dependence, strategy-proofness, formal logic

    Learning to Identify Ambiguous and Misleading News Headlines

    Full text link
    Accuracy is one of the basic principles of journalism. However, it is increasingly hard to manage due to the diversity of news media. Some editors of online news tend to use catchy headlines which trick readers into clicking. These headlines are either ambiguous or misleading, degrading the reading experience of the audience. Thus, identifying inaccurate news headlines is a task worth studying. Previous work names these headlines "clickbaits" and mainly focus on the features extracted from the headlines, which limits the performance since the consistency between headlines and news bodies is underappreciated. In this paper, we clearly redefine the problem and identify ambiguous and misleading headlines separately. We utilize class sequential rules to exploit structure information when detecting ambiguous headlines. For the identification of misleading headlines, we extract features based on the congruence between headlines and bodies. To make use of the large unlabeled data set, we apply a co-training method and gain an increase in performance. The experiment results show the effectiveness of our methods. Then we use our classifiers to detect inaccurate headlines crawled from different sources and conduct a data analysis.Comment: Accepted by IJCAI 201
    • …
    corecore