Search CORE

99,480 research outputs found

Reductions for Frequency-Based Data Mining Problems

Author: Miettinen Pauli
Neumann Stefan
Publication venue
Publication date: 01/01/2017
Field of study

Studying the computational complexity of problems is one of the - if not the - fundamental questions in computer science. Yet, surprisingly little is known about the computational complexity of many central problems in data mining. In this paper we study frequency-based problems and propose a new type of reduction that allows us to compare the complexities of the maximal frequent pattern mining problems in different domains (e.g. graphs or sequences). Our results extend those of Kimelfeld and Kolaitis [ACM TODS, 2014] to a broader range of data mining problems. Our results show that, by allowing constraints in the pattern space, the complexities of many maximal frequent pattern mining problems collapse. These problems include maximal frequent subgraphs in labelled graphs, maximal frequent itemsets, and maximal frequent subsequences with no repetitions. In addition to theoretical interest, our results might yield more efficient algorithms for the studied problems.Comment: This is an extended version of a paper of the same title to appear in the Proceedings of the 17th IEEE International Conference on Data Mining (ICDM'17

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Using Answer Set Programming for pattern mining

Author: Guyet Thomas
Moinard Yves
Quiniou René
Publication venue
Publication date: 11/06/2014
Field of study

Serial pattern mining consists in extracting the frequent sequential patterns from a unique sequence of itemsets. This paper explores the ability of a declarative language, such as Answer Set Programming (ASP), to solve this issue efficiently. We propose several ASP implementations of the frequent sequential pattern mining task: a non-incremental and an incremental resolution. The results show that the incremental resolution is more efficient than the non-incremental one, but both ASP programs are less efficient than dedicated algorithms. Nonetheless, this approach can be seen as a first step toward a generic framework for sequential pattern mining with constraints.Comment: Intelligence Artificielle Fondamentale (2014

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Co-prescription patterns of cardiovascular preventive treatments: A cross-sectional study in the Aragon worker' health study (Spain)

Author: Aguilar-Palacio I.
Feja C.
González J.
Lallana M.
Malo S.
Moreno-Franco B.
Rabanaque M.
Publication venue: 'BMJ'
Publication date: 01/01/2019
Field of study

Objectives: To identify cardiovascular disease (CVD) preventive treatments combinations, among them and with other drugs, and to determine their prevalence in a cohort of Spanish workers. Design: Cross-sectional study. Setting Aragon Workers'' Health Study (AWHS) cohort in Spain. Participants 5577 workers belonging to AWHS cohort. From these subjects, we selected those that had, at least, three prescriptions of the same therapeutic subgroup in 2014 (n=4605). Primary and secondary outcome measures Drug consumption was obtained from the Aragon Pharmaceutical Consumption Registry (Farmasalud). In order to know treatment utilisation, prevalence analyses were conducted. Frequent item set mining techniques were applied to identify drugs co-prescription patterns. All the results were stratified by sex and age. Results: 42.3% of men and 18.8% of women in the cohort received, at least, three prescriptions of a CVD preventive treatment in 2014. The most prescribed CVD treatment were antihypertensives (men: 28.2%, women 9.2%). The most frequent association observed among CVD preventive treatment was agents acting on the renin-angiotensin system and lipid-lowering drugs (5.1% of treated subjects). Co-prescription increased with age, especially after 50 years old, both in frequency and number of associations, and was higher in men. Regarding the association between CVD preventive treatments and other drugs, the most frequent pattern observed was lipid-lowering drugs and drugs used for acid related disorders (4.2% of treated subjects). Conclusions: There is an important number of co-prescription patterns that involve CVD preventive treatments. These patterns increase with age and are more frequent in men. Mining techniques are a useful tool to identify pharmacological patterns that are not evident in the individual clinical practice, in order to improve drug prescription appropriateness

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Universidad de Zaragoza

An efficient parallel method for mining frequent closed sequential patterns

Author: Huynh Bao
Snášel Václav
Vo Bay
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Mining frequent closed sequential pattern (FCSPs) has attracted a great deal of research attention, because it is an important task in sequences mining. In recently, many studies have focused on mining frequent closed sequential patterns because, such patterns have proved to be more efficient and compact than frequent sequential patterns. Information can be fully extracted from frequent closed sequential patterns. In this paper, we propose an efficient parallel approach called parallel dynamic bit vector frequent closed sequential patterns (pDBV-FCSP) using multi-core processor architecture for mining FCSPs from large databases. The pDBV-FCSP divides the search space to reduce the required storage space and performs closure checking of prefix sequences early to reduce execution time for mining frequent closed sequential patterns. This approach overcomes the problems of parallel mining such as overhead of communication, synchronization, and data replication. It also solves the load balance issues of the workload between the processors with a dynamic mechanism that re-distributes the work, when some processes are out of work to minimize the idle CPU time.Web of Science5174021739

DSpace at VSB Technical University of Ostrava

Constraint-based Sequential Pattern Mining with Decision Diagrams

Author: Cire Andre A.
Hosseininasab Amin
van Hoeve Willem-Jan
Publication venue
Publication date: 14/11/2018
Field of study

Constrained sequential pattern mining aims at identifying frequent patterns on a sequential database of items while observing constraints defined over the item attributes. We introduce novel techniques for constraint-based sequential pattern mining that rely on a multi-valued decision diagram representation of the database. Specifically, our representation can accommodate multiple item attributes and various constraint types, including a number of non-monotone constraints. To evaluate the applicability of our approach, we develop an MDD-based prefix-projection algorithm and compare its performance against a typical generate-and-check variant, as well as a state-of-the-art constraint-based sequential pattern mining algorithm. Results show that our approach is competitive with or superior to these other methods in terms of scalability and efficiency.Comment: AAAI201

arXiv.org e-Print Archive

University of Toronto Research Repository

Association for the Advancement of Artificial Intelligence: AAAI Publications