Search CORE

89 research outputs found

Mining High Utility Itemsets with Regular Occurrence

Author: Amphawan Komate
Jitpattanakul Anuchit
Lenca Philippe
Surarerks Athasit
Publication venue: LPPM ITBis Lembah Dempo
Publication date: 31/08/2016
Field of study

High utility itemset mining (HUIM) plays an important role in the data mining community and in a wide range of applications. For example, in retail business it is used for finding sets of sold products that give high profit, low cost, etc. These itemsets can help improve marketing strategies, make promotions/ advertisements, etc. However, since HUIM only considers utility values of items/itemsets, it may not be sufficient to observe product-buying behavior of customers such as information related to "regular purchases of sets of products having a high profit margin". To address this issue, the occurrence behavior of itemsets (in the term of regularity) simultaneously with their utility values was investigated. Then, the problem of mining high utility itemsets with regular occurrence (MHUIR) to find sets of co-occurrence items with high utility values and regular occurrence in a database was considered. An efficient single-pass algorithm, called MHUIRA, was introduced. A new modified utility-list structure, called NUL, was designed to efficiently maintain utility values and occurrence information and to increase the efficiency of computing the utility of itemsets. Experimental studies on real and synthetic datasets and complexity analyses are provided to show the efficiency of MHUIRA combined with NUL in terms of time and space usage for mining interesting itemsets based on regularity and utility constraints

Journal of ICT Research and Applications

Directory of Open Access Journals

HAL-Université de Bretagne Occidentale

ITB Journal

Mining top-k regular episodes from sensor streams

Author: Amphawan Komate
Lenca Philippe
Soulas Julie
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

International audienceThe monitoring of human activities plays an important role in health-care applications and for the data mining community. Existing approaches work on activities recognition occurring in sensor data streams. However, regular behaviors have not been studied. Thus, we here introduce a new approach to discover top-k most regular episodes from sensors streams, TKRES. The top-k approach allows us to control the size of the output, thus preventing overwhelming result analysis for the supervisor. TKRES is based on the use of a simple top-k list and a k-tree structure for maintaining the top-k episodes and their occurrence information. We also investigate and report the performances of TKRES on two real-life smart home datasets

Elsevier - Publisher Connector

Crossref

HAL-Université de Bretagne Occidentale

Improving Risk Predictions by Preprocessing Imbalanced Credit Data

Author: B. Tian
C. Bunkhumpornpat
C. Phua
D.L. Wilson
G.E.A.P.A. Batista
I. Brown
J. Demšar
J. Laurikkala
K. Kennedy
L.C. Thomas
N. Japkowicz
N.M. Kiefer
N.V. Chawla
P.E. Hart
S.J. Yen
V. Vinciotti
Y.M. Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Imbalanced credit data sets refer to databases in which the class of defaulters is heavily under-represented in comparison to the class of non-defaulters. This is a very common situation in real-life credit scoring applications, but it has still received little attention. This paper investigates whether data resampling can be used to improve the performance of learners built from imbalanced credit data sets, and whether the effectiveness of resampling is related to the type of classifier. Experimental results demonstrate that learning with the resampled sets consistently outperforms the use of the original imbalanced credit data, independently of the classifier used

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Document Clustering with Bursty Information

Author: Chaoji Vineet
Hoonlor Apirak
Szymanski Bolesław K.
Zaki Mohamed J.
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 30/01/2013
Field of study

Nowadays, almost all text corpora, such as blogs, emails and RSS feeds, are a collection of text streams. The traditional vector space model (VSM), or bag-of-words representation, cannot capture the temporal aspect of these text streams. So far, only a few bursty features have been proposed to create text representations with temporal modeling for the text streams. We propose bursty feature representations that perform better than VSM on various text mining tasks, such as document retrieval, topic modeling and text categorization. For text clustering, we propose a novel framework to generate bursty distance measure. We evaluated it on UPGMA, Star and K-Medoids clustering algorithms. The bursty distance measure did not only perform equally well on various text collections, but it was also able to cluster the news articles related to specific events much better than other models

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

On the Selection of Meaningful Association Rules

Author: Rangsipan Marukatat
Publication venue: 'IntechOpen'
Publication date: 01/01/2009
Field of study

IntechOpen

Detecting Attacks Against Deep Reinforcement Learning for Autonomous Driving

Author: AMBRA DEMONTIS
ANGELO SOTGIU
BATTISTA BIGGIO
CHENGFANG FANG
HSIAO-YING LIN
LUCA DEMETRIO
MAURA PINTOR
Publication venue
Publication date: 01/01/2023
Field of study

With the advent of deep reinforcement learning, we witness the spread of novel autonomous driving agents that learn how to drive safely among humans. However, skilled attackers might steer the decision-making process of these agents through minimal perturbations applied to the readings of their hardware sensors. These force the behavior of the victim agent to change unexpectedly, increasing the likelihood of crashes by inhibiting its braking capability or coercing it into constantly changing lanes. To counter these phenomena, we propose a detector that can be mounted on autonomous driving cars to spot the presence of ongoing attacks. The detector first profiles the agent's behavior without attacks by looking at the representation learned during training. Once deployed, the detector discards all the decisions that deviate from the regular driving pattern. We empirically highlight the detection capabilities of our work by testing it against unseen attacks deployed with increasing strength

Archivio istituzionale della ricerca - Università di Cagliari

C-Rex: A Comprehensive System for Recommending In-Text Citations with Explanations

Author: Cartus Isabela Bragaglia
Celis Sebastian
Duma Maria
Färber Michael
Zinecker Vinzenz
Publication venue: Association for Computing Machinery
Publication date: 01/01/2021
Field of study

Finding suitable citations for scientific publications can be challenging and time-consuming. To this end, context-aware citation recommendation approaches that recommend publications as candidates for in-text citations have been developed. In this paper, we present C-Rex, a web-based demonstration system available at http://c-rex.org for context-aware citation recommendation based on the Neural Citation Network [5] and millions of publications from the Microsoft Academic Graph. Our system is one of the first online context-aware citation recommendation systems and the first to incorporate not only a deep learning recommendation approach, but also explanation components to help users better understand why papers were recommended. In our offline evaluation, our model performs similarly to the one presented in the original paper and can serve as a basic framework for further implementations. In our online evaluation, we found that the explanations of recommendations increased users’ satisfaction

KITopen