Search CORE

326 research outputs found

LC an effective classification based association rule mining algorithm

Author: Mahmood Qazafi
Publication venue
Publication date
Field of study

Classification using association rules is a research field in data mining that primarily uses association rule discovery techniques in classification benchmarks. It has been confirmed by many research studies in the literature that classification using association tends to generate more predictive classification systems than traditional classification data mining techniques like probabilistic, statistical and decision tree. In this thesis, we introduce a novel data mining algorithm based on classification using association called “Looking at the Class” (LC), which can be used in for mining a range of classification data sets. Unlike known algorithms in classification using the association approach such as Classification based on Association rule (CBA) system and Classification based on Predictive Association (CPAR) system, which merge disjoint items in the rule learning step without anticipating the class label similarity, the proposed algorithm merges only items with identical class labels. This saves too many unnecessary items combining during the rule learning step, and consequently results in large saving in computational time and memory. Furthermore, the LC algorithm uses a novel prediction procedure that employs multiple rules to make the prediction decision instead of a single rule. The proposed algorithm has been evaluated thoroughly on real world security data sets collected using an automated tool developed at Huddersfield University. The security application which we have considered in this thesis is about categorizing websites based on their features to legitimate or fake which is a typical binary classification problem. Also, experimental results on a number of UCI data sets have been conducted and the measures used for evaluation is the classification accuracy, memory usage, and others. The results show that LC algorithm outperformed traditional classification algorithms such as C4.5, PART and Naïve Bayes as well as known classification based association algorithms like CBA with respect to classification accuracy, memory usage, and execution time on most data sets we consider

University of Huddersfield Repository

Web usage mining for click fraud detection

Author: Neves André Pacheco Pereira
Publication venue
Publication date: 01/01/2010
Field of study

Estágio realizado na AuditMark e orientado pelo Eng.º Pedro FortunaTese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201

Repositório Aberto da Universidade do Porto

Data stream mining: from theory to applications and from stationary to mobile

Author: Gaber M.
Gama J.
Krishnaswamy S.
Publication venue
Publication date: 22/03/2010
Field of study

Portsmouth University Research Portal (Pure)

Proceedings of the ECMLPKDD 2015 Doctoral Consortium

Author: Hollmén Jaakko (editor)
Papapetrou Panagiotis (editor)
Publication venue: Aalto-yliopisto
Publication date: 01/01/2015
Field of study

ECMLPKDD 2015 Doctoral Consortium was organized for the second time as part of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD), organised in Porto during September 7-11, 2015. The objective of the doctoral consortium is to provide an environment for students to exchange their ideas and experiences with peers in an interactive atmosphere and to get constructive feedback from senior researchers in machine learning, data mining, and related areas. These proceedings collect together and document all the contributions of the ECMLPKDD 2015 Doctoral Consortium

Aaltodoc Publication Archive

SHELDON Smart habitat for the elderly.

Author: Burnard Micheal
Isaacson Michal
Kaner Jake
Lameski Petre
Maestre Rafael
Maresova Petra
Melero Francisco
Taveter Kuldar
Tomesone Signe
Publication venue: Cetem Technology
Publication date: 12/06/2019
Field of study

An insightful document concerning active and assisted living under different perspectives: Furniture and habitat, ICT solutions and Healthcare

Bucks New University: Bucks Knowledge Archive

Using contextual information to understand searching and browsing behavior

Author: Kiseleva Y.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2015
Field of study

There is great imbalance in the richness of information on the web and the succinctness and poverty of search requests of web users, making their queries only a partial description of the underlying complex information needs. Finding ways to better leverage contextual information and make search context-aware holds the promise to dramatically improve the search experience of users. We conducted a series of studies to discover, model and utilize contextual information in order to understand and improve users' searching and browsing behavior on the web. Our results capture important aspects of context under the realistic conditions of different online search services, aiming to ensure that our scientific insights and solutions transfer to the operational settings of real world applications

Repository TU/e

Crossref

Pure OAI Repository

Mining Butterflies in Streaming Graphs

Author: Sheshbolouki Aida
Publication venue: 'University of Waterloo'
Publication date: 15/05/2023
Field of study

This thesis introduces two main-memory systems sGrapp and sGradd for performing the fundamental analytic tasks of biclique counting and concept drift detection over a streaming graph. A data-driven heuristic is used to architect the systems. To this end, initially, the growth patterns of bipartite streaming graphs are mined and the emergence principles of streaming motifs are discovered. Next, the discovered principles are (a) explained by a graph generator called sGrow; and (b) utilized to establish the requirements for efficient, effective, explainable, and interpretable management and processing of streams. sGrow is used to benchmark stream analytics, particularly in the case of concept drift detection. sGrow displays robust realization of streaming growth patterns independent of initial conditions, scale and temporal characteristics, and model configurations. Extensive evaluations confirm the simultaneous effectiveness and efficiency of sGrapp and sGradd. sGrapp achieves mean absolute percentage error up to 0.05/0.14 for the cumulative butterfly count in streaming graphs with uniform/non-uniform temporal distribution and a processing throughput of 1.5 million data records per second. The throughput and estimation error of sGrapp are 160x higher and 0.02x lower than baselines. sGradd demonstrates an improving performance over time, achieves zero false detection rates when there is not any drift and when drift is already detected, and detects sequential drifts in zero to a few seconds after their occurrence regardless of drift intervals

University of Waterloo's Institutional Repository