Search CORE

1,300 research outputs found

CASP-DM: Context Aware Standard Process for Data Mining

Author: Contreras-Ochando Lidia
Ferri Cèsar
Flach Peter
Hernández-Orallo José
Kull Meelis
Lachiche Nicolas
Martínez-Plumed Fernando
Ramírez-Quintana María José
Publication venue
Publication date: 19/09/2017
Field of study

We propose an extension of the Cross Industry Standard Process for Data Mining (CRISPDM) which addresses specific challenges of machine learning and data mining for context and model reuse handling. This new general context-aware process model is mapped with CRISP-DM reference model proposing some new or enhanced outputs

arXiv.org e-Print Archive

Explore Bristol Research

Working Paper 08-00 - The NIME Model - Specification and Estimation of the Demand Equations of the Household Sector

Author: Eric Meyermans
Patrick Van Brusselen
Publication venue
Publication date
Field of study

Research Papers in Economics

Mining geo-referenced databases: a way to improve decision-making

Author: Amaral Luís
Santos Maribel Yasmina
Publication venue: Idea Group Publishing
Publication date: 01/01/2005
Field of study

Knowledge discovery in databases is a process that aims at the discovery of associations within data sets. The analysis of geo-referenced data demands a particular approach in this process. This chapter presents a new approach to the process of knowledge discovery, in which qualitative geographic identifiers give the positional aspects of geographic data. Those identifiers are manipulated using qualitative reasoning principles, which allows for the inference of new spatial relations required for the data mining step of the knowledge discovery process. The efficacy and usefulness of the implemented system — PADRÃO — has been tested with a bank dataset. The results obtained support that traditional knowledge discovery systems, developed for relational databases and not having semantic knowledge linked to spatial data, can be used in the process of knowledge discovery in geo-referenced databases, since some of this semantic knowledge and the principles of qualitative spatial reasoning are available as spatial domain knowledge

Universidade do Minho: RepositoriUM

Combining data mining and text mining for detection of early stage dementia:the SAMS framework

Author: Asfiandy Dommy
Ballard Clive
Bull Christopher Neil
Burns Alistair
Couth Samuel
Gledson Ann
Keane John
Leroi Iracema
Mellor Joseph
Rayson Paul Edward
Sawyer Peter Harvey
Stringer Gemma
Sutcliffe Alistair Gordon Simpson
Zeng Xiao-Jun
Publication venue: European Language Resources Association (ELRA)
Publication date: 23/05/2016
Field of study

In this paper, we describe the open-source SAMS framework whose novelty lies in bringing together both data collection (keystrokes, mouse movements, application pathways) and text collection (email, documents, diaries) and analysis methodologies. The aim of SAMS is to provide a non-invasive method for large scale collection, secure storage, retrieval and analysis of an individual’s computer usage for the detection of cognitive decline, and to infer whether this decline is consistent with the early stages of dementia. The framework will allow evaluation and study by medical professionals in which data and textual features can be linked to deficits in cognitive domains that are characteristic of dementia. Having described requirements gathering and ethical concerns in previous papers, here we focus on the implementation of the data and text collection components

Lancaster E-Prints

Finding and tracking multi-density clusters in an online dynamic data stream

Author: Fahy Conor
Yang Shengxiang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/05/2019
Field of study

The file attached to this record is the author's final peer reviewed version.Change is one of the biggest challenges in dynamic stream mining. From a data-mining perspective, adapting and tracking change is desirable in order to understand how and why change has occurred. Clustering, a form of unsupervised learning, can be used to identify the underlying patterns in a stream. Density-based clustering identifies clusters as areas of high density separated by areas of low density. This paper proposes a Multi-Density Stream Clustering (MDSC) algorithm to address these two problems; the multi-density problem and the problem of discovering and tracking changes in a dynamic stream. MDSC consists of two on-line components; discovered, labelled clusters and an outlier buffer. Incoming points are assigned to a live cluster or passed to the outlier buffer. New clusters are discovered in the buffer using an ant-inspired swarm intelligence approach. The newly discovered cluster is uniquely labelled and added to the set of live clusters. Processed data is subject to an ageing function and will disappear when it is no longer relevant. MDSC is shown to perform favourably to state-of-the-art peer stream-clustering algorithms on a range of real and synthetic data-streams. Experimental results suggest that MDSC can discover qualitatively useful patterns while being scalable and robust to noise

De Montfort University Open Research Archive

Multi-Behavior Recommendation with Cascading Graph Convolution Networks

Author: Cheng Zhiyong
Gao Zan
Han Sai
Liu Fan
Peng Yuxin
Zhu Lei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/03/2023
Field of study

Multi-behavior recommendation, which exploits auxiliary behaviors (e.g., click and cart) to help predict users' potential interactions on the target behavior (e.g., buy), is regarded as an effective way to alleviate the data sparsity or cold-start issues in recommendation. Multi-behaviors are often taken in certain orders in real-world applications (e.g., click>cart>buy). In a behavior chain, a latter behavior usually exhibits a stronger signal of user preference than the former one does. Most existing multi-behavior models fail to capture such dependencies in a behavior chain for embedding learning. In this work, we propose a novel multi-behavior recommendation model with cascading graph convolution networks (named MB-CGCN). In MB-CGCN, the embeddings learned from one behavior are used as the input features for the next behavior's embedding learning after a feature transformation operation. In this way, our model explicitly utilizes the behavior dependencies in embedding learning. Experiments on two benchmark datasets demonstrate the effectiveness of our model on exploiting multi-behavior data. It outperforms the best baseline by 33.7% and 35.9% on average over the two datasets in terms of Recall@10 and NDCG@10, respectively.Comment: Accepted by WWW 202

arXiv.org e-Print Archive

The User Rights Database: Measuring the Impact of Copyright Balance

Author: Flynn Sean
Palmedo Michael
Publication venue: Digital Commons @ American University Washington College of Law
Publication date: 01/01/2019
Field of study

International and domestic copyright law reform around the world is increasingly focused on how copyright user rights should be expanded to promote maximum creativity and access to knowledge in the digital age. These efforts are guided by a relatively rich theoretical literature. However, few empirical studies explore the social and economic impact of expanding user rights in the digital era. One reason for this gap has been the absence of a tool measuring the key independent variable – changes in copyright user rights over time and between countries. We developed such a tool, which we call the “User Rights Database.” This paper describes the methodology used to create the Database and the results of empirical tests using it. We find that all of the countries in our study are trending toward more open copyright user rights over time, but the wealthy countries in our sample are about thirty years ahead of developing countries on this measure. We find evidence of benefits that more open copyright user rights generate, including the development of high technology industries and scholarly publication. We do not find evidence that opening user rights causes harm to revenue of copyright intensive industries like publishing and entertainment

Digital Commons @ American University Washington College of Law

The Impact of Copyright Exceptions for Researchers on Scholarly Output

Author: Palmedo Michael
Publication venue: Digital Commons @ American University Washington College of Law
Publication date: 20/12/2017
Field of study

High prices restrict access to academic journals and books that scholars rely upon to author new research. One possible solution is the expansion of copyright exceptions allowing unauthorized access to copyrighted works for researchers. I test the link between copyright exceptions for health and science researchers and their publishing output at the country-subject level. I find that scientists residing in countries that implement more robust research exceptions publish more papers and books in subsequent years. This relationship between copyright exceptions and publishing is stronger in lower-income countries, and stronger where there is stricter copyright protection of existing works

Digital Commons @ American University Washington College of Law