Search CORE

159,853 research outputs found

The Bases of Association Rules of High Confidence

Author: Adaricheva Kira
Cabot-Miller Justin
Nation J. B.
Segal Oren
Sharafudinov Anuar
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 05/08/2018
Field of study

We develop a new approach for distributed computing of the association rules of high confidence in a binary table. It is derived from the D-basis algorithm in K. Adaricheva and J.B. Nation (TCS 2017), which is performed on multiple sub-tables of a table given by removing several rows at a time. The set of rules is then aggregated using the same approach as the D-basis is retrieved from a larger set of implications. This allows to obtain a basis of association rules of high confidence, which can be used for ranking all attributes of the table with respect to a given fixed attribute using the relevance parameter introduced in K. Adaricheva et al. (Proceedings of ICFCA-2015). This paper focuses on the technical implementation of the new algorithm. Some testing results are performed on transaction data and medical data.Comment: Presented at DTMN, Sydney, Australia, July 28, 201

arXiv.org e-Print Archive

Crossref

What attracts vehicle consumers’ buying:A Saaty scale-based VIKOR (SSC-VIKOR) approach from after-sales textual perspective?

Author: He Yandong
Lim Ming
Pratap Saurabh
Zhou Fuli
Publication venue: 'Emerald'
Publication date: 01/01/2019
Field of study

Purpose: The increasingly booming e-commerce development has stimulated vehicle consumers to express individual reviews through online forum. The purpose of this paper is to probe into the vehicle consumer consumption behavior and make recommendations for potential consumers from textual comments viewpoint. Design/methodology/approach: A big data analytic-based approach is designed to discover vehicle consumer consumption behavior from online perspective. To reduce subjectivity of expert-based approaches, a parallel Naïve Bayes approach is designed to analyze the sentiment analysis, and the Saaty scale-based (SSC) scoring rule is employed to obtain specific sentimental value of attribute class, contributing to the multi-grade sentiment classification. To achieve the intelligent recommendation for potential vehicle customers, a novel SSC-VIKOR approach is developed to prioritize vehicle brand candidates from a big data analytical viewpoint. Findings: The big data analytics argue that “cost-effectiveness” characteristic is the most important factor that vehicle consumers care, and the data mining results enable automakers to better understand consumer consumption behavior. Research limitations/implications: The case study illustrates the effectiveness of the integrated method, contributing to much more precise operations management on marketing strategy, quality improvement and intelligent recommendation. Originality/value: Researches of consumer consumption behavior are usually based on survey-based methods, and mostly previous studies about comments analysis focus on binary analysis. The hybrid SSC-VIKOR approach is developed to fill the gap from the big data perspective

Coventry University Pure Portal

Enlighten

AUC Optimisation and Collaborative Filtering

Author: Clemencon Stephan
Dhanjal Charanpal
Gaudel Romaric
Publication venue
Publication date: 21/08/2015
Field of study

In recommendation systems, one is interested in the ranking of the predicted items as opposed to other losses such as the mean squared error. Although a variety of ways to evaluate rankings exist in the literature, here we focus on the Area Under the ROC Curve (AUC) as it widely used and has a strong theoretical underpinning. In practical recommendation, only items at the top of the ranked list are presented to the users. With this in mind, we propose a class of objective functions over matrix factorisations which primarily represent a smooth surrogate for the real AUC, and in a special case we show how to prioritise the top of the list. The objectives are differentiable and optimised through a carefully designed stochastic gradient-descent-based algorithm which scales linearly with the size of the data. In the special case of square loss we show how to improve computational complexity by leveraging previously computed measures. To understand theoretically the underlying matrix factorisation approaches we study both the consistency of the loss functions with respect to AUC, and generalisation using Rademacher theory. The resulting generalisation analysis gives strong motivation for the optimisation under study. Finally, we provide computation results as to the efficacy of the proposed method using synthetic and real data

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Data Access

Author: D Yan
D Yan
HN Gabow
JY Halpern
LG Valiant
M Lesniak
M Xie
S Salihoglu
Y Tian
Publication venue
Publication date: 06/09/2017
Field of study

Pregel is a popular distributed computing model for dealing with large-scale graphs. However, it can be tricky to implement graph algorithms correctly and efficiently in Pregel's vertex-centric model, especially when the algorithm has multiple computation stages, complicated data dependencies, or even communication over dynamic internal data structures. Some domain-specific languages (DSLs) have been proposed to provide more intuitive ways to implement graph algorithms, but due to the lack of support for remote access --- reading or writing attributes of other vertices through references --- they cannot handle the above mentioned dynamic communication, causing a class of Pregel algorithms with fast convergence impossible to implement. To address this problem, we design and implement Palgol, a more declarative and powerful DSL which supports remote access. In particular, programmers can use a more declarative syntax called chain access to naturally specify dynamic communication as if directly reading data on arbitrary remote vertices. By analyzing the logic patterns of chain access, we provide a novel algorithm for compiling Palgol programs to efficient Pregel code. We demonstrate the power of Palgol by using it to implement several practical Pregel algorithms, and the evaluation result shows that the efficiency of Palgol is comparable with that of hand-written code.Comment: 12 pages, 10 figures, extended version of APLAS 2017 pape

arXiv.org e-Print Archive

Crossref

Efficient Large-scale Approximate Nearest Neighbor Search on the GPU

Author: Lensch Hendrik P. A.
Sorkine-Hornung Alexander
Wang Oliver
Wieschollek Patrick
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

We present a new approach for efficient approximate nearest neighbor (ANN) search in high dimensional spaces, extending the idea of Product Quantization. We propose a two-level product and vector quantization tree that reduces the number of vector comparisons required during tree traversal. Our approach also includes a novel highly parallelizable re-ranking method for candidate vectors by efficiently reusing already computed intermediate values. Due to its small memory footprint during traversal, the method lends itself to an efficient, parallel GPU implementation. This Product Quantization Tree (PQT) approach significantly outperforms recent state of the art methods for high dimensional nearest neighbor queries on standard reference datasets. Ours is the first work that demonstrates GPU performance superior to CPU performance on high dimensional, large scale ANN problems in time-critical real-world applications, like loop-closing in videos

arXiv.org e-Print Archive

Crossref

MPG.PuRe