Search CORE

93,567 research outputs found

A review of associative classification mining

Author: Thabtah Fadi
Publication venue
Publication date: 01/01/2007
Field of study

Associative classification mining is a promising approach in data mining that utilizes the association rule discovery techniques to construct classification systems, also known as associative classifiers. In the last few years, a number of associative classification algorithms have been proposed, i.e. CPAR, CMAR, MCAR, MMAC and others. These algorithms employ several different rule discovery, rule ranking, rule pruning, rule prediction and rule evaluation methods. This paper focuses on surveying and comparing the state-of-the-art associative classification techniques with regards to the above criteria. Finally, future directions in associative classification, such as incremental learning and mining low-quality data sets, are also highlighted in this paper

CiteSeerX

University of Huddersfield Repository

Off-line vs. On-line Evaluation of Recommender Systems in Small E-commerce

Author: Beel Joeran
Benjamin
Carbonell Jaime
Hidasi Balázs
Jannach Dietmar
Joachims Thorsten
Mikolov Tomas
Noia Tommaso Di
Volkovs Maksims
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/06/2020
Field of study

In this paper, we present our work towards comparing on-line and off-line evaluation metrics in the context of small e-commerce recommender systems. Recommending on small e-commerce enterprises is rather challenging due to the lower volume of interactions and low user loyalty, rarely extending beyond a single session. On the other hand, we usually have to deal with lower volumes of objects, which are easier to discover by users through various browsing/searching GUIs. The main goal of this paper is to determine applicability of off-line evaluation metrics in learning true usability of recommender systems (evaluated on-line in A/B testing). In total 800 variants of recommending algorithms were evaluated off-line w.r.t. 18 metrics covering rating-based, ranking-based, novelty and diversity evaluation. The off-line results were afterwards compared with on-line evaluation of 12 selected recommender variants and based on the results, we tried to learn and utilize an off-line to on-line results prediction model. Off-line results shown a great variance in performance w.r.t. different metrics with the Pareto front covering 68\% of the approaches. Furthermore, we observed that on-line results are considerably affected by the novelty of users. On-line metrics correlates positively with ranking-based metrics (AUC, MRR, nDCG) for novice users, while too high values of diversity and novelty had a negative impact on the on-line results for them. For users with more visited items, however, the diversity became more important, while ranking-based metrics relevance gradually decrease.Comment: Submitted to ACM Hypertext 2020 Conferenc

arXiv.org e-Print Archive

Crossref

Surrogate Functions for Maximizing Precision at the Top

Author: Jain Prateek
Kar Purushottam
Narasimhan Harikrishna
Publication venue
Publication date: 26/05/2015
Field of study

The problem of maximizing precision at the top of a ranked list, often dubbed Precision@k (prec@k), finds relevance in myriad learning applications such as ranking, multi-label classification, and learning with severe label imbalance. However, despite its popularity, there exist significant gaps in our understanding of this problem and its associated performance measure. The most notable of these is the lack of a convex upper bounding surrogate for prec@k. We also lack scalable perceptron and stochastic gradient descent algorithms for optimizing this performance measure. In this paper we make key contributions in these directions. At the heart of our results is a family of truly upper bounding surrogates for prec@k. These surrogates are motivated in a principled manner and enjoy attractive properties such as consistency to prec@k under various natural margin/noise conditions. These surrogates are then used to design a class of novel perceptron algorithms for optimizing prec@k with provable mistake bounds. We also devise scalable stochastic gradient descent style methods for this problem with provable convergence bounds. Our proofs rely on novel uniform convergence bounds which require an in-depth analysis of the structural properties of prec@k and its surrogates. We conclude with experimental results comparing our algorithms with state-of-the-art cutting plane and stochastic gradient algorithms for maximizing [email protected]: To appear in the the proceedings of the 32nd International Conference on Machine Learning (ICML 2015

arXiv.org e-Print Archive

CiteSeerX

Information filtering via preferential diffusion

Author: C. N. Ziegler
J. Bennett
K. K. Rachuri
Linyuan Lü
S. M. McNee
S. M. McNee
Weiping Liu
Publication venue: 'American Physical Society (APS)'
Publication date: 27/02/2011
Field of study

Recommender systems have shown great potential to address information overload problem, namely to help users in finding interesting and relevant objects within a huge information space. Some physical dynamics, including heat conduction process and mass or energy diffusion on networks, have recently found applications in personalized recommendation. Most of the previous studies focus overwhelmingly on recommendation accuracy as the only important factor, while overlook the significance of diversity and novelty which indeed provide the vitality of the system. In this paper, we propose a recommendation algorithm based on the preferential diffusion process on user-object bipartite network. Numerical analyses on two benchmark datasets, MovieLens and Netflix, indicate that our method outperforms the state-of-the-art methods. Specifically, it can not only provide more accurate recommendations, but also generate more diverse and novel recommendations by accurately recommending unpopular objects.Comment: 12 pages, 10 figures, 2 table

arXiv.org e-Print Archive

Crossref

RERO DOC Digital Library

Sound ranking algorithms for XML search

Author: Apers P.M.G.
Flokstra J.
Hiemstra D.
Klinger S.
Rode H.
Publication venue: University of Otago
Publication date: 01/01/2008
Field of study

Ranking algorithms for XML should reflect the actual combined content and structure constraints of queries, while at the same time producing equal rankings for queries that are semantically equal. Ranking algorithms that produce different rankings for queries that are semantically equal are easily detected by tests on large databases: We call such algorithms not sound. We report the behavior of different approaches to ranking content-and-structure queries on pairs of queries for which we expect equal ranking results from the query semantics. We show that most of these approaches are not sound. Of the remaining approaches, only 3 adhere to the W3C XQuery Full-Text standard

KOPS - The Institutional Repository of the University of Konstanz

CiteSeerX

University of Twente Research Information