Search CORE

7 research outputs found

Tuning Word2vec for Large Scale Recommendation Systems

Author: Bodon Ferenc
Mnih Andriy
Rehurek Radim
Shahriari Bobak
Turrin Roberto
Zhao Kui
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/09/2020
Field of study

Word2vec is a powerful machine learning tool that emerged from Natural Lan-guage Processing (NLP) and is now applied in multiple domains, including recom-mender systems, forecasting, and network analysis. As Word2vec is often used offthe shelf, we address the question of whether the default hyperparameters are suit-able for recommender systems. The answer is emphatically no. In this paper, wefirst elucidate the importance of hyperparameter optimization and show that un-constrained optimization yields an average 221% improvement in hit rate over thedefault parameters. However, unconstrained optimization leads to hyperparametersettings that are very expensive and not feasible for large scale recommendationtasks. To this end, we demonstrate 138% average improvement in hit rate with aruntime budget-constrained hyperparameter optimization. Furthermore, to makehyperparameter optimization applicable for large scale recommendation problemswhere the target dataset is too large to search over, we investigate generalizinghyperparameters settings from samples. We show that applying constrained hy-perparameter optimization using only a 10% sample of the data still yields a 91%average improvement in hit rate over the default parameters when applied to thefull datasets. Finally, we apply hyperparameters learned using our method of con-strained optimization on a sample to the Who To Follow recommendation serviceat Twitter and are able to increase follow rates by 15%.Comment: 11 pages, 4 figures, Fourteenth ACM Conference on Recommender System

arXiv.org e-Print Archive

Crossref

Hatékony algoritmusok = Efficient algorithms

Author: Benczúr András
Bodon Ferenc
Demetrovics János
Fogaras Dániel
Friedl Katalin
Friedman Anna Eszter
Ivanyos Gábor
Lukács András
Marx Dániel
Pintér Márta
Rónyai Lajos
Publication venue: OTKA
Publication date: 01/01/2007
Field of study

A kutatás során csoportunk egy sor új eredményt ért el a számítástudomány több területén. Ezek a területek: algebrai és szimbolikus számítások, számításelmélet, kombinatorikus optimalizálás, adatbázis-elmélet, adatbányászat és internetes algoritmusok. Néhány fontosabb eredmény: -- véges ponthalmazokhoz rendelhető Gröbner-bázisok és kapcsolódó struktúrák leírása kombinatorikai szempontból érdekes esetekben, -- a kvantumszámítások néhány fontos modelljének az összehasonlítása, számító erejük tisztázása, kvantumalgoritmusok kidolgozása, -- az ""Adatbázis-szerkezetek"" c. akadémiai Nívódíjas monográfia elkészülte, -- komoly előrelépést értünk el több, az interneten való kereséssel kapcsolatos kérdésben: új, hatékony algoritmusokat javasoltunk a világháló lapjainak személyes preferenciákat figyelembe vevő rangsorolására; algoritmust dolgoztunk ki a web spam jelenség nagy megbízhatóságú, automatikus detektálására; létrehoztunk egy kísérleti keresőrendszert, -- új hatékony adatbányászati algoritmusok kidolgozása és ezek alkalmazása; az alkalmazások közül kiemelkedik a telekommunikációs ügyfelek viselkedésének modellezésével kapcsolatos vizsgálatunk, amely Barabási Albert László világhírű kutatócsoportjával közös munka, és amelyről a The New York Times is beszámolt. | With the partial support of the present grant, we have achieved new results in several fields of computer science, including algebraic and symbolic computation, theoretical computer science, combinatorial optimization, database theory, data mining, algorithms for the internet. Some of the highlights are: -- a description of Gröbner bases and related structures attached to finite sets of of points, where the point sets have combinatorial significance, -- a comparison of some models of quantum computation from the perspective of computing power; development of new quantum algorithms, -- publication of the monograph ""Database structures"" (in Hungarian) which won the Quality Prize of the Akadémai Kiadó, -- significant advances in several directions connected to searching the internet: we proposed new, efficient methods for obtaining a personalized ranking of web pages; we proposed algorithms for the automatic and highly reliable detection of spam links in the web; we developed an experimental search engine, -- development and applications of new algorithms for several data mining tasks; among the applications the most important is a model for telecommunication customer behaviour, which has been elaborated in a joint project with the renowned group of Albert László Barabási, among others The York Times reported on some of our findings

Repository of the Academy's Library

Informatikai algoritmusok 3.

Author: Balogh Ádám
Bodon Ferenc
Demetrovics János
Elek István
Fogaras Dániel
Galántai Aurél
Gombos Gergő
Iványi Antal
Kiss Attila
Kósa Balázs
Lukács András
Matuszka Tamás
Miklós István
Pinczel Balázs
Rácz Gábor
Sali Attila
Sidló Csaba
Szirmay-Kalos László
Ulrich Tamm
Publication venue: Hountler Kft.
Publication date: 01/01/2015
Field of study

A könyv a Magyar Tudományos Akadémia támogatásával készült

ELTE Digital Institutional Repository (EDIT)

A fast apriori implementation

Author: Ferenc Bodon
Publication venue
Publication date
Field of study

The efficiency of frequent itemset mining algorithms is determined mainly by three factors: the way candidates are generated, the data structure that is used and the implementation details. Most papers focus on the first factor, some describe the underlying data structures, but implementation details are almost always neglected. In this paper we show that the effect of implementation can be more important than the selection of the algorithm. Ideas that seem to be quite promising, may turn out to be ineffective if we descend to the implementation level. We theoretically and experimentally analyze APRIORI which is the most established algorithm for frequent itemset mining. Several implementations of the algorithm have been put forward in the last decade. Although they are implementations of the very same algorithm, they display large differences in running time and memory need. In this paper we describe an implementation of APRIORI that outperforms all implementations known to us. We analyze, theoretically and experimentally, the principal data structure of our solution. This data structure is the main factor in the efficiency of our implementation. Moreover, we present a simple modification of APRIORI that appears to be faster than the original algorithm.

CiteSeerX

Automatic discovery of locally frequent itemsets in the presence of highly frequent itemsets

Author: Athanasios K. Tsakalidis
Christos H. Makris
Ferenc Bodon
Ioannis N. Kouris
Publication venue: 'IOS Press'
Publication date
Field of study

Crossref