Search CORE

12,438 research outputs found

Statistical Significance of the Netflix Challenge

Author: Feuerverger Andrey
He Yu
Khatri Shashi
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 24/07/2012
Field of study

Inspired by the legacy of the Netflix contest, we provide an overview of what has been learned---from our own efforts, and those of others---concerning the problems of collaborative filtering and recommender systems. The data set consists of about 100 million movie ratings (from 1 to 5 stars) involving some 480 thousand users and some 18 thousand movies; the associated ratings matrix is about 99% sparse. The goal is to predict ratings that users will give to movies; systems which can do this accurately have significant commercial applications, particularly on the world wide web. We discuss, in some detail, approaches to "baseline" modeling, singular value decomposition (SVD), as well as kNN (nearest neighbor) and neural network models; temporal effects, cross-validation issues, ensemble methods and other considerations are discussed as well. We compare existing models in a search for new models, and also discuss the mission-critical issues of penalization and parameter shrinkage which arise when the dimensions of a parameter space reaches into the millions. Although much work on such problems has been carried out by the computer science and machine learning communities, our goal here is to address a statistical audience, and to provide a primarily statistical treatment of the lessons that have been learned from this remarkable set of data.Comment: Published in at http://dx.doi.org/10.1214/11-STS368 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Graph-RAT: Combining data sources in music recommendation systems

Author: Bainbridge David
McEnnis Daniel
Publication venue: University of Waikato, Department of Computer Science
Publication date: 28/07/2008
Field of study

The complexity of music recommendation systems has increased rapidly in recent years, drawing upon different sources of information: content analysis, web-mining, social tagging, etc. Unfortunately, the tools to scientifically evaluate such integrated systems are not readily available; nor are the base algorithms available. This article describes Graph-RAT (Graph-based Relational Analysis Toolkit), an open source toolkit that provides a framework for developing and evaluating novel hybrid systems. While this toolkit is designed for music recommendation, it has applications outside its discipline as well. An experiment—indicative of the sort of procedure that can be configured using the toolkit—is provided to illustrate its usefulness

Research Commons@Waikato

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Recommender Systems

Author: Adamic
Adomavicius
Agarwal
Albert
Anderson
Arndt
Balabanović
Barabási
Basu
Bell
Berge
Billsus
Blattner
Blei
Blei
Boccaletti
Bollobás
Bollobás
Bollé
Bollé
Bone
Bonhard
Bouchaud
Breiman
Brin
Brynjolfsson
Buckley
Buckley
Burkard
Burke
Burke
Burke
Cacheda
Caldarelli
Campos
Candés
Candés
Carlin
Castellano
Castells
Cattuto
Cattuto
Chebotarev
Chen
Chevalier
Chi Ho Yeung
Cho
Chou
Cimini
Clauset
Claypool
Cooke
Costa
Dellarocas
Dellarocas
Ding
Dorogovtsev
Ellero
Erdös
Esslimani
Euler
Fortunato
Fouss
Franceschet
Gao
Geman
Gemulla
Ghoshal
Golbeck
Goldberg
Goldberg
Goldstein
Griffiths
Grujić
Gualdi
Gualdi
Guo
Gupta
Hagel
Hanely
He
Herlocker
Herlocker
Herlocker
Herr
Hofmann
Hofmann
Holme
Holmes
Hotho
Hu
Huang
Huang
Huang
Hurley
Hwang
Hwang
Jaccard
Jamali
Jansen
Jeh
Jeong
Jia
Jin
Järvelin
Jøsang
Katz
Kendall
Keshavan
Keshavan
Klamt
Klein
Kobsa
Kolda
Kong
Koren
Koren
Koren
Kwak
Laherrère
Lam
Lambiotte
Lambiotte
Lathia
Lathia
Latora
Laureti
Leicht
Leskovec
Liben-Nowell
Linden
Linyuan Lü
Liu
Liu
Liu
Liu
Liu
Liu
Liu
Liu
Liu
Liu
Liu
Lü
Lü
Lü
Lü
Ma
Mantegna
Maslov
Massa
Massa
Matúš Medo
Mcnee
Medo
Medo
Medo
Melville
Mika
Milgram
Min
Mobasher
Moffat
Moreno
Newman
Newman
Newman
Newman
Newman
Newman
Newman
Newman
Newman
Palla
Pan
Pan
Pastor-Satorras
Pastor-Satorras
Pazzani
Pazzani
Pazzani
Phelps
Popescul
Qiu
Quillian
Ravasz
Ren
Resnick
Resnick
Rodgers
Romero
Sabater
Salganik
Salter
Salton
Schafer
Schein
Shang
Shang
Shang
Shang
Shardanand
Si
Simmel
Smyth
Song
Song
Spearman
Stojmirović
Su
Sun
Symeonidis
Symeonidis
Sørensen
Tang
Tao Zhou
Taramasco
Tong
Tribus
Tso
Turner
van Rijsbergen
Vazquez
Vespignani
Vig
Vázquez
Vázquez
Walter
Wang
Wang
Wang
Wasserman
Watts
Watts
Wei
Weibull
Witten
Wu
Xiang
Xuan
Yang
Yao
Yedidia
Yeung
Yeung
Yi-Cheng Zhang
Yin
Yu
Zeng
Zeng
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhao
Zheng
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Zi-Ke Zhang
Ziegler
Ziegler
Zlatić
Publication venue: 'Elsevier BV'
Publication date: 06/02/2012
Field of study

The ongoing rapid expansion of the Internet greatly increases the necessity of effective recommender systems for filtering the abundant information. Extensive research for recommender systems is conducted by a broad range of communities including social and computer scientists, physicists, and interdisciplinary researchers. Despite substantial theoretical and practical achievements, unification and comparison of different approaches are lacking, which impedes further advances. In this article, we review recent developments in recommender systems and discuss the major challenges. We compare and evaluate available algorithms and examine their roles in the future developments. In addition to algorithms, physical aspects are described to illustrate macroscopic behavior of recommender systems. Potential impacts and future directions are discussed. We emphasize that recommendation has a great scientific depth and combines diverse research fields which makes it of interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports

arXiv.org e-Print Archive

Crossref

Aston Publications Explorer

RERO DOC Digital Library

Analysis of UK and European NOx and VOC emission scenarios in the Defra model intercomparison exercise

Author: Alison Redington
Andrea Fraser
Anttila
Charles Chemel
Eden
Jenkinson
Justin Lingard
Kukkonen
Luecken
Manning
Massimo Vieno
Mathew R. Heal
Niemi
Nutthida Kitwiroon
Potempski
Ranjeet Sokhi
Richard Derwent
Saarikoski
Sally Cooke
Sean Beevers
Sillman
Sillman
Stohl
Van Loon
Vautard
Witham
Xavier Francis
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

This is a PDF file of an unedited manuscript that has been accepted for publication. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertainSimple emission scenarios have been implemented in eight United Kingdom air quality models with the aim of assessing how these models compared when addressing whether photochemical ozone formation in southern England was NOx- or VOC-sensitive and whether ozone precursor sources in the UK or in the Rest of Europe (RoE) were the most important during July 2006. The suite of models included three Eulerian-grid models (three implementations of one of these models), a Lagrangian atmospheric dispersion model and two moving box air parcel models. The assignments as to NOx- or VOC-sensitive and to UK- versus RoE-dominant, turned out to be highly variable and often contradictory between the individual models. However, when the assignments were filtered by model performance on each day, many of the contradictions could be eliminated. Nevertheless, no one model was found to be the 'best' model on all days, indicating that no single air quality model could currently be relied upon to inform policymakers robustly in terms of NOx- versus VOC-sensitivity and UK- versus RoE-dominance on each day. It is important to maintain a diversity in model approaches.Peer reviewedFinal Accepted Versio

Crossref

Edinburgh Research Explorer

King's Research Portal

University of Hertfordshire Research Archive

NERC Open Research Archive