Search CORE

1,587 research outputs found

Deep Learning based Recommender System: A Survey and New Perspectives

Author: Sun Aixin
Tay Yi
Yao Lina
Zhang Shuai
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

With the ever-growing volume of online information, recommender systems have been an effective strategy to overcome such information overload. The utility of recommender systems cannot be overstated, given its widespread adoption in many web applications, along with its potential impact to ameliorate many problems related to over-choice. In recent years, deep learning has garnered considerable interest in many research fields such as computer vision and natural language processing, owing not only to stellar performance but also the attractive property of learning feature representations from scratch. The influence of deep learning is also pervasive, recently demonstrating its effectiveness when applied to information retrieval and recommender systems research. Evidently, the field of deep learning in recommender system is flourishing. This article aims to provide a comprehensive review of recent research efforts on deep learning based recommender systems. More concretely, we provide and devise a taxonomy of deep learning based recommendation models, along with providing a comprehensive summary of the state-of-the-art. Finally, we expand on current trends and provide new perspectives pertaining to this new exciting development of the field.Comment: The paper has been accepted by ACM Computing Surveys. https://doi.acm.org/10.1145/328502

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Session-based Recommendation with Graph Neural Networks

Author: Tan Tieniu
Tang Yuyuan
Wang Liang
Wu Shu
Xie Xing
Zhu Yanqiao
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 24/01/2019
Field of study

The problem of session-based recommendation aims to predict user actions based on anonymous sessions. Previous methods model a session as a sequence and estimate user representations besides item representations to make recommendations. Though achieved promising results, they are insufficient to obtain accurate user vectors in sessions and neglect complex transitions of items. To obtain accurate item embedding and take complex transitions of items into account, we propose a novel method, i.e. Session-based Recommendation with Graph Neural Networks, SR-GNN for brevity. In the proposed method, session sequences are modeled as graph-structured data. Based on the session graph, GNN can capture complex transitions of items, which are difficult to be revealed by previous conventional sequential methods. Each session is then represented as the composition of the global preference and the current interest of that session using an attention network. Extensive experiments conducted on two real datasets show that SR-GNN evidently outperforms the state-of-the-art session-based recommendation methods consistently.Comment: 9 pages, 4 figures, accepted by AAAI Conference on Artificial Intelligence (AAAI-19

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Large-Scale User Modeling with Recurrent Neural Networks for Music Discovery on Multiple Time Scales

Author: Agrawal Rohan
Chen Ching-Wei
De Boom Cedric
Demeester Thomas
Dhoedt Bart
Hansen Samantha
Kumar Esh
Yon Romain
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/08/2017
Field of study

The amount of content on online music streaming platforms is immense, and most users only access a tiny fraction of this content. Recommender systems are the application of choice to open up the collection to these users. Collaborative filtering has the disadvantage that it relies on explicit ratings, which are often unavailable, and generally disregards the temporal nature of music consumption. On the other hand, item co-occurrence algorithms, such as the recently introduced word2vec-based recommenders, are typically left without an effective user representation. In this paper, we present a new approach to model users through recurrent neural networks by sequentially processing consumed items, represented by any type of embeddings and other context features. This way we obtain semantically rich user representations, which capture a user's musical taste over time. Our experimental analysis on large-scale user data shows that our model can be used to predict future songs a user will likely listen to, both in the short and long term.Comment: Author pre-print version, 20 pages, 6 figures, 4 table

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition

Author: Alsheikh
Daniel Roggen
Dauphin
Dieleman
Francisco Ordóñez
Gers
Japkowicz
Karpathy
LeCun
Ng
Pigou
Sermanet
Publication venue: 'MDPI AG'
Publication date: 01/01/2016
Field of study

Human activity recognition (HAR) tasks have traditionally been solved using engineered features obtained by heuristic processes. Current research suggests that deep convolutional neural networks are suited to automate feature extraction from raw sensor inputs. However, human activities are made of complex sequences of motor movements, and capturing this temporal dynamics is fundamental for successful HAR. Based on the recent success of recurrent neural networks for time series domains, we propose a generic deep framework for activity recognition based on convolutional and LSTM recurrent units, which: (i) is suitable for multimodal wearable sensors; (ii) can perform sensor fusion naturally; (iii) does not require expert knowledge in designing features; and (iv) explicitly models the temporal dynamics of feature activations. We evaluate our framework on two datasets, one of which has been used in a public activity recognition challenge. Our results show that our framework outperforms competing deep non-recurrent networks on the challenge dataset by 4% on average; outperforming some of the previous reported results by up to 9%. Our results show that the framework can be applied to homogeneous sensor modalities, but can also fuse multimodal sensors to improve performance. We characterise key architectural hyperparameters’ influence on performance to provide insights about their optimisation

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

PubMed Central

Sussex Research Online

Long- and Short-term Sequential Recommendation with Enhanced Temporal Self-attention

Author: Fang Zian
Publication venue
Publication date: 30/11/2021
Field of study

Pure OAI Repository

End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent Neural Models

Author: Busso Carlos
Tao Fei
Publication venue
Publication date: 12/09/2018
Field of study

Speech activity detection (SAD) plays an important role in current speech processing systems, including automatic speech recognition (ASR). SAD is particularly difficult in environments with acoustic noise. A practical solution is to incorporate visual information, increasing the robustness of the SAD approach. An audiovisual system has the advantage of being robust to different speech modes (e.g., whisper speech) or background noise. Recent advances in audiovisual speech processing using deep learning have opened opportunities to capture in a principled way the temporal relationships between acoustic and visual features. This study explores this idea proposing a \emph{bimodal recurrent neural network} (BRNN) framework for SAD. The approach models the temporal dynamic of the sequential audiovisual data, improving the accuracy and robustness of the proposed SAD system. Instead of estimating hand-crafted features, the study investigates an end-to-end training approach, where acoustic and visual features are directly learned from the raw data during training. The experimental evaluation considers a large audiovisual corpus with over 60.8 hours of recordings, collected from 105 speakers. The results demonstrate that the proposed framework leads to absolute improvements up to 1.2% under practical scenarios over a VAD baseline using only audio implemented with deep neural network (DNN). The proposed approach achieves 92.7% F1-score when it is evaluated using the sensors from a portable tablet under noisy acoustic environment, which is only 1.0% lower than the performance obtained under ideal conditions (e.g., clean speech obtained with a high definition camera and a close-talking microphone).Comment: Submitted to Speech Communicatio

arXiv.org e-Print Archive