Search CORE

4,853 research outputs found

Analysing Compression Techniques for In-Memory Collaborative Filtering

Author: Macdonald Craig
Ounis Iadh
Vargas Saul
Publication venue
Publication date: 01/01/2015
Field of study

Following the recent trend of in-memory data processing, it is a usual practice to maintain collaborative filtering data in the main memory when generating recommendations in academic and industrial recommender systems. In this paper, we study the impact of integer compression techniques for in-memory collaborative filtering data in terms of space and time efficiency. Our results provide relevant observations about when and how to compress collaborative filtering data. First, we observe that, depending on the memory constraints, compression techniques may speed up or slow down the performance of state-of-the art collaborative filtering algorithms. Second, after comparing different compression techniques, we find the Frame of Reference (FOR) technique to be the best option in terms of space and time efficiency under different memory constraints

Enlighten

An item/user representation for recommender systems based on bloom filters

Author: Chiky R
Metais E
Meziane F
Pozo M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

This paper focuses on the items/users representation in the domain of recommender systems. These systems compute similarities between items (and/or users) to recommend new items to users based on their previous preferences. It is often useful to consider the characteristics (a.k.a features or attributes) of the items and/or users. This represents items/users by vectors that can be very large, sparse and space-consuming. In this paper, we propose a new accurate method for representing items/users with low size data structures that relies on two concepts: (1) item/user representation is based on bloom filter vectors, and (2) the usage of these filters to compute bitwise AND similarities and bitwise XNOR similarities. This work is motivated by three ideas: (1) detailed vector representations are large and sparse, (2) comparing more features of items/users may achieve better accuracy for items similarities, and (3) similarities are not only in common existing aspects, but also in common missing aspects. We have experimented this approach on the publicly available MovieLens dataset. The results show a good performance in comparison with existing approaches such as standard vector representation and Singular Value Decomposition (SVD)

University of Salford Institutional Repository

Crossref

UDORA - University of Derby Online Research Archive

Studying the Effect of Data Structures on the Efficiency of Collaborative Filtering Systems

Author: Cormen T. H.
Falley P.
Koren Y.
Ning X.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2016
Field of study

This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in CERI '16 Proceedings of the 4th Spanish Conference on Information Retrieval, http://dx.doi.org/10.1145/2934732.2934747Recommender systems is an active research area where the major focus has been on how to improve the quality of gen- erated recommendations, but less attention has been paid on how to do it in an e cient way. This aspect is increas- ingly important because the information to be considered by recommender systems is growing exponentially. In this pa- per we study how di erent data structures a ect the perfor- mance of these systems. Our results with two public datasets provide relevant insights regarding the optimal data struc- tures in terms of memory and time usages. Speci cally, we show that classical data structures like Binary Search Trees and Red-Black Trees can beat more complex and popular alternatives like Hash Tables

Crossref

Biblos-e Archivo

A Hybrid Travel Recommender Model Based on Deep Level Autoencoder And Machine Learning Algorithms

Author: Mohamed Basheer K.P
Muneer V.K
Publication venue: ASSOC ADVANCEMENT ZOOLOGY , AZADANAGAR COLONY RUSTAMPUR, GORAKHPUR, INDIA, 273001
Publication date: 20/12/2023
Field of study

This research investigates the application of autoencoders in processing travelogues written in the Malayalam language on Facebook. The main objective is to harness the capabilities of autoencoders to learn a compressed representation of the input data and employ it to train various machine learning models for enhanced accuracy and efficiency. The major challenge of unavailability of a benchmark dataset in the Malayalam language for the travel domain was overcome by employing NLP techniques on the unstructured, lengthy, imbalanced travelogues, applying some additional filtering methods, and the creation of an exclusive Part of Travel Tagger (POT Tagger) along with lookup dictionaries. As this pioneering work focuses on Malayalam travel reviews posted on social media, the model presents a valuable opportunity for extension to other low-resourced Indian languages. The study follows a two-step approach. Initially, an autoencoder neural network architecture is utilized to encode the travelogues into a lower-dimensional latent space representation. The encoder network adeptly captures crucial features and patterns within the data. The compressed representation obtained from the encoder is then fed into the decoder, which reconstructs the original travelogues. Subsequently, the encoded model is employed to train diverse machine learning models, including logistic regression, decision tree classifier, support vector machine (SVM), random forest classifier (RFC), K-nearest neighbours (KNN), stochastic gradient descent (SGD), and multilayer perceptron (MLP). By utilizing the encoded features as inputs, these models effectively learn from the concise representation of the Malayalam travelogues. Experimental results reveal that the trained machine learning models, using the encoded features, achieve higher accuracy rates compared to conventional approaches. This improvement demonstrates the effectiveness of autoencoders in capturing and representing vital characteristics of the Malayalam travelogues on Facebook. By leveraging capabilities of autoencoder model, we successfully learned a compressed representation of the input data, attaining an impressive validation accuracy of 95.84%. This finding highlights the potential of autoencoders to enhance the overall accuracy and efficiency of travel recommendation systems for Malayalam users on social media platforms. &nbsp

Journal Of Advanced Zoology

Recommended from our members

Parallelizing k-means with hadoop/mahout for big data analytics

Author: Cui Jianbin
Publication venue: Brunel University London
Publication date: 01/01/2015
Field of study

This thesis was submitted for the degree of Master of Philosophy and awarded by Brunel University LondonThe rapid development of Internet and cloud computing technologies has led to explosive generation and processing of huge amounts of data. The ever increasing data volumes bring great values to societies, but in the meantime bring forward a number of challenges. Data mining techniques have been widely used in decision analysis in financial, medical, management, business and many other fields. However, how to analyse and mine valuable information from the massive data has become a crucial problem as the traditional methods are hardly to achieve high scalability in data processing. Recently, MapReduce has emerged into a major programming model in dealing with big data analytics. Apache Hadoop, which is an open-source implementation of MapReduce, has been widely taken up by the community. Hadoop facilitates the utilization of a large number of inexpensive commodity computers. In addition, Hadoop provides support in dealing with faults which is especially useful for long running jobs. Mahout is a new open-source project of Apache, providing a number of machine learning and data mining algorithms based on the Hadoop platform. As a machine learning technique, K-means has been widely used in data analytics through clustering. However, K-means experiences high overhead in computation when the size of data to be analysed is large. This thesis parallelizes K-means using the MapReduce model and implements a parallel K-means with Mahout on the Hadoop platform. The parallel K-means reduces the computation time significantly in comparison with the standard K-means in dealing with a large data set. In addition, this thesis further evaluates the impact of Hadoop parameters on the performance of the Hadoop framework

Brunel University Research Archive

Privacy-preserving recommendation system using federated learning

Author: Basu Rahul
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2020
Field of study

Federated Learning is a form of distributed learning which leverages edge devices for training. It aims to preserve privacy by communicating users’ learning parameters and gradient updates to the global server during the training while keeping the actual data on the users’ devices. The training on global server is performed on these parameters instead of user data directly while fine tuning of the model can be done on client’s devices locally. However, federated learning is not without its shortcomings and in this thesis, we present an overview of the learning paradigm and propose a new federated recommender system framework that utilizes homomorphic encryption. This results in a slight decrease in accuracy metrics but leads to greatly increased user-privacy. We also show that performing computations on encrypted gradients barely affects the recommendation performance while ensuring a more secure means of communicating user gradients to and from the global server

Digital Commons @ New Jersey Institute of Technology (NJIT)

Towards Integration of Artificial Intelligence into Medical Devices as a Real-Time Recommender System for Personalised Healthcare:State-of-the-Art and Future Prospects

Author: Amin Bilal
Faherty Mary
Feely Conor
Iqbal Talha
Jones Tim
Masud Mehedi
Shahzad Atif
Tierney Michelle
Vazquez Patricia
Publication venue
Publication date: 25/01/2024
Field of study

In the era of big data, artificial intelligence (AI) algorithms have the potential to revolutionize healthcare by improving patient outcomes and reducing healthcare costs. AI algorithms have frequently been used in health care for predictive modelling, image analysis and drug discovery. Moreover, as a recommender system, these algorithms have shown promising impacts on personalized healthcare provision. A recommender system learns the behaviour of the user and predicts their current preferences (recommends) based on their previous preferences. Implementing AI as a recommender system improves this prediction accuracy and solves cold start and data sparsity problems. However, most of the methods and algorithms are tested in a simulated setting which cannot recapitulate the influencing factors of the real world. This review article systematically reviews prevailing methodologies in recommender systems and discusses the AI algorithms as recommender systems specifically in the field of healthcare. It also provides discussion around the most cutting-edge academic and practical contributions present in the literature, identifies performance evaluation matrices, challenges in the implementation of AI as a recommender system, and acceptance of AI-based recommender systems by clinicians. The findings of this article direct researchers and professionals to comprehend currently developed recommender systems and the future of medical devices integrated with real-time recommender systems for personalized healthcare

University of Birmingham Research Portal

Incremental volume rendering using hierarchical compression

Author: Haley Michael Blake
Publication venue: Department of Computer Science
Publication date: 01/01/1996
Field of study

Includes bibliographical references.The research has been based on the thesis that efficient volume rendering of datasets, contained on the Internet, can be achieved on average personal workstations. We present a new algorithm here for efficient incremental rendering of volumetric datasets. The primary goal of this algorithm is to give average workstations the ability to efficiently render volume data received over relatively low bandwidth network links in such a way that rapid user feedback is maintained. Common limitations of workstation rendering of volume data include: large memory overheads, the requirement of expensive rendering hardware, and high speed processing ability. The rendering algorithm presented here overcomes these problems by making use of the efficient Shear-Warp Factorisation method which does not require specialised graphics hardware. However the original Shear-Warp algorithm suffers from a high memory overhead and does not provide for incremental rendering which is required should rapid user feedback be maintained. Our algorithm represents the volumetric data using a hierarchical data structure which provides for the incremental classification and rendering of volume data. This exploits the multiscale nature of the octree data structure. The algorithm reduces the memory footprint of the original Shear-Warp Factorisation algorithm by a factor of more than two, while maintaining good rendering performance. These factors make our octree algorithm more suitable for implementation on average desktop workstations for the purposes of interactive exploration of volume models over a network. This dissertation covers the theory and practice of developing the octree based Shear-Warp algorithms, and then presents the results of extensive empirical testing. The results, using typical volume datasets, demonstrate the ability of the algorithm to achieve high rendering rates for both incremental rendering and standard rendering while reducing the runtime memory requirements

Cape Town University OpenUCT