15 research outputs found
Frequent Pattern Mining Algorithms for Finding Associated Frequent Patterns for Data Streams: A Survey
Pattern recognition is seen as a major challenge within the field of data mining and knowledge discovery. For the
work in this paper, we have analyzed a range of widely used algorithms for finding frequent patterns with the
purpose of discovering how these algorithms can be used to obtain frequent patterns over large transactional
databases. This has been presented in the form of a comparative study of the following algorithms: Apriori
algorithm, Frequent Pattern (FP) Growth algorithm, Rapid Association Rule Mining (RARM), ECLAT algorithm
and Associated Sensor Pattern Mining of Data Stream (ASPMS) frequent pattern mining algorithms. This study
also focuses on each of the algorithm’s strengths and weaknesses for finding patterns among large item sets in
database systems
Novel centroid selection approaches for KMeans-clustering based recommender systems
Recommender systems have the ability to filter unseen information for predicting whether a particular user would prefer
a given item when making a choice. Over the years, this process has been dependent on robust applications of data
mining and machine learning techniques, which are known to have scalability issues when being applied for recommender
systems. In this paper, we propose a k-means clustering-based recommendation algorithm, which addresses the scalability
issues associated with traditional recommender systems. An issue with traditional k-means clustering algorithms is that
they choose the initial k centroid randomly, which leads to inaccurate recommendations and increased cost for offline
training of clusters. The work in this paper highlights how centroid selection in k-means based recommender systems
can improve performance as well as being cost saving. The proposed centroid selection method has the ability to
exploit underlying data correlation structures, which has been proven to exhibit superior accuracy and performance in
comparison to the traditional centroid selection strategies, which choose centroids randomly. The proposed approach
has been validated with an extensive set of experiments based on five different datasets (from movies, books, and music
domain). These experiments prove that the proposed approach provides a better quality cluster and converges quicker
than existing approaches, which in turn improves accuracy of the recommendation provided
Robust, scalable, and practical algorithms for recommender systems
The purpose of recommender systems is to filter information unseen by a user to predict whether a user would like a given item. Making effective recommendations from adomain consisting of millions of ratings is a major research challenge in the application of machine learning and data mining. A number of approaches have been proposed to solvethe recommendation problem, where the main motivation is to increase the accuracy of the recommendations while ignoring other design objectives such as scalability, sparsity and imbalanced dataset problems, cold-start problems, and long tail problems. The aim of this thesis is to develop recommendation algorithms that satisfy the aforementioned design objectives making the recommendation generation techniques applicable to a wider range of practical situations and real-world scenarios.With this in mind, in the first half of the thesis, we propose novel hybrid recommendation algorithms that give accurate results and eliminate some of the known problems with recommender systems. More specifically, we propose a novel switching hybrid recommendation framework that combines Collaborative Filtering (CF) with a content-basedfiltering algorithm. Our experiments show that the performance of our algorithm is better than (or comparable to) the other hybrid recommendation approaches availablein the literature. While reducing the dimensions of the dataset by Singular Value Decomposition (SVD), prior to applying CF, we discover that the SVD-based CF fails toproduce reliable recommendations for some datasets. After further investigation, we fi?nd out that the SVD-based recommendations depend on the imputation methods used toapproximate the missing values in the user-item rating matrix. We propose various missing value imputation methods, which exhibit much superior accuracy and performance compared to the traditional missing value imputation method - item average. Furthermore, we show how the gray-sheep users problem associated with a recommender systemcan effectively be solved using the K-means clustering algorithm. After analysing the effect of different centroid selection approaches and distance measures in the K-means clustering algorithm, we demonstrate how the gray-sheep users in a recommender system can be identified by treating them as an outlier problem. We demonstrate that the performance (accuracy and coverage) of the CF-based algorithms suffers in the case of gray-sheep users. We propose a hybrid recommendation algorithm to solve the gray-sheep users problem.In the second half of the thesis, we propose a new class of kernel mapping recommender system methods that we call KMR for solving the recommendation problem. The proposed methods find the multi-linear mapping between two vector spaces based on the structure-learning technique. We propose the user- and item-based versions of the KMR algorithms and offer various ways to combine them. We report results of an extensive evaluation conducted on five different datasets under various recommendation conditions. Our empirical study shows that the proposed algorithms offer a state-of-the-art performance and provide robust performance under all conditions. Furthermore, our algorithms are quite flexible as they can incorporate more information|ratings, demographics, features, and contextual information|easily into the forms of kernels and moreover, these kernels can be added/multiplied. We then adapt the KMR algorithm to incorporate new data incrementally. We offer a new heuristic namely KMRincr that can build the model without retraining the whole model from scratch when new data are added to the recommender system, providing significant computation savings. Our final contribution involves adapting the KMR algorithms to build the model on-line. More specifically, we propose a perceptron-type algorithm namely KMR percept which is a novel, fast, on-line algorithm for building the model that maintains good accuracy and scales well with the data. We provide the temporal analysis of the KMR percept algorithm. The empirical results reveal that the performance of the KMR percept is comparable to the KMR, and furthermore, it overcomes some of the conventional problems with recommender systems
Inference of Activities with Unexpected Actions Using Pattern Mining
Recognition of activities in an unobtrusive manner has attracted the attention of context aware systems, which provide end users with services based on everyday activities that are recognised without infringing the privacy of the end user. Current work has generally focused on applying a range of traditional classification and semantic reasoning based techniques in order to recognise these activities. However, the ability to recognise unexpected actions while the activity is being conducted remains a challenge. In this paper, we present an approach that is able to recognise activities regardless of the order of tasks/actions used to perform the activity. The proposed recognition framework extends an existing activity recognition approach by deploying a frequent pattern mining technique to find patterns among different streams of captured sensor events in order to increase the adaptive learning of the proposed recognition approach
Recognition Framework for Inferring Activities of Daily Living Based on Pattern Mining
Ambient assisted living applications are very much dependent on robust activity recognition frameworks, which allow these applications to provide services based on the contextual information that has been discovered. Existing frameworks have generally focused on the application of traditional classifiers and semantics reasoning to recognize activities. Nevertheless, being able to recognize unexpected actions remains a challenge. The work in this paper presents an approach that is able to recognize activities that have been conducted in an unordered manner. The recognition framework extends an existing approach that recognizes activities by exploiting the different levels of abstraction within an activity. A frequent pattern mining algorithm has been applied to the recognition framework in order to find patterns within the stream of captured events, which in turn increases the adaptive learning ability of the proposed recognition framework. This paper also presents experimental results that validate the recognition ability of the recognition framework. The motivation of this work is to be able to detect the functional decline among elderly people suffering from Alzheimer’s disease by recognizing their daily activities
A robust regression-based stock exchange forecasting and determination of correlation between stock markets
Knowledge-based decision support systems for financial management are an important part of investment plans. Investors are avoiding investing in traditional investment areas such as banks due to low return on investment. The stock exchange is one of the major areas for investment presently. Various non-linear and complex factors affect the stock exchange. A robust stock exchange forecasting system remains an important need. From this line of research, we evaluate the performance of a regression-based model to check the robustness over large datasets. We also evaluate the effect of top stock exchange markets on each other. We evaluate our proposed model on the top 4 stock exchanges-New York, London, NASDAQ and Karachi stock exchange. We also evaluate our model on the top 3 companies-Apple, Microsoft, and Google. A huge (Big Data) historical data is gathered from Yahoo finance consisting of 20 years. Such huge data creates a Big Data problem. The performance of our system is evaluated on a 1-step, 6-step, and 12-step forecast. The experiments show that the proposed system produces excellent results. The results are presented in terms of Mean Absolute Error (MAE) and Root Mean Square Error (RMSE)