361 research outputs found

    soMLier: A South African Wine Recommender System

    Get PDF
    Though several commercial wine recommender systems exist, they are largely tailored to consumers outside of South Africa (SA). Consequently, these systems are of limited use to novice wine consumers in SA. To address this, the aim of this research is to develop a system for South African consumers that yields high-quality wine recommendations, maximises the accuracy of predicted ratings for those recommendations and provides insights into why those suggestions were made. To achieve this, a hybrid system “soMLier” (pronounced “sommelier”) is built in this thesis that makes use of two datasets. Firstly, a database containing several attributes of South African wines such as the chemical composition, style, aroma, price and description was supplied by wine.co.za (a SA wine retailer). Secondly, for each wine in that database, the numeric 5-star ratings and textual reviews made by users worldwide were further scraped from Vivino.com to serve as a dataset of user preferences. Together, these are used to develop and compare several systems, the most optimal of which are combined in the final system. Item-based collaborative filtering methods are investigated first along with model-based techniques (such as matrix factorisation and neural networks) when applied to the user rating dataset to generate wine recommendations through the ranking of rating predictions. Respectively, these methods are determined to excel at generating lists of relevant wine recommendations and producing accurate corresponding predicted ratings. Next, the wine attribute data is used to explore the efficacy of content-based systems. Numeric features (such as price) are compared along with categorical features (such as style) using various distance measures and the relationships between the textual descriptions of the wines are determined using natural language processing methods. These methods are found to be most appropriate for explaining wine recommendations. Hence, the final hybrid system makes use of collaborative filtering to generate recommendations, matrix factorisation to predict user ratings, and content-based techniques to rationalise the wine suggestions made. This thesis contributes the “soMLier” system that is of specific use to SA wine consumers as it bridges the gap between the technologies used by highly-developed existing systems and the SA wine market. Though this final system would benefit from more explicit user data to establish a richer model of user preferences, it can ultimately assist consumers in exploring unfamiliar wines, discovering wines they will likely enjoy, and understanding their preferences of SA wine

    Semantic Recommender System

    Get PDF
    Though Content-based recommender systems proved to have better quality than Collaborative Filtering recommenders, the later is more used because the former suffers from complex mathematical calculations and inadequate data modeling techniques. Using Ontology(ies) to model the data allows machines to better understand both items and users’ preferences and thus not just suggesting better recommendations, but also providing accurate justifications. In this work we present a Semantic Recommender system that uses a novel way of generating recommendations depending on a Recommender Ontology that provides controlled vocabularies in the context of recommendations, and that is built upon the idea that not all classes and properties are important from item-similarities point of view. If the domain Ontology is annotated with the Recommender Ontology, the Semantic Recommender should be able to generate recommendations. As a result, the proposed system works with any domain data. Thanks to The Semantic Web standards. The proposed mathematical model takes into consideration, in addition to items’ features and users’ profiles, the context of the users and the temporal context, so some items, as an event’s ticket, should never be recommended if the event is over, and should get more presence before the event. The Recommender Ontology grants business owners a way to boost the recommended items according to their needs. This guarantees more diversity, which satisfies the business requirements. For the experiments, we have tested the proposed solution with many domains including movies, books, music, and with a real business company. We got 55% accuracy when testing on a movie domain though we knew just one feature about the movies. The main limitation we have faced is the absent of a content-based domain case that contains ABox, TBox, and ratings together

    A Distributed, Architecture-Centric Approach to Computing Accurate Recommendations from Very Large and Sparse Datasets

    Get PDF
    The use of recommender systems is an emerging trend today, when user behavior information is abundant. There are many large datasets available for analysis because many businesses are interested in future user opinions. Sophisticated algorithms that predict such opinions can simplify decision-making, improve customer satisfaction, and increase sales. However, modern datasets contain millions of records, which represent only a small fraction of all possible data. Furthermore, much of the information in such sparse datasets may be considered irrelevant for making individual recommendations. As a result, there is a demand for a way to make personalized suggestions from large amounts of noisy data. Current recommender systems are usually all-in-one applications that provide one type of recommendation. Their inflexible architectures prevent detailed examination of recommendation accuracy and its causes. We introduce a novel architecture model that supports scalable, distributed suggestions from multiple independent nodes. Our model consists of two components, the input matrix generation algorithm and multiple platform-independent combination algorithms. A dedicated input generation component provides the necessary data for combination algorithms, reduces their size, and eliminates redundant data processing. Likewise, simple combination algorithms can produce recommendations from the same input, so we can more easily distinguish between the benefits of a particular combination algorithm and the quality of the data it receives. Such flexible architecture is more conducive for a comprehensive examination of our system. We believe that a user's future opinion may be inferred from a small amount of data, provided that this data is most relevant. We propose a novel algorithm that generates a more optimal recommender input. Unlike existing approaches, our method sorts the relevant data twice. Doing this is slower, but the quality of the resulting input is considerably better. Furthermore, the modular nature of our approach may improve its performance, especially in the cloud computing context. We implement and validate our proposed model via mathematical modeling, by appealing to statistical theories, and through extensive experiments, data analysis, and empirical studies. Our empirical study examines the effectiveness of accuracy improvement techniques for collaborative filtering recommender systems. We evaluate our proposed architecture model on the Netflix dataset, a popular (over 130,000 solutions), large (over 100,000,000 records), and extremely sparse (1.1\%) collection of movie ratings. The results show that combination algorithm tuning has little effect on recommendation accuracy. However, all algorithms produce better results when supplied with a more relevant input. Our input generation algorithm is the reason for a considerable accuracy improvement

    Mitigating Fake Digital Media and Quality Assurance

    Get PDF
    With the introduction of social media, the internet is filled with an excess of data and content. Users feeds are cluttered with fake, malicious, and unnecessary information, polluting their page and wasting their time. As observed in the 2016 US election, spam accounts posting fake news were successfully able to sway political opinion and misinform the general population. Additionally, with social media becoming one of the biggest advertising markets, there is a rise in the number of fake accounts with a large number of followers that mainly consist of bots. It should be a priority to protect the public from false information and businesses that want to make money on social media platforms. Through utilizing Natural Language Processing, image recognition, and recommendation systems, powered through AI and Machine Learning, the goal of our project is to provide the user with content that is verified and tailored to their liking. This report details our plan to mitigate the amount of unnecessary content displayed in front of the user, and the rationale for our designs. It provides a guide for all the completed work, future iterations, performance results, and the reliability of our model as an efficient solution
    corecore