570 research outputs found

    A scalable recommender system : using latent topics and alternating least squares techniques

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsA recommender system is one of the major techniques that handles information overload problem of Information Retrieval. Improves access and proactively recommends relevant information to each user, based on preferences and objectives. During the implementation and planning phases, designers have to cope with several issues and challenges that need proper attention. This thesis aims to show the issues and challenges in developing high-quality recommender systems. A paper solves a current research problem in the field of job recommendations using a distributed algorithmic framework built on top of Spark for parallel computation which allows the algorithm to scale linearly with the growing number of users. The final solution consists of two different recommenders which could be utilised for different purposes. The first method is mainly driven by latent topics among users, meanwhile the second technique utilises a latent factor algorithm that directly addresses the preference-confidence paradigm

    Evaluation of Citation Graph Thematic Dataset Construction and Paper Filtering Methods for Research Literature Recommendation

    Get PDF
    One of the main challenges faced by new researchers is immersing themselves in the existing literature relevant to their field of interest. The vastness and continuous growth of knowledge in their field can be overwhelming, making it difficult to identify the most pertinent research papers within their research themes. To address this issue, research paper recommender systems have emerged as valuable tools. These systems allow researchers to find relevant papers based on their specific interests or research themes by analyzing various aspects such as titles, abstracts, and full texts. The quality of the dataset used is crucial for the development, testing, and refinement of these systems to ensure optimal results. Dataset quality directly impacts the accuracy and reliability of a recommender system. In this thesis, I propose a novel approach for constructing datasets using citation graph networks. These networks consist of nodes representing research papers and edges representing citations between them. By leveraging citation graph networks, we gain a more comprehensive understanding of the relationships and influences among different papers compared to traditional methods that rely solely on keyword searches. To evaluate the effectiveness of the citation graph network method, I compared it with the traditional keyword search approach for dataset construction. Additionally, I assessed the effectiveness of three recommender system algorithms: user-based collaborative filtering, combined with PageRank and personalized PageRank algorithms. The experimental findings provide clear evidence that utilizing citation graph network datasets significantly enhances the efficacy of research paper recommender systems. This improvement simplifies the process of finding relevant literature for researchers, potentially accelerating scientific discovery

    Relational clustering models for knowledge discovery and recommender systems

    Get PDF
    Cluster analysis is a fundamental research field in Knowledge Discovery and Data Mining (KDD). It aims at partitioning a given dataset into some homogeneous clusters so as to reflect the natural hidden data structure. Various heuristic or statistical approaches have been developed for analyzing propositional datasets. Nevertheless, in relational clustering the existence of multi-type relationships will greatly degrade the performance of traditional clustering algorithms. This issue motivates us to find more effective algorithms to conduct the cluster analysis upon relational datasets. In this thesis we comprehensively study the idea of Representative Objects for approximating data distribution and then design a multi-phase clustering framework for analyzing relational datasets with high effectiveness and efficiency. The second task considered in this thesis is to provide some better data models for people as well as machines to browse and navigate a dataset. The hierarchical taxonomy is widely used for this purpose. Compared with manually created taxonomies, automatically derived ones are more appealing because of their low creation/maintenance cost and high scalability. Up to now, the taxonomy generation techniques are mainly used to organize document corpus. We investigate the possibility of utilizing them upon relational datasets and then propose some algorithmic improvements. Another non-trivial problem is how to assign suitable labels for the taxonomic nodes so as to credibly summarize the content of each node. Unfortunately, this field has not been investigated sufficiently to the best of our knowledge, and so we attempt to fill the gap by proposing some novel approaches. The final goal of our cluster analysis and taxonomy generation techniques is to improve the scalability of recommender systems that are developed to tackle the problem of information overload. Recent research in recommender systems integrates the exploitation of domain knowledge to improve the recommendation quality, which however reduces the scalability of the whole system at the same time. We address this issue by applying the automatically derived taxonomy to preserve the pair-wise similarities between items, and then modeling the user visits by another hierarchical structure. Experimental results show that the computational complexity of the recommendation procedure can be greatly reduced and thus the system scalability be improved

    A novel hybrid recommendation system for library book selection

    Get PDF
    Abstract. Increasing number of books published in a year and decreasing budgets have made collection development increasingly difficult in libraries. Despite the data to help decision making being available in the library systems, the librarians have little means to utilize the data. In addition, modern key technologies, such as machine learning, that generate more value out data have not yet been utilized in the field of libraries to their full extent. This study was set to discover a way to build a recommendation system that could help librarians who are struggling with book selection process. This thesis proposed a novel hybrid recommendation system for library book selection. The data used to build the system consisted of book metadata and book circulation data of books located in Joensuu City Library’s adult fiction collection. The proposed system was based on both rule-based components and a machine learning model. The user interface for the system was build using web technologies so that the system could be used via using web browser. The proposed recommendation system was evaluated using two different methods: automated tests and focus group methodology. The system achieved an accuracy of 79.79% and F1 score of 0.86 in automated tests. Uncertainty rate of the system was 27.87%. With these results in automated tests, the proposed system outperformed baseline machine learning models. The main suggestions that were gathered from focus group evaluation were that while the proposed system was found interesting, librarians thought it would need more features and configurability in order to be usable in real world scenarios. Results indicate that making good quality recommendations using book metadata is challenging because the data is high dimensional categorical data by its nature. Main implications of the results are that recommendation systems in domain of library collection development should focus on data pre-processing and feature engineering. Further investigation is suggested to be carried out regarding knowledge representation

    DESIGN AND EXPLORATION OF NEW MODELS FOR SECURITY AND PRIVACY-SENSITIVE COLLABORATION SYSTEMS

    Get PDF
    Collaboration has been an area of interest in many domains including education, research, healthcare supply chain, Internet of things, and music etc. It enhances problem solving through expertise sharing, ideas sharing, learning and resource sharing, and improved decision making. To address the limitations in the existing literature, this dissertation presents a design science artifact and a conceptual model for collaborative environment. The first artifact is a blockchain based collaborative information exchange system that utilizes blockchain technology and semi-automated ontology mappings to enable secure and interoperable health information exchange among different health care institutions. The conceptual model proposed in this dissertation explores the factors that influences professionals continued use of video- conferencing applications. The conceptual model investigates the role the perceived risks and benefits play in influencing professionals’ attitude towards VC apps and consequently its active and automatic use

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Exploiting the conceptual space in hybrid recommender systems: a semantic-based approach

    Full text link
    Tesis doctoral inédita. Universidad Autónoma de Madrid, Escuela Politécnica Superior, octubre de 200
    • …
    corecore