30 research outputs found
Content Recommendation Through Linked Data
Nowadays, people can easily obtain a huge amount of information from the Web, but often they have no criteria to discern it. This issue is known as information overload. Recommender systems are software tools to suggest interesting items to users and can help them to deal with a vast amount of information. Linked Data is a set of best practices to publish data on the Web, and it is the basis of the Web of Data, an interconnected global dataspace.
This thesis discusses how to discover information useful for the user from the vast amount of structured data, and notably Linked Data available on the Web. The work addresses this issue by considering three research questions: how to exploit existing relationships between resources published on the Web to provide recommendations to users; how to represent the user and his context to generate better recommendations for the current situation; and how to effectively visualize the recommended resources and their relationships.
To address the first question, the thesis proposes a new algorithm based on Linked Data which exploits existing relationships between resources to recommend related resources. The algorithm was integrated into a framework to deploy and evaluate Linked Data based recommendation algorithms. In fact, a related problem is how to compare them and how to evaluate their performance when applied to a given dataset. The user evaluation showed that our algorithm improves the rate of new recommendations, while maintaining a satisfying prediction accuracy. To represent the user and their context, this thesis presents the Recommender System Context ontology, which is exploited in a new context-aware approach that can be used with existing recommendation algorithms. The evaluation showed that this method can significantly improve the prediction accuracy. As regards the problem of effectively visualizing the recommended resources and their relationships, this thesis proposes a visualization framework for DBpedia (the Linked Data version of Wikipedia) and mobile devices, which is designed to be extended to other datasets.
In summary, this thesis shows how it is possible to exploit structured data available on the Web to recommend useful resources to users. Linked Data were successfully exploited in recommender systems. Various proposed approaches were implemented and applied to use cases of Telecom Italia
SemRevRec: a recommender system based on user reviews and linked data
Traditionally, recommender systems exploit user ratings to infer preferences. However, the growing popularity of social platforms has encouraged users to write textual reviews about liked items. These reviews represent a valuable source of non-trivial information that could improve users' decision processes. In this paper we propose a novel recommendation approach based on the semantic annotation of entities mentioned in user reviews and on the knowledge available in the Web of Data. We compared our recommender system with two baseline algorithms and a state-of-the-art Linked Data based approach. Our system provided more diverse recommendations with respect to the other techniques considered, while obtaining a better accuracy than the Linked Data based method
Soft-prompt tuning to predict lung cancer using primary care free-text Dutch medical notes
We investigate different natural language processing (NLP) approaches based
on contextualised word representations for the problem of early prediction of
lung cancer using free-text patient medical notes of Dutch primary care
physicians. Because lung cancer has a low prevalence in primary care, we also
address the problem of classification under highly imbalanced classes.
Specifically, we use large Transformer-based pretrained language models (PLMs)
and investigate: 1) how \textit{soft prompt-tuning} -- an NLP technique used to
adapt PLMs using small amounts of training data -- compares to standard model
fine-tuning; 2) whether simpler static word embedding models (WEMs) can be more
robust compared to PLMs in highly imbalanced settings; and 3) how models fare
when trained on notes from a small number of patients. We find that 1)
soft-prompt tuning is an efficient alternative to standard model fine-tuning;
2) PLMs show better discrimination but worse calibration compared to simpler
static word embedding models as the classification problem becomes more
imbalanced; and 3) results when training models on small number of patients are
mixed and show no clear differences between PLMs and WEMs. All our code is
available open source in
\url{https://bitbucket.org/aumc-kik/prompt_tuning_cancer_prediction/}.Comment: A short version of this paper has been published at the 21st
International Conference on Artificial Intelligence in Medicine (AIME 2023
Content Recommendation through Semantic Annotation of User Reviews and Linked Data - An Extended Technical Report
Nowadays, most recommender systems exploit user-provided ratings to infer their preferences. However, the growing popularity of social and e-commerce websites has encouraged users to also share comments and opinions through textual reviews. In this paper, we introduce a new recommendation approach which exploits the semantic annotation of user reviews to extract useful and non-trivial information about the items to recommend. It also relies on the knowledge freely available in the Web of Data, notably in DBpedia and Wikidata, to discover other resources connected with the annotated entities. We evaluated our approach in three domains, using both DBpedia and Wikidata. The results showed that our solution provides a better ranking than another recommendation method based on the Web of Data, while it improves in novelty with respect to traditional techniques based on ratings. Additionally, our method achieved a better performance with Wikidata than DBpedia
Training researchers with the MOVING platform
The MOVING platform enables its users to improve their information literacy by training how to exploit data and text mining methods in their daily research tasks. In this paper, we show how it can support researchers in various tasks, and we introduce its main features, such as text and video retrieval and processing, advanced visualizations, and the technologies to assist the learning process
Executing, Comparing, and Reusing Linked Data-Based Recommendation Algorithms With the Allied Framework
International audienceData published on the Web following the Linked Data principles has resulted in a global data space called the Web of Data. These principles led to semantically interlink and connect different resources at data level regardless their structure, authoring, location, etc. The tremendous and continuous growth of the Web of Data also implies that now it is more likely to find resources that describe real-life concepts. However, discovering and recommending relevant related resources is still an open research area. This chapter studies recommender systems that use Linked Data as a source containing a significant amount of available resources and their relationships useful to produce recommendations. Furthermore, it also presents a framework to deploy and execute state-of-the-art algorithms for Linked Data that have been re-implemented to measure and benchmark them in different application domains and without being bound to a unique dataset
MOVING: A User-Centric Platform for Online Literacy Training and Learning
Part of the Progress in IS book series (PROIS)In this paper, we present an overview of the MOVING platform, a user-driven approach that enables young researchers, decision makers, and public administrators to use machine learning and data mining tools to search, organize, and manage large-scale information sources on the web such as scientific publications, videos of research talks, and social media. In order to provide a concise overview of the platform, we focus on its front end, which is the MOVING web application. By presenting the main components of the web application, we illustrate what functionalities and capabilities the platform offer its end-users, rather than delving into the data analysis and machine learning technologies that make these functionalities possible
EduArc. A FAIR and user-centred infrastructure for learning resources
The project EduArc aims at conceptualising a user-centred infrastructure that allows access and use of learning and teaching material for higher education, specifically Open Educational Resources (OER), from diverse relevant sources like learning management systems and existing university repositories. (Author