164 research outputs found

    Multi-style explainable matrix factorization techniques for recommender systems.

    Get PDF
    Black-box recommender system models are machine learning models that generate personalized recommendations without explaining how the recommendations were generated to the user or giving them a way to correct wrong assumptions made about them by the model. However, compared to white-box models, which are transparent and scrutable, black-box models are generally more accurate. Recent research has shown that accuracy alone is not sufficient for user satisfaction. One such black-box model is Matrix Factorization, a State of the Art recommendation technique that is widely used due to its ability to deal with sparse data sets and to produce accurate recommendations. Recent work has proposed new Matrix Factorization models that are explainable by incorporating explanations derived from semantic knowledge graphs, user neighborhood, or item neighborhood graphs into the model learning process. These Explainable Matrix Factorization (EMF) methods have the benefit of providing explanations without sacrificing accuracy. However, their explanations tend to be limited to only one explanation style. In this dissertation, we propose a framework comprising new machine learning methods to build explainable models that can make recommendations with multiple explanation-styles, by hybridizing multiple EMF models and by proposing new EMF models that explain recommendations using tags. The various pre-calculated explainability scores, leveraged in our proposed methods, have all been validated in prior work that conducted user studies to evaluate users’ satisfaction with each style individually. Unlike most existing work that generates explanations post-hoc, i.e., after the predictions have already been made, our framework is based on calculating explainability scores directly from available data, before the model is learned, and then using them as part of a regularization mechanism, to guide the model learning. Unlike post-hoc methods, our framework makes it possible to learn machine learning models that take into account the explanation scores, therefore ensuring higher transparency. Our evaluation experiments show that our proposed methods provide accurate recommendations while also providing users with multiple styles of explanations about how data was used to generate each recommendation. Each explanation style also provides additional decision-making information that empowers the user to either trust or scrutinize the recommendations. Although, rooted in the hybrid recommendation framework, our proposed methods make a significant step forward in explainable AI and beyond existing hybrid frameworks, because the proposed hybridization mechanisms make an intentional effort to take into account the individual models’ explanations and not only their output predicted ratings

    Social software for music

    Get PDF
    Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 200

    Ubiquitous Computing

    Get PDF
    The aim of this book is to give a treatment of the actively developed domain of Ubiquitous computing. Originally proposed by Mark D. Weiser, the concept of Ubiquitous computing enables a real-time global sensing, context-aware informational retrieval, multi-modal interaction with the user and enhanced visualization capabilities. In effect, Ubiquitous computing environments give extremely new and futuristic abilities to look at and interact with our habitat at any time and from anywhere. In that domain, researchers are confronted with many foundational, technological and engineering issues which were not known before. Detailed cross-disciplinary coverage of these issues is really needed today for further progress and widening of application range. This book collects twelve original works of researchers from eleven countries, which are clustered into four sections: Foundations, Security and Privacy, Integration and Middleware, Practical Applications

    Learning with Single View Co-training and Marginalized Dropout

    Get PDF
    The generalization properties of most existing machine learning techniques are predicated on the assumptions that 1) a sufficiently large quantity of training data is available; 2) the training and testing data come from some common distribution. Although these assumptions are often met in practice, there are also many scenarios in which training data from the relevant distribution is insufficient. We focus on making use of additional data, which is readily available or can be obtained easily but comes from a different distribution than the testing data, to aid learning. We present five learning scenarios, depending on how the distribution we used to sample the additional training data differs from the testing distribution: 1) learning with weak supervision; 2) domain adaptation; 3) learning from multiple domains; 4) learning from corrupted data; 5) learning with partial supervision. We introduce two strategies and manifest them in five ways to cope with the difference between the training and testing distribution. The first strategy, which gives rise to Pseudo Multi-view Co-training: PMC) and Co-training for Domain Adaptation: CODA), is inspired by the co-training algorithm for multi-view data. PMC generalizes co-training to the more common single view data and allows us to learn from weakly labeled data retrieved free from the web. CODA integrates PMC with an another feature selection component to address the feature incompatibility between domains for domain adaptation. PMC and CODA are evaluated on a variety of real datasets, and both yield record performance. The second strategy marginalized dropout leads to marginalized Stacked Denoising Autoencoders: mSDA), Marginalized Corrupted Features: MCF) and FastTag: FastTag). mSDA diminishes the difference between distributions associated with different domains by learning a new representation through marginalized corruption and reconstruciton. MCF learns from a known distribution which is created by corrupting a small set of training data, and improves robustness of learned classifiers by training on ``infinitely\u27\u27 many data sampled from the distribution. FastTag applies marginalized dropout to the output of partially labeled data to recover missing labels for multi-label tasks. These three algorithms not only achieve the state-of-art performance in various tasks, but also deliver orders of magnitude speed up at training and testing comparing to competing algorithms

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    Learning Explainable User Sentiment and Preferences for Information Filtering

    Get PDF
    In the last decade, online social networks have enabled people to interact in many ways with each other and with content. The digital traces of such actions reveal people's preferences towards online content such as news or products. These traces often result from interactions such as sharing or liking, but also from interactions in natural language. The continuous growth of the amount of content and of digital traces has led to information overload: surrounded by large volumes of information, people are facing difficulties when searching for information relevant to their interests. To improve user experience, information systems must be able to assist users in achieving their search goals, effectively and efficiently. This thesis is concerned with two important challenges that information systems need to address in order to significantly improve search experience and overcome information overload. First, these systems need to model accurately the variety of user traces, and second, they need to meaningfully explain search results and recommendations to users. To address these challenges, this thesis proposes novel methods based on machine learning to model user sentiment and preferences for information filtering systems, which are effective, scalable, and easily interpretable by humans. We focus on two prominent types of user traces in social networks: on the one hand, user comments accompanied by unary preferences such as likes, and on the other hand, user reviews accompanied by numerical preferences such as star ratings. In both cases, we advocate that by better understanding user text through mining its semantics and modeling its structure, we can not only improve information filtering, but also explain predictions to users. Within this context, we aim to answer three main research questions, namely: (i)~how do item semantics help to predict unary preferences; (ii)~how do sentiments of free-form user texts help to predict unary preferences; and (iii)~how to model fine-grained numerical preferences from user review texts. Our goal is to model and extract from user text the knowledge required to answer these questions, and to obtain insights on how to design better information filtering systems that are more effective and improve user experience. To answer the first question, we formulate the recommendation problem based on unary preferences as a top-N retrieval task and we define an appropriate dataset and metrics for measuring performance. Then, we propose and evaluate several content-based methods based on semantic similarities under presence or absence of preferences. To answer the second question, we propose a sentiment-aware neighborhood model which integrates the sentiment of user comments with unary preferences, either through fixed or through learned mapping functions. For the latter type, we propose a learning algorithm which adapts the sentiment of user comments to unary preferences at collective or individual levels. To answer the third question, we cast the problem of modeling user attitude toward aspects of items as a weakly supervised problem, and we propose a weighted multiple-instance learning method for solving it. Lastly, we show that the learned saliency weights, apart from being easily interpretable, are useful indicators for review segmentation and summarization

    Statistical Modeling and Analysis

    Get PDF
    Die Blockchain-Technologie revolutioniert die Interaktion zwischen Menschen durch Peer-to-Peer-Netzwerke, Kryptografie und Konsensalgorithmen. Trustless Trust ermöglicht sichere und transparente Transaktionen ohne Zwischenhändler. Trotz der zunehmenden Beliebtheit von Krypto-Assets und den damit verbundenen „Tokenomics“ hat die Öffentlichkeit immer noch kein umfangreiches Wissen über die Funktionsweisen dieser Technologie, und ein Großteil des Diskurses bleibt spekulativ. Das Hauptziel dieser Arbeit ist, die grundlegenden Prinzipien von Krytowährungen (Cryptos) und Non-Fungible Tokens (NFTs) zu untersuchen sowie eine Korrelation zwischen der Technologie und ihren Auswirkungen auf die Wirtschaft aus statistischer und wirtschaftlicher Sicht herzustellen. Um dieses Ziel zu erreichen, wird in den Kapiteln 2 und 3 der Einfluss der Blockchain-Technologie auf Ökonomie und Funktionsweise von Kryptowährungen anhand ökonometrischer Modelle und Clustering-Techniken untersucht. Kapitel 3 untersucht Kryptowirschaft und Blockchain-Funktionalität anhand empirischer Methoden, insbesondere für Coincreatoren und Investoren. Wir zeigen am Beispiel von Ethereum, dass die wirtschaftliche Leistung von Kryptowährungen durch die Gestaltung der ihnen zugrunde liegenden Blockchain-Technologie beeinflusst werden kann. Kapitel 4 untersucht die partiellen Korrelationen von Bitcoin-Renditen über neun verschiedene Zentralbörsen aus der Perspektive eines hochfrequenten, dynamischen Netzwerks. Die vorgeschlagene MHAR-CM liefert Kovarianzschätzungen, die die Besonderheiten der Kryptomärkte berücksichtigen. Das Kapitel zeigt Spillover- und Third-Party-Risiken zwischen diesen Börsen. Kapitel 5 verwendet eine Hedonische Bewertungsmethode, um den DAI Digital Art Index basierend auf dem NFT-Kunstmarkt zu konstruieren. Ein besonderer Fokus liegt auf der Nivellierung der Auswirkungen von Ausreißern mit einer einstufigen robusten Regressions-Huberisierung und einem dynamic conditional score model. Diese Arbeit verknüpft neue Technologien und Wirtschaft durch statistische Modellierung und Analyse. Durch die Bereitstellung empirischer Belege beobachten wir, wie die Blockchain-Technologie unsere Wahrnehmung von Geld, Kunst und anderen Branchen verändert.The emergence of distributed ledger technologies, such as blockchain, has revolutionized how individuals interact by enabling "trust-less trust" through peer-to-peer networks, cryptography, and consensus algorithms. This technology eliminates intermediaries and provides secure, transparent transaction methods. However, public understanding of this technology, along with "Tokenomics", remains limited, resulting in speculative discourse. The main objective of this thesis is to investigate the fundamental principles of cryptocurrencies (cryptos) and non-fungible tokens (NFTs) and establish a correlation between the technology and its economic impact from statistical and economic perspectives. To achieve this, Chapters 2 and 3 explore the influence of blockchain technology on the economic and functional performance of cryptos using econometric models and clustering techniques. Chapter 3 presents an empirical framework that offers insights to coin creators and investors regarding the interplay between cryptonomics, blockchain functionality, and market dynamics. The economic performance of cryptocurrencies, illustrated with Ethereum as an example, is shown to be affected by the design of their underlying blockchain technology. Chapter 4 examines partial correlations of Bitcoin returns across nine centralized exchanges from a high-frequency dynamic network perspective. The proposed MHAR-CM provides reasonable covariance estimates that account for the unique characteristics of crypto markets. This chapter uncovers spillover risk and counterparty risk among these exchanges. In Chapter 5, a hedonic regression approach is employed to construct the DAI digital art index for the NFT art market. Special emphasis is given to mitigating the impact of outliers using one-step robust regression Huberization and a dynamic conditional score model. The DAI index enhances our understanding of this emerging art market and facilitates observation of its macroeconomic trends. This thesis establishes a connection between emerging technologies and the economy through statistical modeling and analysis. By providing empirical evidence, we gain insights into how blockchain technology is transforming our perceptions of money, art, and various industries

    Natural Language Processing using Deep Learning in Social Media

    Full text link
    [ES] En los últimos años, los modelos de aprendizaje automático profundo (AP) han revolucionado los sistemas de procesamiento de lenguaje natural (PLN). Hemos sido testigos de un avance formidable en las capacidades de estos sistemas y actualmente podemos encontrar sistemas que integran modelos PLN de manera ubicua. Algunos ejemplos de estos modelos con los que interaccionamos a diario incluyen modelos que determinan la intención de la persona que escribió un texto, el sentimiento que pretende comunicar un tweet o nuestra ideología política a partir de lo que compartimos en redes sociales. En esta tesis se han propuestos distintos modelos de PNL que abordan tareas que estudian el texto que se comparte en redes sociales. En concreto, este trabajo se centra en dos tareas fundamentalmente: el análisis de sentimientos y el reconocimiento de la personalidad de la persona autora de un texto. La tarea de analizar el sentimiento expresado en un texto es uno de los problemas principales en el PNL y consiste en determinar la polaridad que un texto pretende comunicar. Se trata por lo tanto de una tarea estudiada en profundidad de la cual disponemos de una vasta cantidad de recursos y modelos. Por el contrario, el problema del reconocimiento de personalidad es una tarea revolucionaria que tiene como objetivo determinar la personalidad de los usuarios considerando su estilo de escritura. El estudio de esta tarea es más marginal por lo que disponemos de menos recursos para abordarla pero que no obstante presenta un gran potencial. A pesar de que el enfoque principal de este trabajo fue el desarrollo de modelos de aprendizaje profundo, también hemos propuesto modelos basados en recursos lingüísticos y modelos clásicos del aprendizaje automático. Estos últimos modelos nos han permitido explorar las sutilezas de distintos elementos lingüísticos como por ejemplo el impacto que tienen las emociones en la clasificación correcta del sentimiento expresado en un texto. Posteriormente, tras estos trabajos iniciales se desarrollaron modelos AP, en particular, Redes neuronales convolucionales (RNC) que fueron aplicadas a las tareas previamente citadas. En el caso del reconocimiento de la personalidad, se han comparado modelos clásicos del aprendizaje automático con modelos de aprendizaje profundo, pudiendo establecer una comparativa bajo las mismas premisas. Cabe destacar que el PNL ha evolucionado drásticamente en los últimos años gracias al desarrollo de campañas de evaluación pública, donde múltiples equipos de investigación comparan las capacidades de los modelos que proponen en las mismas condiciones. La mayoría de los modelos presentados en esta tesis fueron o bien evaluados mediante campañas de evaluación públicas, o bien emplearon la configuración de una campaña pública previamente celebrada. Siendo conscientes, por lo tanto, de la importancia de estas campañas para el avance del PNL, desarrollamos una campaña de evaluación pública cuyo objetivo era clasificar el tema tratado en un tweet, para lo cual recogimos y etiquetamos un nuevo conjunto de datos. A medida que avanzabamos en el desarrollo del trabajo de esta tesis, decidimos estudiar en profundidad como las RNC se aplicaban a las tareas de PNL. En este sentido, se exploraron dos líneas de trabajo. En primer lugar, propusimos un método de relleno semántico para RNC, que plantea una nueva manera de representar el texto para resolver tareas de PNL. Y en segundo lugar, se introdujo un marco teórico para abordar una de las críticas más frecuentes del aprendizaje profundo, el cual es la falta de interpretabilidad. Este marco busca visualizar qué patrones léxicos, si los hay, han sido aprendidos por la red para clasificar un texto.[CA] En els últims anys, els models d'aprenentatge automàtic profund (AP) han revolucionat els sistemes de processament de llenguatge natural (PLN). Hem estat testimonis d'un avanç formidable en les capacitats d'aquests sistemes i actualment podem trobar sistemes que integren models PLN de manera ubiqua. Alguns exemples d'aquests models amb els quals interaccionem diàriament inclouen models que determinen la intenció de la persona que va escriure un text, el sentiment que pretén comunicar un tweet o la nostra ideologia política a partir del que compartim en xarxes socials. En aquesta tesi s'han proposats diferents models de PNL que aborden tasques que estudien el text que es comparteix en xarxes socials. En concret, aquest treball se centra en dues tasques fonamentalment: l'anàlisi de sentiments i el reconeixement de la personalitat de la persona autora d'un text. La tasca d'analitzar el sentiment expressat en un text és un dels problemes principals en el PNL i consisteix a determinar la polaritat que un text pretén comunicar. Es tracta per tant d'una tasca estudiada en profunditat de la qual disposem d'una vasta quantitat de recursos i models. Per contra, el problema del reconeixement de la personalitat és una tasca revolucionària que té com a objectiu determinar la personalitat dels usuaris considerant el seu estil d'escriptura. L'estudi d'aquesta tasca és més marginal i en conseqüència disposem de menys recursos per abordar-la però no obstant i això presenta un gran potencial. Tot i que el fouc principal d'aquest treball va ser el desenvolupament de models d'aprenentatge profund, també hem proposat models basats en recursos lingüístics i models clàssics de l'aprenentatge automàtic. Aquests últims models ens han permès explorar les subtileses de diferents elements lingüístics com ara l'impacte que tenen les emocions en la classificació correcta del sentiment expressat en un text. Posteriorment, després d'aquests treballs inicials es van desenvolupar models AP, en particular, Xarxes neuronals convolucionals (XNC) que van ser aplicades a les tasques prèviament esmentades. En el cas de el reconeixement de la personalitat, s'han comparat models clàssics de l'aprenentatge automàtic amb models d'aprenentatge profund la qual cosa a permet establir una comparativa de les dos aproximacions sota les mateixes premisses. Cal remarcar que el PNL ha evolucionat dràsticament en els últims anys gràcies a el desenvolupament de campanyes d'avaluació pública on múltiples equips d'investigació comparen les capacitats dels models que proposen sota les mateixes condicions. La majoria dels models presentats en aquesta tesi van ser o bé avaluats mitjançant campanyes d'avaluació públiques, o bé s'ha emprat la configuració d'una campanya pública prèviament celebrada. Sent conscients, per tant, de la importància d'aquestes campanyes per a l'avanç del PNL, vam desenvolupar una campanya d'avaluació pública on l'objectiu era classificar el tema tractat en un tweet, per a la qual cosa vam recollir i etiquetar un nou conjunt de dades. A mesura que avançàvem en el desenvolupament del treball d'aquesta tesi, vam decidir estudiar en profunditat com les XNC s'apliquen a les tasques de PNL. En aquest sentit, es van explorar dues línies de treball.En primer lloc, vam proposar un mètode d'emplenament semàntic per RNC, que planteja una nova manera de representar el text per resoldre tasques de PNL. I en segon lloc, es va introduir un marc teòric per abordar una de les crítiques més freqüents de l'aprenentatge profund, el qual és la falta de interpretabilitat. Aquest marc cerca visualitzar quins patrons lèxics, si n'hi han, han estat apresos per la xarxa per classificar un text.[EN] In the last years, Deep Learning (DL) has revolutionised the potential of automatic systems that handle Natural Language Processing (NLP) tasks. We have witnessed a tremendous advance in the performance of these systems. Nowadays, we found embedded systems ubiquitously, determining the intent of the text we write, the sentiment of our tweets or our political views, for citing some examples. In this thesis, we proposed several NLP models for addressing tasks that deal with social media text. Concretely, this work is focused mainly on Sentiment Analysis and Personality Recognition tasks. Sentiment Analysis is one of the leading problems in NLP, consists of determining the polarity of a text, and it is a well-known task where the number of resources and models proposed is vast. In contrast, Personality Recognition is a breakthrough task that aims to determine the users' personality using their writing style, but it is more a niche task with fewer resources designed ad-hoc but with great potential. Despite the fact that the principal focus of this work was on the development of Deep Learning models, we have also proposed models based on linguistic resources and classical Machine Learning models. Moreover, in this more straightforward setup, we have explored the nuances of different language devices, such as the impact of emotions in the correct classification of the sentiment expressed in a text. Afterwards, DL models were developed, particularly Convolutional Neural Networks (CNNs), to address previously described tasks. In the case of Personality Recognition, we explored the two approaches, which allowed us to compare the models under the same circumstances. Noteworthy, NLP has evolved dramatically in the last years through the development of public evaluation campaigns, where multiple research teams compare the performance of their approaches under the same conditions. Most of the models here presented were either assessed in an evaluation task or either used their setup. Recognising the importance of this effort, we curated and developed an evaluation campaign for classifying political tweets. In addition, as we advanced in the development of this work, we decided to study in-depth CNNs applied to NLP tasks. Two lines of work were explored in this regard. Firstly, we proposed a semantic-based padding method for CNNs, which addresses how to represent text more appropriately for solving NLP tasks. Secondly, a theoretical framework was introduced for tackling one of the most frequent critics of Deep Learning: interpretability. This framework seeks to visualise what lexical patterns, if any, the CNN is learning in order to classify a sentence. In summary, the main achievements presented in this thesis are: - The organisation of an evaluation campaign for Topic Classification from texts gathered from social media. - The proposal of several Machine Learning models tackling the Sentiment Analysis task from social media. Besides, a study of the impact of linguistic devices such as figurative language in the task is presented. - The development of a model for inferring the personality of a developer provided the source code that they have written. - The study of Personality Recognition tasks from social media following two different approaches, models based on machine learning algorithms and handcrafted features, and models based on CNNs were proposed and compared both approaches. - The introduction of new semantic-based paddings for optimising how the text was represented in CNNs. - The definition of a theoretical framework to provide interpretable information to what CNNs were learning internally.Giménez Fayos, MT. (2021). Natural Language Processing using Deep Learning in Social Media [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/172164TESI

    Radiomics risk modelling using machine learning algorithms for personalised radiation oncology

    Get PDF
    One major objective in radiation oncology is the personalisation of cancer treatment. The implementation of this concept requires the identification of biomarkers, which precisely predict therapy outcome. Besides molecular characterisation of tumours, a new approach known as radiomics aims to characterise tumours using imaging data. In the context of the presented thesis, radiomics was established at OncoRay to improve the performance of imaging-based risk models. Two software-based frameworks were developed for image feature computation and risk model construction. A novel data-driven approach for the correction of intensity non-uniformity in magnetic resonance imaging data was evolved to improve image quality prior to feature computation. Further, different feature selection methods and machine learning algorithms for time-to-event survival data were evaluated to identify suitable algorithms for radiomics risk modelling. An improved model performance could be demonstrated using computed tomography data, which were acquired during the course of treatment. Subsequently tumour sub-volumes were analysed and it was shown that the tumour rim contains the most relevant prognostic information compared to the corresponding core. The incorporation of such spatial diversity information is a promising way to improve the performance of risk models.:1. Introduction 2. Theoretical background 2.1. Basic physical principles of image modalities 2.1.1. Computed tomography 2.1.2. Magnetic resonance imaging 2.2. Basic principles of survival analyses 2.2.1. Semi-parametric survival models 2.2.2. Full-parametric survival models 2.3. Radiomics risk modelling 2.3.1. Feature computation framework 2.3.2. Risk modelling framework 2.4. Performance assessments 2.5. Feature selection methods and machine learning algorithms 2.5.1. Feature selection methods 2.5.2. Machine learning algorithms 3. A physical correction model for automatic correction of intensity non-uniformity in magnetic resonance imaging 3.1. Intensity non-uniformity correction methods 3.2. Physical correction model 3.2.1. Correction strategy and model definition 3.2.2. Model parameter constraints 3.3. Experiments 3.3.1. Phantom and simulated brain data set 3.3.2. Clinical brain data set 3.3.3. Abdominal data set 3.4. Summary and discussion 4. Comparison of feature selection methods and machine learning algorithms for radiomics time-to-event survival models 4.1. Motivation 4.2. Patient cohort and experimental design 4.2.1. Characteristics of patient cohort 4.2.2. Experimental design 4.3. Results of feature selection methods and machine learning algorithms evaluation 4.4. Summary and discussion 5. Characterisation of tumour phenotype using computed tomography imaging during treatment 5.1. Motivation 5.2. Patient cohort and experimental design 5.2.1. Characteristics of patient cohort 5.2.2. Experimental design 5.3. Results of computed tomography imaging during treatment 5.4. Summary and discussion 6. Tumour phenotype characterisation using tumour sub-volumes 6.1. Motivation 6.2. Patient cohort and experimental design 6.2.1. Characteristics of patient cohorts 6.2.2. Experimental design 6.3. Results of tumour sub-volumes evaluation 6.4. Summary and discussion 7. Summary and further perspectives 8. Zusammenfassun
    corecore