742 research outputs found
TrustDL: Use of trust-based dictionary learning to facilitate recommendation in social networks
peer reviewedCollaborative filtering (CF) is a widely applied method to perform recommendation tasks in a wide range of domains and applications. Dictionary learning (DL) models, which are highly important in CF-based recommender systems (RSs), are well represented by rating matrices. However, these methods alone do not resolve the cold start and data sparsity issues in RSs. We observed a significant improvement in rating results by adding trust information on the social network. For that purpose, we proposed a new dictionary learning technique based on trust information, called TrustDL, where the social network data were employed in the process of recommendation based on structural details on the trusted network. TrustDL sought to integrate the sources of information, including trust statements and ratings, into the recommendation model to mitigate both problems of cold start and data sparsity. It conducted dictionary learning and trust embedding simultaneously to predict unknown rating values. In this paper, the dictionary learning technique was integrated into rating learning, along with the trust consistency regularization term designed to offer a more accurate understanding of the feature representation. Moreover, partially identical trust embedding was developed, where users with similar rating sets could cluster together, and those with similar rating sets could be represented collaboratively. The proposed strategy appears significantly beneficial based on experiments conducted on four frequently used datasets: Epinions, Ciao, FilmTrust, and Flixster
Reputation-based Trust Management in Peer-to-Peer File Sharing Systems
Trust is required in file sharing peer-to-peer (P2P) systems to achieve better cooperation among peers and reduce malicious uploads. In reputation-based P2P systems, reputation is used to build trust among peers based on their past transactions and feedbacks from other peers. In these systems, reputable peers will usually be selected to upload requested files, decreasing significantly malicious uploads in the system.
This thesis surveys different reputation management systems with a focus on reputation based P2P systems. We breakdown a typical reputation system into functional components. We discuss each component and present proposed solutions from the literature. Different reputation-based systems are described and analyzed. Each proposed scheme presents a particular perspective in addressing peers’ reputation.
This thesis also presents a novel trust management framework and associated schemes for partially decentralized file sharing P2P systems. We address trust according to three identified dimensions: Authentic Behavior, Credibility Behavior and Contribution Behavior. Within our trust management framework, we proposed several algorithms for reputation management. In particular, we proposed algorithms to detect malicious peers that send inauthentic files, and liar peers that send wrong feedbacks.
Reputable peers need to be motivated to upload authentic files by increasing the benefits received from the system. In addition, free riders need to contribute positively to the system. These peers are consuming resources without uploading to others. To provide the right incentives for peers, we develop a novel service differentiation scheme based on peers’ contribution rather than peers’ reputation. The proposed scheme protects the system against free-riders and malicious peers and reduces the service provided to them.
In this thesis, we also propose a novel recommender framework for partially decentralized file sharing P2P systems. We take advantage from the partial search process used in these systems to explore the relationships between peers. The proposed recommender system does not require any additional effort from the users since implicit rating is used. The recommender system also does not suffer from the problems that affect traditional collaborative filtering schemes like the Cold start, the Data sparseness and the Popularity effect.
Over all, our unified approach to trust management and recommendations allows for better system health and increased user satisfaction
ディープラーニングとグラフ分析に基づく実用的なレコメンダーシステムのユーザーモデリング
Tohoku University石垣司課
Context-Aware Recommendation Systems in Mobile Environments
Nowadays, the huge amount of information available may easily overwhelm users when they need to take a decision that involves choosing among several options. As a solution to this problem, Recommendation Systems (RS) have emerged to offer relevant items to users. The main goal of these systems is to recommend certain items based on user preferences. Unfortunately, traditional recommendation systems do not consider the user’s context as an important dimension to ensure high-quality recommendations. Motivated by the need to incorporate contextual information during the recommendation process, Context-Aware Recommendation Systems (CARS) have emerged. However, these recent recommendation systems are not designed with mobile users in mind, where the context and the movements of the users and items may be important factors to consider when deciding which items should be recommended. Therefore, context-aware recommendation models should be able to effectively and efficiently exploit the dynamic context of the mobile user in order to offer her/him suitable recommendations and keep them up-to-date.The research area of this thesis belongs to the fields of context-aware recommendation systems and mobile computing. We focus on the following scientific problem: how could we facilitate the development of context-aware recommendation systems in mobile environments to provide users with relevant recommendations? This work is motivated by the lack of generic and flexible context-aware recommendation frameworks that consider aspects related to mobile users and mobile computing. In order to solve the identified problem, we pursue the following general goal: the design and implementation of a context-aware recommendation framework for mobile computing environments that facilitates the development of context-aware recommendation applications for mobile users. In the thesis, we contribute to bridge the gap not only between recommendation systems and context-aware computing, but also between CARS and mobile computing.<br /
A Trust Management Framework for Decision Support Systems
In the era of information explosion, it is critical to develop a framework which can extract useful information and help people to make “educated” decisions. In our lives, whether we are aware of it, trust has turned out to be very helpful for us to make decisions. At the same time, cognitive trust, especially in large systems, such as Facebook, Twitter, and so on, needs support from computer systems. Therefore, we need a framework that can effectively, but also intuitively, let people express their trust, and enable the system to automatically and securely summarize the massive amounts of trust information, so that a user of the system can make “educated” decisions, or at least not blind decisions. Inspired by the similarities between human trust and physical measurements, this dissertation proposes a measurement theory based trust management framework. It consists of three phases: trust modeling, trust inference, and decision making. Instead of proposing specific trust inference formulas, this dissertation proposes a fundamental framework which is flexible and can be adapted by many different inference formulas. Validation experiments are done on two data sets: the Epinions.com data set and the Twitter data set. This dissertation also adapts the measurement theory based trust management framework for two decision support applications. In the first application, the real stock market data is used as ground truth for the measurement theory based trust management framework. Basically, the correlation between the sentiment expressed on Twitter and stock market data is measured. Compared with existing works which do not differentiate tweets’ authors, this dissertation analyzes trust among stock investors on Twitter and uses the trust network to differentiate tweets’ authors. The results show that by using the measurement theory based trust framework, Twitter sentiment valence is able to reflect abnormal stock returns better than treating all the authors as equally important or weighting them by their number of followers. In the second application, the measurement theory based trust management framework is used to help to detect and prevent from being attacked in cloud computing scenarios. In this application, each single flow is treated as a measurement. The simulation results show that the measurement theory based trust management framework is able to provide guidance for cloud administrators and customers to make decisions, e.g. migrating tasks from suspect nodes to trustworthy nodes, dynamically allocating resources according to trust information, and managing the trade-off between the degree of redundancy and the cost of resources
Topics in Computational Advertising
<p>Computational advertising is an emerging scientific discipline that incorporates tools and ideas from fields such as statistics, computer science, and economics. Although a consequence of the rapid growth of the Internet, computational advertising has since helped transform the online advertising business into a multi-billion dollar industry.</p><p>The fundamental goal of computational advertising is to determine the ``best'' online ad to display to any given user. This ``best'' ad, however, changes depending upon the specific context that is under consideration. This leads to a variety of different problems, three of which are discussed in this thesis.</p><p>Chapter 1 briefly introduces the topics of online advertising and computational advertising. Chapter 2 proposes a numerical method to approximate the pure strategy Nash equilibrium bidding functions in an independent private value first-price sealed-bid auction where bidders draw their types from continuous and atomless distributions---a setting in which solutions cannot generally be analytically derived, despite the fact that they are known to exist and be unique. Chapter 3 proposes a cross-domain recommender system that is a multiple-domain extension of the Bayesian Probabilistic Matrix Factorization model. Chapter 4 discuss some of the tools and challenges of text mining by using the Trayvon Martin shooting incident as a case study in analyzing the lexical content and network connectivity structure of the political blogosphere. Finally, Chapter 5 presents some concluding remarks and briefly discusses other problems in computational advertising.</p>Dissertatio
Congenial Web Search : A Conceptual Framework for Personalized, Collaborative, and Social Peer-to-Peer Retrieval
Traditional information retrieval methods fail to address the fact that information consumption and production are social activities. Most Web search engines do not consider the social-cultural environment of users' information needs and the collaboration between users. This dissertation addresses a new search paradigm for Web information retrieval denoted as Congenial Web Search. It emphasizes personalization, collaboration, and socialization methods in order to improve effectiveness. The client-server architecture of Web search engines only allows the consumption of information. A peer-to-peer system architecture has been developed in this research to improve information seeking. Each user is involved in an interactive process to produce meta-information. Based on a personalization strategy on each peer, the user is supported to give explicit feedback for relevant documents. His information need is expressed by a query that is stored in a Peer Search Memory. On one hand, query-document associations are incorporated in a personalized ranking method for repeated information needs. The performance is shown in a known-item retrieval setting. On the other hand, explicit feedback of each user is useful to discover collaborative information needs. A new method for a controlled grouping of query terms, links, and users was developed to maintain Virtual Knowledge Communities. The quality of this grouping represents the effectiveness of grouped terms and links. Both strategies, personalization and collaboration, tackle the problem of a missing socialization among searchers. Finally, a concept for integrated information seeking was developed. This incorporates an integrated representation to improve effectiveness of information retrieval and information filtering. An integrated information retrieval process explores a virtual search network of Peer Search Memories in order to accomplish a reputation-based ranking. In addition, the community structure is considered by an integrated information filtering process. Both concepts have been evaluated and shown to have a better performance than traditional techniques. The methods presented in this dissertation offer the potential towards more transparency, and control of Web search
Exploiting Latent Information in Recommender Systems
This thesis exploits latent information in personalised recommendation, and investigates how this information can be used to improve recommender systems. The investigations span three directions: scalar rating-based collaborative filtering, distributional rating-based collaborative filtering, and distributional ratingbased hybrid filtering. In the first investigation, the thesis discovers through data analysis three problems in nearest neighbour collaborative filtering — item irrelevance, preference imbalance, and biased average — and identifies a solution: incorporating “target awareness” in the computation of user similarity and rating deviation. Two new algorithms are subsequently proposed. Quantitative experiments show that the new algorithms, especially the first one, are able to significantly improve the performance under normal situations. They do not however excel in cold-start situations due to greater demand of data. The second investigation builds upon the experimental analysis of the first investigation, and examines the use of discrete probabilistic distributional modelling throughout the recommendation process. It encompasses four ideas: 1) distributional input rating, which enables the explicit representation of noise patterns in user inputs; 2) distributional voting profile, which enables the preservation of not only shift but also spread and peaks in user’s rating habits; 3) distributional similarity, which enables the untangled and separated similarity computation of the likes and the dislikes; and 4) distributional prediction, which enables the communication of the uncertainty, granularity, and ambivalence in the recommendation results. Quantitative experiments show that this model is able to improve the effectiveness of recommendation compared to the scalar model and other published discrete probabilistic models, especially in terms of binary and list recommendation accuracy. The third investigation is based on an analysis regarding the relationship between rating, item content, item quality, and “intangibles”, and is enabled by the discrete probabilistic model proposed in the second investigation. Based on the analysis, a fundamentally different hybrid filtering structure is proposed, where the hybridisation strategy is neither linear nor sequential, but of a divide-and-conquer shape backed by probabilistic derivation. Experimental results show that it is able to outperform the standard linear and sequential hybridisation structures
Recommended from our members
MapReduce based RDF assisted distributed SVM for high throughput spam filtering
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityElectronic mail has become cast and embedded in our everyday lives. Billions of legitimate emails are sent on a daily basis. The widely established underlying infrastructure, its widespread availability as well as its ease of use have all acted as catalysts to such pervasive proliferation. Unfortunately, the same can be alleged about unsolicited bulk email, or rather spam. Various methods, as well as enabling architectures are available to try to mitigate spam permeation. In this respect, this dissertation compliments existing survey work in this area by contributing an extensive literature review of traditional and emerging spam filtering approaches. Techniques, approaches and architectures employed for spam filtering are appraised, critically assessing respective strengths and weaknesses.
Velocity, volume and variety are key characteristics of the spam challenge. MapReduce (M/R) has become increasingly popular as an Internet scale, data intensive processing platform. In the context of machine learning based spam filter training, support vector machine (SVM) based techniques have been proven effective. SVM training is however a computationally intensive process. In this dissertation, a M/R based distributed SVM algorithm for scalable spam filter training, designated MRSMO, is presented. By distributing and processing subsets of the training data across multiple participating computing nodes, the distributed SVM reduces spam filter training time significantly. To mitigate the accuracy degradation introduced by the adopted approach, a Resource Description Framework (RDF) based feedback loop is evaluated. Experimental results demonstrate that this improves the accuracy levels of the distributed SVM beyond the original sequential counterpart.
Effectively exploiting large scale, ‘Cloud’ based, heterogeneous processing capabilities for M/R in what can be considered a non-deterministic environment requires the consideration of a number of perspectives. In this work, gSched, a Hadoop M/R based, heterogeneous aware task to node matching and allocation scheme is designed. Using MRSMO as a baseline, experimental evaluation indicates that gSched improves on the performance of the out-of-the box Hadoop counterpart in a typical Cloud based infrastructure.
The focal contribution to knowledge is a scalable, heterogeneous infrastructure and machine learning based spam filtering scheme, able to capitalize on collaborative accuracy improvements through RDF based, end user feedback. MapReduce based RDF Assisted Distributed SVM for High Throughput Spam Filterin
Efficient Decision Support Systems
This series is directed to diverse managerial professionals who are leading the transformation of individual domains by using expert information and domain knowledge to drive decision support systems (DSSs). The series offers a broad range of subjects addressed in specific areas such as health care, business management, banking, agriculture, environmental improvement, natural resource and spatial management, aviation administration, and hybrid applications of information technology aimed to interdisciplinary issues. This book series is composed of three volumes: Volume 1 consists of general concepts and methodology of DSSs; Volume 2 consists of applications of DSSs in the biomedical domain; Volume 3 consists of hybrid applications of DSSs in multidisciplinary domains. The book is shaped upon decision support strategies in the new infrastructure that assists the readers in full use of the creative technology to manipulate input data and to transform information into useful decisions for decision makers
- …