170 research outputs found

    A Survey on Trust and Distrust Propagation for Web Pages

    Get PDF
    Search engines are the hub for information retrieval from the web. But due to the web spam, we may not get the desired information from the search engines. The phrase web spam is used for the web pages that are designed to spam the web search results by using some unacceptable tactics. Web spam pages use different techniques to achieve undeserved ranking in the web. Over the last decades researchers are trying to design different techniques to identify the web spam pages so that it does not deteriorate the quality of the search results. In this paper we present a survey on different web spam techniques with underlying principles and algorithms. We have surveyed all the major spam detection techniques and provided a brief discussion on the pros and cons of all the existing techniques. Finally, we summarized the various observations and underlying principles that are applied for spam detection techniques.Keywords:TrustRank, Anti-TrustRank, Good-Bad Rank, Spam Detection, Demotio

    Link-based similarity search to fight web spam

    Get PDF
    www.ilab.sztaki.hu/websearch We investigate the usability of similarity search in fighting Web spam based on the assumption that an unknown spam page is more similar to certain known spam pages than to honest pages. In order to be successful, search engine spam never appears in isolation: we observe link farms and alliances for the sole purpose of search engine ranking manipulation. The artificial nature and strong inside connectedness however gave rise to successful algorithms to identify search engine spam. One example is trust and distrust propagation, an idea originating in recommender systems and P2P networks, that yields spam classificators by spreading information along hyperlinks from white and blacklists. While most previous results use PageRank variants for propagation, we form classifiers by investigating similarity top lists of an unknown page along various measures such as co-citation, companion, nearest neighbors in low dimensional projections and SimRank. We test our method over two data sets previously used to measure spam filtering algorithms. 1

    Methods for demoting and detecting Web spam

    Get PDF
    Web spamming has tremendously subverted the ranking mechanism of information retrieval in Web search engines. It manipulates data source maliciously either by contents or links with the intention of contributing negative impacts to Web search results. The altering order of the search results by spammers has increased the difficulty level of searching and time consumption for Web users to retrieve relevant information. In order to improve the quality of Web search engines results, the design of anti-Web spam techniques are developed in this thesis to detect and demote Web spam via trust and distrust and Web spam classification.A comprehensive literature on existing anti-Web spam techniques emphasizing on trust and distrust model and machine learning model is presented. Furthermore, several experiments are conducted to show the vulnerability of ranking algorithm towards Web spam. Two public available Web spam datasets are used for the experiments throughout the thesis - WEBSPAM-UK2006 and WEBSPAM-UK2007.Two link-based trust and distrust model algorithms are presented subsequently: Trust Propagation Rank and Trust Propagation Spam Mass. Both algorithms semi automatically detect and demote Web spam based on limited human experts’ evaluation of non-spam and spam pages. In the experiments, the results for Trust Propagation Rank and Trust Propagation Spam Mass have achieved up to 10.88% and 43.94% improvement over the benchmark algorithms.Thereafter, the weight properties which associated as the linkage between two Web hosts are introduced into the task of Web spam detection. In most studies, the weight properties are involved in ranking mechanism; in this research work, the weight properties are incorporated into distrust based algorithms to detect more spam. The experiments have shown that the weight properties enhanced existing distrust based Web spam detection algorithms for up to 30.26% and 31.30% on both aforementioned datasets.Even though the integration of weight properties has shown significant results in detecting Web spam, the discussion on distrust seed set propagation algorithm is presented to further enhance the Web spam detection experience. Distrust seed set propagation algorithm propagates the distrust score in a wider range to estimate the probability of other unevaluated Web pages for being spam. The experimental results have shown that the algorithm improved the distrust based Web spam detection algorithms up to 19.47% and 25.17% on both datasets.An alternative machine learning classifier - multilayered perceptron neural network is proposed in the thesis to further improve the detection rate of Web spam. In the experiments, the detection rate of Web spam using multilayered perceptron neural network has increased up to 14.02% and 3.53% over the conventional classifier – support vector machines. At the same time, a mechanism to determine the number of hidden neurons for multilayered perceptron neural network is presented in this thesis to simplify the designing process of network structure

    Trust and Credibility in Online Social Networks

    Get PDF
    Increasing portions of people's social and communicative activities now take place in the digital world. The growth and popularity of online social networks (OSNs) have tremendously facilitated online interaction and information exchange. As OSNs enable people to communicate more effectively, a large volume of user-generated content (UGC) is produced daily. As UGC contains valuable information, more people now turn to OSNs for news, opinions, and social networking. Besides users, companies and business owners also benefit from UGC as they utilize OSNs as the platforms for communicating with customers and marketing activities. Hence, UGC has a powerful impact on users' opinions and decisions. However, the openness of OSNs also brings concerns about trust and credibility online. The freedom and ease of publishing information online could lead to UGC with problematic quality. It has been observed that professional spammers are hired to insert deceptive content and promote harmful information in OSNs. It is known as the spamming problem, which jeopardizes the ecosystems of OSNs. The severity of the spamming problem has attracted the attention of researchers and many detection approaches have been proposed. However, most existing approaches are based on behavioral patterns. As spammers evolve to evade being detected by faking normal behaviors, these detection approaches may fail. In this dissertation, we present our work of detecting spammers by extracting behavioral patterns that are difficult to be manipulated in OSNs. We focus on two scenarios, review spamming and social bots. We first identify that the rating deviations and opinion deviations are invariant patterns in review spamming activities since the goal of review spamming is to insert deceptive reviews. We utilize the two kinds of deviations as clues for trust propagation and propose our detection mechanisms. For social bots detection, we identify the behavioral patterns among users in a neighborhood is difficult to be manipulated for a social bot and propose a neighborhood-based detection scheme. Our work shows that the trustworthiness of a user can be reflected in social relations and opinions expressed in the review content. Besides, our proposed features extracted from the neighborhood are useful for social bot detection

    Networks and trust: systems for understanding and supporting internet security

    Get PDF
    Includes bibliographical references.2022 Fall.This dissertation takes a systems-level view of the multitude of existing trust management systems to make sense of when, where and how (or, in some cases, if) each is best utilized. Trust is a belief by one person that by transacting with another person (or organization) within a specific context, a positive outcome will result. Trust serves as a heuristic that enables us to simplify the dozens decisions we make each day about whom we will transact with. In today's hyperconnected world, in which for many people a bulk of their daily transactions related to business, entertainment, news, and even critical services like healthcare take place online, we tend to rely even more on heuristics like trust to help us simplify complex decisions. Thus, trust plays a critical role in online transactions. For this reason, over the past several decades researchers have developed a plethora of trust metrics and trust management systems for use in online systems. These systems have been most frequently applied to improve recommender systems and reputation systems. They have been designed for and applied to varied online systems including peer-to-peer (P2P) filesharing networks, e-commerce platforms, online social networks, messaging and communication networks, sensor networks, distributed computing networks, and others. However, comparatively little research has examined the effects on individuals, organizations or society of the presence or absence of trust in online sociotechnical systems. Using these existing trust metrics and trust management systems, we design a set of experiments to benchmark the performance of these existing systems, which rely heavily on network analysis methods. Drawing on the experiments' results, we propose a heuristic decision-making framework for selecting a trust management system for use in online systems. In this dissertation we also investigate several related but distinct aspects of trust in online sociotechnical systems. Using network/graph analysis methods, we examine how trust (or lack of trust) affects the performance of online networks in terms of security and quality of service. We explore the structure and behavior of online networks including Twitter, GitHub, and Reddit through the lens of trust. We find that higher levels of trust within a network are associated with more spread of misinformation (a form of cybersecurity threat, according to the US CISA) on Twitter. We also find that higher levels of trust in open source developer networks on GitHub are associated with more frequent incidences of cybersecurity vulnerabilities. Using our experimental and empirical findings previously described, we apply the Systems Engineering Process to design and prototype a trust management tool for use on Reddit, which we dub Coni the Trust Moderating Bot. Coni is, to the best of our knowledge, the first trust management tool designed specifically for use on the Reddit platform. Through our work with Coni, we develop and present a blueprint for constructing a Reddit trust tool which not only measures trust levels, but can use these trust levels to take actions on Reddit to improve the quality of submissions within the community (a subreddit)

    Enhancing digital business ecosystem trust and reputation with centrality measures

    Get PDF
    Digital Business Ecosystem (DBE) is a decentralised environment where very small enterprises (VSEs) and small to medium sized enterprises (SMEs) interoperate by establishing collaborations with each other. Collaborations play a major role in the development of DBEs where it is often difficult to select partners, as they are most likely strangers. Even though trust forms the basis for collaboration decisions, trust and reputation information may not be available for each participant. Recommendations from other participants are therefore necessary to help with the selection process. Given the nature of DBEs, social network centrality measures that can influence power and control in the network need to be considered for DBE trust and reputation. A number of social network centralities, which influence reputation in social graphs have been studied in the past. This paper investigates an unexploited centrality measure, betweenness centrality, as a metric to be considered for trust and reputation

    Votetrust: Leveraging friend invitation graph to defend against social network sybils

    Get PDF
    Online social networks (OSNs) suffer from the creation of fake accounts that introduce fake product reviews, malware and spam. Existing defenses focus on using the social graph structure to isolate fakes. However, our work shows that Sybils could befriend a large number of real users, invalidating the assumption behind social-graph-based detection. In this paper, we present VoteTrust, a scalable defense system that further leverages user-level activities. VoteTrust models the friend invitation interactions among users as a directed, signed graph, and uses two key mechanisms to detect Sybils over the graph: a voting-based Sybil detection to find Sybils that users vote to reject, and a Sybil community detection to find other colluding Sybils around identified Sybils. Through evaluating on Renren social network, we show that VoteTrust is able to prevent Sybils from generating many unsolicited friend requests. We also deploy VoteTrust in Renen, and our real experience demonstrates that VoteTrust can detect large-scale collusion among Sybils

    Enhancing digital business ecosystem trust and reputation with centrality measures

    Get PDF
    Digital Business Ecosystem (DBE) is a decentralised environment where very small enterprises (VSEs) and small to medium sized enterprises (SMEs) interoperate by establishing collaborations with each other. Collaborations play a major role in the development of DBEs where it is often difficult to select partners, as they are most likely strangers. Even though trust forms the basis for collaboration decisions, trust and reputation information may not be available for each participant. Recommendations from other participants are therefore necessary to help with the selection process. Given the nature of DBEs, social network centrality measures that can influence power and control in the network need to be considered for DBE trust and reputation. A number of social network centralities, which influence reputation in social graphs have been studied in the past. This paper investigates an unexploited centrality measure, betweenness centrality, as a metric to be considered for trust and reputation
    • …
    corecore