154 research outputs found

    Link-based similarity search to fight web spam

    Get PDF
    www.ilab.sztaki.hu/websearch We investigate the usability of similarity search in fighting Web spam based on the assumption that an unknown spam page is more similar to certain known spam pages than to honest pages. In order to be successful, search engine spam never appears in isolation: we observe link farms and alliances for the sole purpose of search engine ranking manipulation. The artificial nature and strong inside connectedness however gave rise to successful algorithms to identify search engine spam. One example is trust and distrust propagation, an idea originating in recommender systems and P2P networks, that yields spam classificators by spreading information along hyperlinks from white and blacklists. While most previous results use PageRank variants for propagation, we form classifiers by investigating similarity top lists of an unknown page along various measures such as co-citation, companion, nearest neighbors in low dimensional projections and SimRank. We test our method over two data sets previously used to measure spam filtering algorithms. 1

    Methods for demoting and detecting Web spam

    Get PDF
    Web spamming has tremendously subverted the ranking mechanism of information retrieval in Web search engines. It manipulates data source maliciously either by contents or links with the intention of contributing negative impacts to Web search results. The altering order of the search results by spammers has increased the difficulty level of searching and time consumption for Web users to retrieve relevant information. In order to improve the quality of Web search engines results, the design of anti-Web spam techniques are developed in this thesis to detect and demote Web spam via trust and distrust and Web spam classification.A comprehensive literature on existing anti-Web spam techniques emphasizing on trust and distrust model and machine learning model is presented. Furthermore, several experiments are conducted to show the vulnerability of ranking algorithm towards Web spam. Two public available Web spam datasets are used for the experiments throughout the thesis - WEBSPAM-UK2006 and WEBSPAM-UK2007.Two link-based trust and distrust model algorithms are presented subsequently: Trust Propagation Rank and Trust Propagation Spam Mass. Both algorithms semi automatically detect and demote Web spam based on limited human experts’ evaluation of non-spam and spam pages. In the experiments, the results for Trust Propagation Rank and Trust Propagation Spam Mass have achieved up to 10.88% and 43.94% improvement over the benchmark algorithms.Thereafter, the weight properties which associated as the linkage between two Web hosts are introduced into the task of Web spam detection. In most studies, the weight properties are involved in ranking mechanism; in this research work, the weight properties are incorporated into distrust based algorithms to detect more spam. The experiments have shown that the weight properties enhanced existing distrust based Web spam detection algorithms for up to 30.26% and 31.30% on both aforementioned datasets.Even though the integration of weight properties has shown significant results in detecting Web spam, the discussion on distrust seed set propagation algorithm is presented to further enhance the Web spam detection experience. Distrust seed set propagation algorithm propagates the distrust score in a wider range to estimate the probability of other unevaluated Web pages for being spam. The experimental results have shown that the algorithm improved the distrust based Web spam detection algorithms up to 19.47% and 25.17% on both datasets.An alternative machine learning classifier - multilayered perceptron neural network is proposed in the thesis to further improve the detection rate of Web spam. In the experiments, the detection rate of Web spam using multilayered perceptron neural network has increased up to 14.02% and 3.53% over the conventional classifier – support vector machines. At the same time, a mechanism to determine the number of hidden neurons for multilayered perceptron neural network is presented in this thesis to simplify the designing process of network structure

    Trust and Credibility in Online Social Networks

    Get PDF
    Increasing portions of people's social and communicative activities now take place in the digital world. The growth and popularity of online social networks (OSNs) have tremendously facilitated online interaction and information exchange. As OSNs enable people to communicate more effectively, a large volume of user-generated content (UGC) is produced daily. As UGC contains valuable information, more people now turn to OSNs for news, opinions, and social networking. Besides users, companies and business owners also benefit from UGC as they utilize OSNs as the platforms for communicating with customers and marketing activities. Hence, UGC has a powerful impact on users' opinions and decisions. However, the openness of OSNs also brings concerns about trust and credibility online. The freedom and ease of publishing information online could lead to UGC with problematic quality. It has been observed that professional spammers are hired to insert deceptive content and promote harmful information in OSNs. It is known as the spamming problem, which jeopardizes the ecosystems of OSNs. The severity of the spamming problem has attracted the attention of researchers and many detection approaches have been proposed. However, most existing approaches are based on behavioral patterns. As spammers evolve to evade being detected by faking normal behaviors, these detection approaches may fail. In this dissertation, we present our work of detecting spammers by extracting behavioral patterns that are difficult to be manipulated in OSNs. We focus on two scenarios, review spamming and social bots. We first identify that the rating deviations and opinion deviations are invariant patterns in review spamming activities since the goal of review spamming is to insert deceptive reviews. We utilize the two kinds of deviations as clues for trust propagation and propose our detection mechanisms. For social bots detection, we identify the behavioral patterns among users in a neighborhood is difficult to be manipulated for a social bot and propose a neighborhood-based detection scheme. Our work shows that the trustworthiness of a user can be reflected in social relations and opinions expressed in the review content. Besides, our proposed features extracted from the neighborhood are useful for social bot detection

    Graph based Anomaly Detection and Description: A Survey

    Get PDF
    Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the ‘why’, of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field

    The Cooperative Defense Overlay Network: A Collaborative Automated Threat Information Sharing Framework for a Safer Internet

    Get PDF
    With the ever-growing proliferation of hardware and software-based computer security exploits and the increasing power and prominence of distributed attacks, network and system administrators are often forced to make a difficult decision: expend tremendous resources on defense from sophisticated and continually evolving attacks from an increasingly dangerous Internet with varying levels of success; or expend fewer resources on defending against common attacks on "low hanging fruit," hoping to avoid the less common but incredibly devastating zero-day worm or botnet attack. Home networks and small organizations are usually forced to choose the latter option and in so doing are left vulnerable to all but the simplest of attacks. While automated tools exist for sharing information about network-based attacks, this sharing is typically limited to administrators of large networks and dedicated security-conscious users, to the exclusion of smaller organizations and novice home users. In this thesis we propose a framework for a cooperative defense overlay network (CODON) in which participants with varying technical abilities and resources can contribute to the security and health of the internet via automated crowdsourcing, rapid information sharing, and the principle of collateral defense

    Britain in psychological distress: the EU referendum and the psychological operations of the two opposing sides

    Get PDF
    Διπλωματική εργασία--Πανεπιστήμιο Μακεδονίας, Θεσσαλονίκη, 2019.There are numerous analyses trying to explain the outcome of the 2016 referendum on Britain’s membership of the EU. The aim of this dissertation is to examine the circumstances under which voters’ attitudes were formed, and ultimately reflected in their choice on polling day. The study focuses particularly on the referendum campaign and the various psychological operations applied to British citizens, shaping their opinion and affecting their final decision. First of all, this contribution attempts to develop the basic aspects of modern psychological operations (PSYOP), which have been known by many other names or terms, including Propaganda. The term is used to denote any action which is practiced mainly by psychological methods with the objective of evoking a planned psychological reaction in other people. Various techniques are used, aiming to influence a target audience's value system, belief system, emotions, motives, reasoning, or behaviour. In this context, the first chapter defines the word ‘propaganda’ and presents significant facts about its origins and examples of its usage in history. Subsequently, based on an extensive literature review, it provides a thorough analysis about a wide range of propaganda devices. This includes tactics involving language manipulation, as well as non-verbal techniques, such as opinion polls and statistics. In accordance with the above, the second chapter elaborates on Britain’s EU referendum and attempts to explain the Brexit result. Unlike other academic research, this paper considers the outcome of the referendum within the broader context of a detailed analysis of public attitude towards the EU. This attempt requires examining the circumstances which gave rise to the plebiscite before turning to the issue of how the various strategies that were employed during the referendum campaign influenced the position of British electorate on polling day. The paper gives a concise but rich survey of the development of Euroscepticism in Britain, a phenomenon that provoked considerable debate on the UK’s membership in the EU, and eventually led to the resolution of holding a national referendum on the matter. Following that, it devotes a fair number of pages describing the referendum campaign itself – its personalities, principal themes and arguments – and seeks to identify the particular tactics that were used by the two opposing sides to sway voters. It highlights David Cameron’s failure to secure a substantive deal regarding Britain’s terms of membership with the EU and outlines the key messages of the Remain and Leave campaigns, with the former focusing on the economic and security risks of leaving, and the latter on immigration and sovereignty. Most importantly, the study emphasizes the prevalence of propaganda techniques throughout the referendum campaign, with reference to the insight of some of the key players on both sides. Last but not least, based on a review of the campaigns’ strategies, there is an attempt to determine all those factors that may have attributed to the result of Brexit, and caused a historical moment in British history
    corecore