10 research outputs found

    Improved Distortion and Spam Resistance for PageRank

    Full text link
    For a directed graph G=(V,E)G = (V,E), a ranking function, such as PageRank, provides a way of mapping elements of VV to non-negative real numbers so that nodes can be ordered. Brin and Page argued that the stationary distribution, R(G)R(G), of a random walk on GG is an effective ranking function for queries on an idealized web graph. However, R(G)R(G) is not defined for all GG, and in particular, it is not defined for the real web graph. Thus, they introduced PageRank to approximate R(G)R(G) for graphs GG with ergodic random walks while being defined on all graphs. PageRank is defined as a random walk on a graph, where with probability (1āˆ’Ļµ)(1-\epsilon), a random out-edge is traversed, and with \emph{reset probability} Ļµ\epsilon the random walk instead restarts at a node selected using a \emph{reset vector} r^\hat{r}. Originally, r^\hat{r} was taken to be uniform on the nodes, and we call this version UPR. In this paper, we introduce graph-theoretic notions of quality for ranking functions, specifically \emph{distortion} and \emph{spam resistance}. We show that UPR has high distortion and low spam resistance and we show how to select an r^\hat{r} that yields low distortion and high spam resistance.Comment: 36 page

    Networks and trust: systems for understanding and supporting internet security

    Get PDF
    Includes bibliographical references.2022 Fall.This dissertation takes a systems-level view of the multitude of existing trust management systems to make sense of when, where and how (or, in some cases, if) each is best utilized. Trust is a belief by one person that by transacting with another person (or organization) within a specific context, a positive outcome will result. Trust serves as a heuristic that enables us to simplify the dozens decisions we make each day about whom we will transact with. In today's hyperconnected world, in which for many people a bulk of their daily transactions related to business, entertainment, news, and even critical services like healthcare take place online, we tend to rely even more on heuristics like trust to help us simplify complex decisions. Thus, trust plays a critical role in online transactions. For this reason, over the past several decades researchers have developed a plethora of trust metrics and trust management systems for use in online systems. These systems have been most frequently applied to improve recommender systems and reputation systems. They have been designed for and applied to varied online systems including peer-to-peer (P2P) filesharing networks, e-commerce platforms, online social networks, messaging and communication networks, sensor networks, distributed computing networks, and others. However, comparatively little research has examined the effects on individuals, organizations or society of the presence or absence of trust in online sociotechnical systems. Using these existing trust metrics and trust management systems, we design a set of experiments to benchmark the performance of these existing systems, which rely heavily on network analysis methods. Drawing on the experiments' results, we propose a heuristic decision-making framework for selecting a trust management system for use in online systems. In this dissertation we also investigate several related but distinct aspects of trust in online sociotechnical systems. Using network/graph analysis methods, we examine how trust (or lack of trust) affects the performance of online networks in terms of security and quality of service. We explore the structure and behavior of online networks including Twitter, GitHub, and Reddit through the lens of trust. We find that higher levels of trust within a network are associated with more spread of misinformation (a form of cybersecurity threat, according to the US CISA) on Twitter. We also find that higher levels of trust in open source developer networks on GitHub are associated with more frequent incidences of cybersecurity vulnerabilities. Using our experimental and empirical findings previously described, we apply the Systems Engineering Process to design and prototype a trust management tool for use on Reddit, which we dub Coni the Trust Moderating Bot. Coni is, to the best of our knowledge, the first trust management tool designed specifically for use on the Reddit platform. Through our work with Coni, we develop and present a blueprint for constructing a Reddit trust tool which not only measures trust levels, but can use these trust levels to take actions on Reddit to improve the quality of submissions within the community (a subreddit)

    Reading the Market

    Get PDF
    Americans pay famously close attention to "the market," obsessively watching trends, patterns, and swings and looking for clues in every fluctuation. In Reading the Market, Peter Knight explores the Gilded Age origins and development of this peculiar interest. He tracks the historic shift in market operations from local to national while examining how present-day ideas about the nature of markets are tied to past genres of financial representation.Drawing on the late nineteenth-century explosion of art, literature, and media, which sought to dramatize the workings of the stock market for a wide audience, Knight shows how ordinary Americans became both emotionally and financially invested in the market. He analyzes popular investment manuals, brokersā€™ newsletters, newspaper columns, magazine articles, illustrations, and cartoons. He also introduces readers to fiction featuring financial tricksters, which was characterized by themes of personal trust and insider information. The book reveals how the popular culture of the period shaped the very idea of the market as a self-regulating mechanism by making the impersonal abstractions of high finance personal and concrete.From the rise of ticker-tape technology to the development of conspiracy theories, Reading the Market argues that commentary on the Stock Exchange between 1870 and 1915 changed how Americans understood financeā€”and explains what our pervasive interest in Wall Street says about us now

    On Privacy-Enhanced Distributed Analytics in Online Social Networks

    Get PDF
    More than half of the world's population benefits from online social network (OSN) services. A considerable part of these services is mainly based on applying analytics on user data to infer their preferences and enrich their experience accordingly. At the same time, user data is monetized by service providers to run their business models. Therefore, providers tend to extensively collect (personal) data about users. However, this data is oftentimes used for various purposes without informed consent of the users. Providers share this data in different forms with third parties (e.g., data brokers). Moreover, user sensitive data was repeatedly a subject of unauthorized access by malicious parties. These issues have demonstrated the insufficient commitment of providers to user privacy, and consequently, raised users' concerns. Despite the emergence of privacy regulations (e.g., GDPR and CCPA), recent studies showed that user personal data collection and sharing sensitive data are still continuously increasing. A number of privacy-friendly OSNs have been proposed to enhance user privacy by reducing the need for central service providers. However, this improvement in privacy protection usually comes at the cost of losing social connectivity and many analytics-based services of the wide-spread OSNs. This dissertation addresses this issue by first proposing an approach to privacy-friendly OSNs that maintains established social connections. Second, approaches that allow users to collaboratively apply distributed analytics while preserving their privacy are presented. Finally, the dissertation contributes to better assessment and mitigation of the risks associated with distributed analytics. These three research directions are treated through the following six contributions. Conceptualizing Hybrid Online Social Networks: We conceptualize a hybrid approach to privacy-friendly OSNs, HOSN. This approach combines the benefits of using COSNs and DOSN. Users can maintain their social experience in their preferred COSN while being provided with additional means to enhance their privacy. Users can seamlessly post public content or private content that is accessible only by authorized users (friends) beyond the reach of the service providers. Improving the Trustworthiness of HOSNs: We conceptualize software features to address users' privacy concerns in OSNs. We prototype these features in our HOSN}approach and evaluate their impact on the privacy concerns and the trustworthiness of the approach. Also, we analyze the relationships between four important aspects that influence users' behavior in OSNs: privacy concerns, trust beliefs, risk beliefs, and the willingness to use. Privacy-Enhanced Association Rule Mining: We present an approach to enable users to apply efficiently privacy-enhanced association rule mining on distributed data. This approach can be employed in DOSN and HOSN to generate recommendations. We leverage a privacy-enhanced distributed graph sampling method to reduce the data required for the mining and lower the communication and computational overhead. Then, we apply a distributed frequent itemset mining algorithm in a privacy-friendly manner. Privacy Enhancements on Federated Learning (FL): We identify several privacy-related issues in the emerging distributed machine learning technique, FL. These issues are mainly due to the centralized nature of this technique. We discuss tackling these issues by applying FL in a hierarchical architecture. The benefits of this approach include a reduction in the centralization of control and the ability to place defense and verification methods more flexibly and efficiently within the hierarchy. Systematic Analysis of Threats in Federated Learning: We conduct a critical study of the existing attacks in FL to better understand the actual risk of these attacks under real-world scenarios. First, we structure the literature in this field and show the research foci and gaps. Then, we highlight a number of issues in (1) the assumptions commonly made by researchers and (2) the evaluation practices. Finally, we discuss the implications of these issues on the applicability of the proposed attacks and recommend several remedies. Label Leakage from Gradients: We identify a risk of information leakage when sharing gradients in FL. We demonstrate the severity of this risk by proposing a novel attack that extracts the user annotations that describe the data (i.e., ground-truth labels) from gradients. We show the high effectiveness of the attack under different settings such as different datasets and model architectures. We also test several defense mechanisms to mitigate this attack and conclude the effective ones

    Fine-grained, Content-agnostic Network Traffic Analysis for Malicious Activity Detection

    Get PDF
    The rapid evolution of malicious activities in network environments necessitates the development of more effective and efficient detection and mitigation techniques. Traditional traffic analysis (TA) approaches have demonstrated limited efficacy and performance in detecting various malicious activities, resulting in a pressing need for more advanced solutions. To fill the gap, this dissertation proposes several new fine-grained network traffic analysis (FGTA) approaches. These approaches focus on (1) detecting previously hard-to-detect malicious activities by deducing fine-grained, detailed application-layer information in privacy-preserving manners, (2) enhancing usability by providing more explainable results and better adaptability to different network environments, and (3) combining network traffic data with endpoint information to provide users with more comprehensive and accurate protections. We begin by conducting a comprehensive survey of existing FGTA approaches. We then propose CJ-Sniffer, a privacy-aware cryptojacking detection system that efficiently detects cryptojacking traffic. CJ-Sniffer is the first approach to distinguishing cryptojacking traffic from user-initiated cryptocurrency mining traffic, allowing for fine-grained traffic discrimination. This level of fine-grained traffic discrimination has proven challenging to accomplish through traditional TA methodologies. Next, we introduce BotFlowMon, a learning-based, content-agnostic approach for detecting online social network (OSN) bot traffic, which has posed a significant challenge for detection using traditional TA strategies. BotFlowMon is an FGTA approach that relies only on content-agnostic flow-level data as input and utilizes novel algorithms and techniques to classify social bot traffic from real OSN user traffic. To enhance the usability of FGTA-based attack detection, we propose a learning-based DDoS detection approach that emphasizes both explainability and adaptability. This approach provides network administrators with insightful explanatory information and adaptable models for new network environments. Finally, we present a reinforcement learning-based defense approach against L7 DDoS attacks, which combines network traffic data with endpoint information to operate. The proposed approach actively monitors and analyzes the victim server and applies different strategies under different conditions to protect the server while minimizing collateral damage to legitimate requests. Our evaluation results demonstrate that the proposed approaches achieve high accuracy and efficiency in detecting and mitigating various malicious activities, while maintaining privacy-preserving features, providing explainable and adaptable results, or providing comprehensive application-layer situational awareness. This dissertation significantly advances the fields of FGTA and malicious activity detection. This dissertation includes published and unpublished co-authored materials
    corecore