547 research outputs found
An Empirical Study of the I2P Anonymity Network and its Censorship Resistance
Tor and I2P are well-known anonymity networks used by many individuals to
protect their online privacy and anonymity. Tor's centralized directory
services facilitate the understanding of the Tor network, as well as the
measurement and visualization of its structure through the Tor Metrics project.
In contrast, I2P does not rely on centralized directory servers, and thus
obtaining a complete view of the network is challenging. In this work, we
conduct an empirical study of the I2P network, in which we measure properties
including population, churn rate, router type, and the geographic distribution
of I2P peers. We find that there are currently around 32K active I2P peers in
the network on a daily basis. Of these peers, 14K are located behind NAT or
firewalls.
Using the collected network data, we examine the blocking resistance of I2P
against a censor that wants to prevent access to I2P using address-based
blocking techniques. Despite the decentralized characteristics of I2P, we
discover that a censor can block more than 95% of peer IP addresses known by a
stable I2P client by operating only 10 routers in the network. This amounts to
severe network impairment: a blocking rate of more than 70% is enough to cause
significant latency in web browsing activities, while blocking more than 90% of
peer IP addresses can make the network unusable. Finally, we discuss the
security consequences of the network being blocked, and directions for
potential approaches to make I2P more resistant to blocking.Comment: 14 pages, To appear in the 2018 Internet Measurement Conference
(IMC'18
In the IP of the Beholder: Strategies for Active IPv6 Topology Discovery
Existing methods for active topology discovery within the IPv6 Internet
largely mirror those of IPv4. In light of the large and sparsely populated
address space, in conjunction with aggressive ICMPv6 rate limiting by routers,
this work develops a different approach to Internet-wide IPv6 topology mapping.
We adopt randomized probing techniques in order to distribute probing load,
minimize the effects of rate limiting, and probe at higher rates. Second, we
extensively analyze the efficiency and efficacy of various IPv6 hitlists and
target generation methods when used for topology discovery, and synthesize new
target lists based on our empirical results to provide both breadth (coverage
across networks) and depth (to find potential subnetting). Employing our
probing strategy, we discover more than 1.3M IPv6 router interface addresses
from a single vantage point. Finally, we share our prober implementation,
synthesized target lists, and discovered IPv6 topology results
On the Origins of Memes by Means of Fringe Web Communities
Internet memes are increasingly used to sway and manipulate public opinion.
This prompts the need to study their propagation, evolution, and influence
across the Web. In this paper, we detect and measure the propagation of memes
across multiple Web communities, using a processing pipeline based on
perceptual hashing and clustering techniques, and a dataset of 160M images from
2.6B posts gathered from Twitter, Reddit, 4chan's Politically Incorrect board
(/pol/), and Gab, over the course of 13 months. We group the images posted on
fringe Web communities (/pol/, Gab, and The_Donald subreddit) into clusters,
annotate them using meme metadata obtained from Know Your Meme, and also map
images from mainstream communities (Twitter and Reddit) to the clusters.
Our analysis provides an assessment of the popularity and diversity of memes
in the context of each community, showing, e.g., that racist memes are
extremely common in fringe Web communities. We also find a substantial number
of politics-related memes on both mainstream and fringe Web communities,
supporting media reports that memes might be used to enhance or harm
politicians. Finally, we use Hawkes processes to model the interplay between
Web communities and quantify their reciprocal influence, finding that /pol/
substantially influences the meme ecosystem with the number of memes it
produces, while \td has a higher success rate in pushing them to other
communities.Comment: A shorter version of this paper appears in the Proceedings of 18th
ACM Internet Measurement Conference (IMC 2018). This is the full versio
Malware Finances and Operations: a Data-Driven Study of the Value Chain for Infections and Compromised Access
We investigate the criminal market dynamics of infostealer malware and
publish three evidence datasets on malware infections and trade. We justify the
value chain between illicit enterprises using the datasets, compare the prices
and added value, and use the value chain to identify the most effective
countermeasures.
We begin by examining infostealer malware victim logs shared by actors on
hacking forums, and extract victim information and mask sensitive data to
protect privacy. We find access to these same victims for sale at Genesis
Market. This technically sophisticated marketplace provides its own browser to
access victim's online accounts. We collect a second dataset and discover that
91% of prices fall between 1--20 US dollars, with a median of 5 US dollars.
Database Market sells access to compromised online accounts. We produce yet
another dataset, finding 91% of prices fall between 1--30 US dollars, with a
median of 7 US dollars.Comment: In The 18th International Conference on Availability, Reliability and
Security (ARES 2023), August 29 -- September 1, 2023, Benevento, Ital
SoC-Cluster as an Edge Server: an Application-driven Measurement Study
Huge electricity consumption is a severe issue for edge data centers. To this
end, we propose a new form of edge server, namely SoC-Cluster, that
orchestrates many low-power mobile system-on-chips (SoCs) through an on-chip
network. For the first time, we have developed a concrete SoC-Cluster server
that consists of 60 Qualcomm Snapdragon 865 SoCs in a 2U rack. Such a server
has been commercialized successfully and deployed in large scale on edge
clouds. The current dominant workload on those deployed SoC-Clusters is cloud
gaming, as mobile SoCs can seamlessly run native mobile games.
The primary goal of this work is to demystify whether SoC-Cluster can
efficiently serve more general-purpose, edge-typical workloads. Therefore, we
built a benchmark suite that leverages state-of-the-art libraries for two
killer edge workloads, i.e., video transcoding and deep learning inference. The
benchmark comprehensively reports the performance, power consumption, and other
application-specific metrics. We then performed a thorough measurement study
and directly compared SoC-Cluster with traditional edge servers (with Intel CPU
and NVIDIA GPU) with respect to physical size, electricity, and billing. The
results reveal the advantages of SoC-Cluster, especially its high energy
efficiency and the ability to proportionally scale energy consumption with
various incoming loads, as well as its limitations. The results also provide
insightful implications and valuable guidance to further improve SoC-Cluster
and land it in broader edge scenarios
Detecting Phishing Sites Using ChatGPT
The rise of large language models (LLMs) has had a significant impact on
various domains, including natural language processing and artificial
intelligence. While LLMs such as ChatGPT have been extensively researched for
tasks such as code generation and text synthesis, their application in
detecting malicious web content, particularly phishing sites, has been largely
unexplored. To combat the rising tide of automated cyber attacks facilitated by
LLMs, it is imperative to automate the detection of malicious web content,
which requires approaches that leverage the power of LLMs to analyze and
classify phishing sites. In this paper, we propose a novel method that utilizes
ChatGPT to detect phishing sites. Our approach involves leveraging a web
crawler to gather information from websites and generate prompts based on this
collected data. This approach enables us to detect various phishing sites
without the need for fine-tuning machine learning models and identify social
engineering techniques from the context of entire websites and URLs. To
evaluate the performance of our proposed method, we conducted experiments using
a dataset. The experimental results using GPT-4 demonstrated promising
performance, with a precision of 98.3% and a recall of 98.4%. Comparative
analysis between GPT-3.5 and GPT-4 revealed an enhancement in the latter's
capability to reduce false negatives. These findings not only highlight the
potential of LLMs in efficiently identifying phishing sites but also have
significant implications for enhancing cybersecurity measures and protecting
users from the dangers of online fraudulent activities
- …