529 research outputs found
Analyzing the Social Structure and Dynamics of E-mail and Spam in Massive Backbone Internet Traffic
E-mail is probably the most popular application on the Internet, with
everyday business and personal communications dependent on it. Spam or
unsolicited e-mail has been estimated to cost businesses significant amounts of
money. However, our understanding of the network-level behavior of legitimate
e-mail traffic and how it differs from spam traffic is limited. In this study,
we have passively captured SMTP packets from a 10 Gbit/s Internet backbone link
to construct a social network of e-mail users based on their exchanged e-mails.
The focus of this paper is on the graph metrics indicating various structural
properties of e-mail networks and how they evolve over time. This study also
looks into the differences in the structural and temporal characteristics of
spam and non-spam networks. Our analysis on the collected data allows us to
show several differences between the behavior of spam and legitimate e-mail
traffic, which can help us to understand the behavior of spammers and give us
the knowledge to statistically model spam traffic on the network-level in order
to complement current spam detection techniques.Comment: 15 pages, 20 figures, technical repor
Recommended from our members
Herd behaviour in extreme market conditions: The case of the Athens stock exchange
This paper examines herd behaviour in extreme market conditions using data from the Athens Stock Exchange. We test for the presence of herding as suggested by Christie and Huang (1995) and Chang, Cheng, and Khorana (2000). Results based on daily, weekly and monthly data indicate the existence of herd behaviour for the years 1998-2007. Evidence of herd behaviour over daily time intervals is much stronger, revealing the short-term nature of the phenomenon. When the testing period is broken into semi-annual sub-periods, herding is found during the stock market crisis of 1999. Investor behaviour seems to have become more rational since 2002, owing to the regulatory and institutional reforms of the Greek equity market and the intense presence of foreign institutional investors
Efficient Lock-free Binary Search Trees
In this paper we present a novel algorithm for concurrent lock-free internal
binary search trees (BST) and implement a Set abstract data type (ADT) based on
that. We show that in the presented lock-free BST algorithm the amortized step
complexity of each set operation - {\sc Add}, {\sc Remove} and {\sc Contains} -
is , where, is the height of BST with number of nodes
and is the contention during the execution. Our algorithm adapts to
contention measures according to read-write load. If the situation is
read-heavy, the operations avoid helping pending concurrent {\sc Remove}
operations during traversal, and, adapt to interval contention. However, for
write-heavy situations we let an operation help pending {\sc Remove}, even
though it is not obstructed, and so adapt to tighter point contention. It uses
single-word compare-and-swap (\texttt{CAS}) operations. We show that our
algorithm has improved disjoint-access-parallelism compared to similar existing
algorithms. We prove that the presented algorithm is linearizable. To the best
of our knowledge this is the first algorithm for any concurrent tree data
structure in which the modify operations are performed with an additive term of
contention measure.Comment: 15 pages, 3 figures, submitted to POD
Self-stabilizing TDMA Algorithms for Wireless Ad-hoc Networks without External Reference
Time division multiple access (TDMA) is a method for sharing communication
media. In wireless communications, TDMA algorithms often divide the radio time
into timeslots of uniform size, , and then combine them into frames of
uniform size, . We consider TDMA algorithms that allocate at least one
timeslot in every frame to every node. Given a maximal node degree, ,
and no access to external references for collision detection, time or position,
we consider the problem of collision-free self-stabilizing TDMA algorithms that
use constant frame size.
We demonstrate that this problem has no solution when the frame size is , where is the chromatic number for
distance- vertex coloring. As a complement to this lower bound, we focus on
proving the existence of collision-free self-stabilizing TDMA algorithms that
use constant frame size of . We consider basic settings (no hardware
support for collision detection and no prior clock synchronization), and the
collision of concurrent transmissions from transmitters that are at most two
hops apart. In the context of self-stabilizing systems that have no external
reference, we are the first to study this problem (to the best of our
knowledge), and use simulations to show convergence even with computation time
uncertainties
Analyzing the Performance of Lock-Free Data Structures: A Conflict-based Model
This paper considers the modeling and the analysis of the performance of
lock-free concurrent data structures. Lock-free designs employ an optimistic
conflict control mechanism, allowing several processes to access the shared
data object at the same time. They guarantee that at least one concurrent
operation finishes in a finite number of its own steps regardless of the state
of the operations. Our analysis considers such lock-free data structures that
can be represented as linear combinations of fixed size retry loops. Our main
contribution is a new way of modeling and analyzing a general class of
lock-free algorithms, achieving predictions of throughput that are close to
what we observe in practice. We emphasize two kinds of conflicts that shape the
performance: (i) hardware conflicts, due to concurrent calls to atomic
primitives; (ii) logical conflicts, caused by simultaneous operations on the
shared data structure. We show how to deal with these hardware and logical
conflicts separately, and how to combine them, so as to calculate the
throughput of lock-free algorithms. We propose also a common framework that
enables a fair comparison between lock-free implementations by covering the
whole contention domain, together with a better understanding of the
performance impacting factors. This part of our analysis comes with a method
for calculating a good back-off strategy to finely tune the performance of a
lock-free algorithm. Our experimental results, based on a set of widely used
concurrent data structures and on abstract lock-free designs, show that our
analysis follows closely the actual code behavior.Comment: Short version to appear in DISC'1
Recommended from our members
Testing for persistence in mutual fund performance and the ex post verification problem: Evidence from the Greek market
The present study examines a series of performance measures as
an attempt to resolve the ex post verification problem. These measures are employed to test the performance persistence hypothesis of
domestic equity funds in Greece, during the period 1998-2004. Correctly adjusting for risk factors and documented portfolio strategies
explains a significant part of the reported persistence. The intercept of the augmented Carhart regression is proposed as the most appro-
priate performance measure. Using this measure, weak evidence for persistence, only before 2001, is documented. The growth of the fund
industry, the direction of flows to past winners and the integration in the international nancial system are suggested to be the reasons for
the absence of performance persistence
Market Timing And Selectivity: An Empirical Investigation Into The Features Of Greek Mutual Fund Managers
Τhis paper is an empirical assessment of the performance of mutual fund managers in terms of “market timing” and “selectivity”, within the framework suggested by Treynor and Mazuy (1966) and Henriksson and Merton (1981). The relevant data set is a balanced panel of nineteen Greek managers, over a sixty-month period. Empirical evidence does not provide support for correct timing, irrespectively of how the returns of the market index are calculated. It is interesting to note that using the Total Performance Index reduces the ability of managers for selectivity. This result holds for both the models utilized in our study
Configurable Strategies for Work-stealing
Work-stealing systems are typically oblivious to the nature of the tasks they
are scheduling. For instance, they do not know or take into account how long a
task will take to execute or how many subtasks it will spawn. Moreover, the
actual task execution order is typically determined by the underlying task
storage data structure, and cannot be changed. There are thus possibilities for
optimizing task parallel executions by providing information on specific tasks
and their preferred execution order to the scheduling system.
We introduce scheduling strategies to enable applications to dynamically
provide hints to the task-scheduling system on the nature of specific tasks.
Scheduling strategies can be used to independently control both local task
execution order as well as steal order. In contrast to conventional scheduling
policies that are normally global in scope, strategies allow the scheduler to
apply optimizations on individual tasks. This flexibility greatly improves
composability as it allows the scheduler to apply different, specific
scheduling choices for different parts of applications simultaneously. We
present a number of benchmarks that highlight diverse, beneficial effects that
can be achieved with scheduling strategies. Some benchmarks (branch-and-bound,
single-source shortest path) show that prioritization of tasks can reduce the
total amount of work compared to standard work-stealing execution order. For
other benchmarks (triangle strip generation) qualitatively better results can
be achieved in shorter time. Other optimizations, such as dynamic merging of
tasks or stealing of half the work, instead of half the tasks, are also shown
to improve performance. Composability is demonstrated by examples that combine
different strategies, both within the same kernel (prefix sum) as well as when
scheduling multiple kernels (prefix sum and unbalanced tree search)
Shared-object System Equilibria: Delay and Throughput Analysis
We consider shared-object systems that require their threads to fulfill the
system jobs by first acquiring sequentially the objects needed for the jobs and
then holding on to them until the job completion. Such systems are in the core
of a variety of shared-resource allocation and synchronization systems. This
work opens a new perspective to study the expected job delay and throughput
analytically, given the possible set of jobs that may join the system
dynamically.
We identify the system dependencies that cause contention among the threads
as they try to acquire the job objects. We use these observations to define the
shared-object system equilibria. We note that the system is in equilibrium
whenever the rate in which jobs arrive at the system matches the job completion
rate. These equilibria consider not only the job delay but also the job
throughput, as well as the time in which each thread blocks other threads in
order to complete its job. We then further study in detail the thread work
cycles and, by using a graph representation of the problem, we are able to
propose procedures for finding and estimating equilibria, i.e., discovering the
job delay and throughput, as well as the blocking time.
To the best of our knowledge, this is a new perspective, that can provide
better analytical tools for the problem, in order to estimate performance
measures similar to ones that can be acquired through experimentation on
working systems and simulations, e.g., as job delay and throughput in
(distributed) shared-object systems
The Lock-free -LSM Relaxed Priority Queue
Priority queues are data structures which store keys in an ordered fashion to
allow efficient access to the minimal (maximal) key. Priority queues are
essential for many applications, e.g., Dijkstra's single-source shortest path
algorithm, branch-and-bound algorithms, and prioritized schedulers.
Efficient multiprocessor computing requires implementations of basic data
structures that can be used concurrently and scale to large numbers of threads
and cores. Lock-free data structures promise superior scalability by avoiding
blocking synchronization primitives, but the \emph{delete-min} operation is an
inherent scalability bottleneck in concurrent priority queues. Recent work has
focused on alleviating this obstacle either by batching operations, or by
relaxing the requirements to the \emph{delete-min} operation.
We present a new, lock-free priority queue that relaxes the \emph{delete-min}
operation so that it is allowed to delete \emph{any} of the smallest
keys, where is a runtime configurable parameter. Additionally, the
behavior is identical to a non-relaxed priority queue for items added and
removed by the same thread. The priority queue is built from a logarithmic
number of sorted arrays in a way similar to log-structured merge-trees. We
experimentally compare our priority queue to recent state-of-the-art lock-free
priority queues, both with relaxed and non-relaxed semantics, showing high
performance and good scalability of our approach.Comment: Short version as ACM PPoPP'15 poste
- …