31,976 research outputs found
Profiling user activities with minimal traffic traces
Understanding user behavior is essential to personalize and enrich a user's
online experience. While there are significant benefits to be accrued from the
pursuit of personalized services based on a fine-grained behavioral analysis,
care must be taken to address user privacy concerns. In this paper, we consider
the use of web traces with truncated URLs - each URL is trimmed to only contain
the web domain - for this purpose. While such truncation removes the
fine-grained sensitive information, it also strips the data of many features
that are crucial to the profiling of user activity. We show how to overcome the
severe handicap of lack of crucial features for the purpose of filtering out
the URLs representing a user activity from the noisy network traffic trace
(including advertisement, spam, analytics, webscripts) with high accuracy. This
activity profiling with truncated URLs enables the network operators to provide
personalized services while mitigating privacy concerns by storing and sharing
only truncated traffic traces.
In order to offset the accuracy loss due to truncation, our statistical
methodology leverages specialized features extracted from a group of
consecutive URLs that represent a micro user action like web click, chat reply,
etc., which we call bursts. These bursts, in turn, are detected by a novel
algorithm which is based on our observed characteristics of the inter-arrival
time of HTTP records. We present an extensive experimental evaluation on a real
dataset of mobile web traces, consisting of more than 130 million records,
representing the browsing activities of 10,000 users over a period of 30 days.
Our results show that the proposed methodology achieves around 90% accuracy in
segregating URLs representing user activities from non-representative URLs
A Novel Multiobjective Cell Switch-Off Framework for Cellular Networks
Cell Switch-Off (CSO) is recognized as a promising approach to reduce the
energy consumption in next-generation cellular networks. However, CSO poses
serious challenges not only from the resource allocation perspective but also
from the implementation point of view. Indeed, CSO represents a difficult
optimization problem due to its NP-complete nature. Moreover, there are a
number of important practical limitations in the implementation of CSO schemes,
such as the need for minimizing the real-time complexity and the number of
on-off/off-on transitions and CSO-induced handovers. This article introduces a
novel approach to CSO based on multiobjective optimization that makes use of
the statistical description of the service demand (known by operators). In
addition, downlink and uplink coverage criteria are included and a comparative
analysis between different models to characterize intercell interference is
also presented to shed light on their impact on CSO. The framework
distinguishes itself from other proposals in two ways: 1) The number of
on-off/off-on transitions as well as handovers are minimized, and 2) the
computationally-heavy part of the algorithm is executed offline, which makes
its implementation feasible. The results show that the proposed scheme achieves
substantial energy savings in small cell deployments where service demand is
not uniformly distributed, without compromising the Quality-of-Service (QoS) or
requiring heavy real-time processing
Unsupervised host behavior classification from connection patterns
International audienceA novel host behavior classification approach is proposed as a preliminary step toward traffic classification and anomaly detection in network communication. Though many attempts described in the literature were devoted to flow or application classifications, these approaches are not always adaptable to operational constraints of traffic monitoring (expected to work even without packet payload, without bidirectionality, on highspeed networks or from flow reports only...). Instead, the classification proposed here relies on the leading idea that traffic is relevantly analyzed in terms of host typical behaviors: typical connection patterns of both legitimate applications (data sharing, downloading,...) and anomalous (eventually aggressive) behaviors are obtained by profiling traffic at the host level using unsupervised statistical classification. Classification at the host level is not reducible to flow or application classification, and neither is the contrary: they are different operations which might have complementary roles in network management. The proposed host classification is based on a nine-dimensional feature space evaluating host Internet connectivity, dispersion and exchanged traffic content. A Minimum Spanning Tree (MST) clustering technique is developed that does not require any supervised learning step to produce a set of statistically established typical host behaviors. Not relying on a priori defined classes of known behaviors enables the procedure to discover new host behaviors, that potentially were never observed before. This procedure is applied to traffic collected over the entire year 2008 on a transpacific (Japan/USA) link. A cross-validation of this unsupervised classification against a classical port-based inspection and a state-of-the-art method provides assessment of the meaningfulness and the relevance of the obtained classes for host behaviors
- …