67,428 research outputs found
DoWitcher: Effective Worm Detection and Containment in the Internet Core
Enterprise networks are increasingly offloading the responsibility for worm detection and containment to the carrier networks. However, current approaches to the zero-day worm detection problem such as those based on content similarity of packet payloads are not scalable to the carrier link speeds (OC-48 and up-wards). In this paper, we introduce a new system, namely DoWitcher, which in contrast to previous approaches is scalable as well as able to detect the stealthiest worms that employ low-propagation rates or polymorphisms to evade detection. DoWitcher uses an incremental approach toward worm detection: First, it examines the layer-4 traffic features to discern the presence of a worm anomaly; Next, it determines a flow-filter mask that can be applied to isolate the suspect worm flows and; Finally, it enables full-packet capture of only those flows that match the mask, which are then processed by a longest common subsequence algorithm to extract the worm content signature. Via a proof-of-concept implementation on a commercially available network analyzer processing raw packets from an OC-48 link, we demonstrate the capability of DoWitcher to detect low-rate worms and extract signatures for even the polymorphic worm
A Probabilistic Approach to Robust Optimal Experiment Design with Chance Constraints
Accurate estimation of parameters is paramount in developing high-fidelity
models for complex dynamical systems. Model-based optimal experiment design
(OED) approaches enable systematic design of dynamic experiments to generate
input-output data sets with high information content for parameter estimation.
Standard OED approaches however face two challenges: (i) experiment design
under incomplete system information due to unknown true parameters, which
usually requires many iterations of OED; (ii) incapability of systematically
accounting for the inherent uncertainties of complex systems, which can lead to
diminished effectiveness of the designed optimal excitation signal as well as
violation of system constraints. This paper presents a robust OED approach for
nonlinear systems with arbitrarily-shaped time-invariant probabilistic
uncertainties. Polynomial chaos is used for efficient uncertainty propagation.
The distinct feature of the robust OED approach is the inclusion of chance
constraints to ensure constraint satisfaction in a stochastic setting. The
presented approach is demonstrated by optimal experimental design for the
JAK-STAT5 signaling pathway that regulates various cellular processes in a
biological cell.Comment: Submitted to ADCHEM 201
Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments
Data centres that use consumer-grade disks drives and distributed
peer-to-peer systems are unreliable environments to archive data without enough
redundancy. Most redundancy schemes are not completely effective for providing
high availability, durability and integrity in the long-term. We propose alpha
entanglement codes, a mechanism that creates a virtual layer of highly
interconnected storage devices to propagate redundant information across a
large scale storage system. Our motivation is to design flexible and practical
erasure codes with high fault-tolerance to improve data durability and
availability even in catastrophic scenarios. By flexible and practical, we mean
code settings that can be adapted to future requirements and practical
implementations with reasonable trade-offs between security, resource usage and
performance. The codes have three parameters. Alpha increases storage overhead
linearly but increases the possible paths to recover data exponentially. Two
other parameters increase fault-tolerance even further without the need of
additional storage. As a result, an entangled storage system can provide high
availability, durability and offer additional integrity: it is more difficult
to modify data undetectably. We evaluate how several redundancy schemes perform
in unreliable environments and show that alpha entanglement codes are flexible
and practical codes. Remarkably, they excel at code locality, hence, they
reduce repair costs and become less dependent on storage locations with poor
availability. Our solution outperforms Reed-Solomon codes in many disaster
recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially
supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018
48th Annual IEEE/IFIP International Conference on Dependable Systems and
Networks (DSN
Measuring Similarity in Large-Scale Folksonomies
Social (or folksonomic) tagging has become a very popular way to describe content within Web 2.0 websites. Unlike\ud
taxonomies, which overimpose a hierarchical categorisation of content, folksonomies enable end-users to freely create and choose the categories (in this case, tags) that best\ud
describe some content. However, as tags are informally de-\ud
fined, continually changing, and ungoverned, social tagging\ud
has often been criticised for lowering, rather than increasing, the efficiency of searching, due to the number of synonyms, homonyms, polysemy, as well as the heterogeneity of\ud
users and the noise they introduce. To address this issue, a\ud
variety of approaches have been proposed that recommend\ud
users what tags to use, both when labelling and when looking for resources. As we illustrate in this paper, real world\ud
folksonomies are characterized by power law distributions\ud
of tags, over which commonly used similarity metrics, including the Jaccard coefficient and the cosine similarity, fail\ud
to compute. We thus propose a novel metric, specifically\ud
developed to capture similarity in large-scale folksonomies,\ud
that is based on a mutual reinforcement principle: that is,\ud
two tags are deemed similar if they have been associated to\ud
similar resources, and vice-versa two resources are deemed\ud
similar if they have been labelled by similar tags. We offer an efficient realisation of this similarity metric, and assess its quality experimentally, by comparing it against cosine similarity, on three large-scale datasets, namely Bibsonomy, MovieLens and CiteULike
- …