Search CORE

1,037 research outputs found

Polygraph: Automatically generating signatures for polymorphic worms

Author: Karp B
Newsome J
Song D
Publication venue
Publication date: 10/11/2005
Field of study

It is widely believed that content-signature-based intrusion detection systems (IDSes) are easily evaded by polymorphic worms, which vary their payload on every infection attempt. In this paper, we present Polygraph, a signature generation system that successfully produces signatures that match polymorphic worms. Polygraph generates signatures that consist of multiple disjoint content sub-strings. In doing so, Polygraph leverages our insight that for a real-world exploit to function properly, multiple invariant substrings must often be present in all variants of a payload; these substrings typically correspond to protocol framing, return addresses, and in some cases, poorly obfuscated code. We contribute a definition of the polymorphic signature generation problem; propose classes of signature suited for matching polymorphic worm payloads; and present algorithms for automatic generation of signatures in these classes. Our evaluation of these algorithms on a range of polymorphic worms demonstrates that Polygraph produces signatures for polymorphic worms that exhibit low false negatives and false positives. © 2005 IEEE

UCL Discovery

Knowledge Discovery in Documents by Extracting Frequent Word Sequences

Author: Ahonen Helena
Publication venue: Graduate School of Library and Information Science. University of Illinois at Urbana-Champaign
Publication date: 01/01/1999
Field of study

published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository

VirtualHome: Simulating Household Activities via Programs

Author: Boben Marko
Fidler Sanja
Li Jiaman
Puig Xavier
Ra Kevin
Torralba Antonio
Wang Tingwu
Publication venue
Publication date: 18/06/2018
Field of study

In this paper, we are interested in modeling complex activities that occur in a typical household. We propose to use programs, i.e., sequences of atomic actions and interactions, as a high level representation of complex tasks. Programs are interesting because they provide a non-ambiguous representation of a task, and allow agents to execute them. However, nowadays, there is no database providing this type of information. Towards this goal, we first crowd-source programs for a variety of activities that happen in people's homes, via a game-like interface used for teaching kids how to code. Using the collected dataset, we show how we can learn to extract programs directly from natural language descriptions or from videos. We then implement the most common atomic (inter)actions in the Unity3D game engine, and use our programs to "drive" an artificial agent to execute tasks in a simulated household environment. Our VirtualHome simulator allows us to create a large activity video dataset with rich ground-truth, enabling training and testing of video understanding models. We further showcase examples of our agent performing tasks in our VirtualHome based on language descriptions.Comment: CVPR 2018 (Oral

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Long-Term Visual Object Tracking Benchmark

Author: AW Smeulders
B Babenko
C Vondrick
D Held
H Grabner
H Li
J Zhang
Jack Valmadre
JF Henriques
JF Henriques
M Danelljan
M Kristan
M Kumar
M Mueller
P Liang
WL Lu
Y Hua
Y Li
Y Wu
Z Kalal
Publication venue
Publication date: 01/01/2019
Field of study

We propose a new long video dataset (called Track Long and Prosper - TLP) and benchmark for single object tracking. The dataset consists of 50 HD videos from real world scenarios, encompassing a duration of over 400 minutes (676K frames), making it more than 20 folds larger in average duration per sequence and more than 8 folds larger in terms of total covered duration, as compared to existing generic datasets for visual tracking. The proposed dataset paves a way to suitably assess long term tracking performance and train better deep learning architectures (avoiding/reducing augmentation, which may not reflect real world behaviour). We benchmark the dataset on 17 state of the art trackers and rank them according to tracking accuracy and run time speeds. We further present thorough qualitative and quantitative evaluation highlighting the importance of long term aspect of tracking. Our most interesting observations are (a) existing short sequence benchmarks fail to bring out the inherent differences in tracking algorithms which widen up while tracking on long sequences and (b) the accuracy of trackers abruptly drops on challenging long sequences, suggesting the potential need of research efforts in the direction of long-term tracking.Comment: ACCV 2018 (Oral

arXiv.org e-Print Archive

Crossref

Multivariate Fine-Grained Complexity of Longest Common Subsequence

Author: Bringmann Karl
Künnemann Marvin
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2018
Field of study

We revisit the classic combinatorial pattern matching problem of finding a longest common subsequence (LCS). For strings

x

and

y

of length

n

, a textbook algorithm solves LCS in time

O(n^2)

, but although much effort has been spent, no

O(n^{2-\varepsilon})

-time algorithm is known. Recent work indeed shows that such an algorithm would refute the Strong Exponential Time Hypothesis (SETH) [Abboud, Backurs, Vassilevska Williams + Bringmann, K\"unnemann FOCS'15]. Despite the quadratic-time barrier, for over 40 years an enduring scientific interest continued to produce fast algorithms for LCS and its variations. Particular attention was put into identifying and exploiting input parameters that yield strongly subquadratic time algorithms for special cases of interest, e.g., differential file comparison. This line of research was successfully pursued until 1990, at which time significant improvements came to a halt. In this paper, using the lens of fine-grained complexity, our goal is to (1) justify the lack of further improvements and (2) determine whether some special cases of LCS admit faster algorithms than currently known. To this end, we provide a systematic study of the multivariate complexity of LCS, taking into account all parameters previously discussed in the literature: the input size

n:=\max\{|x|,|y|\}

, the length of the shorter string

m:=\min\{|x|,|y|\}

, the length

L

of an LCS of

x

and

y

, the numbers of deletions

\delta := m-L

and

\Delta := n-L

, the alphabet size, as well as the numbers of matching pairs

M

and dominant pairs

d

. For any class of instances defined by fixing each parameter individually to a polynomial in terms of the input size, we prove a SETH-based lower bound matching one of three known algorithms. Specifically, we determine the optimal running time for LCS under SETH as

(n+\min\{d, \delta \Delta, \delta m\})^{1\pm o(1)}

. [...]Comment: Presented at SODA'18. Full Version. 66 page

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Growth rates for subclasses of Av(321)

Author: Albert M. H.
Atkinson M. D.
Brignall R.
Ruškuc N.
Smith Rebecca
West J.
Publication venue
Publication date: 09/10/2009
Field of study

Pattern classes which avoid 321 and other patterns are shown to have the same growth rates as similar (but strictly larger) classes obtained by adding articulation points to any or all of the other patterns. The method of proof is to show that the elements of the latter classes can be represented as bounded merges of elements of the original class, and that the bounded merge construction does not change growth rates

arXiv.org e-Print Archive

CiteSeerX

Open Research Online (The Open University)

University of St. Andrews - Pure

St Andrews Research Repository

JigsawNet: Shredded Image Reassembly using Convolutional Neural Network and Loop-based Composition

Author: Le Canyu
Li Xin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/09/2018
Field of study

This paper proposes a novel algorithm to reassemble an arbitrarily shredded image to its original status. Existing reassembly pipelines commonly consist of a local matching stage and a global compositions stage. In the local stage, a key challenge in fragment reassembly is to reliably compute and identify correct pairwise matching, for which most existing algorithms use handcrafted features, and hence, cannot reliably handle complicated puzzles. We build a deep convolutional neural network to detect the compatibility of a pairwise stitching, and use it to prune computed pairwise matches. To improve the network efficiency and accuracy, we transfer the calculation of CNN to the stitching region and apply a boost training strategy. In the global composition stage, we modify the commonly adopted greedy edge selection strategies to two new loop closure based searching algorithms. Extensive experiments show that our algorithm significantly outperforms existing methods on solving various puzzles, especially those challenging ones with many fragment pieces

arXiv.org e-Print Archive

Louisiana State University

Shape Optimization Problems for Metric Graphs

Author: Buttazzo Giuseppe
Ruffini Berardo
Velichkov Bozhidar
Publication venue: 'EDP Sciences'
Publication date: 29/08/2013
Field of study

We consider the shape optimization problem

\min\big\{{\mathcal E}(\Gamma)\ :\ \Gamma\in{\mathcal A},\ {\mathcal H}^1(\Gamma)=l\ \big\},

where

{\mathcal H}^1

is the one-dimensional Hausdorff measure and

{\mathcal A}

is an admissible class of one-dimensional sets connecting some prescribed set of points

{\mathcal D}=\{D_1,\dots,D_k\}\subset{\mathbb R}^d

. The cost functional

{\mathcal E}(\Gamma)

is the Dirichlet energy of

\Gamma

defined through the Sobolev functions on

\Gamma

vanishing on the points

D_i

. We analyze the existence of a solution in both the families of connected sets and of metric graphs. At the end, several explicit examples are discussed.Comment: 23 pages, 11 figures, ESAIM Control Optim. Calc. Var., (to appear

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Archivio della Ricerca - Università di Pisa

Numérisation de Documents Anciens Mathématiques

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Hal-Diderot

Comparing Java Programs: Syntactic and Contextual Semantic Differences

Author: Huynh Quoc Hung Le
Publication venue
Publication date: 01/01/2005
Field of study

This thesis describes the foundation for developing a tool that compares Java programs, or different versions of a program. The tool captures syntactic differences and contextual semantic differences as well. Syntactic differences are “ordinary” changes in the code. This tool works much in the same way as the Unix tool diff, but it is much smarter than diff. This is because it exploits the fact that programs are built differently than ordinary text. The tool diff’s purpose is to compare text, and it will therefore give imprecise or too verbose results. The tool described in this thesis can identify contextual semantic differences because it knows the contexts of methods, meaning that it knows whether methods are directly declared in the class, inherited from implemented interfaces or if methods override the class’ parent’s method. The approach in this thesis for comparing Java programs is to transform the programs into abstract syntax trees. The transformation from source code to abstract syntax trees are done with the help Strafunski. Strafunski is a software bundle that supports generic programming. The implementation of the tool is done in Haskell. Haskell is a functional programming language. The work of comparing abstract syntax trees can be broken down into the problem of finding the largest common subtree of two abstract syntax trees and further more, the problem of finding the longest common subsequence of two sequences. This thesis describes and presents new algorithms for doing this and it also describe working Haskell code of the implementation of the tool

NORA - Norwegian Open Research Archives

On space efficiency of algorithms working on structural decompositions of graphs

Author: Pilipczuk Michał
Wrochna Marcin
Publication venue
Publication date: 01/01/2016
Field of study

Dynamic programming on path and tree decompositions of graphs is a technique that is ubiquitous in the field of parameterized and exponential-time algorithms. However, one of its drawbacks is that the space usage is exponential in the decomposition's width. Following the work of Allender et al. [Theory of Computing, '14], we investigate whether this space complexity explosion is unavoidable. Using the idea of reparameterization of Cai and Juedes [J. Comput. Syst. Sci., '03], we prove that the question is closely related to a conjecture that the Longest Common Subsequence problem parameterized by the number of input strings does not admit an algorithm that simultaneously uses XP time and FPT space. Moreover, we complete the complexity landscape sketched for pathwidth and treewidth by Allender et al. by considering the parameter tree-depth. We prove that computations on tree-depth decompositions correspond to a model of non-deterministic machines that work in polynomial time and logarithmic space, with access to an auxiliary stack of maximum height equal to the decomposition's depth. Together with the results of Allender et al., this describes a hierarchy of complexity classes for polynomial-time non-deterministic machines with different restrictions on the access to working space, which mirrors the classic relations between treewidth, pathwidth, and tree-depth.Comment: An extended abstract appeared in the proceedings of STACS'16. The new version is augmented with a space-efficient algorithm for Dominating Set using the Chinese remainder theore

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server