23,389 research outputs found
Algorithmic complexity for psychology: A user-friendly implementation of the coding theorem method
Kolmogorov-Chaitin complexity has long been believed to be impossible to
approximate when it comes to short sequences (e.g. of length 5-50). However,
with the newly developed \emph{coding theorem method} the complexity of strings
of length 2-11 can now be numerically estimated. We present the theoretical
basis of algorithmic complexity for short strings (ACSS) and describe an
R-package providing functions based on ACSS that will cover psychologists'
needs and improve upon previous methods in three ways: (1) ACSS is now
available not only for binary strings, but for strings based on up to 9
different symbols, (2) ACSS no longer requires time-consuming computing, and
(3) a new approach based on ACSS gives access to an estimation of the
complexity of strings of any length. Finally, three illustrative examples show
how these tools can be applied to psychology.Comment: to appear in "Behavioral Research Methods", 14 pages in journal
format, R package at http://cran.r-project.org/web/packages/acss/index.htm
On the Feasibility of Malware Authorship Attribution
There are many occasions in which the security community is interested to
discover the authorship of malware binaries, either for digital forensics
analysis of malware corpora or for thwarting live threats of malware invasion.
Such a discovery of authorship might be possible due to stylistic features
inherent to software codes written by human programmers. Existing studies of
authorship attribution of general purpose software mainly focus on source code,
which is typically based on the style of programs and environment. However,
those features critically depend on the availability of the program source
code, which is usually not the case when dealing with malware binaries. Such
program binaries often do not retain many semantic or stylistic features due to
the compilation process. Therefore, authorship attribution in the domain of
malware binaries based on features and styles that will survive the compilation
process is challenging. This paper provides the state of the art in this
literature. Further, we analyze the features involved in those techniques. By
using a case study, we identify features that can survive the compilation
process. Finally, we analyze existing works on binary authorship attribution
and study their applicability to real malware binaries.Comment: FPS 201
Strategies for protecting intellectual property when using CUDA applications on graphics processing units
Recent advances in the massively parallel computational abilities of graphical processing units (GPUs) have increased their use for general purpose computation, as companies look to take advantage of big data processing techniques. This has given rise to the potential for malicious software targeting GPUs, which is of interest to forensic investigators examining the operation of software. The ability to carry out reverse-engineering of software is of great importance within the security and forensics elds, particularly when investigating malicious software or carrying out forensic analysis following a successful security breach. Due to the complexity of the Nvidia CUDA (Compute Uni ed Device Architecture) framework, it is not clear how best to approach the reverse engineering of a piece of CUDA software. We carry out a review of the di erent binary output formats which may be encountered from the CUDA compiler, and their implications on reverse engineering. We then demonstrate the process of carrying out disassembly of an example CUDA application, to establish the various techniques available to forensic investigators carrying out black-box disassembly and reverse engineering of CUDA binaries. We show that the Nvidia compiler, using default settings, leaks useful information. Finally, we demonstrate techniques to better protect intellectual property in CUDA algorithm implementations from reverse engineering
Learning with Latent Language
The named concepts and compositional operators present in natural language
provide a rich source of information about the kinds of abstractions humans use
to navigate the world. Can this linguistic background knowledge improve the
generality and efficiency of learned classifiers and control policies? This
paper aims to show that using the space of natural language strings as a
parameter space is an effective way to capture natural task structure. In a
pretraining phase, we learn a language interpretation model that transforms
inputs (e.g. images) into outputs (e.g. labels) given natural language
descriptions. To learn a new concept (e.g. a classifier), we search directly in
the space of descriptions to minimize the interpreter's loss on training
examples. Crucially, our models do not require language data to learn these
concepts: language is used only in pretraining to impose structure on
subsequent learning. Results on image classification, text editing, and
reinforcement learning show that, in all settings, models with a linguistic
parameterization outperform those without
Estimating the Algorithmic Complexity of Stock Markets
Randomness and regularities in Finance are usually treated in probabilistic
terms. In this paper, we develop a completely different approach in using a
non-probabilistic framework based on the algorithmic information theory
initially developed by Kolmogorov (1965). We present some elements of this
theory and show why it is particularly relevant to Finance, and potentially to
other sub-fields of Economics as well. We develop a generic method to estimate
the Kolmogorov complexity of numeric series. This approach is based on an
iterative "regularity erasing procedure" implemented to use lossless
compression algorithms on financial data. Examples are provided with both
simulated and real-world financial time series. The contributions of this
article are twofold. The first one is methodological : we show that some
structural regularities, invisible with classical statistical tests, can be
detected by this algorithmic method. The second one consists in illustrations
on the daily Dow-Jones Index suggesting that beyond several well-known
regularities, hidden structure may in this index remain to be identified
Binary superlattice design by controlling DNA-mediated interactions
Most binary superlattices created using DNA functionalization or other
approaches rely on particle size differences to achieve compositional order and
structural diversity. Here we study two-dimensional (2D) assembly of
DNA-functionalized micron-sized particles (DFPs), and employ a strategy that
leverages the tunable disparity in interparticle interactions, and thus
enthalpic driving forces, to open new avenues for design of binary
superlattices that do not rely on the ability to tune particle size (i.e.,
entropic driving forces). Our strategy employs tailored blends of complementary
strands of ssDNA to control interparticle interactions between micron-sized
silica particles in a binary mixture to create compositionally diverse 2D
lattices. We show that the particle arrangement can be further controlled by
changing the stoichiometry of the binary mixture in certain cases. With this
approach, we demonstrate the abil- ity to program the particle assembly into
square, pentagonal, and hexagonal lattices. In addition, different particle
types can be compositionally ordered in square checkerboard and hexagonal -
alternating string, honeycomb, and Kagome arrangements.Comment: 4 figures in the main text. 5 figures in the supplementary
informatio
- …