196,214 research outputs found
Preserving Both Privacy and Utility in Network Trace Anonymization
As network security monitoring grows more sophisticated, there is an
increasing need for outsourcing such tasks to third-party analysts. However,
organizations are usually reluctant to share their network traces due to
privacy concerns over sensitive information, e.g., network and system
configuration, which may potentially be exploited for attacks. In cases where
data owners are convinced to share their network traces, the data are typically
subjected to certain anonymization techniques, e.g., CryptoPAn, which replaces
real IP addresses with prefix-preserving pseudonyms. However, most such
techniques either are vulnerable to adversaries with prior knowledge about some
network flows in the traces, or require heavy data sanitization or
perturbation, both of which may result in a significant loss of data utility.
In this paper, we aim to preserve both privacy and utility through shifting the
trade-off from between privacy and utility to between privacy and computational
cost. The key idea is for the analysts to generate and analyze multiple
anonymized views of the original network traces; those views are designed to be
sufficiently indistinguishable even to adversaries armed with prior knowledge,
which preserves the privacy, whereas one of the views will yield true analysis
results privately retrieved by the data owner, which preserves the utility. We
present the general approach and instantiate it based on CryptoPAn. We formally
analyze the privacy of our solution and experimentally evaluate it using real
network traces provided by a major ISP. The results show that our approach can
significantly reduce the level of information leakage (e.g., less than 1\% of
the information leaked by CryptoPAn) with comparable utility
Actionable Visualization of Higher Dimensional Dynamical Processes
Analyzing modern day\u27s information systems that produce humongous multi-dimensional data in form of logs, traces or events that unfold over time can be tedious without adequate visualization, thereby, advocating the need for an intelligible visualization. This thesis researched and developed a visualization framework that represents multi-dimensional dynamic and temporal process data in a potentially intelligible and actionable form. A prototype showing four different views using notional malware data abstracted from Normal Sandbox behavioral traces were developed. In particular, the B-matrix view representing the DLL files used by the malware to attack a system. This representation is aimed at visualizing large data sets without losing emphasis on the process unfolding over multiple dimensions
Integrating Multiple Data Views for Improved Malware Analysis
Malicious software (malware) has become a prominent fixture in computing. There have been many methods developed over the years to combat the spread of malware, but these methods have inevitably been met with countermeasures. For instance, signature-based malware detection gave rise to polymorphic viruses. This arms race\u27 will undoubtedly continue for the foreseeable future as the incentives to develop novel malware continue to outweigh the costs. In this dissertation, I describe analysis frameworks for three important problems related to malware: classification, clustering, and phylogenetic reconstruction. The important component of my methods is that they all take into account multiple views of malware. Typically, analysis has been performed in either the static domain (e.g. the byte information of the executable) or the dynamic domain (e.g. system call traces). This dissertation develops frameworks that can easily incorporate well-studied views from both domains, as well as any new views that may become popular in the future. The only restriction that must be met is that a positive semidefinite similarity (kernel) matrix must be defined on the view, a restriction that is easily met in practice. While the classification problem can be solved with well known multiple kernel learning techniques, the clustering and phylogenetic problems required the development of novel machine learning methods, which I present in this dissertation. It is important to note that although these methods were developed in the context of the malware problem, they are applicable to a wide variety of domains
Memory without content? Radical enactivism and (post)causal theories of memory
Radical enactivism, an increasingly influential approach to cognition in general, has recently been applied to memory in particular, with Hutto and Peeters New directions in the philosophy of memory, Routledge, New York, 2018) providing the first systematic discussion of the implications of the approach for mainstream philosophical theories of memory. Hutto and Peeters argue that radical enactivism, which entails a conception of memory traces as contentless, is fundamentally at odds with current causal and postcausal theories, which remain committed to a conception of traces as contentful: on their view, if radical enactivism is right, then the relevant theories are wrong. Partisans of the theories in question might respond to Hutto and Peeters’ argument in two ways. First, they might challenge radical enactivism itself. Second, they might challenge the conditional claim that, if radical enactivism is right, then their theories are wrong. In this paper, we develop the latter response, arguing that, appearances to the contrary notwithstanding, radical enactivism in fact aligns neatly with an emerging tendency in the philosophy of memory: radical enactivists and causal and postcausal theorists of memory have begun to converge, for distinct but compatible reasons, on a contentless conception of memory traces
Understanding the Detection of View Fraud in Video Content Portals
While substantial effort has been devoted to understand fraudulent activity
in traditional online advertising (search and banner), more recent forms such
as video ads have received little attention. The understanding and
identification of fraudulent activity (i.e., fake views) in video ads for
advertisers, is complicated as they rely exclusively on the detection
mechanisms deployed by video hosting portals. In this context, the development
of independent tools able to monitor and audit the fidelity of these systems
are missing today and needed by both industry and regulators.
In this paper we present a first set of tools to serve this purpose. Using
our tools, we evaluate the performance of the audit systems of five major
online video portals. Our results reveal that YouTube's detection system
significantly outperforms all the others. Despite this, a systematic evaluation
indicates that it may still be susceptible to simple attacks. Furthermore, we
find that YouTube penalizes its videos' public and monetized view counters
differently, the former being more aggressive. This means that views identified
as fake and discounted from the public view counter are still monetized. We
speculate that even though YouTube's policy puts in lots of effort to
compensate users after an attack is discovered, this practice places the burden
of the risk on the advertisers, who pay to get their ads displayed.Comment: To appear in WWW 2016, Montr\'eal, Qu\'ebec, Canada. Please cite the
conference version of this pape
Experience-driven formation of parts-based representations in a model of layered visual memory
Growing neuropsychological and neurophysiological evidence suggests that the
visual cortex uses parts-based representations to encode, store and retrieve
relevant objects. In such a scheme, objects are represented as a set of
spatially distributed local features, or parts, arranged in stereotypical
fashion. To encode the local appearance and to represent the relations between
the constituent parts, there has to be an appropriate memory structure formed
by previous experience with visual objects. Here, we propose a model how a
hierarchical memory structure supporting efficient storage and rapid recall of
parts-based representations can be established by an experience-driven process
of self-organization. The process is based on the collaboration of slow
bidirectional synaptic plasticity and homeostatic unit activity regulation,
both running at the top of fast activity dynamics with winner-take-all
character modulated by an oscillatory rhythm. These neural mechanisms lay down
the basis for cooperation and competition between the distributed units and
their synaptic connections. Choosing human face recognition as a test task, we
show that, under the condition of open-ended, unsupervised incremental
learning, the system is able to form memory traces for individual faces in a
parts-based fashion. On a lower memory layer the synaptic structure is
developed to represent local facial features and their interrelations, while
the identities of different persons are captured explicitly on a higher layer.
An additional property of the resulting representations is the sparseness of
both the activity during the recall and the synaptic patterns comprising the
memory traces.Comment: 34 pages, 12 Figures, 1 Table, published in Frontiers in
Computational Neuroscience (Special Issue on Complex Systems Science and
Brain Dynamics),
http://www.frontiersin.org/neuroscience/computationalneuroscience/paper/10.3389/neuro.10/015.2009
Beyond the causal theory? Fifty years after Martin and Deutscher
It is natural to think of remembering in terms of causation: I can recall a recent dinner with a friend because I experienced that dinner. Some fifty years ago, Martin and Deutscher (1966) turned this basic thought into a full-fledged theory of memory, a theory that came to dominate the landscape in the philosophy of memory. Remembering, Martin and Deutscher argue, requires the existence of a specific sort of causal connection between the rememberer's original experience of an event and his later representation of that event: a causal connection sustained by a memory trace. In recent years, it has become apparent that this reference to memory traces may be out of step with memory science. Contemporary proponents of the causal theory are thus confronted with the question: is it possible to develop an empirically adequate version of the theory, or is it time to move beyond it? This chapter traces the recent history of the causal theory, showing how increased awareness of the theory’s problems has led to the development of modified version of the causal theory and ultimately to the emergence of postcausal theories
Software-Architecture Recovery from Machine Code
In this paper, we present a tool, called Lego, which recovers object-oriented software architecture from stripped binaries. Lego takes a stripped binary as input, and uses information obtained from dynamic analysis to (i) group the functions in the binary into classes, and (ii) identify inheritance and composition relationships between the inferred classes. The information obtained by Lego can be used for reengineering legacy software, and for understanding the architecture of software systems that lack documentation and source code. Our experiments show that the class hierarchies recovered by Lego have a high degree of agreement---measured in terms of precision and recall---with the hierarchy defined in the source code
- …