653 research outputs found
Reify Your Collection Queries for Modularity and Speed!
Modularity and efficiency are often contradicting requirements, such that
programers have to trade one for the other. We analyze this dilemma in the
context of programs operating on collections. Performance-critical code using
collections need often to be hand-optimized, leading to non-modular, brittle,
and redundant code. In principle, this dilemma could be avoided by automatic
collection-specific optimizations, such as fusion of collection traversals,
usage of indexing, or reordering of filters. Unfortunately, it is not obvious
how to encode such optimizations in terms of ordinary collection APIs, because
the program operating on the collections is not reified and hence cannot be
analyzed.
We propose SQuOpt, the Scala Query Optimizer--a deep embedding of the Scala
collections API that allows such analyses and optimizations to be defined and
executed within Scala, without relying on external tools or compiler
extensions. SQuOpt provides the same "look and feel" (syntax and static typing
guarantees) as the standard collections API. We evaluate SQuOpt by
re-implementing several code analyses of the Findbugs tool using SQuOpt, show
average speedups of 12x with a maximum of 12800x and hence demonstrate that
SQuOpt can reconcile modularity and efficiency in real-world applications.Comment: 20 page
SWIGLAL: Python and Octave interfaces to the LALSuite gravitational-wave data analysis libraries
The LALSuite data analysis libraries, written in C, implement key routines critical to the successful detection of gravitational waves, such as the template waveforms describing the merger of two black holes or two neutron stars. SWIGLAL is a component of LALSuite which provides interfaces for Python and Octave, making LALSuite routines accessible directly from scripts written in those languages. It has enabled modern gravitational-wave data analysis software, used in the first detection of gravitational waves, to be written in Python, thereby benefiting from its ease of development and rich feature set, while still having access to the computational speed and scientific trustworthiness of the routines provided by LALSuite
From Design to Production Control Through the Integration of Engineering Data Management and Workflow Management Systems
At a time when many companies are under pressure to reduce "times-to-market"
the management of product information from the early stages of design through
assembly to manufacture and production has become increasingly important.
Similarly in the construction of high energy physics devices the collection of
(often evolving) engineering data is central to the subsequent physics
analysis. Traditionally in industry design engineers have employed Engineering
Data Management Systems (also called Product Data Management Systems) to
coordinate and control access to documented versions of product designs.
However, these systems provide control only at the collaborative design level
and are seldom used beyond design. Workflow management systems, on the other
hand, are employed in industry to coordinate and support the more complex and
repeatable work processes of the production environment. Commercial workflow
products cannot support the highly dynamic activities found both in the design
stages of product development and in rapidly evolving workflow definitions. The
integration of Product Data Management with Workflow Management can provide
support for product development from initial CAD/CAM collaborative design
through to the support and optimisation of production workflow activities. This
paper investigates this integration and proposes a philosophy for the support
of product data throughout the full development and production lifecycle and
demonstrates its usefulness in the construction of CMS detectors.Comment: 18 pages, 13 figure
An efficient and scalable platform for java source code analysis using overlaid graph representations
© 2013 IEEE. Although source code programs are commonly written as textual information, they enclose syntactic and semantic information that is usually represented as graphs. This information is used for many different purposes, such as static program analysis, advanced code search, coding guideline checking, software metrics computation, and extraction of semantic and syntactic information to create predictive models. Most of the existing systems that provide these kinds of services are designed ad hoc for the particular purpose they are aimed at. For this reason, we created ProgQuery, a platform to allow users to write their own Java program analyses in a declarative fashion, using graph representations. We modify the Java compiler to compute seven syntactic and semantic representations, and store them in a Neo4j graph database. Such representations are overlaid, meaning that syntactic and semantic nodes of the different graphs are interconnected to allow combining different kinds of information in the queries/analyses. We evaluate ProgQuery and compare it to the related systems. Our platform outperforms the other systems in analysis time, and scales better to program sizes and analysis complexity. Moreover, the queries coded show that ProgQuery is more expressive than the other approaches. The additional information stored by ProgQuery increases the database size and associated insertion time, but these increases are significantly lower than the query/analysis performance gains obtained.Spanish Department of Science, Innovation and Universities under Project RTI2018-099235-B-I00
Analysis and Concealment of Malware in an Adversarial Environment
Nowadays, users and devices are rapidly growing, and there is a massive migration of data and infrastructure from physical systems to virtual ones. Moreover, people are always connected and deeply dependent on information and communications. Thanks to the massive growth of Internet of Things applications, this phenomenon also affects everyday objects such as home appliances and vehicles. This extensive interconnection implies a significant rate of potential security threats for systems, devices, and virtual identities. For this reason, malware detection and analysis is one of the most critical security topics. The used detection strategies are well suited to analyze and respond to potential threats, but they are vulnerable and can be bypassed under specific conditions.
In light of this scenario, this thesis highlights the existent detection strategies and how it is possible to deceive them using malicious contents concealment strategies, such as code obfuscation and adversarial attacks. Moreover, the ultimate goal is to explore new viable ways to detect and analyze embedded malware and study the feasibility of generating adversarial attacks. In line with these two goals, in this thesis, I present two research contributions. The first one proposes a new viable way to detect and analyze the malicious contents inside Microsoft Office documents (even when concealed). The second one proposes a study about the feasibility of generating Android malicious applications capable of bypassing a real-world detection system.
Firstly, I present Oblivion, a static and dynamic system for large-scale analysis of Office documents with embedded (and most of the time concealed) malicious contents. Oblivion performs instrumentation of the code and executes the Office documents in a virtualized environment to de-obfuscate and reconstruct their behavior. In particular, Oblivion can systematically extract embedded PowerShell and non-PowerShell attacks and reconstruct the employed obfuscation strategies. This research work aims to provide a scalable system that allows analysts to go beyond simple malware detection by performing a real, in-depth inspection of macros. To evaluate the system, a large-scale analysis of more than 40,000 Office documents has been performed. The attained results show that Oblivion can efficiently de-obfuscate malicious macro-files by revealing a large corpus of PowerShell and non-PowerShell attacks in a short amount of time.
Then, the focus is on presenting an Android adversarial attack framework. This research work aims to understand the feasibility of generating adversarial samples specifically through the injection of Android system API calls only. In particular, the constraints necessary to generate actual adversarial samples are discussed. To evaluate the system, I employ an interpretability technique to assess the impact of specific API calls on the evasion. It is also assessed the vulnerability of the used detection system against mimicry and random noise attacks. Finally, it is proposed a basic implementation to generate concrete and working adversarial samples. The attained results suggest that injecting system API calls could be a viable strategy for attackers to generate concrete adversarial samples.
This thesis aims to improve the security landscape in both the research and industrial world by exploring a hot security topic and proposing two novel research works about embedded malware. The main conclusion of this research experience is that systems and devices can be secured with the most robust security processes. At the same time, it is fundamental to improve user awareness and education in detecting and preventing possible attempts of malicious infections
Analysis and identification of possible automation approaches for embedded systems design flows
Sophisticated and high performance embedded systems are present in an increasing number of application domains. In this context, formal-based design methods have been studied to make the development process robust and scalable. Models of computation (MoC) allows the modeling of an application at a high abstraction level by using a formal base. This enables analysis before the application moves to the implementation phase. Different tools and frameworks supporting MoCs have been developed. Some of them can simulate the models and also verify their functionality and feasibility before the next design steps. In view of this, we present a novel method for analysis and identification of possible automation approaches applicable to embedded systems design flow supported by formal models of computation. A comprehensive case study shows the potential and applicability of our method11212
- …