756 research outputs found
Opaque Service Virtualisation: A Practical Tool for Emulating Endpoint Systems
Large enterprise software systems make many complex interactions with other
services in their environment. Developing and testing for production-like
conditions is therefore a very challenging task. Current approaches include
emulation of dependent services using either explicit modelling or
record-and-replay approaches. Models require deep knowledge of the target
services while record-and-replay is limited in accuracy. Both face
developmental and scaling issues. We present a new technique that improves the
accuracy of record-and-replay approaches, without requiring prior knowledge of
the service protocols. The approach uses Multiple Sequence Alignment to derive
message prototypes from recorded system interactions and a scheme to match
incoming request messages against prototypes to generate response messages. We
use a modified Needleman-Wunsch algorithm for distance calculation during
message matching. Our approach has shown greater than 99% accuracy for four
evaluated enterprise system messaging protocols. The approach has been
successfully integrated into the CA Service Virtualization commercial product
to complement its existing techniques.Comment: In Proceedings of the 38th International Conference on Software
Engineering Companion (pp. 202-211). arXiv admin note: text overlap with
arXiv:1510.0142
Multiple Biolgical Sequence Alignment: Scoring Functions, Algorithms, and Evaluations
Aligning multiple biological sequences such as protein sequences or DNA/RNA sequences is a fundamental task in bioinformatics and sequence analysis. These alignments may contain invaluable information that scientists need to predict the sequences\u27 structures, determine the evolutionary relationships between them, or discover drug-like compounds that can bind to the sequences. Unfortunately, multiple sequence alignment (MSA) is NP-Complete. In addition, the lack of a reliable scoring method makes it very hard to align the sequences reliably and to evaluate the alignment outcomes.
In this dissertation, we have designed a new scoring method for use in multiple sequence alignment. Our scoring method encapsulates stereo-chemical properties of sequence residues and their substitution probabilities into a tree-structure scoring scheme. This new technique provides a reliable scoring scheme with low computational complexity.
In addition to the new scoring scheme, we have designed an overlapping sequence clustering algorithm to use in our new three multiple sequence alignment algorithms. One of our alignment algorithms uses a dynamic weighted guidance tree to perform multiple sequence alignment in progressive fashion. The use of dynamic weighted tree allows errors in the early alignment stages to be corrected in the subsequence stages. Other two algorithms utilize sequence knowledge-bases and sequence consistency to produce biological meaningful sequence alignments. To improve the speed of the multiple sequence alignment, we have developed a parallel algorithm that can be deployed on reconfigurable computer models. Analytically, our parallel algorithm is the fastest progressive multiple sequence alignment algorithm
Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs
Binary code analysis allows analyzing binary code without having access to
the corresponding source code. A binary, after disassembly, is expressed in an
assembly language. This inspires us to approach binary analysis by leveraging
ideas and techniques from Natural Language Processing (NLP), a rich area
focused on processing text of various natural languages. We notice that binary
code analysis and NLP share a lot of analogical topics, such as semantics
extraction, summarization, and classification. This work utilizes these ideas
to address two important code similarity comparison problems. (I) Given a pair
of basic blocks for different instruction set architectures (ISAs), determining
whether their semantics is similar or not; and (II) given a piece of code of
interest, determining if it is contained in another piece of assembly code for
a different ISA. The solutions to these two problems have many applications,
such as cross-architecture vulnerability discovery and code plagiarism
detection. We implement a prototype system INNEREYE and perform a comprehensive
evaluation. A comparison between our approach and existing approaches to
Problem I shows that our system outperforms them in terms of accuracy,
efficiency and scalability. And the case studies utilizing the system
demonstrate that our solution to Problem II is effective. Moreover, this
research showcases how to apply ideas and techniques from NLP to large-scale
binary code analysis.Comment: Accepted by Network and Distributed Systems Security (NDSS) Symposium
201
Automated synthesis of mediators to support component interoperability
Interoperability is a major concern for the software engineering field, given the increasing need to compose components dynamically and seamlessly. This dynamic composition is often hampered by differences in the interfaces and behaviours of independently-developed components. To address these differences without changing the components, mediators that systematically enforce interoperability between functionally-compatible components by mapping their interfaces and coordinating their behaviours are required. Existing approaches to mediator synthesis assume that an interface mapping is provided which specifies the correspondence between the operations and data of the components at hand. In this paper, we present an approach based on ontology reasoning and constraint programming in order to infer mappings between components' interfaces automatically. These mappings guarantee semantic compatibility between the operations and data of the interfaces. Then, we analyse the behaviours of components in order to synthesise, if possible, a mediator that coordinates the computed mappings so as to make the components interact properly. Our approach is formally-grounded to ensure the correctness of the synthesised mediator. We demonstrate the validity of our approach by implementing the MICS (Mediator synthesIs to Connect Components) tool and experimenting it with various real-world case studies
Toward Sensor-Based Random Number Generation for Mobile and IoT Devices
The importance of random number generators (RNGs) to various computing applications is well understood. To ensure a quality level of output, high-entropy sources should be utilized as input. However, the algorithms used have not yet fully evolved to utilize newer technology. Even the Android pseudo RNG (APRNG) merely builds atop the Linux RNG to produce random numbers. This paper presents an exploratory study into methods of generating random numbers on sensor-equipped mobile and Internet of Things devices. We first perform a data collection study across 37 Android devices to determine two things-how much random data is consumed by modern devices, and which sensors are capable of producing sufficiently random data. We use the results of our analysis to create an experimental framework called SensoRNG, which serves as a prototype to test the efficacy of a sensor-based RNG. SensoRNG employs collection of data from on-board sensors and combines them via a lightweight mixing algorithm to produce random numbers. We evaluate SensoRNG with the National Institute of Standards and Technology statistical testing suite and demonstrate that a sensor-based RNG can provide high quality random numbers with only little additional overhead
- …