2,828 research outputs found
Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs
Binary code analysis allows analyzing binary code without having access to
the corresponding source code. A binary, after disassembly, is expressed in an
assembly language. This inspires us to approach binary analysis by leveraging
ideas and techniques from Natural Language Processing (NLP), a rich area
focused on processing text of various natural languages. We notice that binary
code analysis and NLP share a lot of analogical topics, such as semantics
extraction, summarization, and classification. This work utilizes these ideas
to address two important code similarity comparison problems. (I) Given a pair
of basic blocks for different instruction set architectures (ISAs), determining
whether their semantics is similar or not; and (II) given a piece of code of
interest, determining if it is contained in another piece of assembly code for
a different ISA. The solutions to these two problems have many applications,
such as cross-architecture vulnerability discovery and code plagiarism
detection. We implement a prototype system INNEREYE and perform a comprehensive
evaluation. A comparison between our approach and existing approaches to
Problem I shows that our system outperforms them in terms of accuracy,
efficiency and scalability. And the case studies utilizing the system
demonstrate that our solution to Problem II is effective. Moreover, this
research showcases how to apply ideas and techniques from NLP to large-scale
binary code analysis.Comment: Accepted by Network and Distributed Systems Security (NDSS) Symposium
201
Structured Review of Code Clone Literature
This report presents the results of a structured review of code clone literature. The aim of the review is to assemble a conceptual model of clone-related concepts which helps us to reason about clones. This conceptual model unifies clone concepts from a wide range of literature, so that findings about clones can be compared with each other
International conference on software engineering and knowledge engineering: Session chair
The Thirtieth International Conference on Software Engineering and Knowledge Engineering (SEKE 2018) will be held at the Hotel Pullman, San Francisco Bay, USA, from July 1 to July 3, 2018. SEKE2018 will also be dedicated in memory of Professor Lofti Zadeh, a great scholar, pioneer and leader in fuzzy sets theory and soft computing.
The conference aims at bringing together experts in software engineering and knowledge engineering to discuss on relevant results in either software engineering or knowledge engineering or both. Special emphasis will be put on the transference of methods between both domains. The theme this year is soft computing in software engineering & knowledge engineering. Submission of papers and demos are both welcome
iRED: A disaggregated P4-AQM fully implemented in programmable data plane hardware
Routers employ queues to temporarily hold packets when the scheduler cannot
immediately process them. Congestion occurs when the arrival rate of packets
exceeds the processing capacity, leading to increased queueing delay. Over
time, Active Queue Management (AQM) strategies have focused on directly
draining packets from queues to alleviate congestion and reduce queuing delay.
On Programmable Data Plane (PDP) hardware, AQMs traditionally reside in the
Egress pipeline due to the availability of queue delay information there. We
argue that this approach wastes the router's resources because the dropped
packet has already consumed the entire pipeline of the device. In this work, we
propose ingress Random Early Detection (iRED), a more efficient approach that
addresses the Egress drop problem. iRED is a disaggregated P4-AQM fully
implemented in programmable data plane hardware and also supports Low Latency,
Low Loss, and Scalable Throughput (L4S) framework, saving device pipeline
resources by dropping packets in the Ingress block. To evaluate iRED, we
conducted three experiments using a Tofino2 programmable switch: i) An in-depth
analysis of state-of-the-art AQMs on PDP hardware, using 12 different network
configurations varying in bandwidth, Round-Trip Time (RTT), and Maximum
Transmission Unit (MTU). The results demonstrate that iRED can significantly
reduce router resource consumption, with up to a 10x reduction in memory usage,
12x fewer processing cycles, and 8x less power consumption for the same traffic
load; ii) A performance evaluation regarding the L4S framework. The results
prove that iRED achieves fairness in bandwidth usage for different types of
traffic (classic and scalable); iii) A comprehensive analysis of the QoS in a
real setup of a DASH) technology. iRED demonstrated up to a 2.34x improvement
in FPS and a 4.77x increase in the video player buffer fill.Comment: Preprint (TNSM under review
Exploring Automated Code Evaluation Systems and Resources for Code Analysis: A Comprehensive Survey
The automated code evaluation system (AES) is mainly designed to reliably
assess user-submitted code. Due to their extensive range of applications and
the accumulation of valuable resources, AESs are becoming increasingly popular.
Research on the application of AES and their real-world resource exploration
for diverse coding tasks is still lacking. In this study, we conducted a
comprehensive survey on AESs and their resources. This survey explores the
application areas of AESs, available resources, and resource utilization for
coding tasks. AESs are categorized into programming contests, programming
learning and education, recruitment, online compilers, and additional modules,
depending on their application. We explore the available datasets and other
resources of these systems for research, analysis, and coding tasks. Moreover,
we provide an overview of machine learning-driven coding tasks, such as bug
detection, code review, comprehension, refactoring, search, representation, and
repair. These tasks are performed using real-life datasets. In addition, we
briefly discuss the Aizu Online Judge platform as a real example of an AES from
the perspectives of system design (hardware and software), operation
(competition and education), and research. This is due to the scalability of
the AOJ platform (programming education, competitions, and practice), open
internal features (hardware and software), attention from the research
community, open source data (e.g., solution codes and submission documents),
and transparency. We also analyze the overall performance of this system and
the perceived challenges over the years
- โฆ