2,828 research outputs found

    Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs

    Full text link
    Binary code analysis allows analyzing binary code without having access to the corresponding source code. A binary, after disassembly, is expressed in an assembly language. This inspires us to approach binary analysis by leveraging ideas and techniques from Natural Language Processing (NLP), a rich area focused on processing text of various natural languages. We notice that binary code analysis and NLP share a lot of analogical topics, such as semantics extraction, summarization, and classification. This work utilizes these ideas to address two important code similarity comparison problems. (I) Given a pair of basic blocks for different instruction set architectures (ISAs), determining whether their semantics is similar or not; and (II) given a piece of code of interest, determining if it is contained in another piece of assembly code for a different ISA. The solutions to these two problems have many applications, such as cross-architecture vulnerability discovery and code plagiarism detection. We implement a prototype system INNEREYE and perform a comprehensive evaluation. A comparison between our approach and existing approaches to Problem I shows that our system outperforms them in terms of accuracy, efficiency and scalability. And the case studies utilizing the system demonstrate that our solution to Problem II is effective. Moreover, this research showcases how to apply ideas and techniques from NLP to large-scale binary code analysis.Comment: Accepted by Network and Distributed Systems Security (NDSS) Symposium 201

    Structured Review of Code Clone Literature

    Get PDF
    This report presents the results of a structured review of code clone literature. The aim of the review is to assemble a conceptual model of clone-related concepts which helps us to reason about clones. This conceptual model unifies clone concepts from a wide range of literature, so that findings about clones can be compared with each other

    International conference on software engineering and knowledge engineering: Session chair

    Get PDF
    The Thirtieth International Conference on Software Engineering and Knowledge Engineering (SEKE 2018) will be held at the Hotel Pullman, San Francisco Bay, USA, from July 1 to July 3, 2018. SEKE2018 will also be dedicated in memory of Professor Lofti Zadeh, a great scholar, pioneer and leader in fuzzy sets theory and soft computing. The conference aims at bringing together experts in software engineering and knowledge engineering to discuss on relevant results in either software engineering or knowledge engineering or both. Special emphasis will be put on the transference of methods between both domains. The theme this year is soft computing in software engineering & knowledge engineering. Submission of papers and demos are both welcome

    iRED: A disaggregated P4-AQM fully implemented in programmable data plane hardware

    Full text link
    Routers employ queues to temporarily hold packets when the scheduler cannot immediately process them. Congestion occurs when the arrival rate of packets exceeds the processing capacity, leading to increased queueing delay. Over time, Active Queue Management (AQM) strategies have focused on directly draining packets from queues to alleviate congestion and reduce queuing delay. On Programmable Data Plane (PDP) hardware, AQMs traditionally reside in the Egress pipeline due to the availability of queue delay information there. We argue that this approach wastes the router's resources because the dropped packet has already consumed the entire pipeline of the device. In this work, we propose ingress Random Early Detection (iRED), a more efficient approach that addresses the Egress drop problem. iRED is a disaggregated P4-AQM fully implemented in programmable data plane hardware and also supports Low Latency, Low Loss, and Scalable Throughput (L4S) framework, saving device pipeline resources by dropping packets in the Ingress block. To evaluate iRED, we conducted three experiments using a Tofino2 programmable switch: i) An in-depth analysis of state-of-the-art AQMs on PDP hardware, using 12 different network configurations varying in bandwidth, Round-Trip Time (RTT), and Maximum Transmission Unit (MTU). The results demonstrate that iRED can significantly reduce router resource consumption, with up to a 10x reduction in memory usage, 12x fewer processing cycles, and 8x less power consumption for the same traffic load; ii) A performance evaluation regarding the L4S framework. The results prove that iRED achieves fairness in bandwidth usage for different types of traffic (classic and scalable); iii) A comprehensive analysis of the QoS in a real setup of a DASH) technology. iRED demonstrated up to a 2.34x improvement in FPS and a 4.77x increase in the video player buffer fill.Comment: Preprint (TNSM under review

    Exploring Automated Code Evaluation Systems and Resources for Code Analysis: A Comprehensive Survey

    Full text link
    The automated code evaluation system (AES) is mainly designed to reliably assess user-submitted code. Due to their extensive range of applications and the accumulation of valuable resources, AESs are becoming increasingly popular. Research on the application of AES and their real-world resource exploration for diverse coding tasks is still lacking. In this study, we conducted a comprehensive survey on AESs and their resources. This survey explores the application areas of AESs, available resources, and resource utilization for coding tasks. AESs are categorized into programming contests, programming learning and education, recruitment, online compilers, and additional modules, depending on their application. We explore the available datasets and other resources of these systems for research, analysis, and coding tasks. Moreover, we provide an overview of machine learning-driven coding tasks, such as bug detection, code review, comprehension, refactoring, search, representation, and repair. These tasks are performed using real-life datasets. In addition, we briefly discuss the Aizu Online Judge platform as a real example of an AES from the perspectives of system design (hardware and software), operation (competition and education), and research. This is due to the scalability of the AOJ platform (programming education, competitions, and practice), open internal features (hardware and software), attention from the research community, open source data (e.g., solution codes and submission documents), and transparency. We also analyze the overall performance of this system and the perceived challenges over the years
    • โ€ฆ
    corecore