Search CORE

43,333 research outputs found

Context2Name: A Deep Learning-Based Approach to Infer Natural Variable Names from Usage Contexts

Author: Bavishi Rohan
Pradel Michael
Sen Koushik
Publication venue
Publication date: 01/11/2017
Field of study

Most of the JavaScript code deployed in the wild has been minified, a process in which identifier names are replaced with short, arbitrary and meaningless names. Minified code occupies less space, but also makes the code extremely difficult to manually inspect and understand. This paper presents Context2Name, a deep learningbased technique that partially reverses the effect of minification by predicting natural identifier names for minified names. The core idea is to predict from the usage context of a variable a name that captures the meaning of the variable. The approach combines a lightweight, token-based static analysis with an auto-encoder neural network that summarizes usage contexts and a recurrent neural network that predict natural names for a given usage context. We evaluate Context2Name with a large corpus of real-world JavaScript code and show that it successfully predicts 47.5% of all minified identifiers while taking only 2.9 milliseconds on average to predict a name. A comparison with the state-of-the-art tools JSNice and JSNaughty shows that our approach performs comparably in terms of accuracy while improving in terms of efficiency. Moreover, Context2Name complements the state-of-the-art by predicting 5.3% additional identifiers that are missed by both existing tools

arXiv.org e-Print Archive

TUbiblio

Generalized Points-to Graphs: A New Abstraction of Memory in the Presence of Pointers

Author: Gharat Pritam M.
Khedker Uday P.
Mycroft Alan
Publication venue
Publication date: 28/01/2018
Field of study

Flow- and context-sensitive points-to analysis is difficult to scale; for top-down approaches, the problem centers on repeated analysis of the same procedure; for bottom-up approaches, the abstractions used to represent procedure summaries have not scaled while preserving precision. We propose a novel abstraction called the Generalized Points-to Graph (GPG) which views points-to relations as memory updates and generalizes them using the counts of indirection levels leaving the unknown pointees implicit. This allows us to construct GPGs as compact representations of bottom-up procedure summaries in terms of memory updates and control flow between them. Their compactness is ensured by the following optimizations: strength reduction reduces the indirection levels, redundancy elimination removes redundant memory updates and minimizes control flow (without over-approximating data dependence between memory updates), and call inlining enhances the opportunities of these optimizations. We devise novel operations and data flow analyses for these optimizations. Our quest for scalability of points-to analysis leads to the following insight: The real killer of scalability in program analysis is not the amount of data but the amount of control flow that it may be subjected to in search of precision. The effectiveness of GPGs lies in the fact that they discard as much control flow as possible without losing precision (i.e., by preserving data dependence without over-approximation). This is the reason why the GPGs are very small even for main procedures that contain the effect of the entire program. This allows our implementation to scale to 158kLoC for C programs

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Generating Predicate Callback Summaries for the Android Framework

Author: Le Wei
Perez Danilo Dominguez
Publication venue
Publication date: 29/03/2017
Field of study

One of the challenges of analyzing, testing and debugging Android apps is that the potential execution orders of callbacks are missing from the apps' source code. However, bugs, vulnerabilities and refactoring transformations have been found to be related to callback sequences. Existing work on control flow analysis of Android apps have mainly focused on analyzing GUI events. GUI events, although being a key part of determining control flow of Android apps, do not offer a complete picture. Our observation is that orthogonal to GUI events, the Android API calls also play an important role in determining the order of callbacks. In the past, such control flow information has been modeled manually. This paper presents a complementary solution of constructing program paths for Android apps. We proposed a specification technique, called Predicate Callback Summary (PCS), that represents the callback control flow information (including callback sequences as well as the conditions under which the callbacks are invoked) in Android API methods and developed static analysis techniques to automatically compute and apply such summaries to construct apps' callback sequences. Our experiments show that by applying PCSs, we are able to construct Android apps' control flow graphs, including inter-callback relations, and also to detect infeasible paths involving multiple callbacks. Such control flow information can help program analysis and testing tools to report more precise results. Our detailed experimental data is available at: http://goo.gl/NBPrKsComment: 11 page

arXiv.org e-Print Archive

Crossref

Summary-based inference of quantitative bounds of live heap objects

Author: Braberman Victor Adrian
Garbervetsky Diego David
Hym Samuel
Yovine Sergio Fabian
Publication venue: 'Elsevier BV'
Publication date: 01/11/2013
Field of study

This article presents a symbolic static analysis for computing parametric upper bounds of the number of simultaneously live objects of sequential Java-like programs. Inferring the peak amount of irreclaimable objects is the cornerstone for analyzing potential heap-memory consumption of stand-alone applications or libraries. The analysis builds method-level summaries quantifying the peak number of live objects and the number of escaping objects. Summaries are built by resorting to summaries of their callees. The usability, scalability and precision of the technique is validated by successfully predicting the object heap usage of a medium-size, real-life application which is significantly larger than other previously reported case-studies.Fil: Braberman, Victor Adrian. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Garbervetsky, Diego David. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Hym, Samuel. Universite Lille 3; FranciaFil: Yovine, Sergio Fabian. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

CONICET Digital