796 research outputs found
Taming Strings in Dynamic Languages - An Abstract Interpretation-based Static Analysis Approach
In the recent years, dynamic languages such as JavaScript, Python or PHP, have found several fields of applications, thanks to the multiple features provided, the agility of deploying software and the seeming facility of learning such languages. In particular, strings play a central role in dynamic languages, as they can be implicitly converted to other type values, used to access object properties or transformed at run-time into executable code. In particular, the possibility to dynamically generate code as strings transformation breaks the typical assumption in static program analysis that the code is an immutable object, indeed static. This happens because program\u2019s essential data structures, such as the control-flow graph and the system of equation associated with the program to analyze, are themselves dynamically mutating objects. In a sentence: "You can\u2019t check the code you don\u2019t see". For all these reasons, dynamic languages still pone a big challenge for static program analysis, making it drastically hard and imprecise. The goal of this thesis is to tackle the problem of statically analyzing dynamic code by treating the code as any other data structure that can be statically analyzed, and by treating the static analyzer as any other function that can be recursively called. Since, in dynamically-generated code, the program code can be encoded as strings and then transformed into executable code, we first define a novel and suitable string abstraction, and the corresponding abstract semantics, able to both keep enough information to analyze string properties, in general, and keep enough information about the possible executable strings that may be converted to code. Such string abstraction will permits us to distill from a string abstract value the executable program expressed by it, allowing us to recursively call the static analyzer on the synthesized program. The final result of this thesis is an important first step towards a sound-by- construction abstract interpreter for real-world dynamic string manipulation languages, analyzing also string-to-code statements, that is the code that standard static analysis "can\u2019t see"
Formal Semantics for Java-like Languages and Research Opportunities
The objective of this paper is twofold: first, we discuss the state of art on Java-like semantics, focusing on those that provide formal specification using operational semantics (big-step or small-step), studying in detail the most cited projects and presenting some derivative works that extend the originals aggregating useful features. Also, we filter our research for those that provide some insights in type-safety proofs. Furthermore, we provide a comparison between the most used projects in order to show which functionalities are covered in such projects. Second, our effort is focused towards the research opportunities in this area, showing some important works that can be applied to the previously presented projects to study features of object-oriented languages, and pointing for some possibilities to explore in future researches
Static Program Analysis for String Manipulation Languages
In recent years, dynamic languages, such as JavaScript or Python, have been
increasingly used in a wide range of fields and applications. Their tricky and
misunderstood behaviors pose a hard challenge for static analysis of these
programming languages. A key aspect of any dynamic language program is the
multiple usage of strings, since they can be implicitly converted to another
type value, transformed by string-to-code primitives or used to access an
object-property. Unfortunately, string analyses for dynamic languages still
lack precision and do not take into account some important string features.
Moreover, string obfuscation is very popular in the context of dynamic language
malicious code, for example, to hide code information inside strings and then
to dynamically transform strings into executable code. In this scenario, more
precise string analyses become a necessity. This paper is placed in the context
of static string analysis by abstract interpretation and proposes a new
semantics for string analysis, placing a first step for handling dynamic
languages string features.Comment: In Proceedings VPT 2019, arXiv:1908.0672
- …