8,623 research outputs found
Empirical Evidence of Large-Scale Diversity in API Usage of Object-Oriented Software
In this paper, we study how object-oriented classes are used across thousands
of software packages. We concentrate on "usage diversity'", defined as the
different statically observable combinations of methods called on the same
object. We present empirical evidence that there is a significant usage
diversity for many classes. For instance, we observe in our dataset that Java's
String is used in 2460 manners. We discuss the reasons of this observed
diversity and the consequences on software engineering knowledge and research
DyPyBench: A Benchmark of Executable Python Software
Python has emerged as one of the most popular programming languages,
extensively utilized in domains such as machine learning, data analysis, and
web applications. Python's dynamic nature and extensive usage make it an
attractive candidate for dynamic program analysis. However, unlike for other
popular languages, there currently is no comprehensive benchmark suite of
executable Python projects, which hinders the development of dynamic analyses.
This work addresses this gap by presenting DyPyBench, the first benchmark of
Python projects that is large scale, diverse, ready to run (i.e., with fully
configured and prepared test suites), and ready to analyze (by integrating with
the DynaPyt dynamic analysis framework). The benchmark encompasses 50 popular
opensource projects from various application domains, with a total of 681k
lines of Python code, and 30k test cases. DyPyBench enables various
applications in testing and dynamic analysis, of which we explore three in this
work: (i) Gathering dynamic call graphs and empirically comparing them to
statically computed call graphs, which exposes and quantifies limitations of
existing call graph construction techniques for Python. (ii) Using DyPyBench to
build a training data set for LExecutor, a neural model that learns to predict
values that otherwise would be missing at runtime. (iii) Using dynamically
gathered execution traces to mine API usage specifications, which establishes a
baseline for future work on specification mining for Python. We envision
DyPyBench to provide a basis for other dynamic analyses and for studying the
runtime behavior of Python code
- …