138 research outputs found
Accuracy of Author Names in Bibliographic Data Sources: An Italian Case Study
We investigate the accuracy of how author names are reported in bibliographic records excerpted from four prominent sources: WoS, Scopus, PubMed, and CrossRef. We take as a case study 44,549 publications stored in the internal database of Sapienza University of Rome, one of the largest universities in Europe. While our results indicate generally good accuracy for all bibliographic data sources considered, we highlight a number of issues that undermine the accuracy for certain classes of author names, including compound names and names with diacritics, which are common features to Italian and other Western languages
Rethinking Pointer Reasoning in Symbolic Execution
Symbolic execution is a popular program analysis technique that allows seeking for bugs by reasoning over multiple alternative execution states at once. As the number of states to explore may grow exponentially, a symbolic executor may quickly run out of space. For instance, a memory access to a symbolic address may potentially reference the entire address space, leading to a combinatorial explosion of the possible resulting execution states. To cope with this issue, state-of-the-art executors concretize symbolic addresses that span memory intervals larger than some threshold. Unfortunately, this could result in missing interesting execution states, e.g., where a bug arises. In this paper we introduce MemSight, a new approach to symbolic memory that reduces the need for concretization, hence offering the opportunity for broader state explorations and more precise pointer reasoning. Rather than mapping address instances to data as previous tools do, our technique maps symbolic address expressions to data, maintaining the possible alternative states resulting from the memory referenced by a symbolic address in a compact, implicit form. A preliminary experimental investigation on prominent benchmarks from the DARPA Cyber Grand Challenge shows that MemSight enables the exploration of states unreachable by previous techniques
SymFusion: Hybrid Instrumentation for Concolic Execution
Concolic execution is a dynamic twist of symbolic execution de-
signed with scalability in mind. Recent concolic executors heavily
rely on program instrumentation to achieve such scalability. The
instrumentation code can be added at compilation time (e.g., using
an LLVM pass), or directly at execution time with the help of a
dynamic binary translator. The former approach results in more ef-
ficient code but requires recompilation. Unfortunately, recompiling
the entire code of a program is not always feasible or practical (e.g.,
in presence of third-party components). On the contrary, the latter
approach does not require recompilation but incurs significantly
higher execution time overhead.
In this paper, we investigate a hybrid instrumentation approach
for concolic execution, called SymFusion. In particular, this hybrid
instrumentation approach allows the user to recompile the core
components of an application, thus minimizing the analysis over-
head on them, while still being able to dynamically instrument the
rest of the application components at execution time. Our experi-
mental evaluation shows that our design can achieve a nice balance
between efficiency and efficacy on several real-world application
Fuzzing Symbolic Expressions
Recent years have witnessed a wide array of results in software testing,
exploring different approaches and methodologies ranging from fuzzers to
symbolic engines, with a full spectrum of instances in between such as concolic
execution and hybrid fuzzing. A key ingredient of many of these tools is
Satisfiability Modulo Theories (SMT) solvers, which are used to reason over
symbolic expressions collected during the analysis. In this paper, we
investigate whether techniques borrowed from the fuzzing domain can be applied
to check whether symbolic formulas are satisfiable in the context of concolic
and hybrid fuzzing engines, providing a viable alternative to classic SMT
solving techniques. We devise a new approximate solver, FUZZY-SAT, and show
that it is both competitive with and complementary to state-of-the-art solvers
such as Z3 with respect to handling queries generated by hybrid fuzzers
A Survey of Symbolic Execution Techniques
Many security and software testing applications require checking whether
certain properties of a program hold for any possible usage scenario. For
instance, a tool for identifying software vulnerabilities may need to rule out
the existence of any backdoor to bypass a program's authentication. One
approach would be to test the program using different, possibly random inputs.
As the backdoor may only be hit for very specific program workloads, automated
exploration of the space of possible inputs is of the essence. Symbolic
execution provides an elegant solution to the problem, by systematically
exploring many possible execution paths at the same time without necessarily
requiring concrete inputs. Rather than taking on fully specified input values,
the technique abstractly represents them as symbols, resorting to constraint
solvers to construct actual instances that would cause property violations.
Symbolic execution has been incubated in dozens of tools developed over the
last four decades, leading to major practical breakthroughs in a number of
prominent software reliability applications. The goal of this survey is to
provide an overview of the main ideas, challenges, and solutions developed in
the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing
Surveys. If you are considering citing this survey, we would appreciate if you
could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing
this survey, we would appreciate if you could use the following BibTeX entry:
http://goo.gl/Hf5Fv
On the Shapley value and its application to the Italian VQR research assessment exercise
Research assessment exercises have now become common evaluation tools in a number of countries. These exercises have the goal of guiding merit-based public funds allocation, stimulating improvement of research productivity through competition and assessing the impact of adopted research support policies. One case in point is Italy's most recent research assessment effort, VQR 2011–2014 (Research Quality Evaluation), which, in addition to research institutions, also evaluated university departments, and individuals in some cases (i.e., recently hired research staff and members of PhD committees). However, the way an institution's score was divided, according to VQR rules, between its constituent departments or its staff members does not enjoy many desirable properties well known from coalitional game theory (e.g., budget balance, fairness, marginality). We propose, instead, an alternative score division rule that is based on the notion of Shapley value, a well known solution concept in coalitional game theory, which enjoys the desirable properties mentioned above. For a significant test case (namely, Sapienza University of Rome, the largest university in Italy), we present a detailed comparison of the scores obtained, for substructures and individuals, by applying the official VQR rules, with those resulting from Shapley value computations. We show that there are significant differences in the resulting scores, making room for improvements in the allocation rules used in research assessment exercises
- …