116 research outputs found
Understanding Android Obfuscation Techniques: A Large-Scale Investigation in the Wild
In this paper, we seek to better understand Android obfuscation and depict a
holistic view of the usage of obfuscation through a large-scale investigation
in the wild. In particular, we focus on four popular obfuscation approaches:
identifier renaming, string encryption, Java reflection, and packing. To obtain
the meaningful statistical results, we designed efficient and lightweight
detection models for each obfuscation technique and applied them to our massive
APK datasets (collected from Google Play, multiple third-party markets, and
malware databases). We have learned several interesting facts from the result.
For example, malware authors use string encryption more frequently, and more
apps on third-party markets than Google Play are packed. We are also interested
in the explanation of each finding. Therefore we carry out in-depth code
analysis on some Android apps after sampling. We believe our study will help
developers select the most suitable obfuscation approach, and in the meantime
help researchers improve code analysis systems in the right direction
Focused Dynamic Slicing for Large Applications using an Abstract Memory-Model
Dynamic slicing techniques compute program dependencies to find all
statements that affect the value of a variable at a program point for a
specific execution. Despite their many potential uses, applicability is limited
by the fact that they typically cannot scale beyond small-sized applications.
We believe that at the heart of this limitation is the use of memory references
to identify data-dependencies. Particularly, working with memory references
hinders distinct treatment of the code-to-be-sliced (e.g., classes the user has
an interest in) from the rest of the code (including libraries and frameworks).
The ability to perform a coarser-grained analysis for the code that is not
under focus may provide performance gains and could become one avenue toward
scalability. In this paper, we propose a novel approach that completely
replaces memory reference registering and processing with a memory analysis
model that works with program symbols (i.e., terms). In fact, this approach
enables the alternative of not instrumenting -- thus, not generating any trace
-- for code that is not part of the code-to-be-sliced. We report on an
implementation of an abstract dynamic slicer for C\#, \textit{DynAbs}, and an
evaluation that shows how large and relevant parts of Roslyn and Powershell --
two of the largest and modern C\# applications that can be found in GitHub --
can be sliced for their test cases assertions in at most a few minutes. We also
show how reducing the code-to-be-sliced focus can bring important speedups with
marginal relative precision loss
Regression test selection for distributed Java RMI programs by means of formal concept analysis
Software maintenance is the process of modifying an existing system to ensure that it meets current and future requirements. As a result, performing regression testing becomes an essential but time consuming aspect of any maintenance activity. Regression testing is initiated after a programmer has made changes to a program that may have inadvertently introduced errors. It is a quality control approach to ensure that the newly modified code still complies with its specified requirements and that unmodified code has not been affected by the maintenance activity. In the literature various types of test selection techniques have been proposed to reduce the effort associated with re-executing the required test cases. However, the majority of these approach has been focusing only on sequential programs, and provide no or only very limited support for distributed programs or database-driven applications. The thesis presents a lightweight methodology, which applies Formal Concept Analysis to support a regression test selection analysis, in combination with execution trace collection and external data sharing analysis, for distributed Java RMI programs. Two Eclipse plug-ins were developed to automate the regression test selection process and to evaluate our methodology
Recommended from our members
Path-based dynamic impact analysis
Successful software systems evolve over their lifetimes through the cumulative changes made by software maintainers. As software evolves, the problems resulting from software change worsen, exacerbated by increased system size and complexity, lack of program understanding, amount of effort required to make changes, and number of personnel involved. Experience shows that software changes made without visibility into their effects can lead to poor effort estimates, delays in release schedules, degraded software design, unreliable software products, increased costs, and premature retirement of the software system. Software change impact analysis, impact analysis, is a software maintenance technique meant to address these problems, by assessing the effects of changes made to a software system. While impact analysis is frequently cited as a motivation or a potential application for program analysis and software maintenance research, research specific to the task of impact analysis has languished for more than 10 years. In addition, few researchers have examined the empirical factors underlying common impact analysis techniques or the tradeoffs inherent in known techniques, and none have performed empirical studies comparing impact analysis techniques. In this dissertation we introduce a new impact analysis approach, named PathImpact, that addresses a set of tradeoffs not addressed by any current impact analysis approach. Ours is the first fully-dynamic impact analysis approach. PathImpact uses light-weight instrumentation to record program execution at the level of procedure calls and returns, then efficiently builds a compressed representation that can be directly used to estimate change impact. We next extend PathImpact to accomodate system evolution yielding a technique we call EvolveImpact. EvolveImpact updates the impact representation after a system change, whereas PathImpact requires a complete recompution. In addition, we show how our approaches can be extended to a large class of emerging software architectures, including Java component-based systems and large-scale systems. Finally, we discuss the implementation of our approaches, present the first cost models for impact analysis techniques, and report the results of the first empirical studies that compare impact analysis techniques. We also empirically examine the performance of our approaches and the factors affecting the use of our techniques in practice. We found that our approach has linear time and space complexity (in the size of the dynamic information collected) and achieved a mean compression value of 0.955 on the subjects we used in our experiments. Our investigation of program evolution across multiple versions of three of our subject programs showed that, depending on the level of change activity, EvolveImpact can update the impact representation more efficiently than recomputing it in a majority of cases
Static and Dynamic Analysis in Cryptographic-API Misuse Detection of Mobile Application
With Android devices becoming more advanced and gaining more popularity, the number of cryptographic-API misuses in mobile applications is escalating. Numerous snippets of code in Android are from Stack Overflow and over 90% of them contain several crypto-issues. Various crypto-misuse detectors come out aiming to report vulnerabilities of apps and better secure users’ privacy. These detectors can be broadly classified into two categories based on the analysis strategies employed to catch misuses – static analysis (i.e., by scanning the code base) and dynamic analysis (i.e., by executing the code). However, there are not enough research on comparing their underlying differences, making it difficult to explain the pervasiveness of static crypto-detectors in both academia and industry. The lack of studies potentially limits the improvement of crypto-detection efficiency. In this study, a holistic evaluation and comparison on static and dynamic analysis’ underlying mechanisms, robustness, and efficiency are carried out. A systematic empirical experiment is implemented on testing 1003 popular Android applications across 21 categories from Google Play. We find that 93.3% of the apps make at least one mistake using cryptographic APIs and closely analyze top four cryptographic rules reported to be violated most frequently by static crypto detector. Instead of merely comparing statistics such as false positives (i.e., false alarms), we focus on examining the crypto rules whose number of violations reported by static and dynamic crypto detectors diverge greatly. In addition, we firstly posit a new taxonomy schema that classifies cryptographic rules based on how they are inspected rather than their attack type or severity level. This schema will be useful to both researchers and practitioners to decide how to efficiently combine static and dynamic techniques to improve the reliability and accuracy of crypto-detection
A Graph Coloring Approach to Dynamic Slicing of Object-Oriented Programs
Program slicing is a decomposition technique, which produces a subprogram from the parent program relevant to a particular computation. Hence slicing is also regarded
as a program transformation technique. A dynamic program slice is an executable part of a program whose behavior is identical, for the same program input, to that of the
original program with respect to a variable of interest at some execution position. Dynamic slices are smaller than static slice, which can be used eciently in dierent
software engineering activities like program testing, debugging, software maintenance, program comprehension etc.
In this dissertation, we present our work concerned with the dynamic slicing of object-oriented programs. We have developed a novel algorithm, which incorporates graph coloring technique to compute dynamic slice of object-oriented programs. But in order to achieve the goal efficiently, we have contradicted the constraints of the
traditional graph coloring theory. Moreover, the state restriction of the slicing criterion is taken into consideration, in addition to the dependence analysis. The advantage of our algorithm is that, it is more time ecient than the existing algorithms. We have named this algorithm, as Contradictory Graph Coloring Algorithm (CGCA)
- …