491 research outputs found
Opinion Mining for Software Development: A Systematic Literature Review
Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies.
SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in
code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take
considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils
these approaches entail.
We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion
mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in
other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4)
concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques.
The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide
critical insights for the further development of opinion mining techniques in the SE domain
Enhancing Source Code Representations for Deep Learning with Static Analysis
Deep learning techniques applied to program analysis tasks such as code
classification, summarization, and bug detection have seen widespread interest.
Traditional approaches, however, treat programming source code as natural
language text, which may neglect significant structural or semantic details.
Additionally, most current methods of representing source code focus solely on
the code, without considering beneficial additional context. This paper
explores the integration of static analysis and additional context such as bug
reports and design patterns into source code representations for deep learning
models. We use the Abstract Syntax Tree-based Neural Network (ASTNN) method and
augment it with additional context information obtained from bug reports and
design patterns, creating an enriched source code representation that
significantly enhances the performance of common software engineering tasks
such as code classification and code clone detection. Utilizing existing
open-source code data, our approach improves the representation and processing
of source code, thereby improving task performance
User Review-Based Change File Localization for Mobile Applications
In the current mobile app development, novel and emerging DevOps practices
(e.g., Continuous Delivery, Integration, and user feedback analysis) and tools
are becoming more widespread. For instance, the integration of user feedback
(provided in the form of user reviews) in the software release cycle represents
a valuable asset for the maintenance and evolution of mobile apps. To fully
make use of these assets, it is highly desirable for developers to establish
semantic links between the user reviews and the software artefacts to be
changed (e.g., source code and documentation), and thus to localize the
potential files to change for addressing the user feedback. In this paper, we
propose RISING (Review Integration via claSsification, clusterIng, and
linkiNG), an automated approach to support the continuous integration of user
feedback via classification, clustering, and linking of user reviews. RISING
leverages domain-specific constraint information and semi-supervised learning
to group user reviews into multiple fine-grained clusters concerning similar
users' requests. Then, by combining the textual information from both commit
messages and source code, it automatically localizes potential change files to
accommodate the users' requests. Our empirical studies demonstrate that the
proposed approach outperforms the state-of-the-art baseline work in terms of
clustering and localization accuracy, and thus produces more reliable results.Comment: 15 pages, 3 figures, 8 table
Efficiently Manifesting Asynchronous Programming Errors in Android Apps
Android, the #1 mobile app framework, enforces the single-GUI-thread model,
in which a single UI thread manages GUI rendering and event dispatching. Due to
this model, it is vital to avoid blocking the UI thread for responsiveness. One
common practice is to offload long-running tasks into async threads. To achieve
this, Android provides various async programming constructs, and leaves
developers themselves to obey the rules implied by the model. However, as our
study reveals, more than 25% apps violate these rules and introduce
hard-to-detect, fail-stop errors, which we term as aysnc programming errors
(APEs). To this end, this paper introduces APEChecker, a technique to
automatically and efficiently manifest APEs. The key idea is to characterize
APEs as specific fault patterns, and synergistically combine static analysis
and dynamic UI exploration to detect and verify such errors. Among the 40
real-world Android apps, APEChecker unveils and processes 61 APEs, of which 51
are confirmed (83.6% hit rate). Specifically, APEChecker detects 3X more APEs
than the state-of-art testing tools (Monkey, Sapienz and Stoat), and reduces
testing time from half an hour to a few minutes. On a specific type of APEs,
APEChecker confirms 5X more errors than the data race detection tool,
EventRacer, with very few false alarms
Are Formal Contracts a useful Digital Twin of Software Systems?
Digital Twins are a trend topic in the industry today to either manage runtime information or forecast properties of devices and products. The techniques for Digitial Twins are already employed in several disciplines of formal methods, in particular, formal verification, runtime verification and specification inference. In this paper, we connect the Digital Twin concept and existing research areas in the field of formal methods. We sketch how digital twins for software-centric systems can be forged from existing formal methods
A Memory Usage Comparison Between Jitana and Soot
There are several factors that make analyzing Android apps to address dependability and security concerns challenging. These factors include (i) resource efficiency as analysts need to be able to analyze large code-bases to look for issues that can exist in the application code and underlying platform code; (ii) scalability as today’s cybercriminals deploy attacks that may involve many participating apps; and (iii) in many cases, security analysts often rely on dynamic or hybrid analysis techniques to detect and identify the sources of issues.
The underlying principle governing the design of existing program analysis engines is the main cause that prevents them from satisfying these factors. Existing designs operate like compilers, so they only analyze one app at a time using a close-world process that leads to poor efficiency and scalability. Recently, Tsutana et al. introduced Jitana, a Virtual Class-Loader (VCL) based approach to construct program analyses based on the open-world concept. This approach is able to continuously load and analyze code. As such, this approach establishes a new way to make analysis efforts proportional to the code size and provides an infrastructure to construct complex, efficient, and scalable static, dynamic, and hybrid analysis procedures to address emerging dependability and security needs.
In this thesis, we attempt to quantify the performance benefit of Jitana through the lens of memory usage. Memory is a very important system-level resource that if not expended efficiently, can result in long execution time and premature termination of a program. Existing program analysis frameworks are notorious for consuming a large amount of memory during an attempt to analyze a large software project. As such, we design an experiment to compare the
memory usage between Jitana and Soot, a widely used program analysis and optimization framework for Java. Our evaluation consists of using 18 Android apps, with sizes ranging from 0.02 MB to 80.4 MB. Our empirical evaluations reveal that Jitana requires up to 81% less memory than Soot to analyze an app. At the same time, it can also analyze more components including those belonging to the application and those belonging to the Android framework.
Adviser: Witawas Srisa-a
Learning Linear Temporal Properties
We present two novel algorithms for learning formulas in Linear Temporal
Logic (LTL) from examples. The first learning algorithm reduces the learning
task to a series of satisfiability problems in propositional Boolean logic and
produces a smallest LTL formula (in terms of the number of subformulas) that is
consistent with the given data. Our second learning algorithm, on the other
hand, combines the SAT-based learning algorithm with classical algorithms for
learning decision trees. The result is a learning algorithm that scales to
real-world scenarios with hundreds of examples, but can no longer guarantee to
produce minimal consistent LTL formulas. We compare both learning algorithms
and demonstrate their performance on a wide range of synthetic benchmarks.
Additionally, we illustrate their usefulness on the task of understanding
executions of a leader election protocol
- …