55,722 research outputs found
User Review-Based Change File Localization for Mobile Applications
In the current mobile app development, novel and emerging DevOps practices
(e.g., Continuous Delivery, Integration, and user feedback analysis) and tools
are becoming more widespread. For instance, the integration of user feedback
(provided in the form of user reviews) in the software release cycle represents
a valuable asset for the maintenance and evolution of mobile apps. To fully
make use of these assets, it is highly desirable for developers to establish
semantic links between the user reviews and the software artefacts to be
changed (e.g., source code and documentation), and thus to localize the
potential files to change for addressing the user feedback. In this paper, we
propose RISING (Review Integration via claSsification, clusterIng, and
linkiNG), an automated approach to support the continuous integration of user
feedback via classification, clustering, and linking of user reviews. RISING
leverages domain-specific constraint information and semi-supervised learning
to group user reviews into multiple fine-grained clusters concerning similar
users' requests. Then, by combining the textual information from both commit
messages and source code, it automatically localizes potential change files to
accommodate the users' requests. Our empirical studies demonstrate that the
proposed approach outperforms the state-of-the-art baseline work in terms of
clustering and localization accuracy, and thus produces more reliable results.Comment: 15 pages, 3 figures, 8 table
Simplifying Deep-Learning-Based Model for Code Search
To accelerate software development, developers frequently search and reuse
existing code snippets from a large-scale codebase, e.g., GitHub. Over the
years, researchers proposed many information retrieval (IR) based models for
code search, which match keywords in query with code text. But they fail to
connect the semantic gap between query and code. To conquer this challenge, Gu
et al. proposed a deep-learning-based model named DeepCS. It jointly embeds
method code and natural language description into a shared vector space, where
methods related to a natural language query are retrieved according to their
vector similarities. However, DeepCS' working process is complicated and
time-consuming. To overcome this issue, we proposed a simplified model
CodeMatcher that leverages the IR technique but maintains many features in
DeepCS. Generally, CodeMatcher combines query keywords with the original order,
performs a fuzzy search on name and body strings of methods, and returned the
best-matched methods with the longer sequence of used keywords. We verified its
effectiveness on a large-scale codebase with about 41k repositories.
Experimental results showed the simplified model CodeMatcher outperforms DeepCS
by 97% in terms of MRR (a widely used accuracy measure for code search), and it
is over 66 times faster than DeepCS. Besides, comparing with the
state-of-the-art IR-based model CodeHow, CodeMatcher also improves the MRR by
73%. We also observed that: fusing the advantages of IR-based and
deep-learning-based models is promising because they compensate with each other
by nature; improving the quality of method naming helps code search, since
method name plays an important role in connecting query and code
Ontological Reengineering for Reuse
This paper presents the concept of Ontological Reengineering as the process of retrieving
and transforming a conceptual model of an existing and implemented ontology into a new, more correct and more complete conceptual model which is reimplemented. Three activities have been identified in this process: reverse engineering, restructuring and forward engineering. The aim of Reverse Engineering is to output a possible conceptual model on the basis of the code in which the ontology is implemented. The goal of Restructuring is to reorganize this initial conceptual model into a new conceptual model, which is built bearing in mind the use of the restructured ontology by the ontology/application that reuses it. Finally, the objective of Forward Engineering is output a new implementation of the ontology. The paper also discusses how the ontological reengineering process has been applied to the Standard-Units ontology [18], which is included in a Chemical-Elements [12] ontology. These two ontologies will be included in a Monatomic-Ions and Environmental-Pollutants ontologies
Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs
Binary code analysis allows analyzing binary code without having access to
the corresponding source code. A binary, after disassembly, is expressed in an
assembly language. This inspires us to approach binary analysis by leveraging
ideas and techniques from Natural Language Processing (NLP), a rich area
focused on processing text of various natural languages. We notice that binary
code analysis and NLP share a lot of analogical topics, such as semantics
extraction, summarization, and classification. This work utilizes these ideas
to address two important code similarity comparison problems. (I) Given a pair
of basic blocks for different instruction set architectures (ISAs), determining
whether their semantics is similar or not; and (II) given a piece of code of
interest, determining if it is contained in another piece of assembly code for
a different ISA. The solutions to these two problems have many applications,
such as cross-architecture vulnerability discovery and code plagiarism
detection. We implement a prototype system INNEREYE and perform a comprehensive
evaluation. A comparison between our approach and existing approaches to
Problem I shows that our system outperforms them in terms of accuracy,
efficiency and scalability. And the case studies utilizing the system
demonstrate that our solution to Problem II is effective. Moreover, this
research showcases how to apply ideas and techniques from NLP to large-scale
binary code analysis.Comment: Accepted by Network and Distributed Systems Security (NDSS) Symposium
201
Enabling High-Level Application Development for the Internet of Things
Application development in the Internet of Things (IoT) is challenging
because it involves dealing with a wide range of related issues such as lack of
separation of concerns, and lack of high-level of abstractions to address both
the large scale and heterogeneity. Moreover, stakeholders involved in the
application development have to address issues that can be attributed to
different life-cycles phases. when developing applications. First, the
application logic has to be analyzed and then separated into a set of
distributed tasks for an underlying network. Then, the tasks have to be
implemented for the specific hardware. Apart from handling these issues, they
have to deal with other aspects of life-cycle such as changes in application
requirements and deployed devices. Several approaches have been proposed in the
closely related fields of wireless sensor network, ubiquitous and pervasive
computing, and software engineering in general to address the above challenges.
However, existing approaches only cover limited subsets of the above mentioned
challenges when applied to the IoT. This paper proposes an integrated approach
for addressing the above mentioned challenges. The main contributions of this
paper are: (1) a development methodology that separates IoT application
development into different concerns and provides a conceptual framework to
develop an application, (2) a development framework that implements the
development methodology to support actions of stakeholders. The development
framework provides a set of modeling languages to specify each development
concern and abstracts the scale and heterogeneity related complexity. It
integrates code generation, task-mapping, and linking techniques to provide
automation. Code generation supports the application development phase by
producing a programming framework that allows stakeholders to focus on the
application logic, while our mapping and linking techniques together support
the deployment phase by producing device-specific code to result in a
distributed system collaboratively hosted by individual devices. Our evaluation
based on two realistic scenarios shows that the use of our approach improves
the productivity of stakeholders involved in the application development
`The frozen accident' as an evolutionary adaptation: A rate distortion theory perspective on the dynamics and symmetries of genetic coding mechanisms
We survey some interpretations and related issues concerning the frozen hypothesis due to F. Crick and how it can be explained in terms of several natural mechanisms involving error correction codes, spin glasses, symmetry breaking and the characteristic robustness of genetic networks. The approach to most of these questions involves using elements of Shannon's rate distortion theory incorporating a semantic system which is meaningful for the relevant alphabets and vocabulary implemented in transmission of the genetic code. We apply the fundamental homology between information source uncertainty with the free energy density of a thermodynamical system with respect to transcriptional regulators and the communication channels of sequence/structure in proteins. This leads to the suggestion that the frozen accident may have been a type of evolutionary adaptation
- âŠ