400 research outputs found
Criteria for software modularization
A central issue in programming practice involves determining the appropriate size and information content of a software module. This study attempted to determine the effectiveness of two widely used criteria for software modularization, strength and size, in reducing fault rate and development cost. Data from 453 FORTRAN modules developed by professional programmers were analyzed. The results indicated that module strength is a good criterion with respect to fault rate, whereas arbitrary module size limitations inhibit programmer productivity. This analysis is a first step toward defining empirically based standards for software modularization
On the Effect of Semantically Enriched Context Models on Software Modularization
Many of the existing approaches for program comprehension rely on the
linguistic information found in source code, such as identifier names and
comments. Semantic clustering is one such technique for modularization of the
system that relies on the informal semantics of the program, encoded in the
vocabulary used in the source code. Treating the source code as a collection of
tokens loses the semantic information embedded within the identifiers. We try
to overcome this problem by introducing context models for source code
identifiers to obtain a semantic kernel, which can be used for both deriving
the topics that run through the system as well as their clustering. In the
first model, we abstract an identifier to its type representation and build on
this notion of context to construct contextual vector representation of the
source code. The second notion of context is defined based on the flow of data
between identifiers to represent a module as a dependency graph where the nodes
correspond to identifiers and the edges represent the data dependencies between
pairs of identifiers. We have applied our approach to 10 medium-sized open
source Java projects, and show that by introducing contexts for identifiers,
the quality of the modularization of the software systems is improved. Both of
the context models give results that are superior to the plain vector
representation of documents. In some cases, the authoritativeness of
decompositions is improved by 67%. Furthermore, a more detailed evaluation of
our approach on JEdit, an open source editor, demonstrates that inferred topics
through performing topic analysis on the contextual representations are more
meaningful compared to the plain representation of the documents. The proposed
approach in introducing a context model for source code identifiers paves the
way for building tools that support developers in program comprehension tasks
such as application and domain concept location, software modularization and
topic analysis
Improved Binary Similarity Measures for Software Modularization
Various binary similarity measures have been employed in clustering approaches to make homogeneous groups of similar entities in the data. These similarity measures are mostly based only on the presence and absence of features. Binary similarity measures have also been explored with different clustering approaches (e.g., agglomerative hierarchical clustering) for software modularization to make the software systems understandable and manageable. Each similarity measure has its own strengths and weaknesses that result in improving and deteriorating the clustering results, respectively. This paper highlights the strengths of some well-known existing binary similarity measures for software modularization. Furthermore, based on these existing similarity measures, this paper introduces the improved new binary similarity measures. Proofs of the correctness with illustration and a series of experiments are presented to evaluate the effectiveness of our new binary similarity measures
Extending ROOT through Modules
The ROOT software framework is foundational for the HEP ecosystem, providing
capabilities such as IO, a C++ interpreter, GUI, and math libraries. It uses
object-oriented concepts and build-time components to layer between them. We
believe additional layering formalisms will benefit ROOT and its users. We
present the modularization strategy for ROOT which aims to formalize the
description of existing source components, making available the dependencies
and other metadata externally from the build system, and allow post-install
additions of functionality in the runtime environment. components can then be
grouped into packages, installable from external repositories to deliver
post-install step of missing packages. This provides a mechanism for the wider
software ecosystem to interact with a minimalistic install. Reducing
intra-component dependencies improves maintainability and code hygiene. We
believe helping maintain the smallest "base install" possible will help
embedding use cases. The modularization effort draws inspiration from the Java,
Python, and Swift ecosystems. Keeping aligned with the modern C++, this
strategy relies on forthcoming features such as C++ modules. We hope
formalizing the component layer will provide simpler ROOT installs, improve
extensibility, and decrease the complexity of embedding in other ecosystemsComment: 8 pages, 2 figures, 1 listing, CHEP 2018 - 23rd International
Conference on Computing in High Energy and Nuclear Physic
Towards A Modular IT-Landscape For Manufacturing Companies: Framework For Holistic Software Modularization
Companies in the manufacturing sector are confronted with an increasingly dynamic environment. Thus, corporate processes and, consequently, the supporting IT landscape must change. This need is not yet fully met in the development of information systems. While best-of-breed approaches are available, monolithic systems that no longer meet the manufacturing industry's requirements are still prevalent in practical use. A modular structure of IT landscapes could combine the advantages of individual and standard information systems and meet the need for adaptability. At present, however, there is no established standard for the modular design of IT landscapes in the field of manufacturing companies' information systems. This paper presents different ways of the modular design of IT landscapes and information systems and analyzes their objects of modularization. For this purpose, a systematic literature research is carried out in the subject area of software and modularization. Starting from the V-model as a reference model, a framework for different levels of modularization was developed by identifying that most scientific approaches carry out modularization at the data structure-based and source code-based levels. Only a few sources address the consideration of modularization at the level of the software environment-based and software function-based level. In particular, no domain-specific application of these levels of modularization, e.g., for manufacturing, was identified
Annotated bibliography of software engineering laboratory literature
An annotated bibliography is presented of technical papers, documents, and memorandums produced by or related to the Software Engineering Laboratory. The bibliography was updated and reorganized substantially since the original version (SEL-82-006, November 1982). All materials were grouped into eight general subject areas for easy reference: (1) The Software Engineering Laboratory; (2) The Software Engineering Laboratory: Software Development Documents; (3) Software Tools; (4) Software Models; (5) Software Measurement; (6) Technology Evaluations; (7) Ada Technology; and (8) Data Collection. Subject and author indexes further classify these documents by specific topic and individual author
Applications of Multi-view Learning Approaches for Software Comprehension
Program comprehension concerns the ability of an individual to make an
understanding of an existing software system to extend or transform it.
Software systems comprise of data that are noisy and missing, which makes
program understanding even more difficult. A software system consists of
various views including the module dependency graph, execution logs,
evolutionary information and the vocabulary used in the source code, that
collectively defines the software system. Each of these views contain unique
and complementary information; together which can more accurately describe the
data. In this paper, we investigate various techniques for combining different
sources of information to improve the performance of a program comprehension
task. We employ state-of-the-art techniques from learning to 1) find a suitable
similarity function for each view, and 2) compare different multi-view learning
techniques to decompose a software system into high-level units and give
component-level recommendations for refactoring of the system, as well as
cross-view source code search. The experiments conducted on 10 relatively large
Java software systems show that by fusing knowledge from different views, we
can guarantee a lower bound on the quality of the modularization and even
improve upon it. We proceed by integrating different sources of information to
give a set of high-level recommendations as to how to refactor the software
system. Furthermore, we demonstrate how learning a joint subspace allows for
performing cross-modal retrieval across views, yielding results that are more
aligned with what the user intends by the query. The multi-view approaches
outlined in this paper can be employed for addressing problems in software
engineering that can be encoded in terms of a learning problem, such as
software bug prediction and feature location
- …