379 research outputs found
Cooperative Based Software Clustering on Dependency Graphs
The organization of software systems into subsystems is usually based on the
constructs of packages or modules and has a major impact on the maintainability of
the software. However, during software evolution, the organization of the system is
subject to continual modification, which can cause it to drift away from the original
design, often with the effect of reducing its quality.
A number of techniques for evaluating a system's maintainability and for controlling
the effort required to conduct maintenance activities involve software clustering.
Software clustering refers to the partitioning of software system components
into clusters in order to obtain both exterior and interior connectivity between these
components. It helps maintainers enhance the quality of software modularization
and improve its maintainability.
Research in this area has produced numerous algorithms with a variety of
methodologies and parameters. This thesis presents a novel ensemble approach
that synthesizes a new solution from the outcomes of multiple constituent clustering
algorithms. The main principle behind this approach derived from machine
learning, as applied to document clustering, but it has been modified, both conceptually
and empirically, for use in software clustering. The conceptual modifications
include working with a variable number of clusters produced by the input algorithms
and employing graph structures rather than feature vectors. The empirical
modifications include experiments directed at the selection of the optimal cluster merging criteria. Case studies based on open source software systems show that
establishing cooperation between leading state-of-the-art algorithms produces better
clustering results compared with those achieved using only one of any of the
algorithms considered
Applications of Multi-view Learning Approaches for Software Comprehension
Program comprehension concerns the ability of an individual to make an
understanding of an existing software system to extend or transform it.
Software systems comprise of data that are noisy and missing, which makes
program understanding even more difficult. A software system consists of
various views including the module dependency graph, execution logs,
evolutionary information and the vocabulary used in the source code, that
collectively defines the software system. Each of these views contain unique
and complementary information; together which can more accurately describe the
data. In this paper, we investigate various techniques for combining different
sources of information to improve the performance of a program comprehension
task. We employ state-of-the-art techniques from learning to 1) find a suitable
similarity function for each view, and 2) compare different multi-view learning
techniques to decompose a software system into high-level units and give
component-level recommendations for refactoring of the system, as well as
cross-view source code search. The experiments conducted on 10 relatively large
Java software systems show that by fusing knowledge from different views, we
can guarantee a lower bound on the quality of the modularization and even
improve upon it. We proceed by integrating different sources of information to
give a set of high-level recommendations as to how to refactor the software
system. Furthermore, we demonstrate how learning a joint subspace allows for
performing cross-modal retrieval across views, yielding results that are more
aligned with what the user intends by the query. The multi-view approaches
outlined in this paper can be employed for addressing problems in software
engineering that can be encoded in terms of a learning problem, such as
software bug prediction and feature location
Towards a Reference Architecture with Modular Design for Large-scale Genotyping and Phenotyping Data Analysis: A Case Study with Image Data
With the rapid advancement of computing technologies, various scientific research communities have been extensively using cloud-based software tools or applications. Cloud-based applications allow users to access
software applications from web browsers while relieving them from the installation of any software applications in
their desktop environment. For example, Galaxy, GenAP, and iPlant Colaborative are popular cloud-based
systems for scientific workflow analysis in the domain of plant Genotyping and Phenotyping. These systems are being used for conducting research, devising new techniques, and sharing the computer assisted analysis results among collaborators. Researchers need to integrate their new workflows/pipelines, tools or techniques with the base system over time. Moreover, large scale data need to be processed within the time-line for more effective analysis. Recently, Big Data technologies are emerging for facilitating large scale data processing with commodity hardware. Among the above-mentioned systems, GenAp is utilizing the Big Data technologies for specific cases only. The structure of such a cloud-based system is highly variable and complex in nature. Software architects and developers need to consider totally different properties and challenges during the development and maintenance phases compared to the traditional business/service oriented systems. Recent studies report that software engineers and data engineers confront challenges to develop analytic tools for supporting large scale and heterogeneous data analysis. Unfortunately, less focus has been given by the software researchers to devise a well-defined methodology and frameworks for flexible design of a cloud system for the Genotyping and Phenotyping domain. To that end, more effective design methodologies and frameworks are an urgent need for cloud based Genotyping and Phenotyping analysis system development that also supports large scale data processing.
In our thesis, we conduct a few studies in order to devise a stable reference architecture and modularity model for the software developers and data engineers in the domain of Genotyping and Phenotyping. In the first study, we analyze the architectural changes of existing candidate systems to find out the stability issues. Then, we extract architectural patterns of the candidate systems and propose a conceptual reference architectural model. Finally, we present a case study on the modularity of computation-intensive tasks as an extension of the data-centric development. We show that the data-centric modularity model is at the core of the flexible development of a Genotyping and Phenotyping analysis system. Our proposed model and case study with thousands of images provide a useful knowledge-base for software researchers, developers, and data engineers for cloud based Genotyping and Phenotyping analysis system development
Formal Verification of Industrial Software and Neural Networks
Software ist ein wichtiger Bestandteil unsere heutige Gesellschaft. Da Software vermehrt
in sicherheitskritischen Bereichen angewandt wird, müssen wir uns auf eine korrekte und
sichere Ausführung verlassen können. Besonders eingebettete Software, zum Beispiel in
medizinischen Geräten, Autos oder Flugzeugen, muss gründlich und formal geprüft werden.
Die Software solcher eingebetteten Systeme kann man in zwei Komponenten aufgeteilt.
In klassische (deterministische) Steuerungssoftware und maschinelle Lernverfahren
zum Beispiel für die Bilderkennung oder Kollisionsvermeidung angewandt werden.
Das Ziel dieser Dissertation ist es den Stand der Technik bei der Verifikation von
zwei Hauptkomponenten moderner eingebetteter Systeme zu verbessern: in C/C++
geschriebene Software und neuronalen Netze. Für beide Komponenten wird das Verifikationsproblem
formal definiert und neue Verifikationsansätze werden vorgestellt
- …