70 research outputs found

    The Impact of Systematic Edits in History Slicing

    Full text link
    While extracting a subset of a commit history, specifying the necessary portion is a time-consuming task for developers. Several commit-based history slicing techniques have been proposed to identify dependencies between commits and to extract a related set of commits using a specific commit as a slicing criterion. However, the resulting subset of commits become large if commits for systematic edits whose changes do not depend on each other exist. We empirically investigated the impact of systematic edits on history slicing. In this study, commits in which systematic edits were detected are split between each file so that unnecessary dependencies between commits are eliminated. In several histories of open source systems, the size of history slices was reduced by 13.3-57.2% on average after splitting the commits for systematic edits.Comment: 5 pages, MSR 201

    ChangeBeadsThreader: An Interactive Environment for Tailoring Automatically Untangled Changes

    Full text link
    To improve the usability of a revision history, change untangling, which reconstructs the history to ensure that changes in each commit belong to one intentional task, is important. Although there are several untangling approaches based on the clustering of fine-grained editing operations of source code, they often produce unsuitable result for a developer, and manual tailoring of the result is necessary. In this paper, we propose ChangeBeadsThreader (CBT), an interactive environment for splitting and merging change clusters to support the manual tailoring of untangled changes. CBT provides two features: 1) a two-dimensional space where fine-grained change history is visualized to help users find the clusters to be merged and 2) an augmented diff view that enables users to confirm the consistency of the changes in a specific cluster for finding those to be split. These features allow users to easily tailor automatically untangled changes.Comment: 5 pages, SANER 202

    Analysis of Human Affect and Bug Patterns to Improve Software Quality and Security

    Get PDF
    The impact of software is ever increasing as more and more systems are being software operated. Despite the usefulness of software, many instances software failures have been causing tremendous losses in lives and dollars. Software failures take place because of bugs (i.e., faults) in the software systems. These bugs cause the program to malfunction or crash and expose security vulnerabilities exploitable by malicious hackers. Studies confirm that software defects and vulnerabilities appear in source code largely due to the human mistakes and errors of the developers. Human performance is impacted by the underlying development process and human affects, such as sentiment and emotion. This thesis examines these human affects of software developers, which have drawn recent interests in the community. For capturing developers’ sentimental and emotional states, we have developed several software tools (i.e., SentiStrength-SE, DEVA, and MarValous). These are novel tools facilitating automatic detection of sentiments and emotions from the software engineering textual artifacts. Using such an automated tool, the developers’ sentimental variations are studied with respect to the underlying development tasks (e.g., bug-fixing, bug-introducing), development periods (i.e., days and times), team sizes and project sizes. We expose opportunities for exploiting developers’ sentiments for higher productivity and improved software quality. While developers’ sentiments and emotions can be leveraged for proactive and active safeguard in identifying and minimizing software bugs, this dissertation also includes in-depth studies of the relationship among various bug patterns, such as software defects, security vulnerabilities, and code smells to find actionable insights in minimizing software bugs and improving software quality and security. Bug patterns are exposed through mining software repositories and bug databases. These bug patterns are crucial in localizing bugs and security vulnerabilities in software codebase for fixing them, predicting portions of software susceptible to failure or exploitation by hackers, devising techniques for automated program repair, and avoiding code constructs and coding idioms that are bug-prone. The software tools produced from this thesis are empirically evaluated using standard measurement metrics (e.g., precision, recall). The findings of all the studies are validated with appropriate tests for statistical significance. Finally, based on our experience and in-depth analysis of the present state of the art, we expose avenues for further research and development towards a holistic approach for developing improved and secure software systems

    Dynamically generated multi-modal application interfaces

    Get PDF
    This work introduces a new UIMS (User Interface Management System), which aims to solve numerous problems in the field of user-interface development arising from hard-coded use of user interface toolkits. The presented solution is a concrete system architecture based on the abstract ARCH model consisting of an interface abstraction-layer, a dialog definition language called GIML (Generalized Interface Markup Language) and pluggable interface rendering modules. These components form an interface toolkit called GITK (Generalized Interface ToolKit). With the aid of GITK (Generalized Interface ToolKit) one can build an application, without explicitly creating a concrete end-user interface. At runtime GITK can create these interfaces as needed from the abstract specification and run them. Thereby GITK is equipping one application with many interfaces, even kinds of interfaces that did not exist when the application was written. It should be noted that this work will concentrate on providing the base infrastructure for adaptive/adaptable system, and does not aim to deliver a complete solution. This work shows that the proposed solution is a fundamental concept needed to create interfaces for everyone, which can be used everywhere and at any time. This text further discusses the impact of such technology for users and on the various aspects of software systems and their development. The targeted main audience of this work are software developers or people with strong interest in software development

    Connected Information Management

    Get PDF
    Society is currently inundated with more information than ever, making efficient management a necessity. Alas, most of current information management suffers from several levels of disconnectedness: Applications partition data into segregated islands, small notes don’t fit into traditional application categories, navigating the data is different for each kind of data; data is either available at a certain computer or only online, but rarely both. Connected information management (CoIM) is an approach to information management that avoids these ways of disconnectedness. The core idea of CoIM is to keep all information in a central repository, with generic means for organization such as tagging. The heterogeneity of data is taken into account by offering specialized editors. The central repository eliminates the islands of application-specific data and is formally grounded by a CoIM model. The foundation for structured data is an RDF repository. The RDF editing meta-model (REMM) enables form-based editing of this data, similar to database applications such as MS access. Further kinds of data are supported by extending RDF, as follows. Wiki text is stored as RDF and can both contain structured text and be combined with structured data. Files are also supported by the CoIM model and are kept externally. Notes can be quickly captured and annotated with meta-data. Generic means for organization and navigation apply to all kinds of data. Ubiquitous availability of data is ensured via two CoIM implementations, the web application HYENA/Web and the desktop application HYENA/Eclipse. All data can be synchronized between these applications. The applications were used to validate the CoIM ideas

    Logs and Models in Engineering Complex Embedded Production Software Systems

    Get PDF

    Logs and Models in Engineering Complex Embedded Production Software Systems

    Get PDF

    Vers une nouvelle approche basée sur l'apprentissage profond pour la classification des changements du code source par activités de maintenance

    Get PDF
    « Le domaine du développement logiciel possède une vraie mine d'information qui est sous forme d'historique de changements appliqués aux logiciels pendant leur cycle de vie. En effet, cet historique dont une partie importante est publiquement accessible à partir des systèmes de contrôle de versions fait l'objet d'exploration et d'analyse scientifique à travers le domaine du forage des référentiels de logiciels (MSR pour Mining Software Repositories en anglais) dont le but est d'améliorer plusieurs aspects rencontrés par les parties prenantes pendant le développement d'un logiciel. Dans ce travail, nous nous sommes intéressés à la détermination des types d'activité de maintenance qui sont présents dans une modification du code source. Plusieurs études se sont intéressées à ce sujet, et l'ont traité en exploitant les informations fournies par un programmeur comme le message décrivant les changements effectués ainsi que le code modifié qui est sous forme d'ajout et suppression de lignes de code. Cependant, la majorité d'entre elles considèrent qu'un changement comprend un seul type d'activité de maintenance, ce qui n'est pas toujours vrai en pratique. Ensuite, dans leurs exploitations des données textuelles, ces études se limitent au message alors que ce dernier comprend souvent seulement une description du code modifié et non la raison du changement. Et puis, dans leurs approches, elles se limitent à étudier des projets utilisant le même langage de programmation. À travers cette étude, nous répondons à ces enjeux en proposant un modèle de classification par activités de maintenance basé sur des modèles en apprentissage profond, qui seront également responsables de l'extraction de caractéristiques, que ce soit à partir d'une information textuelle (le message et la proposition de changement) ou du code modifié, indépendamment de son langage de programmation. Nous proposons également un nouveau jeu de données pour cette tâche afin de répondre à un autre enjeu qui est la rareté des jeux de données disponibles. Ce jeu de données tient compte du fait qu'un changement peut appartenir à plusieurs classes de changements. L'architecture de notre modèle est composée d'un modèle préentrainé permettant la génération des représentations distribuées des données textuelles, en plus d'un classificateur sous forme d'un réseau de neurones qui prendra en entrée la sortie du modèle préentrainé en plus des caractéristiques qui concernent le code modifié. Notre approche, dont l'entraînement est basé sur un apprentissage par transfert, a donné des résultats encourageants non seulement sur notre jeu de données, mais aussi en ce qui concerne le support des jeux de données des travaux reliés.-- Mots-clés : Activités de maintenance, systèmes de contrôle de version, forage des référentiels de logiciels, apprentissage. »-- « Software development has a wealth of information in the form of a history of changes applied to software during its life cycle. Indeed, a part of this history, publicly accessible from version control systems, is the subject of exploration and scientific analysis through mining software repositories (MSR). MSR aims to facilitate and improve several aspects stakeholders encounter during software development. In this work, we are interested in determining the types of maintenance activity present in modifying the source code. Several studies have been interested in this subject and have dealt with it by exploiting the information provided by a programmer, like a message describing the changes made and the modified code in the form of added and removed lines of code. However, most consider that a change includes only one type of maintenance activity, which is not always accurate in practice. Also, in using textual data, these studies limit themselves to the message, which often includes only a description of the modified code and not the reason for the change. Additionally, their approaches limit themselves to studying projects that use the same programming language. Through this study, we respond to these challenges by proposing a classification model by maintenance activities based on deep learning models. It will also be responsible for feature extraction, whether from textual information (message and issue description) or modified code, regardless of its programming language. We also provide a new dataset for this task to address another issue: the scarcity of available datasets. This dataset takes into account the fact that a change can belong to several classes of changes. The architecture of our model is composed of a pre-trained model allowing the generation of distributed representations of textual data, in addition to a classifier in the form of a neural network. This network inputs are the output of the pre-trained model and the characteristics related to the modified code. Our approach, whose training is based on transfer learning, has given encouraging results not only on our dataset but also on the support of related work datasets.-- Keywords : Maintenance activities, version control systems, software repository mining, deep learning, transfer learning, distributed representation, classification. »-
    corecore