244,797 research outputs found

    DeepSoft: A vision for a deep model of software

    Full text link
    Although software analytics has experienced rapid growth as a research area, it has not yet reached its full potential for wide industrial adoption. Most of the existing work in software analytics still relies heavily on costly manual feature engineering processes, and they mainly address the traditional classification problems, as opposed to predicting future events. We present a vision for \emph{DeepSoft}, an \emph{end-to-end} generic framework for modeling software and its development process to predict future risks and recommend interventions. DeepSoft, partly inspired by human memory, is built upon the powerful deep learning-based Long Short Term Memory architecture that is capable of learning long-term temporal dependencies that occur in software evolution. Such deep learned patterns of software can be used to address a range of challenging problems such as code and task recommendation and prediction. DeepSoft provides a new approach for research into modeling of source code, risk prediction and mitigation, developer modeling, and automatically generating code patches from bug reports.Comment: FSE 201

    Identification of Software Bugs by Analyzing Natural Language-Based Requirements Using Optimized Deep Learning Features

    Get PDF
    © 2024 Tech Science Press. All rights reserved. This is an open access article distributed under the Creative Commons Attribution License, to view a copy of the license, see: https://creativecommons.org/licenses/by/4.0/Software project outcomes heavily depend on natural language requirements, often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements. Researchers are exploring machine learning to predict software bugs, but a more precise and general approach is needed. Accurate bug prediction is crucial for software evolution and user training, prompting an investigation into deep and ensemble learning methods. However, these studies are not generalized and efficient when extended to other datasets. Therefore, this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems. The methods involved feature selection, which is used to reduce the dimensionality and redundancy of features and select only the relevant ones; transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets, and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model. Four National Aeronautics and Space Administration (NASA) and four Promise datasets are used in the study, showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve (AUC-ROC) values when different classifiers were combined. It reveals that using an amalgam of techniques such as those used in this study, feature selection, transfer learning, and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing, useful end mode.Peer reviewe

    A Deep Reinforcement Learning Approach for Adaptive Traffic Routing in Next-gen Networks

    Full text link
    Next-gen networks require significant evolution of management to enable automation and adaptively adjust network configuration based on traffic dynamics. The advent of software-defined networking (SDN) and programmable switches enables flexibility and programmability. However, traditional techniques that decide traffic policies are usually based on hand-crafted programming optimization and heuristic algorithms. These techniques make non-realistic assumptions, e.g., considering static network load and topology, to obtain tractable solutions, which are inadequate for next-gen networks. In this paper, we design and develop a deep reinforcement learning (DRL) approach for adaptive traffic routing. We design a deep graph convolutional neural network (DGCNN) integrated into the DRL framework to learn the traffic behavior from not only the network topology but also link and node attributes. We adopt the Deep Q-Learning technique to train the DGCNN model in the DRL framework without the need for a labeled training dataset, enabling the framework to quickly adapt to traffic dynamics. The model leverages q-value estimates to select the routing path for every traffic flow request, balancing exploration and exploitation. We perform extensive experiments with various traffic patterns and compare the performance of the proposed approach with the Open Shortest Path First (OSPF) protocol. The experimental results show the effectiveness and adaptiveness of the proposed framework by increasing the network throughput by up to 7.8% and reducing the traffic delay by up to 16.1% compared to OSPF.Comment: Accepted for publication in the Proceedings of the IEEE International Conference on Communications (IEEE ICC 2024

    Deep Learning Anti-patterns from Code Metrics History

    Full text link
    Anti-patterns are poor solutions to recurring design problems. Number of empirical studies have highlighted the negative impact of anti-patterns on software maintenance which motivated the development of various detection techniques. Most of these approaches rely on structural metrics of software systems to identify affected components while others exploit historical information by analyzing co-changes occurring between code components. By relying solely on one aspect of software systems (i.e., structural or historical), existing approaches miss some precious information which limits their performances. In this paper, we propose CAME (Convolutional Analysis of code Metrics Evolution), a deep-learning based approach that relies on both structural and historical information to detect anti-patterns. Our approach exploits historical values of structural code metrics mined from version control systems and uses a Convolutional Neural Network classifier to infer the presence of anti-patterns from this information. We experiment our approach for the widely known God Class anti-pattern and evaluate its performances on three software systems. With the results of our study, we show that: (1) using historical values of source code metrics allows to increase the precision; (2) CAME outperforms existing static machine-learning classifiers; and (3) CAME outperforms existing detection tools.Comment: Preprint. Paper accepted for inclusion in the Research Track of the 35th IEEE International Conference on Software Maintenance and Evolution (ICSME 2019), Cleveland, Ohio, US

    Deep Learning In Software Engineering

    Get PDF
    Software evolves and therefore requires an evolving field of Software Engineering. The evolution of software can be seen on an individual project level through the software life cycle, as well as on a collective level, as we study the trends and uses of software in the real world. As the needs and requirements of users change, so must software evolve to reflect those changes. This cycle is never ending and has led to continuous and rapid development of software projects. More importantly, it has put a great responsibility on software engineers, causing them to adopt practices and tools that allow them to increase their efficiency. However, these tools suffer the same fate as software designed for the general population; they need to change in order to reflect the user’s needs. Fortunately, the demand for this evolving software has given software engineers a plethora of data and artifacts to analyze. The challenge arises when attempting to identify and apply patterns learned from the vast amount of data. In this dissertation, we explore and develop techniques to take advantage of the vast amount of software data and to aid developers in software development tasks. Specifically, we exploit the tool of deep learning to automatically learn patterns discovered within previous software data and automatically apply those patterns to present day software development. We first set out to investigate the current impact of deep learning in software engineering by performing a systematic literature review of top tier conferences and journals. This review provides guidelines and common pitfalls for researchers to consider when implementing DL (Deep Learning) approaches in SE (Software Engineering). In addition, the review provides a research road map for areas within SE where DL could be applicable. Our next piece of work developed an approach that simultaneously learned different representations of source code for the task of clone detection. We found that the use of multiple representations, such as Identifiers, ASTs, CFGs and bytecode, can lead to the identification of similar code fragments. Through the use of deep learning strategies, we automatically learned these different representations without the requirement of hand-crafted features. Lastly, we designed a novel approach for automating the generation of assert statements through seq2seq learning, with the goal of increasing the efficiency of software testing. Given the test method and the context of the associated focal method, we automatically generated semantically and syntactically correct assert statements for a given, unseen test method. We exemplify that the techniques presented in this dissertation provide a meaningful advancement to the field of software engineering and the automation of software development tasks. We provide analytical evaluations and empirical evidence that substantiate the impact of our findings and usefulness of our approaches toward the software engineering community

    Grand Challenges of Traceability: The Next Ten Years

    Full text link
    In 2007, the software and systems traceability community met at the first Natural Bridge symposium on the Grand Challenges of Traceability to establish and address research goals for achieving effective, trustworthy, and ubiquitous traceability. Ten years later, in 2017, the community came together to evaluate a decade of progress towards achieving these goals. These proceedings document some of that progress. They include a series of short position papers, representing current work in the community organized across four process axes of traceability practice. The sessions covered topics from Trace Strategizing, Trace Link Creation and Evolution, Trace Link Usage, real-world applications of Traceability, and Traceability Datasets and benchmarks. Two breakout groups focused on the importance of creating and sharing traceability datasets within the research community, and discussed challenges related to the adoption of tracing techniques in industrial practice. Members of the research community are engaged in many active, ongoing, and impactful research projects. Our hope is that ten years from now we will be able to look back at a productive decade of research and claim that we have achieved the overarching Grand Challenge of Traceability, which seeks for traceability to be always present, built into the engineering process, and for it to have "effectively disappeared without a trace". We hope that others will see the potential that traceability has for empowering software and systems engineers to develop higher-quality products at increasing levels of complexity and scale, and that they will join the active community of Software and Systems traceability researchers as we move forward into the next decade of research
    • …
    corecore