749 research outputs found

    Software Design Change Artifacts Generation through Software Architectural Change Detection and Categorisation

    Get PDF
    Software is solely designed, implemented, tested, and inspected by expert people, unlike other engineering projects where they are mostly implemented by workers (non-experts) after designing by engineers. Researchers and practitioners have linked software bugs, security holes, problematic integration of changes, complex-to-understand codebase, unwarranted mental pressure, and so on in software development and maintenance to inconsistent and complex design and a lack of ways to easily understand what is going on and what to plan in a software system. The unavailability of proper information and insights needed by the development teams to make good decisions makes these challenges worse. Therefore, software design documents and other insightful information extraction are essential to reduce the above mentioned anomalies. Moreover, architectural design artifacts extraction is required to create the developer’s profile to be available to the market for many crucial scenarios. To that end, architectural change detection, categorization, and change description generation are crucial because they are the primary artifacts to trace other software artifacts. However, it is not feasible for humans to analyze all the changes for a single release for detecting change and impact because it is time-consuming, laborious, costly, and inconsistent. In this thesis, we conduct six studies considering the mentioned challenges to automate the architectural change information extraction and document generation that could potentially assist the development and maintenance teams. In particular, (1) we detect architectural changes using lightweight techniques leveraging textual and codebase properties, (2) categorize them considering intelligent perspectives, and (3) generate design change documents by exploiting precise contexts of components’ relations and change purposes which were previously unexplored. Our experiment using 4000+ architectural change samples and 200+ design change documents suggests that our proposed approaches are promising in accuracy and scalability to deploy frequently. Our proposed change detection approach can detect up to 100% of the architectural change instances (and is very scalable). On the other hand, our proposed change classifier’s F1 score is 70%, which is promising given the challenges. Finally, our proposed system can produce descriptive design change artifacts with 75% significance. Since most of our studies are foundational, our approaches and prepared datasets can be used as baselines for advancing research in design change information extraction and documentation

    ML-based data-entry automation and data anomaly detection to support data quality assurance

    Get PDF
    Data playsacentralroleinmodernsoftwaresystems,whichare very oftenpoweredbymachinelearning(ML)andusedincriticaldo- mains ofourdailylives,suchasfinance,health,andtransportation. However,theeffectivenessofML-intensivesoftwareapplicationshighly depends onthequalityofthedata.Dataqualityisaffectedbydata anomalies; dataentryerrorsareoneofthemainsourcesofanomalies. The goalofthisthesisistodevelopapproachestoensuredataquality by preventingdataentryerrorsduringtheform-fillingprocessandby checking theofflinedatasavedindatabases. The maincontributionsofthisthesisare: 1. LAFF, anapproachtoautomaticallysuggestpossiblevaluesofcat- egorical fieldsindataentryforms. 2. LACQUER, anapproachtoautomaticallyrelaxthecompleteness requirementofdataentryformsbydecidingwhenafieldshould be optionalbasedonthefilledfieldsandhistoricalinputinstances. 3. LAFF-AD, anapproachtoautomaticallydetectdataanomaliesin categorical columnsinofflinedatasets. LAFF andLACQUERfocusmainlyonpreventingdataentryerrors during theform-fillingprocess.Bothapproachescanbeintegratedinto data entryapplicationsasefficientandeffectivestrategiestoassistthe user duringtheform-fillingprocess.LAFF-ADcanbeusedofflineon existing suspiciousdatatoeffectivelydetectanomaliesincategorical data. In addition,weperformedanextensiveevaluationofthethreeap- proaches,assessingtheireffectivenessandefficiency,usingreal-world datasets

    Measuring the impact of COVID-19 on hospital care pathways

    Get PDF
    Care pathways in hospitals around the world reported significant disruption during the recent COVID-19 pandemic but measuring the actual impact is more problematic. Process mining can be useful for hospital management to measure the conformance of real-life care to what might be considered normal operations. In this study, we aim to demonstrate that process mining can be used to investigate process changes associated with complex disruptive events. We studied perturbations to accident and emergency (A &E) and maternity pathways in a UK public hospital during the COVID-19 pandemic. Co-incidentally the hospital had implemented a Command Centre approach for patient-flow management affording an opportunity to study both the planned improvement and the disruption due to the pandemic. Our study proposes and demonstrates a method for measuring and investigating the impact of such planned and unplanned disruptions affecting hospital care pathways. We found that during the pandemic, both A &E and maternity pathways had measurable reductions in the mean length of stay and a measurable drop in the percentage of pathways conforming to normative models. There were no distinctive patterns of monthly mean values of length of stay nor conformance throughout the phases of the installation of the hospital’s new Command Centre approach. Due to a deficit in the available A &E data, the findings for A &E pathways could not be interpreted

    Engineering Blockchain Based Software Systems: Foundations, Survey, and Future Directions

    Full text link
    Many scientific and practical areas have shown increasing interest in reaping the benefits of blockchain technology to empower software systems. However, the unique characteristics and requirements associated with Blockchain Based Software (BBS) systems raise new challenges across the development lifecycle that entail an extensive improvement of conventional software engineering. This article presents a systematic literature review of the state-of-the-art in BBS engineering research from a software engineering perspective. We characterize BBS engineering from the theoretical foundations, processes, models, and roles and discuss a rich repertoire of key development activities, principles, challenges, and techniques. The focus and depth of this survey not only gives software engineering practitioners and researchers a consolidated body of knowledge about current BBS development but also underpins a starting point for further research in this field

    OpenCitations Meta

    Full text link
    OpenCitations Meta is a new database that contains bibliographic metadata of scholarly publications involved in citations indexed by the OpenCitations infrastructure. It adheres to Open Science principles and provides data under a CC0 license for maximum reuse. The data can be accessed through a SPARQL endpoint, REST APIs, and dumps. OpenCitations Meta serves three important purposes. Firstly, it enables disambiguation of citations between publications described using different identifiers from various sources. For example, it can link publications identified by DOIs in Crossref and PMIDs in PubMed. Secondly, it assigns new globally persistent identifiers (PIDs), known as OpenCitations Meta Identifiers (OMIDs), to bibliographic resources without existing external persistent identifiers like DOIs. Lastly, by hosting the bibliographic metadata internally, OpenCitations Meta improves the speed of metadata retrieval for citing and cited documents. The database is populated through automated data curation, including deduplication, error correction, and metadata enrichment. The data is stored in RDF format following the OpenCitations Data Model, and changes and provenance information are tracked. OpenCitations Meta and its production. OpenCitations Meta currently incorporates data from Crossref, DataCite, and the NIH Open Citation Collection. In terms of semantic publishing datasets, it is currently the first in data volume.Comment: 26 pages, 7 figure

    Correlating contexts and NFR conflicts from event logs

    Get PDF
    In the design of autonomous systems, it is important to consider the preferences of the interested parties to improve the user experience. These preferences are often associated with the contexts in which each system is likely to operate. The operational behavior of a system must also meet various non-functional requirements (NFRs), which can present different levels of conflict depending on the operational context. This work aims to model correlations between the individual contexts and the consequent conflicts between NFRs. The proposed approach is based on analyzing the system event logs, tracing them back to the leaf elements at the specification level and providing a contextual explanation of the system’s behavior. The traced contexts and NFR conflicts are then mined to produce Context-Context and Context-NFR conflict sequential rules. The proposed Contextual Explainability (ConE) framework uses BERT-based pre-trained language models and sequential rule mining libraries for deriving the above correlations. Extensive evaluations are performed to compare the existing state-of-the-art approaches. The best-fit solutions are chosen to integrate within the ConE framework. Based on experiments, an accuracy of 80%, a precision of 90%, a recall of 97%, and an F1-score of 88% are recorded for the ConE framework on the sequential rules that were mined

    Large Language Models for Software Engineering: A Systematic Literature Review

    Full text link
    Large Language Models (LLMs) have significantly impacted numerous domains, notably including Software Engineering (SE). Nevertheless, a well-rounded understanding of the application, effects, and possible limitations of LLMs within SE is still in its early stages. To bridge this gap, our systematic literature review takes a deep dive into the intersection of LLMs and SE, with a particular focus on understanding how LLMs can be exploited in SE to optimize processes and outcomes. Through a comprehensive review approach, we collect and analyze a total of 229 research papers from 2017 to 2023 to answer four key research questions (RQs). In RQ1, we categorize and provide a comparative analysis of different LLMs that have been employed in SE tasks, laying out their distinctive features and uses. For RQ2, we detail the methods involved in data collection, preprocessing, and application in this realm, shedding light on the critical role of robust, well-curated datasets for successful LLM implementation. RQ3 allows us to examine the specific SE tasks where LLMs have shown remarkable success, illuminating their practical contributions to the field. Finally, RQ4 investigates the strategies employed to optimize and evaluate the performance of LLMs in SE, as well as the common techniques related to prompt optimization. Armed with insights drawn from addressing the aforementioned RQs, we sketch a picture of the current state-of-the-art, pinpointing trends, identifying gaps in existing research, and flagging promising areas for future study

    Continuous Rationale Management

    Get PDF
    Continuous Software Engineering (CSE) is a software life cycle model open to frequent changes in requirements or technology. During CSE, software developers continuously make decisions on the requirements and design of the software or the development process. They establish essential decision knowledge, which they need to document and share so that it supports the evolution and changes of the software. The management of decision knowledge is called rationale management. Rationale management provides an opportunity to support the change process during CSE. However, rationale management is not well integrated into CSE. The overall goal of this dissertation is to provide workflows and tool support for continuous rationale management. The dissertation contributes an interview study with practitioners from the industry, which investigates rationale management problems, current practices, and features to support continuous rationale management beneficial for practitioners. Problems of rationale management in practice are threefold: First, documenting decision knowledge is intrusive in the development process and an additional effort. Second, the high amount of distributed decision knowledge documentation is difficult to access and use. Third, the documented knowledge can be of low quality, e.g., outdated, which impedes its use. The dissertation contributes a systematic mapping study on recommendation and classification approaches to treat the rationale management problems. The major contribution of this dissertation is a validated approach for continuous rationale management consisting of the ConRat life cycle model extension and the comprehensive ConDec tool support. To reduce intrusiveness and additional effort, ConRat integrates rationale management activities into existing workflows, such as requirements elicitation, development, and meetings. ConDec integrates into standard development tools instead of providing a separate tool. ConDec enables lightweight capturing and use of decision knowledge from various artifacts and reduces the developers' effort through automatic text classification, recommendation, and nudging mechanisms for rationale management. To enable access and use of distributed decision knowledge documentation, ConRat defines a knowledge model of decision knowledge and other artifacts. ConDec instantiates the model as a knowledge graph and offers interactive knowledge views with useful tailoring, e.g., transitive linking. To operationalize high quality, ConRat introduces the rationale backlog, the definition of done for knowledge documentation, and metrics for intra-rationale completeness and decision coverage of requirements and code. ConDec implements these agile concepts for rationale management and a knowledge dashboard. ConDec also supports consistent changes through change impact analysis. The dissertation shows the feasibility, effectiveness, and user acceptance of ConRat and ConDec in six case study projects in an industrial setting. Besides, it comprehensively analyses the rationale documentation created in the projects. The validation indicates that ConRat and ConDec benefit CSE projects. Based on the dissertation, continuous rationale management should become a standard part of CSE, like automated testing or continuous integration

    AI-Enhanced Hybrid Decision Management

    Get PDF
    The Decision Model and Notation (DMN) modeling language allows the precise specification of business decisions and business rules. DMN is readily understandable by business users involved in decision management. However, as the models get complex, the cognitive abilities of humans threaten manual maintainability and comprehensibility. Proper design of the decision logic thus requires comprehensive automated analysis of e.g., all possible cases the decision shall cover; correlations between inputs and outputs; and the importance of inputs for deriving the output. In the paper, the authors explore the mutual benefits of combining human-driven DMN decision modeling with the computational power of Artificial Intelligence for DMN model analysis and improved comprehension. The authors propose a model-driven approach that uses DMN models to generate Machine Learning (ML) training data and show, how the trained ML models can inform human decision modelers by means of superimposing the feature importance within the original DMN models. An evaluation with multiple real DMN models from an insurance company evaluates the feasibility and the utility of the approach

    Supporting requirement elicitation and ontology testing in knowledge graph engineering

    Get PDF
    Knowledge graphs and ontologies are closely related concepts in the field of knowledge representation. In recent years, knowledge graphs have gained increasing popularity and are serving as essential components in many knowledge engineering projects that view them as crucial to their success. The conceptual foundation of the knowledge graph is provided by ontologies. Ontology modeling is an iterative engineering process that consists of steps such as the elicitation and formalization of requirements, the development, testing, refactoring, and release of the ontology. The testing of the ontology is a crucial and occasionally overlooked step of the process due to the lack of integrated tools to support it. As a result of this gap in the state-of-the-art, the testing of the ontology is completed manually, which requires a considerable amount of time and effort from the ontology engineers. The lack of tool support is noticed in the requirement elicitation process as well. In this aspect, the rise in the adoption and accessibility of knowledge graphs allows for the development and use of automated tools to assist with the elicitation of requirements from such a complementary source of data. Therefore, this doctoral research is focused on developing methods and tools that support the requirement elicitation and testing steps of an ontology engineering process. To support the testing of the ontology, we have developed XDTesting, a web application that is integrated with the GitHub platform that serves as an ontology testing manager. Concurrently, to support the elicitation and documentation of competency questions, we have defined and implemented RevOnt, a method to extract competency questions from knowledge graphs. Both methods are evaluated through their implementation and the results are promising
    • …
    corecore