1,278 research outputs found

    NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings

    Full text link
    Current approaches for service composition (assemblies of atomic services) require developers to use: (a) domain-specific semantics to formalize services that restrict the vocabulary for their descriptions, and (b) translation mechanisms for service retrieval to convert unstructured user requests to strongly-typed semantic representations. In our work, we argue that effort to developing service descriptions, request translations, and matching mechanisms could be reduced using unrestricted natural language; allowing both: (1) end-users to intuitively express their needs using natural language, and (2) service developers to develop services without relying on syntactic/semantic description languages. Although there are some natural language-based service composition approaches, they restrict service retrieval to syntactic/semantic matching. With recent developments in Machine learning and Natural Language Processing, we motivate the use of Sentence Embeddings by leveraging richer semantic representations of sentences for service description, matching and retrieval. Experimental results show that service composition development effort may be reduced by more than 44\% while keeping a high precision/recall when matching high-level user requests with low-level service method invocations.Comment: This paper will appear on SCC'19 (IEEE International Conference on Services Computing) on July 1

    Identification of microservices from monolithic applications through topic modelling

    Get PDF
    Dissertação de mestrado em Informatics EngineeringMicroservices emerged as one of the most popular architectural patterns in the recent years given the increased need to scale, grow and flexibilize software projects accompanied by the growth in cloud computing and DevOps. Many software applications are being submitted to a process of migration from its monolithic architecture to a more modular, scalable and flexible architecture of microservices. This process is slow and, depending on the project’s complexity, it may take months or even years to complete. This dissertation proposes a new approach on microservices identification by resorting to topic modelling in order to identify services according to domain terms. This approach in combination with clustering techniques produces a set of services based on the original software. The proposed methodology is implemented as an open-source tool for exploration of monolithic architectures and identification of microservices. An extensive quantitative analysis using the state of the art metrics on independence of functionality and modularity of services was conducted on 200 open-source projects collected from GitHub. Cohesion at message and domain level metrics showed medians of roughly 0.6. Interfaces per service exhibited a median of 1.5 with a compact interquartile range. Structural and conceptual modularity revealed medians of 0.2 and 0.4 respectively. Further analysis to understand if the methodology works better for smaller/larger projects revealed an overall stability and similar performance across metrics. Our first results are positive demonstrating beneficial identification of services due to overall metrics’ results.Os microserviços emergiram como um dos padrões arquiteturais mais populares na atualidade dado o aumento da necessidade em escalar, crescer e flexibilizar projetos de software, acompanhados da crescente da computação na cloud e DevOps. Muitas aplicações estão a ser submetidas a processos de migração de uma arquitetura monolítica para uma arquitetura mais modular, escalável e flexivel de microserviços. Este processo de migração é lento, e dependendo da complexidade do projeto, poderá levar vários meses ou mesmo anos a completar. Esta dissertação propõe uma nova abordagem na identificação de microserviços recorrendo a modelação de tópicos de forma a identificar serviços de acordo com termos de domínio de um projeto de software. Esta abordagem em combinação com técnicas de clustering produz um conjunto de serviços baseado no projeto de software original. A metodologia proposta é implementada como uma ferramenta open-source para exploração de arquiteturas monolíticas e identificação de microserviços. Uma análise quantitativa extensa recorrendo a métricas de independência de funcionalidade e modularidade de serviços foi conduzida em 200 aplicações open-source recolhidas do GitHub. Métricas de coesão ao nível da mensagem e domínio revelaram medianas em torno de 0.6. Interfaces por serviço demonstraram uma mediana de 1.5 com um intervalo interquartil compacto. Métricas de modularidade estrutural e conceptual revelaram medianas de 0.2 e 0.4 respetivamente. Uma análise mais aprofundada para tentar perceber se a metodologia funciona melhor para projetos de diferentes dimensões/características revelaram uma estabilidade geral do funcionamento do método. Os primeiros resultados são positivos demonstrando identificações de serviços benéficos tendo em conta que os valores das métricas são de uma forma global positivos e promissores

    Closing the gap between guidance and practice, an investigation of the relevance of design guidance to practitioners using object-oriented technologies

    Get PDF
    This thesis investigates if object oriented guidance is relevant in practice, and how this affects software that is produced. This is achieved by surveying practitioners and studying how constructs such as interfaces and inheritance are used in open-source systems. Surveyed practitioners framed 'good design' in terms of impact on development and maintenance. Recognition of quality requires practitioner judgement (individually and as a group), and principles are valued over rules. Time constraints heighten sensitivity to the rework cost of poor design decisions. Examination of open source systems highlights the use of interface and inheritance. There is some evidence of 'textbook' use of these structures, and much use is simple. Outliers are widespread indicating a pragmatic approach. Design is found to reflect the pressures of practice - high-level decisions justify 'designed' structures and architecture, while uncertainty leads to deferred design decisions - simpler structures, repetition, and unconsolidated design. Sub-populations of structures can be identified which may represent common trade-offs. Useful insights are gained into practitioner attitude to design guidance. Patterns of use and structure are identified which may aid in assessment and comprehension of object oriented systems.This thesis investigates if object oriented guidance is relevant in practice, and how this affects software that is produced. This is achieved by surveying practitioners and studying how constructs such as interfaces and inheritance are used in open-source systems. Surveyed practitioners framed 'good design' in terms of impact on development and maintenance. Recognition of quality requires practitioner judgement (individually and as a group), and principles are valued over rules. Time constraints heighten sensitivity to the rework cost of poor design decisions. Examination of open source systems highlights the use of interface and inheritance. There is some evidence of 'textbook' use of these structures, and much use is simple. Outliers are widespread indicating a pragmatic approach. Design is found to reflect the pressures of practice - high-level decisions justify 'designed' structures and architecture, while uncertainty leads to deferred design decisions - simpler structures, repetition, and unconsolidated design. Sub-populations of structures can be identified which may represent common trade-offs. Useful insights are gained into practitioner attitude to design guidance. Patterns of use and structure are identified which may aid in assessment and comprehension of object oriented systems

    Empirically-Grounded Construction of Bug Prediction and Detection Tools

    Get PDF
    There is an increasing demand on high-quality software as software bugs have an economic impact not only on software projects, but also on national economies in general. Software quality is achieved via the main quality assurance activities of testing and code reviewing. However, these activities are expensive, thus they need to be carried out efficiently. Auxiliary software quality tools such as bug detection and bug prediction tools help developers focus their testing and reviewing activities on the parts of software that more likely contain bugs. However, these tools are far from adoption as mainstream development tools. Previous research points to their inability to adapt to the peculiarities of projects and their high rate of false positives as the main obstacles of their adoption. We propose empirically-grounded analysis to improve the adaptability and efficiency of bug detection and prediction tools. For a bug detector to be efficient, it needs to detect bugs that are conspicuous, frequent, and specific to a software project. We empirically show that the null-related bugs fulfill these criteria and are worth building detectors for. We analyze the null dereferencing problem and find that its root cause lies in methods that return null. We propose an empirical solution to this problem that depends on the wisdom of the crowd. For each API method, we extract the nullability measure that expresses how often the return value of this method is checked against null in the ecosystem of the API. We use nullability to annotate API methods with nullness annotation and warn developers about missing and excessive null checks. For a bug predictor to be efficient, it needs to be optimized as both a machine learning model and a software quality tool. We empirically show how feature selection and hyperparameter optimizations improve prediction accuracy. Then we optimize bug prediction to locate the maximum number of bugs in the minimum amount of code by finding the most cost-effective combination of bug prediction configurations, i.e., dependent variables, machine learning model, and response variable. We show that using both source code and change metrics as dependent variables, applying feature selection on them, then using an optimized Random Forest to predict the number of bugs results in the most cost-effective bug predictor. Throughout this thesis, we show how empirically-grounded analysis helps us achieve efficient bug prediction and detection tools and adapt them to the characteristics of each software project
    • …
    corecore