118,069 research outputs found

    Searching, Selecting, and Synthesizing Source Code Components

    Get PDF
    As programmers develop software, they instinctively sense that source code exists that could be reused if found --- many programming tasks are common to many software projects across different domains. oftentimes, a programmer will attempt to create new software from this existing source code, such as third-party libraries or code from online repositories. Unfortunately, several major challenges make it difficult to locate the relevant source code and to reuse it. First, there is a fundamental mismatch between the high-level intent reflected in the descriptions of source code, and the low-level implementation details. This mismatch is known as the concept assignment problem , and refers to the frequent case when the keywords from comments or identifiers in code do not match the features implemented in the code. Second, even if relevant source code is found, programmers must invest significant intellectual effort into understanding how to reuse the different functions, classes, or other components present in the source code. These components may be specific to a particular application, and difficult to reuse.;One key source of information that programmers use to understand source code is the set of relationships among the source code components. These relationships are typically structural data, such as function calls or class instantiations. This structural data has been repeatedly suggested as an alternative to textual analysis for search and reuse, however as yet no comprehensive strategy exists for locating relevant and reusable source code. In my research program, I harness this structural data in a unified approach to creating and evolving software from existing components. For locating relevant source code, I present a search engine for finding applications based on the underlying Application Programming Interface (API) calls, and a technique for finding chains of relevant function invocations from repositories of millions of lines of code. Next, for reusing source code, I introduce a system to facilitate building software prototypes from existing packages, and an approach to detecting similar software applications

    Using Modularity Metrics to assist Move Method Refactoring of Large System

    Full text link
    For large software systems, refactoring activities can be a challenging task, since for keeping component complexity under control the overall architecture as well as many details of each component have to be considered. Product metrics are therefore often used to quantify several parameters related to the modularity of a software system. This paper devises an approach for automatically suggesting refactoring opportunities on large software systems. We show that by assessing metrics for all components, move methods refactoring an be suggested in such a way to improve modularity of several components at once, without hindering any other. However, computing metrics for large software systems, comprising thousands of classes or more, can be a time consuming task when performed on a single CPU. For this, we propose a solution that computes metrics by resorting to GPU, hence greatly shortening computation time. Thanks to our approach precise knowledge on several properties of the system can be continuously gathered while the system evolves, hence assisting developers to quickly assess several solutions for reducing modularity issues

    ALOJA: A benchmarking and predictive platform for big data performance analysis

    Get PDF
    The main goals of the ALOJA research project from BSC-MSR, are to explore and automate the characterization of cost-effectivenessof Big Data deployments. The development of the project over its first year, has resulted in a open source benchmarking platform, an online public repository of results with over 42,000 Hadoop job runs, and web-based analytic tools to gather insights about system's cost-performance1. This article describes the evolution of the project's focus and research lines from over a year of continuously benchmarking Hadoop under dif- ferent configuration and deployments options, presents results, and dis cusses the motivation both technical and market-based of such changes. During this time, ALOJA's target has evolved from a previous low-level profiling of Hadoop runtime, passing through extensive benchmarking and evaluation of a large body of results via aggregation, to currently leveraging Predictive Analytics (PA) techniques. Modeling benchmark executions allow us to estimate the results of new or untested configu- rations or hardware set-ups automatically, by learning techniques from past observations saving in benchmarking time and costs.This work is partially supported the BSC-Microsoft Research Centre, the Span- ish Ministry of Education (TIN2012-34557), the MINECO Severo Ochoa Research program (SEV-2011-0067) and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    Emissivity: A Program for Atomic Emissivity Calculations

    Full text link
    In this article we report the release of a new program for calculating the emissivity of atomic transitions. The program, which can be obtained with its documentation from our website www.scienceware.net, passed various rigorous tests and was used by the author to generate theoretical data and analyze observational data. It is particularly useful for investigating atomic transition lines in astronomical context as the program is capable of generating a huge amount of theoretical data and comparing it to observational list of lines. A number of atomic transition algorithms and analytical techniques are implemented within the program and can be very useful in various situations. The program can be described as fast and efficient. Moreover, it requires modest computational resources.Comment: 20 pages, 0 figures, 0 table

    Big-Data-Driven Materials Science and its FAIR Data Infrastructure

    Get PDF
    This chapter addresses the forth paradigm of materials research -- big-data driven materials science. Its concepts and state-of-the-art are described, and its challenges and chances are discussed. For furthering the field, Open Data and an all-embracing sharing, an efficient data infrastructure, and the rich ecosystem of computer codes used in the community are of critical importance. For shaping this forth paradigm and contributing to the development or discovery of improved and novel materials, data must be what is now called FAIR -- Findable, Accessible, Interoperable and Re-purposable/Re-usable. This sets the stage for advances of methods from artificial intelligence that operate on large data sets to find trends and patterns that cannot be obtained from individual calculations and not even directly from high-throughput studies. Recent progress is reviewed and demonstrated, and the chapter is concluded by a forward-looking perspective, addressing important not yet solved challenges.Comment: submitted to the Handbook of Materials Modeling (eds. S. Yip and W. Andreoni), Springer 2018/201

    A Case Study in Matching Service Descriptions to Implementations in an Existing System

    Full text link
    A number of companies are trying to migrate large monolithic software systems to Service Oriented Architectures. A common approach to do this is to first identify and describe desired services (i.e., create a model), and then to locate portions of code within the existing system that implement the described services. In this paper we describe a detailed case study we undertook to match a model to an open-source business application. We describe the systematic methodology we used, the results of the exercise, as well as several observations that throw light on the nature of this problem. We also suggest and validate heuristics that are likely to be useful in partially automating the process of matching service descriptions to implementations.Comment: 20 pages, 19 pdf figure
    corecore