9,300 research outputs found

    A systematic literature review on source code similarity measurement and clone detection: techniques, applications, and challenges

    Full text link
    Measuring and evaluating source code similarity is a fundamental software engineering activity that embraces a broad range of applications, including but not limited to code recommendation, duplicate code, plagiarism, malware, and smell detection. This paper proposes a systematic literature review and meta-analysis on code similarity measurement and evaluation techniques to shed light on the existing approaches and their characteristics in different applications. We initially found over 10000 articles by querying four digital libraries and ended up with 136 primary studies in the field. The studies were classified according to their methodology, programming languages, datasets, tools, and applications. A deep investigation reveals 80 software tools, working with eight different techniques on five application domains. Nearly 49% of the tools work on Java programs and 37% support C and C++, while there is no support for many programming languages. A noteworthy point was the existence of 12 datasets related to source code similarity measurement and duplicate codes, of which only eight datasets were publicly accessible. The lack of reliable datasets, empirical evaluations, hybrid methods, and focuses on multi-paradigm languages are the main challenges in the field. Emerging applications of code similarity measurement concentrate on the development phase in addition to the maintenance.Comment: 49 pages, 10 figures, 6 table

    Modular lifelong machine learning

    Get PDF
    Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge. Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand. This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems. First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures. Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations. Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods. Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer

    Life-Cycle Portfolio Choice with Stock Market Loss Framing: Explaining the Empirical Evidence

    Get PDF
    We develop a life-cycle model with optimal consumption, portfolio choice, and flexible work hours for households with loss-framing preferences giving them disutility if they experience losses from stock investments. Structural estimation using U.S. data shows that the model tracks the empirical age-pattern of stock market participants’ financial wealth, stock shares, and work hours remarkably well. Including stock market participation costs in the model allows us to also predict low stock market participations rates observed in the overall population. Allowing for heterogeneous agents further improves explanatory power and accounts for the observed discrepancy in wealth accumulation between stockholders and non-stockholders

    Differential Models, Numerical Simulations and Applications

    Get PDF
    This Special Issue includes 12 high-quality articles containing original research findings in the fields of differential and integro-differential models, numerical methods and efficient algorithms for parameter estimation in inverse problems, with applications to biology, biomedicine, land degradation, traffic flows problems, and manufacturing systems

    CITIES: Energetic Efficiency, Sustainability; Infrastructures, Energy and the Environment; Mobility and IoT; Governance and Citizenship

    Get PDF
    This book collects important contributions on smart cities. This book was created in collaboration with the ICSC-CITIES2020, held in San José (Costa Rica) in 2020. This book collects articles on: energetic efficiency and sustainability; infrastructures, energy and the environment; mobility and IoT; governance and citizenship

    Modeling, Simulation and Data Processing for Additive Manufacturing

    Get PDF
    Additive manufacturing (AM) or, more commonly, 3D printing is one of the fundamental elements of Industry 4.0. and the fourth industrial revolution. It has shown its potential example in the medical, automotive, aerospace, and spare part sectors. Personal manufacturing, complex and optimized parts, short series manufacturing and local on-demand manufacturing are some of the current benefits. Businesses based on AM have experienced double-digit growth in recent years. Accordingly, we have witnessed considerable efforts in developing processes and materials in terms of speed, costs, and availability. These open up new applications and business case possibilities all the time, which were not previously in existence. Most research has focused on material and AM process development or effort to utilize existing materials and processes for industrial applications. However, improving the understanding and simulation of materials and AM process and understanding the effect of different steps in the AM workflow can increase the performance even more. The best way of benefit of AM is to understand all the steps related to that—from the design and simulation to additive manufacturing and post-processing ending the actual application.The objective of this Special Issue was to provide a forum for researchers and practitioners to exchange their latest achievements and identify critical issues and challenges for future investigations on “Modeling, Simulation and Data Processing for Additive Manufacturing”. The Special Issue consists of 10 original full-length articles on the topic

    Recent Advances in Single-Particle Tracking: Experiment and Analysis

    Get PDF
    This Special Issue of Entropy, titled “Recent Advances in Single-Particle Tracking: Experiment and Analysis”, contains a collection of 13 papers concerning different aspects of single-particle tracking, a popular experimental technique that has deeply penetrated molecular biology and statistical and chemical physics. Presenting original research, yet written in an accessible style, this collection will be useful for both newcomers to the field and more experienced researchers looking for some reference. Several papers are written by authorities in the field, and the topics cover aspects of experimental setups, analytical methods of tracking data analysis, a machine learning approach to data and, finally, some more general issues related to diffusion

    A Late Iron Age farmstead in the Outer Hebrides

    Get PDF
    The settlement at Bornais consists of a complex of mounds which protrude from the relatively flat machair plain in the township of Bornais on the island of South Uist. This sandy plain has proved an attractive settlement from the Beaker period onwards; it appears to have been intensively occupied from the Late Bronze Age to the end of the Norse period. Mound 1 was the original location for settlement in this part of the machair plain; pre-Viking activity of some complexity is present and it is likely that the settlement activity started in the Middle Iron Age, if not earlier. The examination of the mound 1 deposits provides an important contribution to our understanding of the Iron Age sequence in the Atlantic province. The principal contribution comprises the large quantities of mammal, fish and bird bones, carbonised plant remains and pottery, which can be accurately dated to a fairly precise and narrow period in the 1st millennium AD. These are augmented by a substantial collection of small finds which included distinctive bone artefacts. The contextual significance of the site is based on the survival of floor deposits and a burnt-down roof; the floor deposits can be compared with abandonment and adjacent midden deposits providing contrasting contextual environments that help to clarify depositional processes. The burning down of the house and the excellent preservation of the deposits within it provide an unparalleled opportunity to examine the timber superstructure of the building and the layout of the material used by the inhabitants

    Bibliographic Control in the Digital Ecosystem

    Get PDF
    With the contributions of international experts, the book aims to explore the new boundaries of universal bibliographic control. Bibliographic control is radically changing because the bibliographic universe is radically changing: resources, agents, technologies, standards and practices. Among the main topics addressed: library cooperation networks; legal deposit; national bibliographies; new tools and standards (IFLA LRM, RDA, BIBFRAME); authority control and new alliances (Wikidata, Wikibase, Identifiers); new ways of indexing resources (artificial intelligence); institutional repositories; new book supply chain; “discoverability” in the IIIF digital ecosystem; role of thesauri and ontologies in the digital ecosystem; bibliographic control and search engines
    • 

    corecore