38,326 research outputs found

    Using a Nearest-Neighbour, BERT-Based Approach for Scalable Clone Detection

    Full text link
    Code clones can detrimentally impact software maintenance and manually detecting them in very large codebases is impractical. Additionally, automated approaches find detection of Type 3 and Type 4 (inexact) clones very challenging. While the most recent artificial deep neural networks (for example BERT-based artificial neural networks) seem to be highly effective in detecting such clones, their pairwise comparison of every code pair in the target system(s) is inefficient and scales poorly on large codebases. We therefore introduce SSCD, a BERT-based clone detection approach that targets high recall of Type 3 and Type 4 clones at scale (in line with our industrial partner's requirements). It does so by computing a representative embedding for each code fragment and finding similar fragments using a nearest neighbour search. SSCD thus avoids the pairwise-comparison bottleneck of other Neural Network approaches while also using parallel, GPU-accelerated search to tackle scalability. This paper details the approach and an empirical assessment towards configuring and evaluating that approach in industrial setting. The configuration analysis suggests that shorter input lengths and text-only based neural network models demonstrate better efficiency in SSCD, while only slightly decreasing effectiveness. The evaluation results suggest that SSCD is more effective than state-of-the-art approaches like SAGA and SourcererCC. It is also highly efficient: in its optimal setting, SSCD effectively locates clones in the entire 320 million LOC BigCloneBench (a standard clone detection benchmark) in just under three hours.Comment: 10 pages, 2 figures, 38th IEEE International Conference on Software Maintenance and Evolutio

    SourcererCC: Scaling Code Clone Detection to Big Code

    Full text link
    Despite a decade of active research, there is a marked lack in clone detectors that scale to very large repositories of source code, in particular for detecting near-miss clones where significant editing activities may take place in the cloned code. We present SourcererCC, a token-based clone detector that targets three clone types, and exploits an index to achieve scalability to large inter-project repositories using a standard workstation. SourcererCC uses an optimized inverted-index to quickly query the potential clones of a given code block. Filtering heuristics based on token ordering are used to significantly reduce the size of the index, the number of code-block comparisons needed to detect the clones, as well as the number of required token-comparisons needed to judge a potential clone. We evaluate the scalability, execution time, recall and precision of SourcererCC, and compare it to four publicly available and state-of-the-art tools. To measure recall, we use two recent benchmarks, (1) a large benchmark of real clones, BigCloneBench, and (2) a Mutation/Injection-based framework of thousands of fine-grained artificial clones. We find SourcererCC has both high recall and precision, and is able to scale to a large inter-project repository (250MLOC) using a standard workstation.Comment: Accepted for publication at ICSE'16 (preprint, unrevised

    VirtFogSim: A parallel toolbox for dynamic energy-delay performance testing and optimization of 5G Mobile-Fog-Cloud virtualized platforms

    Get PDF
    It is expected that the pervasive deployment of multi-tier 5G-supported Mobile-Fog-Cloudtechnological computing platforms will constitute an effective means to support the real-time execution of future Internet applications by resource- and energy-limited mobile devices. Increasing interest in this emerging networking-computing technology demands the optimization and performance evaluation of several parts of the underlying infrastructures. However, field trials are challenging due to their operational costs, and in every case, the obtained results could be difficult to repeat and customize. These emergingMobile-Fog-Cloud ecosystems still lack, indeed, customizable software tools for the performance simulation of their computing-networking building blocks. Motivated by these considerations, in this contribution, we present VirtFogSim. It is aMATLAB-supported software toolbox that allows the dynamic joint optimization and tracking of the energy and delay performance of Mobile-Fog-Cloud systems for the execution of applications described by general Directed Application Graphs (DAGs). In a nutshell, the main peculiar features of the proposed VirtFogSim toolbox are that: (i) it allows the joint dynamic energy-aware optimization of the placement of the application tasks and the allocation of the needed computing-networking resources under hard constraints on acceptable overall execution times, (ii) it allows the repeatable and customizable simulation of the resulting energy-delay performance of the overall system; (iii) it allows the dynamic tracking of the performed resource allocation under time-varying operational environments, as those typically featuring mobile applications; (iv) it is equipped with a user-friendly Graphic User Interface (GUI) that supports a number of graphic formats for data rendering, and (v) itsMATLAB code is optimized for running atop multi-core parallel execution platforms. To check both the actual optimization and scalability capabilities of the VirtFogSim toolbox, a number of experimental setups featuring different use cases and operational environments are simulated, and their performances are compared

    Heterologous screening of hybridomas for the development of broad-specific monoclonal antibodies against deoxynivalenol and its analogues

    Get PDF
    Hapten heterology was introduced into the steps of hybridoma selection for the development of monoclonal antibodies (MAbs) against deoxynivalenol (DON). Firstly, a novel heterologous DON hapten was synthesised and covalently coupled to proteins (i.e. bovine serum albumin (BSA), ovalbumin and horseradish peroxidase) using the linkage of cyanuric chloride (CC). After immunisation, antisera from different DON immunogens were checked for the presence of useful antibodies. Next, both homologous and heterologous enzyme-linked immunosorbent assays were conducted to screen for hybridomas. It was found that heterologous screening could significantly reduce the proportion of false positives and appeared to be an efficient approach for selecting hybridomas of interest. This strategy resulted in two kinds of broad-selective MAbs against DON and its analogues. They were quite distinct from other reported DON-antibodies in their cross-reactivity profiles. A unique MAb 13H1 derived from DON-CC-BSA immunogen could recognise DON and its analogues in the order of HT-2 toxin > 15-acetyl-DON > DON > nivalenol, with IC50 ranging from 1.14 to 7.69 mu g/ml. Another preferable MAb 10H10 generated from DON-BSA immunogen manifested relatively similar affinity to DON, 3-acetyl-DON and 15-acetyl-DON, with IC50 values of 22, 15 and 34 ng/ml, respectively. This is the first broad-specific MAb against DON and its two acetylated forms and thus it can be used for simultaneous detection of the three mycotoxins
    • …
    corecore