14,965 research outputs found

    On accelerating ultra-large-scale mining

    Get PDF
    Ultra-large-scale mining has been shown to be useful for a number of software engineering tasks e.g. mining specifications, defect prediction. We propose a new research direction for accelerating ultra-large-scale mining that goes beyond parallelization. Our key idea is to analyze the interaction pattern between the mining task and the artifact to cluster artifacts such that running the mining task on one candidate artifact from each cluster is sufficient to produce results for other artifacts in the same cluster. Our artifact clustering criteria go beyond syntactic, semantic, and functional similarities to mining-task-specific similarity, where the interaction pattern between the mining task and the artifact is used for clustering. Our preliminary evaluation demonstrates that our technique significantly reduces the overall mining time

    BCFA: Bespoke Control Flow Analysis for CFA at Scale

    Full text link
    Many data-driven software engineering tasks such as discovering programming patterns, mining API specifications, etc., perform source code analysis over control flow graphs (CFGs) at scale. Analyzing millions of CFGs can be expensive and performance of the analysis heavily depends on the underlying CFG traversal strategy. State-of-the-art analysis frameworks use a fixed traversal strategy. We argue that a single traversal strategy does not fit all kinds of analyses and CFGs and propose bespoke control flow analysis (BCFA). Given a control flow analysis (CFA) and a large number of CFGs, BCFA selects the most efficient traversal strategy for each CFG. BCFA extracts a set of properties of the CFA by analyzing the code of the CFA and combines it with properties of the CFG, such as branching factor and cyclicity, for selecting the optimal traversal strategy. We have implemented BCFA in Boa, and evaluated BCFA using a set of representative static analyses that mainly involve traversing CFGs and two large datasets containing 287 thousand and 162 million CFGs. Our results show that BCFA can speedup the large scale analyses by 1%-28%. Further, BCFA has low overheads; less than 0.2%, and low misprediction rate; less than 0.01%.Comment: 12 page

    SRF Cavity Fabrication and Materials

    Full text link
    The technological and metallurgical requirements of material for highgradient superconducting cavities are described. High-purity niobium, as the preferred metal for the fabrication of superconducting accelerating cavities, should meet exact specifications. The content of interstitial impurities such as oxygen, nitrogen, and carbon must be below 10{\mu}g/g. The hydrogen content should be kept below 2{\mu}g/g to prevent degradation of the Q-value under certain cool-down conditions. The material should be free of flaws (foreign material inclusions or cracks and laminations) that can initiate a thermal breakdown. Defects may be detected by quality control methods such as eddy current scanning and identified by a number of special methods. Conventional and alternative cavity fabrication methods are reviewed. Conventionally, niobium cavities are fabricated from sheet niobium by the formation of half-cells by deep drawing, followed by trim machining and Electron-Beam Welding (EBW). The welding of half-cells is a delicate procedure, requiring intermediate cleaning steps and a careful choice of weld parameters to achieve full penetration of the joints. The equator welds are particularly critical. A challenge for a welded construction is the tight mechanical and electrical tolerances. These can be maintained by a combination of mechanical and radio-frequency measurements on halfcells and by careful tracking of weld shrinkage. The established procedure is suitable for large series production. The main aspects of quality assurance management are mentioned. Another cavity fabrication approach is slicing discs from the ingot and producing cavities by deep drawing and EBW. Accelerating gradients at the level of 35-45 MV.m-1 can be achieved by applying Electropolishing (EP) treatment....Comment: 37 pages, contribution to the CAS-CERN Accelerator School: Superconductivity for Accelerators, Erice, Italy, 24 April - 4 May 2013, edited by R. Baile

    Similarity-Aware Spectral Sparsification by Edge Filtering

    Full text link
    In recent years, spectral graph sparsification techniques that can compute ultra-sparse graph proxies have been extensively studied for accelerating various numerical and graph-related applications. Prior nearly-linear-time spectral sparsification methods first extract low-stretch spanning tree from the original graph to form the backbone of the sparsifier, and then recover small portions of spectrally-critical off-tree edges to the spanning tree to significantly improve the approximation quality. However, it is not clear how many off-tree edges should be recovered for achieving a desired spectral similarity level within the sparsifier. Motivated by recent graph signal processing techniques, this paper proposes a similarity-aware spectral graph sparsification framework that leverages efficient spectral off-tree edge embedding and filtering schemes to construct spectral sparsifiers with guaranteed spectral similarity (relative condition number) level. An iterative graph densification scheme is introduced to facilitate efficient and effective filtering of off-tree edges for highly ill-conditioned problems. The proposed method has been validated using various kinds of graphs obtained from public domain sparse matrix collections relevant to VLSI CAD, finite element analysis, as well as social and data networks frequently studied in many machine learning and data mining applications

    Collective program analysis

    Get PDF
    Encouraged by the success of data-driven software engineering (SE) techniques that have found numerous applications e.g. in defect prediction, specification inference, etc, the demand for mining and analyzing source code repositories at scale has significantly increased. However, analyzing source code at scale remains expensive to the extent that data-driven solutions to certain SE problems are beyond our reach today. Extant techniques have focused on leveraging distributed computing to solve this problem, but with a concomitant increase in computational resource needs. In this thesis, we propose collective program analysis (CPA), a technique to accelerate ultra-large-scale source code mining without demanding more computational resources and by utilizing the similarity between millions of source code artifacts. First, we describe the general concept of collective program analysis. Given a mining task that is required to be run on thousands of artifacts, the artifacts with similar interactions are clustered together, such that the mining task is required to be run on only one candidate from each cluster to produce the mining result and the results for other candidates in the same cluster can be produced using extrapolation. The two technical innovations of collective program analysis are: mining task specific similarity and interaction pattern graph. Mining task specific similarity is about whether two or more artifacts can be considered similar for a given mining task. An interaction pattern graph represents the interaction between the mining task and the artifact when the mining task is run on the artifact. An interaction pattern graph is used to determine mining task specific similarity between artifacts. Given a mining task and an artifact producing an interaction pattern graph soundly and efficiently can be very challenging. We propose a pre-analysis and program compaction technique to achieve this. Given a source code mining task and thousands of input programs on which the mining task needs to be run, our technique first extracts the information about what parts of an input program are relevant for the mining task and then removes the irrelevant parts from input programs, prior to running the mining task on them. Our key technical contributions are a static analysis to extract information about the parts of program that are relevant for a mining task and a sound program compaction technique that produces a reduced program on which the mining task has similar output as original program. Upon producing interaction pattern graphs of thousands of artifacts, they have to be clustered and the mining task results have to be reused between similar artifacts to achieve acceleration. In the final part of this thesis, we fully describes collective program analysis and illustrate mining millions of control flow graphs (CFGs) by clustering similar CFGs

    On Accelerating Source Code Analysis At Massive Scale

    Get PDF
    Encouraged by the success of data-driven software engineering (SE) techniques that have found numerous applications e.g. in defect prediction, specification inference, etc, the demand for mining and analyzing source code repositories at scale has significantly increased. However, analyzing source code at scale remains expensive to the extent that data-driven solutions to certain SE problems are beyond our reach today. Extant techniques have focussed on leveraging distributed computing to solve this problem, but with a concomitant increase in computational resource needs. This work proposes a technique that reduces the amount of computation performed by the ultra-large-scale source code mining task. Our key idea is to analyze the mining task to identify and remove the irrelevant portions of the source code, prior to running the mining task. We show a realization of our insight for mining and analyzing massive collections of control flow graphs of source codes. Our evaluation using 16 classical control-/data-flow analyses that are typical components of mining tasks and 7 Million CFGs shows that our technique can achieve on average 40% reduction in the task computation time. Our case studies demonstrates the applicability of our technique to massive scale source code mining tasks
    • …
    corecore