13 research outputs found

    Numerical model of fracture growth in hydraulic re-fracturing

    Get PDF
    Simulation of fracture propagation with FEM method generates the need of re-meshing to provide more accurate results. This raises a question of determining the direction and criterion of mesh modification. In the case of general-purpose CAE-packages, we deal with a stationary mesh, and the fracture path is usually represented as a chain of elements with degraded properties. The algorithm proposed in this paper is based on the ANSYS Mechanical APDL language for stepwise geometry reconstruction and mesh modification in accordance with the current configuration of a growing fracture and provides a more accurate description of its shape. The fracture propagation process is divided into stages. Each subsequent stage differs from the previous one by the fracture shape modified due to the crack length increment in the calculated direction. To check the adequacy of the model, an experiment on fracture propagation in glass specimens with an initial notching under uniaxial compression was performed. The laboratory experiments were carried out to determine the fracture toughness of rocks. The developed numerical model has been used to solve the problem of re-fracturing for different stress anisotropy in the oil-bearing rock formation

    Fine-grained Benchmark Subsetting for System Selection

    Get PDF
    ABSTRACT System selection aims at finding the best architecture for a set of programs and workloads. It traditionally requires long running benchmarks. We propose a method to reduce the cost of system selection. We break down benchmarks into elementary fragments of source code, called codelets. Then, we identify two causes of redundancy: first, similar codelets; second, codelets called repeatedly. The key idea is to minimize redundancy inside the benchmark suite to speed it up. For each group of similar codelets, only one representative is kept. For codelets called repeatedly and for which the performance does not vary across calls, the number of invocations is reduced. Given an initial benchmark suite, our method produces a set of reduced benchmarks that can be used in place of the original one for system selection. We evaluate our method on the NAS SER benchmarks, producing a reduced benchmark suite 30 times faster in average than the original suite, with a maximum of 44 times. The reduced suite predicts the execution time on three target architectures with a median error between 3.9% and 8%

    Diogenes: Lightweight Scalable RSA Modulus Generation with a Dishonest Majority

    Get PDF
    In this work, we design and implement the first protocol for RSA modulus construction that can support thousands of parties and offers security against an arbitrary number of corrupted parties. In a nutshell, we design the ``best\u27\u27 protocol for this scale that is secure against passive corruption, then amplify it to obtain active security using efficient non-interactive zero-knowledge arguments. Our protocol satisfies a stronger security guarantee where a deviating party can be identified when the protocol aborts (referred to as security with identifiable-abort) and allows for ``public verifiability\u27\u27. Our passively secure protocol extends the recent work of Chen et al. that, in turn, is based on the blueprint introduced in the original work of Boneh-Franklin protocol (CRYPTO 1997, J. ACM, 2001). Specifically, we reduce the task of sampling a modulus to secure distributed multiplication, which we implement via an efficient threshold additively homomorphic encryption (AHE) scheme based on the Ring-LWE assumption. This results in a protocol where the amortized per-party communication cost grows logarithmically in the number of parties. In order to keep the parties lightweight, we employ an ``untrusted\u27\u27 coordinator that is connected to all parties and performs all public and broadcast operations. We amplify this protocol to obtain active security (with identifiable-abort) by attaching zero-knowledge proofs. We instantiate our ZK proof system by composing two different types of ZK proof systems: (1) the Ligero sub-linear zero-knowledge proof system (Ames et al., CCS 2017), and (2) Sigma-protocol for proving the knowledge of a discrete logarithm in unknown order groups (Shoup, Eurocrypt 2000). We implemented both the passive and the active variants of our protocol and ran experiments using 2 to 4,000 parties. This is the first such implementation of any MPC protocol that can scale to more than 1,000 parties. For generating a 2048-bit modulus among 1,000 parties, our passive protocol executed in under 4 minutes and the active variant ran in 22 minutes

    Milepost GCC: Machine Learning Enabled Self-tuning Compiler

    Get PDF
    International audienceTuning compiler optimizations for rapidly evolving hardwaremakes porting and extending an optimizing compiler for each new platform extremely challenging. Iterative optimization is a popular approach to adapting programs to a new architecture automatically using feedback-directed compilation. However, the large number of evaluations required for each program has prevented iterative compilation from widespread take-up in production compilers. Machine learning has been proposed to tune optimizations across programs systematically but is currently limited to a few transformations, long training phases and critically lacks publicly released, stable tools. Our approach is to develop a modular, extensible, self-tuning optimization infrastructure to automatically learn the best optimizations across multiple programs and architectures based on the correlation between program features, run-time behavior and optimizations. In this paper we describeMilepostGCC, the first publicly-available open-source machine learning-based compiler. It consists of an Interactive Compilation Interface (ICI) and plugins to extract program features and exchange optimization data with the cTuning.org open public repository. It automatically adapts the internal optimization heuristic at function-level granularity to improve execution time, code size and compilation time of a new program on a given architecture. Part of the MILEPOST technology together with low-level ICI-inspired plugin framework is now included in the mainline GCC.We developed machine learning plugins based on probabilistic and transductive approaches to predict good combinations of optimizations. Our preliminary experimental results show that it is possible to automatically reduce the execution time of individual MiBench programs, some by more than a factor of 2, while also improving compilation time and code size. On average we are able to reduce the execution time of the MiBench benchmark suite by 11% for the ARC reconfigurable processor.We also present a realistic multi-objective optimization scenario for Berkeley DB library using Milepost GCC and improve execution time by approximately 17%, while reducing compilatio

    A holistic approach to poredict effective compiler optimizations using machine learning

    No full text
    Un choix efficace des optimisations de compilation améliore notablement la performances des applications. En raison du grand nombre de choix possibles une approche exhaustive est irréalisable et l'exploration peut facilement tomber dans un minimum local. Les compilateurs utilisent des heuristiques qui parfois dégradent la performance, ce qui contraint les utilisateurs à des ajustements manuels. Cette thèse propose une approche holistique basée sur l'apprentissage automatique pour améliorer la sélection des optimisations du compilateur. L'analyse statique d'un grand nombre de boucles permet de montrer l'existence d'un potentiel d'optimisation significatif. On applique ensuite Milepost GCC, un compilateur basé sur l'apprentissage automatique, pour optimiser différentes applications. Il utilise les caractéristiques statiques des fonctions et un algorithme de classification, pour prédire une bonne séquence d'optimisations. Milepost apporte une accélération significative qui surpasse les solutions existantes. La contribution majeure de cette thèse est une méthode de méta-optimisation, ULM. Elle exploite des données statiques et dynamiques afin de déterminer les meilleurs jeux d'apprentissage pour différent algorithmes de classification. En mettant plusieurs algorithmes en compétition, ULM construit un prédicteur plus efficace que les solutions existantes. ULM prédit dans 92% des cas étudiés la meilleure combinaison d'optimisations.Effective compiler optimizations can greatly improve applications performance.These optimizations are numerous and can be applied in any order. Compilers select these optimizations using solutions driven by heuristics which may degrade programs performance. Therefore, developers resort to the tedious manual search for the best optimizations. Combinatorial search space makes this effort intractable and one can easily fall into a local minimum and miss the best combination. This thesis develops a holistic approach to improve applications performance with compiler optimizations and machine learning. A combination of static loop analysis and statistical learning is used to analyze a large corpus of loops and reveal good potential for compiler optimizations. Milepost GCC, a machine-learning based compiler, is applied to optimize benchmarks and an industrial database application. It uses function level static features and classi cation algorithms to predict a good sequence of optimizations. While Milepost GCC can mispredict the best optimizations, in general it obtains considerable speedups and outperforms state-of-the-art compiler heuristics. The culmination of this thesis is the ULM meta-optimization framework. ULM characterizes applications at different levels with static code features and hardware performance counters and nds the most important combination of program features. By selecting among three classi cation algorithms and tuning their parameters, ULM builds a sophisticated predictor that can outperform existing solutions. As a result, the ULM framework predicted correctly the best sequence of optimizations sequence in 92% of cases.VERSAILLES-BU Sciences et IUT (786462101) / SudocSudocFranceF

    Is Source-code Isolation Viable for Performance Characterization?

    No full text
    International audienceSource-code isolation finds and extracts the hotspots of an application as independent isolated fragments of code, called codelets. Codelets can be modified, compiled, run, and measured independently from the original application. Source-code isolation reduces benchmarking cost and allows piece-wise optimization of an application. Source-code isolation is faster than whole-program benchmarking and optimization since the user can concentrate only on the bottlenecks. This paper examines the viability of using isolated codelets in place of the original application for performance characterization and optimization. On the NAS benchmarks, we show that codelets capture 92.3% of the original execution time. We present a set of techniques for keeping codelets as faithful as possible to the original hotspots: 63.6% of the codelets have the same assembly as the original hotspots and 81.6% of the codelets have the same run time performance as the original hotspots

    Evaluating Architecture and Compiler Design through Static Loop Analysis

    No full text
    International audienceUsing the MAQAO loop static analyzer, we characterize a corpus of binary loops extracted from common benchmark suits such as SPEC, NAS, etc. and several industrial applications. For each loop, MAQAO extracts low-level assembly features such as: integer and floating-point vectorization ratio, number of registers used and spill-fill, number of concurrent memory streams accessed, etc. The distributions of these features on a large representative code corpus can be used to evaluate compilers and architectures and tune them for the most frequently used assembly patterns. In this paper, we present the MAQAO loop analyzer and a characterization of the 4857 binary loops. We evaluate register allocation and vectorization on two compilers and propose a method to tune loop buffer size and stream prefetcher based on static analysis of benchmarks

    Fine-grained Benchmark Subsetting for System Selection

    No full text
    International audienceSystem selection aims at finding the best architecture for a set of programs and workloads. It traditionally requires long running benchmarks. We propose a method to reduce the cost of system selection. We break down benchmarks into elementary fragments of source code, called codelets. Then, we identify two causes of redundancy: first, similar codelets; second, codelets called repeatedly. The key idea is to minimize redundancy inside the benchmark suite to speed it up. For each group of similar codelets, only one representative is kept. For codelets called repeatedly and for which the performance does not vary across calls, the number of invocations is reduced. Given an initial benchmark suite, our method produces a set of reduced benchmarks that can be used in place of the original one for system selection. We evaluate our method on the NAS SER benchmarks, producing a reduced benchmark suite 30 times faster in average than the original suite, with a maximum of 44 times. The reduced suite predicts the execution time on three target architectures with a median error between 3.9% and 8%

    Transforming GCC into a research-friendly environment: plugins for optimization tuning and reordering, function cloning and program instrumentation

    Get PDF
    International audienceComputer scientists are always eager to have a powerful, robust and stable compiler infrastructure. However, until recently, researchers had to either use available and often unstable research compilers, create new ones from scratch, try to hack open-source non-research compilers or use source to source tools. It often requires duplication of a large amount of functionality available in current production compilers while making questionable the practicality of the obtained research results. The Interactive Compilation Interface (ICI) has been introduced to avoid such time-consuming replication and transform popular, production compilers such as GCC into research toolsets by providing an ability to access, modify and extend GCC's internal functionality through a compiler-dependent hook and clear compiler-independent API with external portable plugins without interrupting the natural evolution of a compiler. In this paper, we describe our recent extensions to GCC and ICI with the preliminary experimental data to support selection and reordering of optimization passes with a dependency grammar, control of individual transformations and their parameters, generic function cloning and program instrumentation. We are synchronizing these developments implemented during Google Summer of Code'09 program with the mainline GCC 4.5 and its native low-level plugin system. These extensions are intended to enable and popularize the use of GCC for realistic research on empirical iterative feedback-directed compilation, statistical collective optimization, run-time adaptation and development of intelligent self-tuning computing systems among other important topics. Such research infrastructure should help researchers prototype and validate their ideas quickly in realistic, production environments while keeping portability of their research plugins across different releases of a compiler. Moreover, it should also allow to move successful ideas back to GCC much faster thus helping to improve, modularize and clean it up. Furthermore, we are porting GCC with ICI extensions for performance/power auto-tuning for data centers and cloud computing systems with heterogeneous architectures or for continuous whole system optimization
    corecore