16 research outputs found

    Establishing a base of trust with performance counters for enterprise workloads

    Get PDF
    Understanding the performance of large, complex enterprise-class applications is an important, yet nontrivial task. Methods using hardware performance counters, such as profiling through event-based sampling, are often favored over instrumentation for analyzing such large codes, but rarely provide good accuracy at the instruction level. This work evaluates the accuracy ofmultiple eventbased sampling techniques and quantifies the impact of a range of improvements suggested in recent years. The evaluation is performed on instances of three modern CPU architectures, using designated kernels and full applications. We conclude that precisely distributed events considerably improve accuracy, with further improvements possible when using Last Branch Records. We also present practical recommendations for hardware architects, tool developers and performance engineers, aimed at improving the quality of results

    Software-Based Techniques for Protecting Return Addresses

    Full text link
    Protecting computing systems against cyberattacks should be put high on the agenda. For example, Colonial Pipeline, an American oil pipeline system, suffered a cyberattack that impacted its computerized equipment managing the pipeline, leading to a state of emergency declared by President Joe Biden in May, 2021. As reported by Microsoft Security Response Center, attackers are unanimously corrupting the stack and most Control Flow Guard (CFG) improvements will provide little value-add until stack protection loads. Shadow stacks play an important role in protecting backward edges (return addresses on the call stack) to mitigate Return-Oriented Programming (ROP) attacks. Control-Flow Integrity (CFI) techniques often focus on protecting forward edges (indirect calls via function pointers and virtual calls) and assume that backward edges are protected by shadow stacks. However, the cruel reality is that shadow stacks are still not widely deployed due to compatibility, performance or security deficiencies. In this thesis, we propose three novel techniques for protecting return addresses. First, by adding one level of indirection, we introduce BarRA, the first shadow stack mechanism that applies continuous runtime re-randomization to abstract return addresses for protecting their corresponding concrete return addresses (also protected by CFI) for single-threaded programs, thus avoiding expensive pointer tracking. As a nice side-effect, BarRA naturally combines the shadow stack, CFI and runtime re-randomization in the same framework. Second, without reserving any dedicated register, we propose a novel threadlocal storage mechanism, STK-TLS, that is both efficient and free of compatibility issues. We also present a new microsecond-level runtime re-randomization technique (without relying on information hiding or MMU), STK-MSR, to mitigate information disclosure attacks and protect the shadow stack with 64-bit entropy. Based on STK-TLS and STK-MSR, we have implemented a novel stack layout (referred to as Bustk), that is highly performant, compatible with existing code, and provides meaningful security for single- and multi-threaded server programs. Third, by fast-moving safe regions in the large 47-bit user space (based on MMU), we design a practical shadow stack, FlashStack, for protecting return addresses in single- and multi-threaded programs (including browsers) running under 64-bit Linux on x86-64. FlashStack introduces a novel lightweight instrumentation mechanism, a continuous shuffling scheme for the shadow stack in user space, and a new dual-prologue approach for a protected function to mitigate the TOCTTOU attacks (constructed by Microsoft s red team), information disclosure attacks, and crash-resistant probing attacks

    Modelling and computational improvements to the simulation of single vector-boson plus jet processes for the ATLAS experiment

    Get PDF
    This paper presents updated Monte Carlo configurations used to model the production of single electroweak vector bosons (W, Z/γ∗) in association with jets in proton-proton collisions for the ATLAS experiment at the Large Hadron Collider. Improvements pertaining to the electroweak input scheme, parton-shower splitting kernels and scale-setting scheme are shown for multi-jet merged configurations accurate to next-to-leading order in the strong and electroweak couplings. The computational resources required for these set-ups are assessed, and approximations are introduced resulting in a factor three reduction of the per-event CPU time without affecting the physics modelling performance. Continuous statistical enhancement techniques are introduced by ATLAS in order to populate low cross-section regions of phase space and are shown to match or exceed the generated effective luminosity. This, together with the lower per-event CPU time, results in a 50% reduction in the required computing resources compared to a legacy set-up previously used by the ATLAS collaboration. The set-ups described in this paper will be used for future ATLAS analyses and lay the foundation for the next generation of Monte Carlo predictions for single vector-boson plus jets production. [Figure not available: see fulltext.]

    Modelling and computational improvements to the simulation of single vector-boson plus jet processes for the ATLAS experiment

    Get PDF
    This paper presents updated Monte Carlo configurations used to model the production of single electroweak vector bosons (W, Z/γ∗) in association with jets in proton-proton collisions for the ATLAS experiment at the Large Hadron Collider. Improvements pertaining to the electroweak input scheme, parton-shower splitting kernels and scale-setting scheme are shown for multi-jet merged configurations accurate to next-to-leading order in the strong and electroweak couplings. The computational resources required for these set-ups are assessed, and approximations are introduced resulting in a factor three reduction of the per-event CPU time without affecting the physics modelling performance. Continuous statistical enhancement techniques are introduced by ATLAS in order to populate low cross-section regions of phase space and are shown to match or exceed the generated effective luminosity. This, together with the lower per-event CPU time, results in a 50% reduction in the required computing resources compared to a legacy set-up previously used by the ATLAS collaboration. The set-ups described in this paper will be used for future ATLAS analyses and lay the foundation for the next generation of Monte Carlo predictions for single vector-boson plus jets production

    Modelling and computational improvements to the simulation of single vector-boson plus jet processes for the ATLAS experiment

    Get PDF
    This paper presents updated Monte Carlo configurations used to model the production of single electroweak vector bosons (W, Z/γ∗) in association with jets in proton-proton collisions for the ATLAS experiment at the Large Hadron Collider. Improvements pertaining to the electroweak input scheme, parton-shower splitting kernels and scale-setting scheme are shown for multi-jet merged configurations accurate to next-to-leading order in the strong and electroweak couplings. The computational resources required for these set-ups are assessed, and approximations are introduced resulting in a factor three reduction of the per-event CPU time without affecting the physics modelling performance. Continuous statistical enhancement techniques are introduced by ATLAS in order to populate low cross-section regions of phase space and are shown to match or exceed the generated effective luminosity. This, together with the lower per-event CPU time, results in a 50% reduction in the required computing resources compared to a legacy set-up previously used by the ATLAS collaboration. The set-ups described in this paper will be used for future ATLAS analyses and lay the foundation for the next generation of Monte Carlo predictions for single vector-boson plus jets production

    Modelling and computational improvements to the simulation of single vector-boson plus jet processes for the ATLAS experiment

    Get PDF
    This paper presents updated Monte Carlo configurations used to model the production of single electroweak vector bosons (W, Z/gamma*) in association with jets in proton-proton collisions for the ATLAS experiment at the Large Hadron Collider. Improvements pertaining to the electroweak input scheme, parton-shower splitting kernels and scale-setting scheme are shown for multi-jet merged configurations accurate to next-to-leading order in the strong and electroweak couplings. The computational resources required for these set-ups are assessed, and approximations are introduced resulting in a factor three reduction of the per-event CPU time without affecting the physics modelling performance. Continuous statistical enhancement techniques are introduced by ATLAS in order to populate low cross-section regions of phase space and are shown to match or exceed the generated effective luminosity. This, together with the lower per-event CPU time, results in a 50% reduction in the required computing resources compared to a legacy set-up previously used by the ATLAS collaboration. The set-ups described in this paper will be used for future ATLAS analyses and lay the foundation for the next generation of Monte Carlo predictions for single vector-boson plus jets production.ANPCyT, ArgentinaYerPhI, ArmeniaARC, AustraliaBMWFW and FWF, AustriaANAS, AzerbaijanSSTC, BelarusCNPq and FAPESP, BrazilNSERC, NRC and CFI, CanadaCERNANID, ChileCAS, MOST and NSFC, ChinaMinciencias, ColombiaMSMT CR, MPO CR and VSC CR, Czech RepublicDNRF and DNSRC, DenmarkIN2P3-CNRS and CEA-DRF/IRFU, FranceSRNSFG, GeorgiaBMBF, HGF and MPG, GermanyGSRI, GreeceRGC and Hong Kong SAR, ChinaISF and Benoziyo Center, IsraelINFN, Italy;MEXT and JSPS, JapanCNRST, MoroccoNWO, The NetherlandsRCN, NorwayMEiN, PolandFCT, PortugalMNE/IFA, RomaniaJINRMES of Russia and NRC KI, Russian FederationMESTD, SerbiaMSSR, SlovakiaARRS and MIZŠ, SloveniaDSI/NRF, South AfricaMICINN, SpainSRC and Wallenberg Foundation, SwedenSERI, SNSF and Cantons of Bern and Geneva, SwitzerlandMOST, TaiwanTAEK, TurkeySTFC, U.K.DOE and NSF, U.S.A.BCKDF, CANARIE, Compute Canada and CRC, CanadaCOST, ERC, ERDF, Horizon 2020 and Marie Skłodowska-Curie Actions, European UnionInvestissements d’Avenir Labex, Investissements d’Avenir Idex and ANR, FranceDFG and AvH Foundation, GermanyHerakleitos, Thales and Aristeia programmes co-financed by EU-ESF and the Greek NSRF, GreeceBSF-NSF and GIF, IsraelNorwegian Financial Mechanism 2014–2021, NorwayNCN and NAWA, PolandLa Caixa Banking Foundation, CERCA Programme Generalitat de Catalunya and PROMETEO and GenT Programmes Generalitat Valenciana, SpainGöran Gustafssons Stiftelse, SwedenThe Royal Society and Leverhulme Trust, U.K

    Modelling and computational improvements to the simulation of single vector-boson plus jet processes for the ATLAS experiment

    Get PDF
    This paper presents updated Monte Carlo configurations used to model the production of single electroweak vector bosons (W,Z/γ)(W,Z/\gamma ^{*}) in association with jets in proton-proton collisions for the ATLAS experiment at the Large Hadron Collider. Improvements pertaining to the electroweak input scheme, parton-shower splitting kernels and scale-setting scheme are shown for multi-jet merged configurations accurate to next-to-leading order in the strong and electroweak couplings. The computational resources required for these set-ups are assessed, and approximations are introduced resulting in a factor three reduction of the per-event CPU time without affecting the physics modelling performance. Continuous statistical enhancement techniques are introduced by ATLAS in order to populate low cross-section regions of phase space and are shown to match or exceed the generated effective luminosity. This, together with the lower per-event CPU time, results in a 50% reduction in the required computing resources compared to a legacy set-up previously used by the ATLAS collaboration. The set-ups described in this paper will be used for future ATLAS analyses and lay the foundation for the next generation of Monte Carlo predictions for single vector-boson plus jets production

    Modelling and computational improvements to the simulation of single vector-boson plus jet processes for the ATLAS experiment

    Get PDF
    This paper presents updated Monte Carlo configurations used to model the production of single electroweak vector bosons (W, Z/γ∗) in association with jets in proton-proton collisions for the ATLAS experiment at the Large Hadron Collider. Improvements pertaining to the electroweak input scheme, parton-shower splitting kernels and scale-setting scheme are shown for multi-jet merged configurations accurate to next-to-leading order in the strong and electroweak couplings. The computational resources required for these set-ups are assessed, and approximations are introduced resulting in a factor three reduction of the per-event CPU time without affecting the physics modelling performance. Continuous statistical enhancement techniques are introduced by ATLAS in order to populate low cross-section regions of phase space and are shown to match or exceed the generated effective luminosity. This, together with the lower per-event CPU time, results in a 50% reduction in the required computing resources compared to a legacy set-up previously used by the ATLAS collaboration. The set-ups described in this paper will be used for future ATLAS analyses and lay the foundation for the next generation of Monte Carlo predictions for single vector-boson plus jets production

    Modelling and computational improvements to the simulation of single vector-boson plus jet processes for the ATLAS experiment

    Get PDF
    This paper presents updated Monte Carlo configurations used to model the production of single electroweak vector bosons (W, Z/gamma*) in association with jets in protonproton collisions for the ATLAS experiment at the Large Hadron Collider. Improvements pertaining to the electroweak input scheme, parton-shower splitting kernels and scalesetting scheme are shown for multi-jet merged configurations accurate to next-to-leading order in the strong and electroweak couplings. The computational resources required for these set-ups are assessed, and approximations are introduced resulting in a factor three reduction of the per-event CPU time without affecting the physics modelling performance. Continuous statistical enhancement techniques are introduced by ATLAS in order to populate low cross-section regions of phase space and are shown to match or exceed the generated effective luminosity. This, together with the lower per-event CPU time, results in a 50% reduction in the required computing resources compared to a legacy set-up previously used by the ATLAS collaboration. The set-ups described in this paper will be used for future ATLAS analyses and lay the foundation for the next generation of Monte Carlo predictions for single vector-boson plus jets production

    Modelling and computational improvements to the simulation of single vector-boson plus jet processes for the ATLAS experiment

    Get PDF
    This paper presents updated Monte Carlo configurations used to model the production of single electroweak vector bosons (W, Z/gamma*) in association with jets in proton-proton collisions for the ATLAS experiment at the Large Hadron Collider. Improvements pertaining to the electroweak input scheme, parton-shower splitting kernels and scale-setting scheme are shown for multi-jet merged configurations accurate to next-to-leading order in the strong and electroweak couplings. The computational resources required for these set-ups are assessed, and approximations are introduced resulting in a factor three reduction of the per-event CPU time without affecting the physics modelling performance. Continuous statistical enhancement techniques are introduced by ATLAS in order to populate low cross-section regions of phase space and are shown to match or exceed the generated effective luminosity. This, together with the lower per-event CPU time, results in a 50% reduction in the required computing resources compared to a legacy set-up previously used by the ATLAS collaboration. The set-ups described in this paper will be used for future ATLAS analyses and lay the foundation for the next generation of Monte Carlo predictions for single vector-boson plus jets production
    corecore