36 research outputs found

    CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs

    Get PDF
    Since processor performance scalability will now mostly be achieved through thread-level parallelism, there is a strong incen- tive to parallelize a broad range of applications, including those with complex control flow and data structures. And writing par- allel programs is a notoriously difficult task. Beyond processor performance, the architect can help by facilitating the task of the programmer, especially by simplifying the model exposed to the programmer. In this article, among the many issues associated with writing par- allel programs, we focus on finding the appropriate parallelism granularity, and efficiently mapping tasks with complex control and data flow to threads. We propose to relieve the user and com- piler of both tasks by delegating the parallelization decision to the architecture at run-time, through a combination of hardware and software support and a tight dialogue between both. For the software support, we leverage an increasingly popular approach in software engineering, called component-based pro- gramming; the component contract assumes tight encapsulation of code and data for easy manipulation. Previous research works have shown that it is possible to augment components with the ability to split/spawn, providing a simple and fitting approach for programming parallel applications with complex control and data structures. However, such environments still require the program- mer to determine the appropriate granularity of parallelism, and spawning incurs significant overheads due to software run-time system management. For that purpose, we provide an environment with the ability to spawn conditionally depending on available hardware resources, and we delegate spawning decisions and actions to the architec- ture. This conditional spawning is implemented through frequent hardware resource probing by the program. This, in turn, enables rapid adaptation to varying workload conditions, data sets and hardware resources. Furthermore, thanks to appropriate com- bined hardware and compiler support, the probing has no signifi- cant overhead on program performance. We demonstrate this approach on an 8-context SMT, sev- eral non-trivial algorithms and re-engineered SPEC CINT2000 benchmarks, written using component syntax processed by our toolchain. We achieve speedups ranging from 1.1 to 3.0 on our test suite

    Transdimensional inference of archeomagnetic intensity change

    Get PDF
    One of the main goals of archeomagnetism is to document the secular changes of Earth's magnetic field by laboratory analysis of the magnetization carried by archeological artefacts. Typical techniques for creating a time-dependent model assume a prescribed temporal discretisation which, when coupled with sparse data coverage, require strong regularisation generally applied over the entire time series in order to ensure smoothness. Such techniques make it difficult to characterise uncertainty and frequency content, and robustly detect rapid changes. Key to proper modelling (and physical understanding) is a method that places a minimum level of regularisation on any fit to the data. Here we apply a transdimensional Bayesian technique based on piecewise linear interpolation to sparse archeointensity datasets, in which the temporal complexity of the model is not set a priori, but is self-selected by the data. The method produces two key outputs: (i) a posterior distribution of intensity as a function of time, a useful tool for archeomagnetic dating, whose statistics are smooth but formally unregularised; (ii) by including the data ages in the model of unknown parameters, the method also produces posterior age statistics of each individual contributing datum. We test the technique using synthetic datasets and confirm agreement of our method with an integrated likelihood approach. We then apply the method to three archeomagnetic datasets all reduced to a single location: one temporally well-sampled within 700km from Paris (here referred to as Paris700), one that is temporally sparse centred on Hawaii, and a third (from Lübeck, Germany and Paris700) that has additional ordering constraints on age from stratification. Compared with other methods, our average posterior distributions largely agree, however our credible intervals appear to much better reflect the uncertainty during periods of sparse data coverage. Because each ensemble member of the posterior distribution is piecewise linear, we only fit oscillations when required by the data. As an example, we show that an oscillatory signal, associated with temporally-localised intensity maxima reported for a sparse Hawaiian dataset, is not required by the data. However, we do recover the previously reported oscillation of period 260 yrs for the Paris700 dataset and compute the probability distribution of the period of oscillation. We further demonstrate that such an oscillation is unresolved when accounting for age uncertainty by using a fixed age and with an artificially inflated error budget on intensity

    Vaccine breakthrough hypoxemic COVID-19 pneumonia in patients with auto-Abs neutralizing type I IFNs

    Full text link
    Life-threatening `breakthrough' cases of critical COVID-19 are attributed to poor or waning antibody response to the SARS- CoV-2 vaccine in individuals already at risk. Pre-existing autoantibodies (auto-Abs) neutralizing type I IFNs underlie at least 15% of critical COVID-19 pneumonia cases in unvaccinated individuals; however, their contribution to hypoxemic breakthrough cases in vaccinated people remains unknown. Here, we studied a cohort of 48 individuals ( age 20-86 years) who received 2 doses of an mRNA vaccine and developed a breakthrough infection with hypoxemic COVID-19 pneumonia 2 weeks to 4 months later. Antibody levels to the vaccine, neutralization of the virus, and auto- Abs to type I IFNs were measured in the plasma. Forty-two individuals had no known deficiency of B cell immunity and a normal antibody response to the vaccine. Among them, ten (24%) had auto-Abs neutralizing type I IFNs (aged 43-86 years). Eight of these ten patients had auto-Abs neutralizing both IFN-a2 and IFN-., while two neutralized IFN-omega only. No patient neutralized IFN-ss. Seven neutralized 10 ng/mL of type I IFNs, and three 100 pg/mL only. Seven patients neutralized SARS-CoV-2 D614G and the Delta variant (B.1.617.2) efficiently, while one patient neutralized Delta slightly less efficiently. Two of the three patients neutralizing only 100 pg/mL of type I IFNs neutralized both D61G and Delta less efficiently. Despite two mRNA vaccine inoculations and the presence of circulating antibodies capable of neutralizing SARS-CoV-2, auto-Abs neutralizing type I IFNs may underlie a significant proportion of hypoxemic COVID-19 pneumonia cases, highlighting the importance of this particularly vulnerable population

    AP+SOMT: Agent-Programming Combined with Self-Organized Multi-Threading

    Get PDF
    In order to scale up processors beyond ILP, we explore the exploitation of coarser-grain parallelism. We advocate that a slightly different programming approach, called agent programming (AP), can unveil a large amount of parallelism, considerably simplify the task of optimizing compilers and empower the architecture with the ability to exploit potential parallelism based on available resources. We show that an SMT, augmented with dynamic steering strategies and thread swapping features, is an appropriate solution for such self-organized architectures; self-organized SMT is called SOMT. Using a set of specially written agent-like programs corresponding to classic algorithms, we show that AP+SOMT exhibit better performance, stability and scalability for a large array of data sets, and makes compiler optimizations less necessary. Finally, we outline that the approach can be progressively adopted as a combination of a hardware add-on and C language extensions, much like multimedia support in current superscalar processors

    Statistical properties of reversals and chrons in numerical dynamos and implications for the geodynamo

    No full text
    International audienceWe analyse a series of very long runs (equivalent to up to 50 Myr) produced by chemically-driven dynamos. All runs assume homogeneous boundary conditions, an electrically conducting inner-core (except for one run) and only differ by the choice of the Rayleigh number Ra★. Introducing dynamo-based definitions of reversals, chrons and related concepts, such as "failed reversals" and "segments" (bounded by reversals or failed reversals), we investigate the distributions of chron and segment lengths, those of reversal and failed reversal durations, the way dipole field behaves through reversals and failed reversals, and the possible links between the axial dipole intensity and chron or segment lengths. We show that chron and segment lengths are very well described in terms of a Poisson process (with no occurrence of superchrons), while distributions of reversal and failed reversal durations are better fitted by log-normal distributions. We found that reversal rates generally increase in proportion to Rm-Rmc,Rm being the magnetic Reynolds number and Rmc a critical value. In contrast, reversal and failed reversal durations appear to be mainly controlled by the core's magnetic diffusion timescale. More generally, we show that much of the reversing behaviour of these dynamos can be understood by examining their signature in a (g10,g11,h11) phase-space plot. This reveals that the run with an insulating inner-core is very different and has only two distinct modes of opposite polarity, which we argue is the reason it displays less reversals and failed reversals, and has a clear tendency to produce an intensity "overshoot" and some systematic pattern in the dipole pole behaviour through reversals and failed reversals. This contrasts with conducting inner-core runs, which display an additional central unstable mode, the importance of which increases with Rm, and which is responsible for the more complex reversing behaviour of these dynamos. Available paleomagnetic data suggest that the current geodynamo could have such a (small) central mode, which would thus imply a strong sensitivity of the frequency and complexity of reversals and of the likelihood of failed reversals, to changes in the geodynamo's driving parameters through geological times

    CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs

    Get PDF
    Since processor performance scalability will now mostly be achieved through thread-level parallelism, there is a strong incen- tive to parallelize a broad range of applications, including those with complex control flow and data structures. And writing par- allel programs is a notoriously difficult task. Beyond processor performance, the architect can help by facilitating the task of the programmer, especially by simplifying the model exposed to the programmer. In this article, among the many issues associated with writing par- allel programs, we focus on finding the appropriate parallelism granularity, and efficiently mapping tasks with complex control and data flow to threads. We propose to relieve the user and com- piler of both tasks by delegating the parallelization decision to the architecture at run-time, through a combination of hardware and software support and a tight dialogue between both. For the software support, we leverage an increasingly popular approach in software engineering, called component-based pro- gramming; the component contract assumes tight encapsulation of code and data for easy manipulation. Previous research works have shown that it is possible to augment components with the ability to split/spawn, providing a simple and fitting approach for programming parallel applications with complex control and data structures. However, such environments still require the program- mer to determine the appropriate granularity of parallelism, and spawning incurs significant overheads due to software run-time system management. For that purpose, we provide an environment with the ability to spawn conditionally depending on available hardware resources, and we delegate spawning decisions and actions to the architec- ture. This conditional spawning is implemented through frequent hardware resource probing by the program. This, in turn, enables rapid adaptation to varying workload conditions, data sets and hardware resources. Furthermore, thanks to appropriate com- bined hardware and compiler support, the probing has no signifi- cant overhead on program performance. We demonstrate this approach on an 8-context SMT, sev- eral non-trivial algorithms and re-engineered SPEC CINT2000 benchmarks, written using component syntax processed by our toolchain. We achieve speedups ranging from 1.1 to 3.0 on our test suite

    Validation with code instropection of a virtual platform for sandboxing and security analysis

    No full text
    International audienceValidating the safety and security of software computing systems often involves testing code in simulators of these systems, called virtual platforms. Because security breaches often come from implementation details, such simulators must reach a high level of accuracy. However, validating an instruction set simulator is a heavy development task involving large test campaigns. In this paper, we propose a novel technique to automatically generate and evaluate simulator tests. Using C++ polymorphism, we developed a code introspection software library that enables automatic test generation. By leveraging this automated approach, we were able to develop a self-testing simulator, providing a superior level of validation with minimal development overhead

    CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs

    No full text
    Since processor performance scalability will now mostly be achieved through thread-level parallelism, there is a strong incen- tive to parallelize a broad range of applications, including those with complex control flow and data structures. And writing par- allel programs is a notoriously difficult task. Beyond processor performance, the architect can help by facilitating the task of the programmer, especially by simplifying the model exposed to the programmer. In this article, among the many issues associated with writing par- allel programs, we focus on finding the appropriate parallelism granularity, and efficiently mapping tasks with complex control and data flow to threads. We propose to relieve the user and com- piler of both tasks by delegating the parallelization decision to the architecture at run-time, through a combination of hardware and software support and a tight dialogue between both. For the software support, we leverage an increasingly popular approach in software engineering, called component-based pro- gramming; the component contract assumes tight encapsulation of code and data for easy manipulation. Previous research works have shown that it is possible to augment components with the ability to split/spawn, providing a simple and fitting approach for programming parallel applications with complex control and data structures. However, such environments still require the program- mer to determine the appropriate granularity of parallelism, and spawning incurs significant overheads due to software run-time system management. For that purpose, we provide an environment with the ability to spawn conditionally depending on available hardware resources, and we delegate spawning decisions and actions to the architec- ture. This conditional spawning is implemented through frequent hardware resource probing by the program. This, in turn, enables rapid adaptation to varying workload conditions, data sets and hardware resources. Furthermore, thanks to appropriate com- bined hardware and compiler support, the probing has no signifi- cant overhead on program performance. We demonstrate this approach on an 8-context SMT, sev- eral non-trivial algorithms and re-engineered SPEC CINT2000 benchmarks, written using component syntax processed by our toolchain. We achieve speedups ranging from 1.1 to 3.0 on our test suite

    Accélération de la compilation dynamique pour les cibles embarquées

    No full text
    2 pagesNational audienceLes technologies de compilation dynamique (ou Just-In-Time) sont en pleine expansion et déjà très présentes dans les architectures hautes-performances. Leur apparition dans l'embarqué est très récente (tout au plus une dizaine d'année) et pose aujourd'hui beaucoup de problèmes de passage à l'échelle. L'objectif de cet article est de présenter les études menées sur les différentes technologies de compilation dynamique dans l'optique de déterminer les pistes intéressantes à explorer pour en augmenter l'attractivité sur cibles embarquées
    corecore