124,332 research outputs found

    Simulation sampling with live-points

    Get PDF
    Current simulation-sampling techniques construct accurate model state for each measurement by continuously warming large microarchitectural structures (e.g., caches and the branch predictor) while functionally simulating the billions of instructions between measurements. This approach, called functional warming, is the main performance bottleneck of simulation sampling and requires hours of runtime while the detailed simulation of the sample requires only minutes. Existing simulators can avoid functional simulation by jumping directly to particular instruction stream locations with architectural state checkpoints. To replace functional warming, these checkpoints must additionally provide microarchitectural model state that is accurate and reusable across experiments while meeting tight storage constraints. In this paper, we present a simulation-sampling framework that replaces functional warming with live-points without sacrificing accuracy. A live-point stores the bare minimum of functionally-warmed state for accurate simulation of a limited execution window while placing minimal restrictions on microarchitectural configuration. Live-points can be processed in random rather than program order, allowing simulation results and their statistical confidence to be reported while simulations are in progress. Our framework matches the accuracy of prior simulation-sampling techniques (i.e., ±3% error with 99.7% confidence), while estimating the performance of an 8-way out-of-order superscalar processor running SPEC CPU2000 in 91 seconds per benchmark, on average, using a 12 GB live-point librar

    Feeding back Information on Ineligibility from Sample Surveys to the Frame

    No full text
    It is usually discovered in the data collection phase of a survey that some units in the sample are ineligible even if the frame information has indicated otherwise. For example, in many business surveys a nonnegligible proportion of the sampled units will have ceased trading since the latest update of the frame. This information may be fed back to the frame and used in subsequent surveys, thereby making forthcoming samples more efficient by avoiding sampling nonnegligible units. We investigate what effect on survey estimation the process of feeding back information on ineligibility may have, and derive an expression for the bias that can occur as a result of feeding back. The focus is on estimation of the total using the common expansion estimator. We obtain an estimator that is nearly unbiased in the presence of feed back. This estimator relies on consistent estimates of the number of eligible and ineligible units in the population being available

    BarrierPoint: sampled simulation of multi-threaded applications

    Get PDF
    Sampling is a well-known technique to speed up architectural simulation of long-running workloads while maintaining accurate performance predictions. A number of sampling techniques have recently been developed that extend well- known single-threaded techniques to allow sampled simulation of multi-threaded applications. Unfortunately, prior work is limited to non-synchronizing applications (e.g., server throughput workloads); requires the functional simulation of the entire application using a detailed cache hierarchy which limits the overall simulation speedup potential; leads to different units of work across different processor architectures which complicates performance analysis; or, requires massive machine resources to achieve reasonable simulation speedups. In this work, we propose BarrierPoint, a sampling methodology to accelerate simulation by leveraging globally synchronizing barriers in multi-threaded applications. BarrierPoint collects microarchitecture-independent code and data signatures to determine the most representative inter-barrier regions, called barrierpoints. BarrierPoint estimates total application execution time (and other performance metrics of interest) through detailed simulation of these barrierpoints only, leading to substantial simulation speedups. Barrierpoints can be simulated in parallel, use fewer simulation resources, and define fixed units of work to be used in performance comparisons across processor architectures. Our evaluation of BarrierPoint using NPB and Parsec benchmarks reports average simulation speedups of 24.7x (and up to 866.6x) with an average simulation error of 0.9% and 2.9% at most. On average, BarrierPoint reduces the number of simulation machine resources needed by 78x

    Secretory vesicles are preferentially targeted to areas of low molecular SNARE density

    Get PDF
    Intercellular communication is commonly mediated by the regulated fusion, or exocytosis, of vesicles with the cell surface. SNARE (soluble N-ethymaleimide sensitive factor attachment protein receptor) proteins are the catalytic core of the secretory machinery, driving vesicle and plasma membrane merger. Plasma membrane SNAREs (tSNAREs) are proposed to reside in dense clusters containing many molecules, thus providing a concentrated reservoir to promote membrane fusion. However, biophysical experiments suggest that a small number of SNAREs are sufficient to drive a single fusion event. Here we show, using molecular imaging, that the majority of tSNARE molecules are spatially separated from secretory vesicles. Furthermore, the motilities of the individual tSNAREs are constrained in membrane micro-domains, maintaining a non-random molecular distribution and limiting the maximum number of molecules encountered by secretory vesicles. Together our results provide a new model for the molecular mechanism of regulated exocytosis and demonstrate the exquisite organization of the plasma membrane at the level of individual molecular machines
    • …
    corecore