34 research outputs found

    SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling

    Get PDF
    Current software-based microarchitecture simulators are many orders of magnitude slower than the hardware they simulate. Hence, most microarchitecture design studies draw their conclusions from drastically truncated benchmark simulations that are often inaccurate and misleading. We present the sampling microarchitecture simulation (SMARTS) framework as an approach to enable fast and accurate performance measurements of full-length benchmarks. SMARTS accelerates simulation by selectively measuring in detail only an appropriate benchmark subset. SMARTS prescribes a statistically sound procedure for configuring a systematic sampling simulation run to achieve a desired quantifiable confidence in estimates. Analysis of 41 of the 45 possible SPEC2K benchmark/ input combinations show CPI and energy per instruction (EPI) can be estimated to within 3% with 99.7% confidence by measuring fewer than 50 million instructions per benchmark. In practice, inaccuracy in micro-architectural state initialization introduces an additional uncertainty which we empirically bound to /spl sim/2% for the tested benchmarks. Our implementation of SMARTS achieves an actual average error of only 0.64% on CPI and 0.59% on EPI for the tested benchmarks, running with average speedups of 35 and 60 over detailed simulation of 8-way and 16-way out-of-order processors, respectively

    SimFlex: Statistical Sampling of Computer System Simulation

    Get PDF
    Timing-accurate full-system multiprocessor simulations can take years because of architecture and application complexity. Statistical sampling makes simulation-based studies feasible by providing ten-thousand-fold reductions in simulation runtime and enabling thousand-way simulation parallelis

    Interwell coupling effect in Si/SiGe quantum wells grown by ultra high vacuum chemical vapor deposition

    Get PDF
    Si/Si0.66Ge0.34coupled quantum well (CQW) structures with different barrier thickness of 40, 4 and 2 nm were grown on Si substrates using an ultra high vacuum chemical vapor deposition (UHV-CVD) system. The samples were characterized using high resolution x-ray diffraction (HRXRD), cross-sectional transmission electron microscopy (XTEM) and photoluminescence (PL) spectroscopy. Blue shift in PL peak energy due to interwell coupling was observed in the CQWs following increase in the Si barrier thickness. The Si/SiGe heterostructure growth process and theoretical band structure model was validated by comparing the energy of the no-phonon peak calculated by the 6 + 2-bandk·pmethod with experimental PL data. Close agreement between theoretical calculations and experimental data was obtained

    Diagnosis of obstructive coronary artery disease using computed tomography angiography in patients with stable chest pain depending on clinical probability and in clinically important subgroups: Meta-analysis of individual patient data

    Get PDF
    Objective To determine whether coronary computed tomography angiography (CTA) should be performed in patients with any clinical probability of coronary artery disease (CAD), and whether the diagnostic performance differs between subgroups of patients. Design Prospectively designed meta-analysis of individual patient data from prospective diagnostic accuracy studies. Data sources Medline, Embase, and Web of Science for published studies. Unpublished studies were identified via direct contact with participating investigators. Eligibility criteria for selecting studies Prospective diagnostic accuracy studies that compared coronary CTA with coronary angiography as the reference standard, using at least a 50% diameter reduction as a cutoff value for obstructive CAD. All patients needed to have a clinical indication for coronary angiography due to suspected CAD, and both tests had to be performed in all patients. Results had to be provided using 2 72 or 3 72 cross tabulations for the comparison of CTA with coronary angiography. Primary outcomes were the positive and negative predictive values of CTA as a function of clinical pretest probability of obstructive CAD, analysed by a generalised linear mixed model; calculations were performed including and excluding non-diagnostic CTA results. The no-treat/treat threshold model was used to determine the range of appropriate pretest probabilities for CTA. The threshold model was based on obtained post-test probabilities of less than 15% in case of negative CTA and above 50% in case of positive CTA. Sex, angina pectoris type, age, and number of computed tomography detector rows were used as clinical variables to analyse the diagnostic performance in relevant subgroups. Results Individual patient data from 5332 patients from 65 prospective diagnostic accuracy studies were retrieved. For a pretest probability range of 7-67%, the treat threshold of more than 50% and the no-treat threshold of less than 15% post-test probability were obtained using CTA. At a pretest probability of 7%, the positive predictive value of CTA was 50.9% (95% confidence interval 43.3% to 57.7%) and the negative predictive value of CTA was 97.8% (96.4% to 98.7%); corresponding values at a pretest probability of 67% were 82.7% (78.3% to 86.2%) and 85.0% (80.2% to 88.9%), respectively. The overall sensitivity of CTA was 95.2% (92.6% to 96.9%) and the specificity was 79.2% (74.9% to 82.9%). CTA using more than 64 detector rows was associated with a higher empirical sensitivity than CTA using up to 64 rows (93.4% v 86.5%, P=0.002) and specificity (84.4% v 72.6%, P<0.001). The area under the receiver-operating-characteristic curve for CTA was 0.897 (0.889 to 0.906), and the diagnostic performance of CTA was slightly lower in women than in with men (area under the curve 0.874 (0.858 to 0.890) v 0.907 (0.897 to 0.916), P<0.001). The diagnostic performance of CTA was slightly lower in patients older than 75 (0.864 (0.834 to 0.894), P=0.018 v all other age groups) and was not significantly influenced by angina pectoris type (typical angina 0.895 (0.873 to 0.917), atypical angina 0.898 (0.884 to 0.913), non-anginal chest pain 0.884 (0.870 to 0.899), other chest discomfort 0.915 (0.897 to 0.934)). Conclusions In a no-treat/treat threshold model, the diagnosis of obstructive CAD using coronary CTA in patients with stable chest pain was most accurate when the clinical pretest probability was between 7% and 67%. Performance of CTA was not influenced by the angina pectoris type and was slightly higher in men and lower in older patients. Systematic review registration PROSPERO CRD42012002780

    Real time stereo vision using exponential step cost aggregation on GPU

    No full text
    10.1109/ICIP.2009.5413693Proceedings - International Conference on Image Processing, ICIP4281-428

    Highly efficient performance portable tracking of evolving surfaces

    No full text
    10.1109/IPDPS.2012.36Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS 2012296-30

    High performance stereo vision designed for massively data parallel platforms

    No full text
    10.1109/TCSVT.2010.2077771IEEE Transactions on Circuits and Systems for Video Technology20111509-1519ITCT

    Highly-parallel special-purpose multicore architecture for SystemC/TLM simulations

    No full text
    Conference of 14th International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, SAMOS 2014 ; Conference Date: 14 July 2014 Through 17 July 2014; Conference Code:114504International audienceThe complexity of SystemC virtual prototyping is continuously increasing. Accelerating RTL/TLM SystemC simulations is essential to control future SoC development cost and time-to-market. In this paper, we present RAVES, a highly-parallel special-purpose multicore architecture that achieves simulation performance more efficiently by parallel execution of light-weight user-level threads on many small cores. We present a design study based on the virtual prototype of RAVES processors running a co-designed custom SystemC kernel. Our evaluation suggests that a 64-core RAVES processor can deliver up to 4.47× more simulation performance than a high-end x86 processor

    Fast bilateral filtering by adapting block size

    No full text
    10.1109/ICIP.2010.5651251Proceedings - International Conference on Image Processing, ICIP3281-328
    corecore