73 research outputs found

    The Implications of Diverse Applications and Scalable Data Sets in Benchmarking Big Data Systems

    Full text link
    Now we live in an era of big data, and big data applications are becoming more and more pervasive. How to benchmark data center computer systems running big data applications (in short big data systems) is a hot topic. In this paper, we focus on measuring the performance impacts of diverse applications and scalable volumes of data sets on big data systems. For four typical data analysis applications---an important class of big data applications, we find two major results through experiments: first, the data scale has a significant impact on the performance of big data systems, so we must provide scalable volumes of data sets in big data benchmarks. Second, for the four applications, even all of them use the simple algorithms, the performance trends are different with increasing data scales, and hence we must consider not only variety of data sets but also variety of applications in benchmarking big data systems.Comment: 16 pages, 3 figure

    BigDataBench: a Big Data Benchmark Suite from Internet Services

    Full text link
    As architecture, systems, and data management communities pay greater attention to innovative big data systems and architectures, the pressure of benchmarking and evaluating these systems rises. Considering the broad use of big data systems, big data benchmarks must include diversity of data and workloads. Most of the state-of-the-art big data benchmarking efforts target evaluating specific types of applications or system software stacks, and hence they are not qualified for serving the purposes mentioned above. This paper presents our joint research efforts on this issue with several industrial partners. Our big data benchmark suite BigDataBench not only covers broad application scenarios, but also includes diverse and representative data sets. BigDataBench is publicly available from http://prof.ict.ac.cn/BigDataBench . Also, we comprehensively characterize 19 big data workloads included in BigDataBench with varying data inputs. On a typical state-of-practice processor, Intel Xeon E5645, we have the following observations: First, in comparison with the traditional benchmarks: including PARSEC, HPCC, and SPECCPU, big data applications have very low operation intensity; Second, the volume of data input has non-negligible impact on micro-architecture characteristics, which may impose challenges for simulation-based big data architecture research; Last but not least, corroborating the observations in CloudSuite and DCBench (which use smaller data inputs), we find that the numbers of L1 instruction cache misses per 1000 instructions of the big data applications are higher than in the traditional benchmarks; also, we find that L3 caches are effective for the big data applications, corroborating the observation in DCBench.Comment: 12 pages, 6 figures, The 20th IEEE International Symposium On High Performance Computer Architecture (HPCA-2014), February 15-19, 2014, Orlando, Florida, US

    AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation

    Full text link
    Despite the considerable progress in automatic abdominal multi-organ segmentation from CT/MRI scans in recent years, a comprehensive evaluation of the models' capabilities is hampered by the lack of a large-scale benchmark from diverse clinical scenarios. Constraint by the high cost of collecting and labeling 3D medical data, most of the deep learning models to date are driven by datasets with a limited number of organs of interest or samples, which still limits the power of modern deep models and makes it difficult to provide a fully comprehensive and fair estimate of various methods. To mitigate the limitations, we present AMOS, a large-scale, diverse, clinical dataset for abdominal organ segmentation. AMOS provides 500 CT and 100 MRI scans collected from multi-center, multi-vendor, multi-modality, multi-phase, multi-disease patients, each with voxel-level annotations of 15 abdominal organs, providing challenging examples and test-bed for studying robust segmentation algorithms under diverse targets and scenarios. We further benchmark several state-of-the-art medical segmentation models to evaluate the status of the existing methods on this new challenging dataset. We have made our datasets, benchmark servers, and baselines publicly available, and hope to inspire future research. Information can be found at https://amos22.grand-challenge.org

    Meta-analysis Followed by Replication Identifies Loci in or near CDKN1B, TET3, CD80, DRAM1, and ARID5B as Associated with Systemic Lupus Erythematosus in Asians

    Get PDF
    Systemic lupus erythematosus (SLE) is a prototype autoimmune disease with a strong genetic involvement and ethnic differences. Susceptibility genes identified so far only explain a small portion of the genetic heritability of SLE, suggesting that many more loci are yet to be uncovered for this disease. In this study, we performed a meta-analysis of genome-wide association studies on SLE in Chinese Han populations and followed up the findings by replication in four additional Asian cohorts with a total of 5,365 cases and 10,054 corresponding controls. We identified genetic variants in or near CDKN1B, TET3, CD80, DRAM1, and ARID5B as associated with the disease. These findings point to potential roles of cell-cycle regulation, autophagy, and DNA demethylation in SLE pathogenesis. For the region involving TET3 and that involving CDKN1B, multiple independent SNPs were identified, highlighting a phenomenon that might partially explain the missing heritability of complex diseases

    Genome-wide association meta-analysis in Chinese and European individuals identifies ten new loci associated with systemic lupus erythematosus

    Get PDF
    Systemic lupus erythematosus (SLE; OMIM 152700) is a genetically complex autoimmune disease. Genome-wide association studies (GWASs) have identified more than 50 loci as robustly associated with the disease in single ancestries, but genome-wide transancestral studies have not been conducted. We combined three GWAS data sets from Chinese (1,659 cases and 3,398 controls) and European (4,036 cases and 6,959 controls) populations. A meta-analysis of these studies showed that over half of the published SLE genetic associations are present in both populations. A replication study in Chinese (3,043 cases and 5,074 controls) and European (2,643 cases and 9,032 controls) subjects found ten previously unreported SLE loci. Our study provides further evidence that the majority of genetic risk polymorphisms for SLE are contained within the same regions across both populations. Furthermore, a comparison of risk allele frequencies and genetic risk scores suggested that the increased prevalence of SLE in non-Europeans (including Asians) has a genetic basis

    Impact of Surface Hydroxylation on Stability of Silica-Support Metal Nanoparticles: On the Way to Tailor the Catalysts

    No full text
    Catalysts lie in the central role in chemical reactions and act as the heart of countless chemical protocols, from academic research at laboratories scale to the chemical industry level. Nanocatalysts are the catalysts composed of nanoparticles, usually have some active metal nanoparticles sit on some types of the supports. Metal nanoparticles (NPs) are characterized by a very high surface area to volume ratio and a large number of low coordination sites. These properties make them highly desired act as active components in catalytic reactions and other applications. However, this high number of low coordination sites also strongly destabilizes particles and makes them prone to sintering then leads to the loss of active surface area, reaction activity and selectivity. Recently, computational simulations from our group developed the amorphous silica model as the support in platinum-silica catalyst system using a combination of classical molecular modeling and density functional theory (DFT) calculations. In those studies, nanoparticle adhesion energetics and charge transfer were both found to be depend on the silica surface hydroxyl density. Since the hydroxylation is easily tunable by pretreatment temperature, this suggest that both electronic charge and catalyst stability can be modified via catalyst calcination. In this work, the platinum NPs dispersed on amorphous silica support were used as model catalysts. Two silica supports with different hydroxyl densities were investigated to explore the impact of surface hydroxylation on stability and reactivity of the catalysts. Through particle size analysis obtained by X-ray diffraction (XRD) and transmission electron microscopy (TEM) after elevated temperature treatment, we found that Pt NPs on fully hydroxylized silica is more stable than on dehydroxylized silica, with NPs on the former growing to only around half the size compared to those on dehydroxylized catalysts at 800 ºC. Finally, we analyzed the reactivity of these two catalysts in CO oxidation and found that the ignition temperature of dehydroxylized catalysts was about 30 ºC higher than that of the rehydroxylized catalysts, which correlates well with improved thermal stability of this catalyst. Overall, our results confirm that the degree of surface hydroxylation of silica has strong impact on both stability and reactivity of the silica-supported metal nanocatalysts

    Formal verification of mCWQ using extended hoare logic

    No full text
    Node mobility, as one of the most important features of Wireless Sensor Networks (WSNs), may affect the reliability of communication links in the networks, leading to abnormalities and decreasing the quality of service provided by WSNs. The mCWQ calculus (i.e., CWQ calculus with mobility) is recently proposed to capture the feature of node mobility and increase the communication quality of WSNs. In this paper, we present a proof system for the mCWQ calculus to prove its correctness. Our specifications and verifications are based on Hoare Logic. In order to describe the timing of observable actions, we extend the assertion language with primitives. And terminating and non-terminating computations both can be described in our proof system. We also give some examples to illustrate the application of our proof system
    corecore