32 research outputs found

    The Imperative of Synthetic Biology: A Proposed National Research Initiative

    Get PDF
    A 2.5 page report outlining why the United States should launch a strategic national research initiative in synthetic biolog

    D-SPACE4Cloud: A Design Tool for Big Data Applications

    Get PDF
    The last years have seen a steep rise in data generation worldwide, with the development and widespread adoption of several software projects targeting the Big Data paradigm. Many companies currently engage in Big Data analytics as part of their core business activities, nonetheless there are no tools and techniques to support the design of the underlying hardware configuration backing such systems. In particular, the focus in this report is set on Cloud deployed clusters, which represent a cost-effective alternative to on premises installations. We propose a novel tool implementing a battery of optimization and prediction techniques integrated so as to efficiently assess several alternative resource configurations, in order to determine the minimum cost cluster deployment satisfying QoS constraints. Further, the experimental campaign conducted on real systems shows the validity and relevance of the proposed method

    Modeling performance of Hadoop applications: A journey from queueing networks to stochastic well formed nets

    Get PDF
    Nowadays, many enterprises commit to the extraction of actionable knowledge from huge datasets as part of their core business activities. Applications belong to very different domains such as fraud detection or one-to-one marketing, and encompass business analytics and support to decision making in both private and public sectors. In these scenarios, a central place is held by the MapReduce framework and in particular its open source implementation, Apache Hadoop. In such environments, new challenges arise in the area of jobs performance prediction, with the needs to provide Service Level Agreement guarantees to the enduser and to avoid waste of computational resources. In this paper we provide performance analysis models to estimate MapReduce job execution times in Hadoop clusters governed by the YARN Capacity Scheduler. We propose models of increasing complexity and accuracy, ranging from queueing networks to stochastic well formed nets, able to estimate job performance under a number of scenarios of interest, including also unreliable resources. The accuracy of our models is evaluated by considering the TPC-DS industry benchmark running experiments on Amazon EC2 and the CINECA Italian supercomputing center. The results have shown that the average accuracy we can achieve is in the range 9–14%

    On the relations between Markov chain lumpability and reversibility

    Get PDF
    In the literature, the notions of lumpability and time reversibility for large Markov chains have been widely used to efficiently study the functional and non-functional properties of computer systems. In this paper we explore the relations among different definitions of lumpability (strong, exact and strict) and the notion of time-reversed Markov chain. Specifically, we prove that an exact lumping induces a strong lumping on the reversed Markov chain and a strict lumping holds both for the forward and the reversed processes. Based on these results we introduce the class of λρ-reversible Markov chains which combines the notions of lumping and time reversibility modulo state renaming. We show that the class of autoreversible processes, previously introduced in Marin and Rossi (Proceedings of the IEEE 21st international symposium on modeling, analysis and simulation of computer and telecommunication systems MASCOTS, pp. 151–160, 2013), is strictly contained in the class of λρ-reversible chains

    Computer science

    No full text

    500 special relationships

    No full text

    Pale and male

    No full text

    Academia's failure to retain data scientists

    No full text
    WE APPLAUD THE recommendations of V. Stodden et al. for “Enhancing reproducibility for computational methods” (Policy Forum, 9 December 2016, p. 1240). Many of their recommendations could be fulfilled if researchers embraced modern statistical and computational (“data science”) practices. However, academia has proven slow to adapt best practices and thus has failed to retain many brilliant quantitative scientists. This is bad for science, as prevalent data analyses are often suboptimal, data-driven discoveries remain underdeveloped, and highly skilled people, who could use their unique expertise to move forward the most pressing scientific questions, are lost to industry.Peer reviewe

    ViewpointEnvisioning the future of computing research

    No full text
    corecore