4,433 research outputs found

    Exploring power behaviors and trade-offs of in-situ data analytics

    Get PDF
    pre-printAs scientific applications target exascale, challenges related to data and energy are becoming dominating concerns. For example, coupled simulation workflows are increasingly adopting in-situ data processing and analysis techniques to address costs and overheads due to data movement and I/O. However it is also critical to understand these overheads and associated trade-offs from an energy perspective. The goal of this paper is exploring data-related energy/performance trade-offs for end-to-end simulation workflows running at scale on current high-end computing systems. Specifically, this paper presents: (1) an analysis of the data-related behaviors of a combustion simulation workflow with an in-situ data analytics pipeline, running on the Titan system at ORNL; (2) a power model based on system power and data exchange patterns, which is empirically validated; and (3) the use of the model to characterize the energy behavior of the workflow and to explore energy/performance tradeoffs on current as well as emerging systems

    Research and Education in Computational Science and Engineering

    Get PDF
    Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that neither theory nor experiment alone is equipped to answer. CSE provides scientists and engineers of all persuasions with algorithmic inventions and software systems that transcend disciplines and scales. Carried on a wave of digital technology, CSE brings the power of parallelism to bear on troves of data. Mathematics-based advanced computing has become a prevalent means of discovery and innovation in essentially all areas of science, engineering, technology, and society; and the CSE community is at the core of this transformation. However, a combination of disruptive developments---including the architectural complexity of extreme-scale computing, the data revolution that engulfs the planet, and the specialization required to follow the applications to new frontiers---is redefining the scope and reach of the CSE endeavor. This report describes the rapid expansion of CSE and the challenges to sustaining its bold advances. The report also presents strategies and directions for CSE research and education for the next decade.Comment: Major revision, to appear in SIAM Revie

    An Automatic Level Set Based Liver Segmentation from MRI Data Sets

    Get PDF
    A fast and accurate liver segmentation method is a challenging work in medical image analysis area. Liver segmentation is an important process for computer-assisted diagnosis, pre-evaluation of liver transplantation and therapy planning of liver tumors. There are several advantages of magnetic resonance imaging such as free form ionizing radiation and good contrast visualization of soft tissue. Also, innovations in recent technology and image acquisition techniques have made magnetic resonance imaging a major tool in modern medicine. However, the use of magnetic resonance images for liver segmentation has been slow when we compare applications with the central nervous systems and musculoskeletal. The reasons are irregular shape, size and position of the liver, contrast agent effects and similarities of the gray values of neighbor organs. Therefore, in this study, we present a fully automatic liver segmentation method by using an approximation of the level set based contour evolution from T2 weighted magnetic resonance data sets. The method avoids solving partial differential equations and applies only integer operations with a two-cycle segmentation algorithm. The efficiency of the proposed approach is achieved by applying the algorithm to all slices with a constant number of iteration and performing the contour evolution without any user defined initial contour. The obtained results are evaluated with four different similarity measures and they show that the automatic segmentation approach gives successful results

    Scalable aggregation predictive analytics: a query-driven machine learning approach

    Get PDF
    We introduce a predictive modeling solution that provides high quality predictive analytics over aggregation queries in Big Data environments. Our predictive methodology is generally applicable in environments in which large-scale data owners may or may not restrict access to their data and allow only aggregation operators like COUNT to be executed over their data. In this context, our methodology is based on historical queries and their answers to accurately predict ad-hoc queries’ answers. We focus on the widely used set-cardinality, i.e., COUNT, aggregation query, as COUNT is a fundamental operator for both internal data system optimizations and for aggregation-oriented data exploration and predictive analytics. We contribute a novel, query-driven Machine Learning (ML) model whose goals are to: (i) learn the query-answer space from past issued queries, (ii) associate the query space with local linear regression & associative function estimators, (iii) define query similarity, and (iv) predict the cardinality of the answer set of unseen incoming queries, referred to the Set Cardinality Prediction (SCP) problem. Our ML model incorporates incremental ML algorithms for ensuring high quality prediction results. The significance of contribution lies in that it (i) is the only query-driven solution applicable over general Big Data environments, which include restricted-access data, (ii) offers incremental learning adjusted for arriving ad-hoc queries, which is well suited for query-driven data exploration, and (iii) offers a performance (in terms of scalability, SCP accuracy, processing time, and memory requirements) that is superior to data-centric approaches. We provide a comprehensive performance evaluation of our model evaluating its sensitivity, scalability and efficiency for quality predictive analytics. In addition, we report on the development and incorporation of our ML model in Spark showing its superior performance compared to the Spark’s COUNT method
    corecore