1,693 research outputs found

    Speculative Approximations for Terascale Analytics

    Full text link
    Model calibration is a major challenge faced by the plethora of statistical analytics packages that are increasingly used in Big Data applications. Identifying the optimal model parameters is a time-consuming process that has to be executed from scratch for every dataset/model combination even by experienced data scientists. We argue that the incapacity to evaluate multiple parameter configurations simultaneously and the lack of support to quickly identify sub-optimal configurations are the principal causes. In this paper, we develop two database-inspired techniques for efficient model calibration. Speculative parameter testing applies advanced parallel multi-query processing methods to evaluate several configurations concurrently. The number of configurations is determined adaptively at runtime, while the configurations themselves are extracted from a distribution that is continuously learned following a Bayesian process. Online aggregation is applied to identify sub-optimal configurations early in the processing by incrementally sampling the training dataset and estimating the objective function corresponding to each configuration. We design concurrent online aggregation estimators and define halting conditions to accurately and timely stop the execution. We apply the proposed techniques to distributed gradient descent optimization -- batch and incremental -- for support vector machines and logistic regression models. We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big Data analytics system -- and evaluate their performance over terascale-size synthetic and real datasets. The results confirm that as many as 32 configurations can be evaluated concurrently almost as fast as one, while sub-optimal configurations are detected accurately in as little as a 1/20th1/20^{\text{th}} fraction of the time

    PACK: Prediction-Based Traffic Redundancy Elimination System & provide secure encryption in Cloud

    Get PDF
    In this paper, we use concept of PACK (predictive ACKs), which act like a traffic redundancy elimination (TRE) system, designed for cloud computing customers. TRE is designed on cloud to reduce traffic as well as cost regarding TRE Computation and storage will be optimized. The main advantage of the Pack Cloud-server is its ability to span end clients TRE effort, thus minimizing processing costs prompted by the TRE Algorithm. Unlike previous solutions Pack does not require server to continuously keep track on customer to maintain the status of the server.Pack maintain computing environment that combine server and client movement to maintain cloud elasticity. Pack is based on TRE technology; TRE is used to eliminate the transmission of redundant content as well as allow client to use newly received chunk to identify previously received chunks chains, which in turn can be used as reliable predictors future transmitted chunks.In our proposed work we are using encryption concept. we will send the chunks in encrypted format. For encryption we are using AES algorithm which is based on symmetric block cipher. This is using for security Purpose. We are going to secure our file from other traffics

    Service Abstractions for Scalable Deep Learning Inference at the Edge

    Get PDF
    Deep learning driven intelligent edge has already become a reality, where millions of mobile, wearable, and IoT devices analyze real-time data and transform those into actionable insights on-device. Typical approaches for optimizing deep learning inference mostly focus on accelerating the execution of individual inference tasks, without considering the contextual correlation unique to edge environments and the statistical nature of learning-based computation. Specifically, they treat inference workloads as individual black boxes and apply canonical system optimization techniques, developed over the last few decades, to handle them as yet another type of computation-intensive applications. As a result, deep learning inference on edge devices still face the ever increasing challenges of customization to edge device heterogeneity, fuzzy computation redundancy between inference tasks, and end-to-end deployment at scale. In this thesis, we propose the first framework that automates and scales the end-to-end process of deploying efficient deep learning inference from the cloud to heterogeneous edge devices. The framework consists of a series of service abstractions that handle DNN model tailoring, model indexing and query, and computation reuse for runtime inference respectively. Together, these services bridge the gap between deep learning training and inference, eliminate computation redundancy during inference execution, and further lower the barrier for deep learning algorithm and system co-optimization. To build efficient and scalable services, we take a unique algorithmic approach of harnessing the semantic correlation between the learning-based computation. Rather than viewing individual tasks as isolated black boxes, we optimize them collectively in a white box approach, proposing primitives to formulate the semantics of the deep learning workloads, algorithms to assess their hidden correlation (in terms of the input data, the neural network models, and the deployment trials) and merge common processing steps to minimize redundancy

    Focused Proofreading: Efficiently Extracting Connectomes from Segmented EM Images

    Full text link
    Identifying complex neural circuitry from electron microscopic (EM) images may help unlock the mysteries of the brain. However, identifying this circuitry requires time-consuming, manual tracing (proofreading) due to the size and intricacy of these image datasets, thus limiting state-of-the-art analysis to very small brain regions. Potential avenues to improve scalability include automatic image segmentation and crowd sourcing, but current efforts have had limited success. In this paper, we propose a new strategy, focused proofreading, that works with automatic segmentation and aims to limit proofreading to the regions of a dataset that are most impactful to the resulting circuit. We then introduce a novel workflow, which exploits biological information such as synapses, and apply it to a large dataset in the fly optic lobe. With our techniques, we achieve significant tracing speedups of 3-5x without sacrificing the quality of the resulting circuit. Furthermore, our methodology makes the task of proofreading much more accessible and hence potentially enhances the effectiveness of crowd sourcing

    Towards resilient EU HPC systems: A blueprint

    Get PDF
    This document aims to spearhead a Europe-wide discussion on HPC system resilience and to help the European HPC community define best practices for resilience. We analyse a wide range of state-of-the-art resilience mechanisms and recommend the most effective approaches to employ in large-scale HPC systems. Our guidelines will be useful in the allocation of available resources, as well as guiding researchers and research funding towards the enhancement of resilience approaches with the highest priority and utility. Although our work is focused on the needs of next generation HPC systems in Europe, the principles and evaluations are applicable globally.This work has received funding from the European Union’s Horizon 2020 research and innovation programme under the projects ECOSCALE (grant agreement No 671632), EPI (grant agreement No 826647), EuroEXA (grant agreement No 754337), Eurolab4HPC (grant agreement No 800962), EVOLVE (grant agreement No 825061), EXA2PRO (grant agreement No 801015), ExaNest (grant agreement No 671553), ExaNoDe (grant agreement No 671578), EXDCI-2 (grant agreement No 800957), LEGaTO (grant agreement No 780681), MB2020 (grant agreement No 779877), RECIPE (grant agreement No 801137) and SDK4ED (grant agreement No 780572). The work was also supported by the European Commission’s Seventh Framework Programme under the projects CLERECO (grant agreement No 611404), the NCSA-Inria-ANL-BSC-JSCRiken-UTK Joint-Laboratory for Extreme Scale Computing – JLESC (https://jlesc.github.io/), OMPI-X project (No ECP-2.3.1.17) and the Spanish Government through Severo Ochoa programme (SEV-2015-0493). This work was sponsored in part by the U.S. Department of Energy's Office of Advanced Scientific Computing Research, program managers Robinson Pino and Lucy Nowell. This manuscript has been authored by UT-Battelle, LLC under Contract No DE-AC05-00OR22725 with the U.S. Department of Energy.Preprin

    From viability to sustainability: the contribution of the viable systems approach (VSA)

    Get PDF
    The current dynamics of business systems require new ways of conceiving the role of single entities. On this basis, a complex of interactions between the company and the reference context must be activated to guarantee survival dynamics. From these considerations re-emerge the ideas of Peccei (2013) and King (2013) that recognise in the systemic thought the foundations for a sustainable society. The present study derives from these considerations, and aims at contributing to the advancement of the knowledge necessary to overcome the challenges in the sustainability field. The methodological approach, albeit heuristic, can be traced back to the positive scientific and constructivist method. The results of the study showed the prevalence of qualitative and subjective techniques, accompanied by the so-called inductive method, testifying to the intense interaction between the scholar and the object investigated. With regard to future research, it would be interesting to construct a flexible, scalable and extensible model to recover both a database and an ontology for the theoretical framework

    Accelerating Parallel Verification via Complementary Property Partitioning and Strategy Exploration

    Get PDF
    Industrial hardware verification tasks often require checking a large number of properties within a testbench. Verification tools often utilize parallelism in their solving orchestration to improve scalability, either in portfolio mode where different solver strategies run concurrently, or in partitioning mode where disjoint property subsets are verified independently. While most tools focus solely upon reducing end-to-end walltime, reducing overall CPU-time is a comparably-important goal influencing power consumption, competition for available machines, and IT costs. Portfolio approaches often degrade into highly-redundant work across processes, where similar strategies address properties in nearly-identical order. Partitioning should take property affinity into account, atomically verifying high-affinity properties to minimize redundant work of applying identical strategies on individual properties with nearly-identical logic cones. In this paper, we improve multi-property parallel verification with respect to both wall- and CPU-time. We extend affinity-based partitioning to guarantee complete utilization of available processes, with provable partition quality. We propose methods to minimize redundant computation, and dynamically optimize work distribution. We deploy our techniques in a sequential redundancy removal framework, using localization to solve non-inductive properties. Our techniques offer a median 2.4× speedup yielding 18.1% more property solves, as demonstrated by extensive experiments

    A Design Approach for Soft Errors Protection in Real-Time Systems

    Get PDF
    This paper proposes the use of metrics to refine system design for soft errors protection in system on chip architectures. Specifically this research shows the use of metrics in design space exploration that highlight where in the structure of the model and at what point in the behaviour, protection is needed against soft errors. As these metrics improve the ability of the system to provide functionality, they are referred to here as reliability metrics. Previous approaches to prevent soft errors focused on recovery after detection. Almost no research has been directed towards preventive measures. But in real-time systems, deadlines are performance requirements that absolutely must be met and a missed deadline constitutes an erroneous action and a possible system failure. This paper focuses on a preventive approach as a solution rather than recovery after detection. The intention of this research is to prevent serious loss of system functionality or system failure though it may not be able to eliminate the impact of soft errors completely
    • …
    corecore