374 research outputs found

    Designing, Building, and Modeling Maneuverable Applications within Shared Computing Resources

    Get PDF
    Extending the military principle of maneuver into war-fighting domain of cyberspace, academic and military researchers have produced many theoretical and strategic works, though few have focused on researching actual applications and systems that apply this principle. We present our research in designing, building and modeling maneuverable applications in order to gain the system advantages of resource provisioning, application optimization, and cybersecurity improvement. We have coined the phrase “Maneuverable Applications” to be defined as distributed and parallel application that take advantage of the modification, relocation, addition or removal of computing resources, giving the perception of movement. Our work with maneuverable applications has been within shared computing resources, such as the Clemson University Palmetto cluster, where multiple users share access and time to a collection of inter-networked computers and servers. In this dissertation, we describe our implementation and analytic modeling of environments and systems to maneuver computational nodes, network capabilities, and security enhancements for overcoming challenges to a cyberspace platform. Specifically we describe our work to create a system to provision a big data computational resource within academic environments. We also present a computing testbed built to allow researchers to study network optimizations of data centers. We discuss our Petri Net model of an adaptable system, which increases its cybersecurity posture in the face of varying levels of threat from malicious actors. Lastly, we present work and investigation into integrating these technologies into a prototype resource manager for maneuverable applications and validating our model using this implementation

    Distributed simulation optimization and parameter exploration framework for the cloud

    Get PDF
    Simulation models are becoming an increasingly popular tool for the analysis and optimization of complex real systems in different fields. Finding an optimal system design requires performing a large sweep over the parameter space in an organized way. Hence, the model optimization process is extremely demanding from a computational point of view, as it requires careful, time-consuming, complex orchestration of coordinated executions. In this paper, we present the design of SOF (Simulation Optimization and exploration Framework in the cloud), a framework which exploits the computing power of a cloud computational environment in order to carry out effective and efficient simulation optimization strategies. SOF offers several attractive features. Firstly, SOF requires “zero configuration” as it does not require any additional software installed on the remote node; only standard Apache Hadoop and SSH access are sufficient. Secondly, SOF is transparent to the user, since the user is totally unaware that the system operates on a distributed environment. Finally, SOF is highly customizable and programmable, since it enables the running of different simulation optimization scenarios using diverse programming languages – provided that the hosting platform supports them – and different simulation toolkits, as developed by the modeler. The tool has been fully developed and is available on a public repository1 under the terms of the open source Apache License. It has been tested and validated on several private platforms, such as a dedicated cluster of workstations, as well as on public platforms, including the Hortonworks Data Platform and Amazon Web Services Elastic MapReduce solution

    MOLNs: A cloud platform for interactive, reproducible and scalable spatial stochastic computational experiments in systems biology using PyURDME

    Full text link
    Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools, a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments

    Modeling performance of Hadoop applications: A journey from queueing networks to stochastic well formed nets

    Get PDF
    Nowadays, many enterprises commit to the extraction of actionable knowledge from huge datasets as part of their core business activities. Applications belong to very different domains such as fraud detection or one-to-one marketing, and encompass business analytics and support to decision making in both private and public sectors. In these scenarios, a central place is held by the MapReduce framework and in particular its open source implementation, Apache Hadoop. In such environments, new challenges arise in the area of jobs performance prediction, with the needs to provide Service Level Agreement guarantees to the enduser and to avoid waste of computational resources. In this paper we provide performance analysis models to estimate MapReduce job execution times in Hadoop clusters governed by the YARN Capacity Scheduler. We propose models of increasing complexity and accuracy, ranging from queueing networks to stochastic well formed nets, able to estimate job performance under a number of scenarios of interest, including also unreliable resources. The accuracy of our models is evaluated by considering the TPC-DS industry benchmark running experiments on Amazon EC2 and the CINECA Italian supercomputing center. The results have shown that the average accuracy we can achieve is in the range 9–14%

    Orchestrated Platform for Cyber-Physical Systems

    Get PDF
    One of the main driving forces in the era of cyber-physical systems (CPSs) is the introduction of massive sensor networks (or nowadays various Internet of things solutions as well) into manufacturing processes, connected cars, precision agriculture, and so on. Therefore, large amounts of sensor data have to be ingested at the server side in order to generate and make the "twin digital model" or virtual factory of the existing physical processes for (among others) predictive simulation and scheduling purposes usable. In this paper, we focus on our ultimate goal, a novel software container-based approach with cloud agnostic orchestration facilities that enable the system operators in the industry to create and manage scalable, virtual IT platforms on-demand for these two typical major pillars of CPS: (1) server-side (i.e., back-end) framework for sensor networks and (2) configurable simulation tool for predicting the behavior of manufacturing systems. The paper discusses the scalability of the applied discrete-event simulation tool and the layered back-end framework starting from simple virtual machine-level to sophisticated multilevel autoscaling use case scenario. The presented achievements and evaluations leverage on (among others) the synergy of the existing EasySim simulator, our new CQueue software container manager, the continuously developed Octopus cloud orchestrator tool, and the latest version of the evolving MiCADO framework for integrating such tools into a unified platform

    Fluid Petri Nets for the Performance Evaluation of MapReduce Applications

    Get PDF
    Big Data applications allow to successfully analyze large amounts of data not necessarily structured, though at the same time they present new challenges. For example, predicting the performance of frameworks such as Hadoop can be a costly task, hence the necessity to provide models that can be a valuable support for designers and developers. This paper provides a new contribution in studying a novel modeling approach based on fluid Petri nets to predict MapReduce jobs execution time. The experiments we performed at CINECA, the Italian supercomputing center, have shown that the achieved accuracy is within 16% of the actual measurements on average
    corecore