220,513 research outputs found

    Instrumentation, performance visualization, and debugging tools for multiprocessors

    Get PDF
    The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessor architectures. However, without effective means to monitor (and visualize) program execution, debugging, and tuning parallel programs becomes intractably difficult as program complexity increases with the number of processors. Research on performance evaluation tools for multiprocessors is being carried out at ARC. Besides investigating new techniques for instrumenting, monitoring, and presenting the state of parallel program execution in a coherent and user-friendly manner, prototypes of software tools are being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Our current tool set, the Ames Instrumentation Systems (AIMS), incorporates features from various software systems developed in academia and industry. The execution of FORTRAN programs on the Intel iPSC/860 can be automatically instrumented and monitored. Performance data collected in this manner can be displayed graphically on workstations supporting X-Windows. We have successfully compared various parallel algorithms for computational fluid dynamics (CFD) applications in collaboration with scientists from the Numerical Aerodynamic Simulation Systems Division. By performing these comparisons, we show that performance monitors and debuggers such as AIMS are practical and can illuminate the complex dynamics that occur within parallel programs

    Automated data reduction workflows for astronomy

    Full text link
    Data from complex modern astronomical instruments often consist of a large number of different science and calibration files, and their reduction requires a variety of software tools. The execution chain of the tools represents a complex workflow that needs to be tuned and supervised, often by individual researchers that are not necessarily experts for any specific instrument. The efficiency of data reduction can be improved by using automatic workflows to organise data and execute the sequence of data reduction steps. To realize such efficiency gains, we designed a system that allows intuitive representation, execution and modification of the data reduction workflow, and has facilities for inspection and interaction with the data. The European Southern Observatory (ESO) has developed Reflex, an environment to automate data reduction workflows. Reflex is implemented as a package of customized components for the Kepler workflow engine. Kepler provides the graphical user interface to create an executable flowchart-like representation of the data reduction process. Key features of Reflex are a rule-based data organiser, infrastructure to re-use results, thorough book-keeping, data progeny tracking, interactive user interfaces, and a novel concept to exploit information created during data organisation for the workflow execution. Reflex includes novel concepts to increase the efficiency of astronomical data processing. While Reflex is a specific implementation of astronomical scientific workflows within the Kepler workflow engine, the overall design choices and methods can also be applied to other environments for running automated science workflows.Comment: 12 pages, 7 figure

    Paper Session II-C - The NASA Intelligent Synthesis Environment Advanced Learning Systems Initiative

    Get PDF
    The NASA Intelligent Synthesis Environment Program vision is to research, develop, acquire, validate, demonstrate and implement revolutionary engineering and science tools and processes for the design, development and execution of NASA’s missions in a collaborative and distributed fashion. The Cultural Infusion Element of the ISE Program intends to help bridge the gap between tools, methodologies and systems by developing training approaches and learning environments to fully utilize collaborative capabilities produced by ISE. This effort to bridge the gap will include both “pushing” the advanced learning systems into the Agency to enable program and project managers and employees to use the tools as well as foster the continual learning organization within NASA and “pulling” academia forward to an advanced learning systems capability within academia. Creating effective environments for life-long learning by engineering professionals is essential for the long-term viability of the aerospace industry. This paper will present the overarching objectives of the Advanced Learning System and present a description of the concepts NASA ISE has developed and plans to develop in the future

    ALOJA: A framework for benchmarking and predictive analytics in Hadoop deployments

    Get PDF
    This article presents the ALOJA project and its analytics tools, which leverages machine learning to interpret Big Data benchmark performance data and tuning. ALOJA is part of a long-term collaboration between BSC and Microsoft to automate the characterization of cost-effectiveness on Big Data deployments, currently focusing on Hadoop. Hadoop presents a complex run-time environment, where costs and performance depend on a large number of configuration choices. The ALOJA project has created an open, vendor-neutral repository, featuring over 40,000 Hadoop job executions and their performance details. The repository is accompanied by a test-bed and tools to deploy and evaluate the cost-effectiveness of different hardware configurations, parameters and Cloud services. Despite early success within ALOJA, a comprehensive study requires automation of modeling procedures to allow an analysis of large and resource-constrained search spaces. The predictive analytics extension, ALOJA-ML, provides an automated system allowing knowledge discovery by modeling environments from observed executions. The resulting models can forecast execution behaviors, predicting execution times for new configurations and hardware choices. That also enables model-based anomaly detection or efficient benchmark guidance by prioritizing executions. In addition, the community can benefit from ALOJA data-sets and framework to improve the design and deployment of Big Data applications.This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 639595). This work is partially supported by the Ministry of Economy of Spain under contracts TIN2012-34557 and 2014SGR1051.Peer ReviewedPostprint (published version

    Design and Implementation of a Tracer Driver: Easy and Efficient Dynamic Analyses of Constraint Logic Programs

    Full text link
    Tracers provide users with useful information about program executions. In this article, we propose a ``tracer driver''. From a single tracer, it provides a powerful front-end enabling multiple dynamic analysis tools to be easily implemented, while limiting the overhead of the trace generation. The relevant execution events are specified by flexible event patterns and a large variety of trace data can be given either systematically or ``on demand''. The proposed tracer driver has been designed in the context of constraint logic programming; experiments have been made within GNU-Prolog. Execution views provided by existing tools have been easily emulated with a negligible overhead. Experimental measures show that the flexibility and power of the described architecture lead to good performance. The tracer driver overhead is inversely proportional to the average time between two traced events. Whereas the principles of the tracer driver are independent of the traced programming language, it is best suited for high-level languages, such as constraint logic programming, where each traced execution event encompasses numerous low-level execution steps. Furthermore, constraint logic programming is especially hard to debug. The current environments do not provide all the useful dynamic analysis tools. They can significantly benefit from our tracer driver which enables dynamic analyses to be integrated at a very low cost.Comment: To appear in Theory and Practice of Logic Programming (TPLP), Cambridge University Press. 30 pages

    Model Exploration Using OpenMOLE - a workflow engine for large scale distributed design of experiments and parameter tuning

    Get PDF
    OpenMOLE is a scientific workflow engine with a strong emphasis on workload distribution. Workflows are designed using a high level Domain Specific Language (DSL) built on top of Scala. It exposes natural parallelism constructs to easily delegate the workload resulting from a workflow to a wide range of distributed computing environments. In this work, we briefly expose the strong assets of OpenMOLE and demonstrate its efficiency at exploring the parameter set of an agent simulation model. We perform a multi-objective optimisation on this model using computationally expensive Genetic Algorithms (GA). OpenMOLE hides the complexity of designing such an experiment thanks to its DSL, and transparently distributes the optimisation process. The example shows how an initialisation of the GA with a population of 200,000 individuals can be evaluated in one hour on the European Grid Infrastructure.Comment: IEEE High Performance Computing and Simulation conference 2015, Jun 2015, Amsterdam, Netherland

    SMiT: Local System Administration Across Disparate Environments Utilizing the Cloud

    Get PDF
    System administration can be tedious. Most IT departments maintain several (if not several hundred) computers, each of which requires periodic housecleaning: updating of software, clearing of log files, removing old cache files, etc. Compounding the problem is the computing environment itself. Because of the distributed nature of these computers, system administration time is often consumed in repetitive tasks that should be automated. Although current system administration tools exist, they are often centralized, unscalable, unintuitive, or inflexible. To meet the needs of system administrators and IT professionals, we developed the Script Management Tool (SMiT). SMiT is a web-based tool that permits administration of distributed computers from virtually anywhere via a common web browser. SMiT consists of a cloud-based server running on Google App Engine enabling users to intuitively create, manage, and deploy administration scripts. To support local execution of scripts, SMiT provides an execution engine that runs on the organization’s local machines and communicates with the server to fetch scripts, execute them, and deliver results back to the server. Because of its distributed asynchronous architecture SMiT is scalable to thousands of machines. SMiT is also extensible to a wide variety of system administration tasks via its plugin architecture
    • …
    corecore