17 research outputs found

    The NetLogger Methodology for High Performance Distributed Systems Performance Analysis

    Full text link

    Performance of wireless local area networks in Malaysian institutions

    Get PDF
    Although Wireless Local Area Networks (WLANs) are widely used in many settings little work has been done in studying their impact and benefits. This study provides empirical indicators on the performance of the WLANs implemented on Malaysian Institutions of Higher Learning Our research adopted Deming’s P-D-C-A Model and modified it to Plan-Implement-Control-Evaluate (P-I-C-E) in establishing a performance measurement for a WLAN; hence WLAN Performance Index (WPi). The measurement consists of four key performance indicators (KPi), reflecting the performance of the four P-I-C-E dimensions. These performance indicators provide a guide for the institutions to take the necessary corrective and preventive actions in attaining an effective WLAN system. Benchmarking was conducted by comparing three institutions in identifying the best WLAN system measures. The WPi is then applied to the three Malaysian public institutions of higher learning (MIPTA), which have implemented such WLAN system. The study reveals the WPi of each of the MIPTA being measured, indicating the strength and weaknesses of each institutions. Also, we suggest corrective actions necessary in achieving an effective and efficient WLAN system. Gap analysis was done to the three MIPTAs findings comparative to the benchmarked WLAN system

    Precise Request Tracing and Performance Debugging for Multi-tier Services of Black Boxes

    Full text link
    As more and more multi-tier services are developed from commercial components or heterogeneous middleware without the source code available, both developers and administrators need a precise request tracing tool to help understand and debug performance problems of large concurrent services of black boxes. Previous work fails to resolve this issue in several ways: they either accept the imprecision of probabilistic correlation methods, or rely on knowledge of protocols to isolate requests in pursuit of tracing accuracy. This paper introduces a tool named PreciseTracer to help debug performance problems of multi-tier services of black boxes. Our contributions are two-fold: first, we propose a precise request tracing algorithm for multi-tier services of black boxes, which only uses application-independent knowledge; secondly, we present a component activity graph abstraction to represent causal paths of requests and facilitate end-to-end performance debugging. The low overhead and tolerance of noise make PreciseTracer a promising tracing tool for using on production systems

    The Java Management Extensions (JMX): Is Your Cluster Ready for Evolution?

    Get PDF
    The arrival of commodity hardware configurations with performance rivaling that offered by RISC workstations is resulting in important advances in the state of the art of building and running very large scalable clusters at "mass market" pricing levels. However, cluster middleware layers are still considered as static infrastructures which are not ready for evolution. In this paper, we claim that middleware layers based on both agent and Java technologies offer new opportunities to support clusters where services can be dynamically added, removed and reconfigured. To support this claim, we present the Java Management Extensions (JMX), a new Java agent based technology, and its application to implement two disjoint cluster management middleware services (a remote reboot service and a distributed infrastructure for collecting Log events) which share a unique agent-based infrastructure

    Online Event Correlations Analysis in System Logs of Large-Scale Cluster Systems

    Full text link

    Dynamic multi-resource monitoring for predictive job scheduling.

    Get PDF
    Standard job schedulers rely on either the user\u27s estimation, or a few approaches that use performance databases to keep information about job runtimes to predict future runs. Co-scheduling for improved resource utilization, however, requires more detailed information as regards behavior on multiple resources to make predictions about slowdowns. Thus, information about communication, I/O, and computation at application level is needed but hard to estimate by the user. Furthermore, dynamic adaptive resource allocation requires information about the different processes on different machine nodes. We present an intelligent monitoring tool, ScoPro, which provides such information. To make monitoring more feasible, ScoPro harnesses the dynamic instrument techniques, which postpone insertion of instrumentation code until the application is executing. To keep intrusion low, we limit monitoring to short test phases. (Abstract shortened by UMI.)Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .L586. Source: Masters Abstracts International, Volume: 44-03, page: 1407. Thesis (M.Sc.)--University of Windsor (Canada), 2005

    Performance Analysis Tool for HPC and Big Data Applications on Scientific Clusters

    Full text link
    Big data is prevalent in HPC computing. Many HPC projects rely on complex workflows to analyze terabytes or petabytes of data. These workflows often require running over thousands of CPU cores and performing simultaneous data accesses, data movements, and computation. It is challenging to analyze the performance involving terabytes or petabytes of workflow data or measurement data of the executions, from complex workflows over a large number of nodes and multiple parallel task executions. To help identify performance bottlenecks or debug the performance issues in large-scale scientific applications and scientific clusters, we have developed a performance analysis framework, using state-ofthe- art open-source big data processing tools. Our tool can ingest system logs and application performance measurements to extract key performance features, and apply the most sophisticated statistical tools and data mining methods on the performance data. It utilizes an efficient data processing engine to allow users to interactively analyze a large amount of different types of logs and measurements. To illustrate the functionality of the big data analysis framework, we conduct case studies on the workflows from an astronomy project known as the Palomar Transient Factory (PTF) and the job logs from the genome analysis scientific cluster

    \STATMOND: A Peer-To-Peer Status And Performance Monitor For Dynamic Resource Allocation On Parallel Computers

    Get PDF
    This thesis presents a decentralized tool STATMOND - to monitor the status of a peer-to-peer network. STATMOND provides an accurate measurement scheme for parameters such as CPU load and memory utilization on Linux clusters. The services of STATMOND are ubiquitous in that each computer measures and for- wards its data over the network and also maintains the data of other nodes in memory. The data are periodically updated, and users on any node can ‘see‘ the status and performance of the network based on these parameters. This thesis describes the problems confronting cluster computing, the necessity of monitoring tools and how STATMOND can be a step towards better allocation of resources for dynamic computing

    How Are We Doing? A Self-Assessment of the Quality of Services and Systems at NERSC - (Oct. 1, 1997-Dec. 31, 1998)

    Full text link
    corecore