4 research outputs found

    Project Final Report: HPC-Colony II

    Full text link
    This report recounts the HPC Colony II Project which was a computer science effort funded by DOE's Advanced Scientific Computing Research office. The project included researchers from ORNL, IBM, and the University of Illinois at Urbana-Champaign. The topic of the effort was adaptive system software for extreme scale parallel machines. A description of findings is included

    A GPU-Oriented Algorithm Design for Secant-Based Dimensionality Reduction

    Full text link
    Dimensionality-reduction techniques are a fundamental tool for extracting useful information from high-dimensional data sets. Because secant sets encode manifold geometry, they are a useful tool for designing meaningful data-reduction algorithms. In one such approach, the goal is to construct a projection that maximally avoids secant directions and hence ensures that distinct data points are not mapped too close together in the reduced space. This type of algorithm is based on a mathematical framework inspired by the constructive proof of Whitney's embedding theorem from differential topology. Computing all (unit) secants for a set of points is by nature computationally expensive, thus opening the door for exploitation of GPU architecture for achieving fast versions of these algorithms. We present a polynomial-time data-reduction algorithm that produces a meaningful low-dimensional representation of a data set by iteratively constructing improved projections within the framework described above. Key to our algorithm design and implementation is the use of GPUs which, among other things, minimizes the computational time required for the calculation of all secant lines. One goal of this report is to share ideas with GPU experts and to discuss a class of mathematical algorithms that may be of interest to the broader GPU community.Comment: To appear in the 17th IEEE International Symposium on Parallel and Distributed Computing, Geneva, Switzerland 201

    Anomaly Detection in Host Signaling Pathways for the Early Prognosis of Acute Infection

    Get PDF
    Clinical diagnosis of acute infectious diseases during the early stages of infection is critical to administering the appropriate treatment to improve the disease outcome. We present a data driven analysis of the human cellular response to respiratory viruses including influenza, respiratory syncytia virus, and human rhinovirus, and compared this with the response to the bacterial endotoxin, Lipopolysaccharides (LPS). Using an anomaly detection framework we identified pathways that clearly distinguish between asymptomatic and symptomatic patients infected with the four different respiratory viruses and that accurately diagnosed patients exposed to a bacterial infection. Connectivity pathway analysis comparing the viral and bacterial diagnostic signatures identified host cellular pathways that were unique to patients exposed to LPS endotoxin indicating this type of analysis could be used to identify host biomarkers that can differentiate clinical etiologies of acute infection. We applied the Multivariate State Estimation Technique (MSET) on two human influenza (H1N1 and H3N2) gene expression data sets to define host networks perturbed in the asymptomatic phase of infection. Our analysis identified pathways in the respiratory virus diagnostic signature as prognostic biomarkers that triggered prior to clinical presentation of acute symptoms. These early warning pathways correctly predicted that almost half of the subjects would become symptomatic in less than forty hours post-infection and that three of the 18 subjects would become symptomatic after only 8 hours. These results provide a proof-of-concept for utility of anomaly detection algorithms to classify host pathway signatures that can identify presymptomatic signatures of acute diseases and differentiate between etiologies of infection. On a global scale, acute respiratory infections cause a significant proportion of human co-morbidities and account for 4.25 million deaths annually. The development of clinical diagnostic tools to distinguish between acute viral and bacterial respiratory infections is critical to improve patient care and limit the overuse of antibiotics in the medical community. The identification of prognostic respiratory virus biomarkers provides an early warning system that is capable of predicting which subjects will become symptomatic to expand our medical diagnostic capabilities and treatment options for acute infectious diseases. The host response to acute infection may be viewed as a deterministic signaling network responsible for maintaining the health of the host organism. We identify pathway signatures that reflect the very earliest perturbations in the host response to acute infection. These pathways provide a monitor the health state of the host using anomaly detection to quantify and predict health outcomes to pathogens

    A proactive fault tolerance framework for high performance computing (HPC) systems in the cloud

    Get PDF
    High Performance Computing (HPC) systems have been widely used by scientists and researchers in both industry and university laboratories to solve advanced computation problems. Most advanced computation problems are either data-intensive or computation-intensive. They may take hours, days or even weeks to complete execution. For example, some of the traditional HPC systems computations run on 100,000 processors for weeks. Consequently traditional HPC systems often require huge capital investments. As a result, scientists and researchers sometimes have to wait in long queues to access shared, expensive HPC systems. Cloud computing, on the other hand, offers new computing paradigms, capacity, and flexible solutions for both business and HPC applications. Some of the computation-intensive applications that are usually executed in traditional HPC systems can now be executed in the cloud. Cloud computing price model eliminates huge capital investments. However, even for cloud-based HPC systems, fault tolerance is still an issue of growing concern. The large number of virtual machines and electronic components, as well as software complexity and overall system reliability, availability and serviceability (RAS), are factors with which HPC systems in the cloud must contend. The reactive fault tolerance approach of checkpoint/restart, which is commonly used in HPC systems, does not scale well in the cloud due to resource sharing and distributed systems networks. Hence, the need for reliable fault tolerant HPC systems is even greater in a cloud environment. In this thesis we present a proactive fault tolerance approach to HPC systems in the cloud to reduce the wall-clock execution time, as well as dollar cost, in the presence of hardware failure. We have developed a generic fault tolerance algorithm for HPC systems in the cloud. We have further developed a cost model for executing computation-intensive applications on HPC systems in the cloud. Our experimental results obtained from a real cloud execution environment show that the wall-clock execution time and cost of running computation-intensive applications in the cloud can be considerably reduced compared to checkpoint and redundancy techniques used in traditional HPC systems
    corecore