58 research outputs found
Large sparse least squares computations
Orthogonal Givens factorization is a popular method for solving large sparse least squares problems. Row and column permutations of the data matrix are necessary to preserve sparsity, and reduce the computational effort during factorization. The computation of a solution is usually divided into a symbolic ordering phase, and a numerical factorization and solution phase. Some theoretical results on row ordering are obtained using a graph-theoretic representation. These results provide a basis for a symbolic Givens factorization. Column orderings are also discussed, and an efficient algorithm for the symbolic ordering phase is developed. Sometimes, due to sparsity considerations, it is advantageous to leave out some rows from the factorization, and then update only the solution with these rows. A method for updating the solution with additional rows or constraints is extended to rank-deficient problems. Finally, the application of sparse matrix methods to large unbalanced analysis of variance problems is discussed. Some of the developed algorithms are programmed and tested
Uncertainty Analysis of a Heavily Instrumented Building at Different Scales of Simulation
Simulation plays a big role in understanding the behavior of building envelopes. With the increasing availability of computational resources, it is feasible to conduct parametric simulations for applications such as software model calibration, building control optimization, or fault detection and diagnostics. In this paper, we present an uncertainty exploration of a building envelope’s thermal conductivity properties for a heavily instrumented residential building involving more than 200 sensors. A total of 156 input parameters were determined to be important by experts, which were then varied using a Markov Order process. Depending on the number of simulations in an ensemble, the techniques employed to meaningfully make sense of the information can be very different, and potentially challenging. This paper discusses different strategies one could employ when the number of simulation range from a few to tens of thousands of simulations in an ensemble. The paper highlights this and the associated computational challenge in the context of ensemble simulations where the chosen sampling process allows one to generate datasets consisting of just of a few simulations to an exponentially large intractable dataset with data in the hundreds of terabytes. Besides the computational and data management challenges, the paper will also presents meaningful visualization approaches that are candidates for extreme scale analysis. The method of analysis almost always depends on the experimental design. While Markov Ordering for sampling will be explicitly presented, the paper will also touch upon various other experimental design strategies and their resulting analysis methods in the context of scientific simulations. We expect the sampling and ensemble analysis at various scales to help us gain insight into unique issues of building energy modeling, especially at different scales of simulation. We also expect the analytic approaches employed for understanding the thermal properties of building envelopes to be beneficial for software calibration and building design. We demonstrate these in the context of a real-world, heavily instrumented building
A Spatially Correlated Competing Risks Time-to-Event Model for Supercomputer GPU Failure Data
Graphics processing units (GPUs) are widely used in many high-performance
computing (HPC) applications such as imaging/video processing and training
deep-learning models in artificial intelligence. GPUs installed in HPC systems
are often heavily used, and GPU failures occur during HPC system operations.
Thus, the reliability of GPUs is of interest for the overall reliability of HPC
systems. The Cray XK7 Titan supercomputer was one of the top ten supercomputers
in the world. The failure event times of more than 30,000 GPUs in Titan were
recorded and previous data analysis suggested that the failure time of a GPU
may be affected by the GPU's connectivity location inside the supercomputer
among other factors. In this paper, we conduct in-depth statistical modeling of
GPU failure times to study the effect of location on GPU failures under
competing risks with covariates and spatially correlated random effects. In
particular, two major failure types of GPUs in Titan are considered. The
connectivity locations of cabinets are modeled as spatially correlated random
effects, and the positions of GPUs inside each cabinet are treated as
covariates. A Bayesian framework is used for statistical inference. We also
compare different methods of estimation such as the maximum likelihood, which
is implemented via an expectation-maximization algorithm. Our results provide
interesting insights into GPU failures in HPC systems.Comment: 45 pages, 25 figure
Contrasting Climate Ensembles: A Model-based Visualization Approach for Analyzing Extreme Events
AbstractThe use of increasingly sophisticated means to simulate and observe natural phenomena has led to the production of larger and more complex data. As the size and complexity of this data increases, the task of data analysis becomes more challeng- ing. Determining complex relationships among variables requires new algorithm development. Addressing the challenge of handling large data necessitates that algorithm implementations target high performance computing platforms. In this work we present a technique that allows a user to study the interactions among multiple variables in the same spatial extents as the underlying data. The technique is implemented in an existing parallel analysis and visualization framework in order that it be applicable to the largest datasets. The foundation of our approach is to classify data points via inclusion in, or distance to, multivariate representations of relationships among a subset of the variables of a dataset. We abstract the space in which inclusion is calculated and through various space transformations we alleviate the necessity to consider variables’ scales and distributions when making comparisons. We apply this approach to the problem of highlighting variations in climate model ensembles
Recommended from our members
VACET: Proposed SciDAC2 Visualization and Analytics Center forEnabling Technologies
This paper accompanies a poster that is being presented atthe SciDAC 2006 meeting in Denver, CO. This project focuses on leveragingscientific visualization and analytics software technology as an enablingtechnology for increasing scientific productivity and insight. Advancesincomputational technology have resultedin an "information big bang,"which in turn has createda significant data understanding challenge. Thischallenge is widely acknowledged to be one of the primary bottlenecks incontemporary science. The vision for our Center is to respond directly tothat challenge by adapting, extending, creating when necessary anddeploying visualization and data understanding technologies for ourscience stakeholders. Using an organizational model as a Visualizationand Analytics Center for Enabling Technologies (VACET), we are wellpositioned to be responsive to the needs of a diverse set of scientificstakeholders in a coordinated fashion using a range of visualization,mathematics, statistics, computer and computational science and datamanagement technologies
Recommended from our members
Seeing the Unseeable
The SciDAC Visualization and Analytics Center for Enabling Technologies (VACET) isa highly productive effort combining the forces of leading visualization researchersfrom five different institutions to solve some of the most challenging dataunderstanding problems in modern science. The VACET technology portfolio isdiverse, spanning all typical visual data analysis use models and effectivelybalancing forward-looking research with focused software architecture andengineering resulting in a production-quality software infrastructure. One of the keyelements in VACET's success is a rich set of projects that are collaborations withscience stakeholders: these efforts focus on identifying and overcoming obstacles toscientific knowledge discovery in modern, large, and complex scientific datasets
Recommended from our members
Occam's Razor and Petascale Visual Data Analysis
One of the central challenges facing visualization research is how to effectively enable knowledge discovery. An effective approach will likely combine application architectures that are capable of running on today?s largest platforms to address the challenges posed by large data with visual data analysis techniques that help find, represent, and effectively convey scientifically interesting features and phenomena
- …