237 research outputs found

    Efficient Intra-Rack Resource Disaggregation for HPC Using Co-Packaged DWDM Photonics

    Full text link
    The diversity of workload requirements and increasing hardware heterogeneity in emerging high performance computing (HPC) systems motivate resource disaggregation. Resource disaggregation allows compute and memory resources to be allocated individually as required to each workload. However, it is unclear how to efficiently realize this capability and cost-effectively meet the stringent bandwidth and latency requirements of HPC applications. To that end, we describe how modern photonics can be co-designed with modern HPC racks to implement flexible intra-rack resource disaggregation and fully meet the bit error rate (BER) and high escape bandwidth of all chip types in modern HPC racks. Our photonic-based disaggregated rack provides an average application speedup of 11% (46% maximum) for 25 CPU and 61% for 24 GPU benchmarks compared to a similar system that instead uses modern electronic switches for disaggregation. Using observed resource usage from a production system, we estimate that an iso-performance intra-rack disaggregated HPC system using photonics would require 4x fewer memory modules and 2x fewer NICs than a non-disaggregated baseline.Comment: 15 pages, 12 figures, 4 tables. Published in IEEE Cluster 202

    A Framework for Uplink Intercell Interference Modeling with Channel-Based Scheduling

    Full text link
    This paper presents a novel framework for modeling the uplink intercell interference (ICI) in a multiuser cellular network. The proposed framework assists in quantifying the impact of various fading channel models and state-of-the-art scheduling schemes on the uplink ICI. Firstly, we derive a semianalytical expression for the distribution of the location of the scheduled user in a given cell considering a wide range of scheduling schemes. Based on this, we derive the distribution and moment generating function (MGF) of the uplink ICI considering a single interfering cell. Consequently, we determine the MGF of the cumulative ICI observed from all interfering cells and derive explicit MGF expressions for three typical fading models. Finally, we utilize the obtained expressions to evaluate important network performance metrics such as the outage probability, ergodic capacity, and average fairness numerically. Monte-Carlo simulation results are provided to demonstrate the efficacy of the derived analytical expressions.Comment: IEEE Transactions on Wireless Communications, 2013. arXiv admin note: substantial text overlap with arXiv:1206.229

    Advanced flight computer. Special study

    Get PDF
    This report documents a special study to define a 32-bit radiation hardened, SEU tolerant flight computer architecture, and to investigate current or near-term technologies and development efforts that contribute to the Advanced Flight Computer (AFC) design and development. An AFC processing node architecture is defined. Each node may consist of a multi-chip processor as needed. The modular, building block approach uses VLSI technology and packaging methods that demonstrate a feasible AFC module in 1998 that meets that AFC goals. The defined architecture and approach demonstrate a clear low-risk, low-cost path to the 1998 production goal, with intermediate prototypes in 1996

    Thermal, Elastic and Seismic Signature of High-Resolution Mantle Circulation Models

    Get PDF
    A long-standing question in the study of Earth’s deep interior is the origin of seismic mantle heterogeneity. The challenge is to efficiently mine the wealth of information available in complex seismic waveforms and to separate the potential contributions of thermal anomalies and compositional variations. High expectations to gain new insight currently lie within the application of high-performance computing to geophysical problems. Modern supercomputers allow, for example, the simulation of global mantle flow at Earth-like convective vigor or seismic wave propagation through complex three-dimensional structures. The sophisticated computational tools incorporate a variety of physical phenomena and result in synthetic datasets that show a complexity comparable to real observations. However, it is so far not clear how to combine the results from the various disciplines in a consistent manner to obtain a better understanding of deep Earth structure from the expensive large-scale numerical simulations. In particular, it is important to understand how to build conceptual models of Earth’s mantle based on geodynamic considerations that can be quantitatively assessed and used to test specific hypotheses. One specific goal is to generate seismic heterogeneity from dynamic flow calculations that can be used in global wave propagation simulations so that synthetic seismograms can be directly compared to seismic data without the need to perform inversions. In the multi-disciplinary study presented here, a new method is developed to theoretically predict and assess seismic mantle heterogeneity. Forward modeling of global mantle flow is combined with information from mineral physics and seismology. Temperatures inside the mantle are obtained by generating a new class of mantle circulation models at very high numerical resolution. The global average grid spacing of ~25 km (around 80 million finite elements) allows for the simulation of flow at Rayleigh numbers on the order of 10^9 and to resolve a thermal boundary layer thickness of around 100 km. To assess the predicted present day temperature fields, the geodynamic flow calculations are post-processed with published thermodynamically self-consistent models of mantle mineralogy for a pyrolite composition to convert thermal structure into elastic parameters. Quantitative predictions of the magnitudes of seismic velocity and density variations are thereby possible due to the appropriately high numerical resolution necessary to obtain temperature variations that are consistent with the mineralogical conversion. The resulting structures are compared to tomographic models based on a variety of statistical measures taking into account the limited resolving power of the seismic data. In a final step, the geodynamic models are investigated with respect to the influence of strong convective mass transport on the stability of Earth’s rotation axis. This additional and independent analysis provides information on whether strongly bottom heated isochemical mantle circulation can be reconciled with paleomagnetic estimates of true polar wander. One specific question that can be addressed with this approach is the origin of two large regions of strongly reduced seismic velocities in the lowermost mantle. Several seismological observations are interpreted as being caused by compositional variations. However, a large number of recent geodynamical, mineralogical and also seismological studies argue for a strong thermal gradient across the core-mantle boundary that might provide an alternative explanation through the resulting large temperature variations. Here, the forward modeling approach is used to test the assumption whether the presence of a strong thermal gradient in isochemical whole mantle flow is compatible with a variety of geophysical observations. The results show that the temperature variations deduced from the new high-resolution mantle circulation models are capable of explaining gross statistical features of mantle structure mapped by tomography. The main finding is that models with strong core heating, which also give a surface heat flux consistent with observations, yield realistic depth profiles of root-mean-square (RMS) variations of shear wave velocity. Most importantly, only models with a large core contribution to the mantle energy budget are compatible with the strong negative seismic anomalies in the large low velocity provinces of the lower mantle. Taking into account the effects of limited resolving power of seismic data on the magnitudes of predicted seismic heterogeneity further improves this match to tomographic models. This illustrates that seismic heterogeneity is likely dominated by thermal variations and thus limits the possible role of chemical heterogeneity in the lower mantle. Altogether, the results strengthen the notion of strongly bottom heated isochemical whole mantle flow with a pyrolite composition. Furthermore, these findings give confidence in the consistency of the presented approach and demonstrate the great potential of geophysical large-scale high-performance simulations and their application to seismic data and tomographic models

    Doctor of Philosophy

    Get PDF
    dissertationScene labeling is the problem of assigning an object label to each pixel of a given image. It is the primary step towards image understanding and unifies object recognition and image segmentation in a single framework. A perfect scene labeling framework detects and densely labels every region and every object that exists in an image. This task is of substantial importance in a wide range of applications in computer vision. Contextual information plays an important role in scene labeling frameworks. A contextual model utilizes the relationships among the objects in a scene to facilitate object detection and image segmentation. Using contextual information in an effective way is one of the main questions that should be answered in any scene labeling framework. In this dissertation, we develop two scene labeling frameworks that rely heavily on contextual information to improve the performance over state-of-the-art methods. The first model, called the multiclass multiscale contextual model (MCMS), uses contextual information from multiple objects and at different scales for learning discriminative models in a supervised setting. The MCMS model incorporates crossobject and interobject information into one probabilistic framework, and thus is able to capture geometrical relationships and dependencies among multiple objects in addition to local information from each single object present in an image. The second model, called the contextual hierarchical model (CHM), learns contextual information in a hierarchy for scene labeling. At each level of the hierarchy, a classifier is trained based on downsampled input images and outputs of previous levels. The CHM then incorporates the resulting multiresolution contextual information into a classifier to segment the input image at original resolution. This training strategy allows for optimization of a joint posterior probability at multiple resolutions through the hierarchy. We demonstrate the performance of CHM on different challenging tasks such as outdoor scene labeling and edge detection in natural images and membrane detection in electron microscopy images. We also introduce two novel classification methods. WNS-AdaBoost speeds up the training of AdaBoost by providing a compact representation of a training set. Disjunctive normal random forest (DNRF) is an ensemble method that is able to learn complex decision boundaries and achieves low generalization error by optimizing a single objective function for each weak classifier in the ensemble. Finally, a segmentation framework is introduced that exploits both shape information and regional statistics to segment irregularly shaped intracellular structures such as mitochondria in electron microscopy images

    A Parallel Processor System for Nuclear Shell-Model Calculations

    Get PDF
    This thesis describes the design and implementation of a dedicated parallel processor system for nuclear shell-model calculations. The purpose of these calculations is to determine nuclear energy eigenvalues by the tridiagonalisation of the nuclear Hamiltonian matrix using the Lanczos method. The Theoretical Nuclear Structure group at Glasgow University's Physics Department would normally perform this type of calculation on a high-performance main-frame computer. However these machines have limitations which restrict the number and scope of the calculations that can be performed. The Shell Model Processor system consists of a Multiple Microprocessor Unit (MMPU) driven by a highly pipelined dedicated front-end processor. The MMPU has a modular, moderately coupled, MIMD architecture based on autonomous processing modules. The elements within the system communicate via three shared buses. The front-end is responsible for determining the position of non-zero elements within the Hamiltonian matrix. Once the position of an element has been found it is passed to one of the free processing modules within the MMPU. The processing module then determines the value of the matrix element and performs the appropriate arithmetic to accumulate the resultant Lanczos vector. Two such processing modules have been developed. The most recently developed module is based on two MC68000 16/32 bit microprocessors. In addition there are two supervisory processor modules, one of which controls the front-end and also assists it in its function. The other module has privileged system capabilities and is responsible for supervising the system as a whole. The system has been successfully tested and performance figures are presented. The future expansion of the system to allow it to perform larger calculations is also discussed

    Post-transcriptional homeostasis and regulation of MCM2–7 in mammalian cells

    Get PDF
    The MiniChromosome Maintenance 2-7 (MCM2-7) complex provides essential replicative helicase function. Insufficient MCMs impair the cell cycle and cause genomic instability (GIN), leading to cancer and developmental defects in mice. Remarkably, depletion or mutation of one Mcm can decrease all Mcm levels. Here, we use mice and cells bearing a GIN-causing hypomophic allele of Mcm4 (Chaos3), in conjunction with disruption alleles of other Mcms, to reveal two new mechanisms that regulate MCM protein levels and pre-RC formation. First, the Mcm4Chaos3 allele, which disrupts MCM4:MCM6 interaction, triggers a Dicer1 and Drosha-dependent ∼40% reduction in Mcm2–7 mRNAs. The decreases in Mcm mRNAs coincide with up-regulation of the miR-34 family of microRNAs, which is known to be Trp53-regulated and target Mcms. Second, MCM3 acts as a negative regulator of the MCM2–7 helicase in vivo by complexing with MCM5 in a manner dependent upon a nuclear-export signal-like domain, blocking the recruitment of MCMs onto chromatin. Therefore, the stoichiometry of MCM components and their localization is controlled post-transcriptionally at both the mRNA and protein levels. Alterations to these pathways cause significant defects in cell growth reflected by disease phenotypes in mice
    corecore