148 research outputs found
Alloy Informatics through Ab Initio Charge Density Profiles: Case Study of Hydrogen Effects in Face-Centered Cubic Crystals
Materials design has traditionally evolved through trial-error approaches,
mainly due to the non-local relationship between microstructures and properties
such as strength and toughness. We propose 'alloy informatics' as a machine
learning based prototype predictive approach for alloys and compounds, using
electron charge density profiles derived from first-principle calculations. We
demonstrate this framework in the case of hydrogen interstitials in
face-centered cubic crystals, showing that their differential electron charge
density profiles capture crystal properties and defect-crystal interaction
properties. Radial Distribution Functions (RDFs) of defect-induced differential
charge density perturbations highlight the resulting screening effect, and,
together with hydrogen Bader charges, strongly correlate to a large set of
atomic properties of the metal species forming the bulk crystal. We observe the
spontaneous emergence of classes of charge responses while coarse-graining over
crystal compositions. Nudge-Elastic-Band calculations show that RDFs and charge
features also connect to hydrogen migration energy barriers between
interstitial sites. Unsupervised machine-learning on RDFs supports
classification, unveiling compositional and configurational non-localities in
the similarities of the perturbed densities. Electron charge density
perturbations may be considered as bias-free descriptors for a large variety of
defects
Generalized empirical Bayesian methods for discovery of differential data in high-throughput biology
Motivation:
High-throughput data are now commonplace in biological research. Rapidly changing technologies and application mean that novel methods for detecting differential behaviour that account for a ‘large P, small n’ setting are required at an increasing rate. The development of such methods is, in general, being done on an ad hoc basis, requiring further development cycles and a lack of standardization between analyses.
Results:
We present here a generalized method for identifying differential behaviour within high-throughput biological data through empirical Bayesian methods. This approach is based on our baySeq algorithm for identification of differential expression in RNA-seq data based on a negative binomial distribution, and in paired data based on a beta-binomial distribution. Here we show how the same empirical Bayesian approach can be applied to any parametric distribution, removing the need for lengthy development of novel methods for differently distributed data. Comparisons with existing methods developed to address specific problems in high-throughput biological data show that these generic methods can achieve equivalent or better performance. A number of enhancements to the basic algorithm are also presented to increase flexibility and reduce computational costs.
Availability and implementation:
The methods are implemented in the R baySeq (v2) package, available on Bioconductor http://www.bioconductor.org/packages/release/bioc/html/baySeq.html.
Contact: [email protected]
Supplementary information:
Supplementary data are available at Bioinformatics online.This work was supported by European Research Council Advanced Investigator Grant ERC-2013-AdG 340642 – TRIBE.This is the author accepted manuscript. The final version is available from Oxford University Press via http://dx.doi.org/10.1093/bioinformatics/btv56
Improving CC-NUMA performance using instruction-based prediction
We propose Instruction-based Prediction as a means to optimize directory-based cache coherent NUMA shared-memory. Instruction-based prediction is based on observing the behavior of load and store instructions in relation to coherent events and predicting their future behavior. Although this technique is well established in the uniprocessor world, it has not been widely applied for optimizing transparent shared-memory. Typically, in this environment, prediction is based on datablock access history (address-based prediction) in the form of adaptive cache coherence protocols. The advantage of instruction-based prediction is that it requires few hardware resources in the form of small prediction structures per node to match (or exceed) the performance of address-based prediction. To show the potential of instruction-based prediction w
Identification And Optimization Of Sharing Patterns For Scalable Shared-Memory Multiprocessors
Distributed shared-memory architectures typically employ a directory-based protocol to maintain cache coherence. Identifying sharing patterns in parallel programs and applying specialized optimizations can increase cache-coherence protocol efficiency and yield performance improvements. In this thesis, I propose and study both optimizations to sharing patterns and techniques to identify sharing patterns. The main thrust of the thesis is GLOW, a comprehensive optimization for wide sharing---a sharing pattern that is a serious obstacle to scalability to large numbers of processors. I present GLOW in the form of extensions to the SCI ANSI/IEEE standard. GLOW is implemented in special network switches and incorporates characteristics that are not found together in previous proposals: scalable writes and scalable reads, network locality (by exploiting the abundance of widely-shared data to satisfy requests locally), simplicity, transparency to the base protocol, and network topology indepen..
Ipstash: A power-efficient memory architecture for ip-lookup
High-speed routers often use commodity, fullyassociative
Kiloprocessor Extensions to SCI
To expand the Scalable Coherent Interface's (SCI) capabilities so it can be used to efficiently handle sharing in systems of hundreds or even thousands of processors, the SCI working group is developing the Kiloprocessor Extensions to SCI. In this paper we describe the proposed GLOW and STEM kiloprocessor extensions to SCI. These two sets of extensions provide SCI with scalable reads and scalable writes to widely-shared data. This kind of datum represents one of the main obstacles to scalability for many cache coherence protocols. The GLOW extensions are intended for systems with complex networks of interconnected SCI rings, (e.g., large networks of workstations). GLOW extensions are based on building k-ary sharing trees that map well to the underlying topology. In contrast, STEM is intended for systems where GLOW is not applicable (e.g., topologies based on centralized switches). STEM defines algorithms to build and maintain binary sharing trees. We show that latencies of GLOW reads a..
- …