164,858 research outputs found
Toward Linearizability Testing for Multi-Word Persistent Synchronization Primitives
Persistent memory makes it possible to recover in-memory data structures following a failure instead of rebuilding them from state saved in slow secondary storage. Implementing such recoverable data structures correctly is challenging as their underlying algorithms must deal with both parallelism and failures, which makes them especially susceptible to programming errors. Traditional proofs of correctness should therefore be combined with other methods, such as model checking or software testing, to minimize the likelihood of uncaught defects. This research focuses specifically on the algorithmic principles of software testing, particularly linearizability analysis, for multi-word persistent synchronization primitives such as conditional swap operations. We describe an efficient decision procedure for linearizability in this context, and discuss its practical applications in detecting previously-unknown bugs in implementations of multi-word persistent primitives
A topological approach for protein classification
Protein function and dynamics are closely related to its sequence and
structure. However prediction of protein function and dynamics from its
sequence and structure is still a fundamental challenge in molecular biology.
Protein classification, which is typically done through measuring the
similarity be- tween proteins based on protein sequence or physical
information, serves as a crucial step toward the understanding of protein
function and dynamics. Persistent homology is a new branch of algebraic
topology that has found its success in the topological data analysis in a
variety of disciplines, including molecular biology. The present work explores
the potential of using persistent homology as an indepen- dent tool for protein
classification. To this end, we propose a molecular topological fingerprint
based support vector machine (MTF-SVM) classifier. Specifically, we construct
machine learning feature vectors solely from protein topological fingerprints,
which are topological invariants generated during the filtration process. To
validate the present MTF-SVM approach, we consider four types of problems.
First, we study protein-drug binding by using the M2 channel protein of
influenza A virus. We achieve 96% accuracy in discriminating drug bound and
unbound M2 channels. Additionally, we examine the use of MTF-SVM for the
classification of hemoglobin molecules in their relaxed and taut forms and
obtain about 80% accuracy. The identification of all alpha, all beta, and
alpha-beta protein domains is carried out in our next study using 900 proteins.
We have found a 85% success in this identifica- tion. Finally, we apply the
present technique to 55 classification tasks of protein superfamilies over 1357
samples. An average accuracy of 82% is attained. The present study establishes
computational topology as an independent and effective alternative for protein
classification
GEANT4 : a simulation toolkit
Abstract Geant4 is a toolkit for simulating the passage of particles through matter. It includes a complete range of functionality including tracking, geometry, physics models and hits. The physics processes offered cover a comprehensive range, including electromagnetic, hadronic and optical processes, a large set of long-lived particles, materials and elements, over a wide energy range starting, in some cases, from 250 eV and extending in others to the TeV energy range. It has been designed and constructed to expose the physics models utilised, to handle complex geometries, and to enable its easy adaptation for optimal use in different sets of applications. The toolkit is the result of a worldwide collaboration of physicists and software engineers. It has been created exploiting software engineering and object-oriented technology and implemented in the C++ programming language. It has been used in applications in particle physics, nuclear physics, accelerator design, space engineering and medical physics. PACS: 07.05.Tp; 13; 2
A Framework to Synergize Partial Order Reduction with State Interpolation
We address the problem of reasoning about interleavings in safety
verification of concurrent programs. In the literature, there are two prominent
techniques for pruning the search space. First, there are well-investigated
trace-based methods, collectively known as "Partial Order Reduction (POR)",
which operate by weakening the concept of a trace by abstracting the total
order of its transitions into a partial order. Second, there is state-based
interpolation where a collection of formulas can be generalized by taking into
account the property to be verified. Our main contribution is a framework that
synergistically combines POR with state interpolation so that the sum is more
than its parts
Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening
This work introduces a number of algebraic topology approaches, such as
multicomponent persistent homology, multi-level persistent homology and
electrostatic persistence for the representation, characterization, and
description of small molecules and biomolecular complexes. Multicomponent
persistent homology retains critical chemical and biological information during
the topological simplification of biomolecular geometric complexity.
Multi-level persistent homology enables a tailored topological description of
inter- and/or intra-molecular interactions of interest. Electrostatic
persistence incorporates partial charge information into topological
invariants. These topological methods are paired with Wasserstein distance to
characterize similarities between molecules and are further integrated with a
variety of machine learning algorithms, including k-nearest neighbors, ensemble
of trees, and deep convolutional neural networks, to manifest their descriptive
and predictive powers for chemical and biological problems. Extensive numerical
experiments involving more than 4,000 protein-ligand complexes from the PDBBind
database and near 100,000 ligands and decoys in the DUD database are performed
to test respectively the scoring power and the virtual screening power of the
proposed topological approaches. It is demonstrated that the present approaches
outperform the modern machine learning based methods in protein-ligand binding
affinity predictions and ligand-decoy discrimination
TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions
Although deep learning approaches have had tremendous success in image, video
and audio processing, computer vision, and speech recognition, their
applications to three-dimensional (3D) biomolecular structural data sets have
been hindered by the entangled geometric complexity and biological complexity.
We introduce topology, i.e., element specific persistent homology (ESPH), to
untangle geometric complexity and biological complexity. ESPH represents 3D
complex geometry by one-dimensional (1D) topological invariants and retains
crucial biological information via a multichannel image representation. It is
able to reveal hidden structure-function relationships in biomolecules. We
further integrate ESPH and convolutional neural networks to construct a
multichannel topological neural network (TopologyNet) for the predictions of
protein-ligand binding affinities and protein stability changes upon mutation.
To overcome the limitations to deep learning arising from small and noisy
training sets, we present a multitask topological convolutional neural network
(MT-TCNN). We demonstrate that the present TopologyNet architectures outperform
other state-of-the-art methods in the predictions of protein-ligand binding
affinities, globular protein mutation impacts, and membrane protein mutation
impacts.Comment: 20 pages, 8 figures, 5 table
Statistical topological data analysis using persistence landscapes
We define a new topological summary for data that we call the persistence
landscape. Since this summary lies in a vector space, it is easy to combine
with tools from statistics and machine learning, in contrast to the standard
topological summaries. Viewed as a random variable with values in a Banach
space, this summary obeys a strong law of large numbers and a central limit
theorem. We show how a number of standard statistical tests can be used for
statistical inference using this summary. We also prove that this summary is
stable and that it can be used to provide lower bounds for the bottleneck and
Wasserstein distances.Comment: 26 pages, final version, to appear in Journal of Machine Learning
Research, includes two additional examples not in the journal version: random
geometric complexes and Erdos-Renyi random clique complexe
- âŠ