31,544 research outputs found
SPRINT: Ultrafast protein-protein interaction prediction of the entire human interactome
Proteins perform their functions usually by interacting with other proteins.
Predicting which proteins interact is a fundamental problem. Experimental
methods are slow, expensive, and have a high rate of error. Many computational
methods have been proposed among which sequence-based ones are very promising.
However, so far no such method is able to predict effectively the entire human
interactome: they require too much time or memory. We present SPRINT (Scoring
PRotein INTeractions), a new sequence-based algorithm and tool for predicting
protein-protein interactions. We comprehensively compare SPRINT with
state-of-the-art programs on seven most reliable human PPI datasets and show
that it is more accurate while running orders of magnitude faster and using
very little memory. SPRINT is the only program that can predict the entire
human interactome. Our goal is to transform the very challenging problem of
predicting the entire human interactome into a routine task. The source code of
SPRINT is freely available from github.com/lucian-ilie/SPRINT/ and the datasets
and predicted PPIs from www.csd.uwo.ca/faculty/ilie/SPRINT/
Efficient Antihydrogen Detection in Antimatter Physics by Deep Learning
Antihydrogen is at the forefront of antimatter research at the CERN
Antiproton Decelerator. Experiments aiming to test the fundamental CPT symmetry
and antigravity effects require the efficient detection of antihydrogen
annihilation events, which is performed using highly granular tracking
detectors installed around an antimatter trap. Improving the efficiency of the
antihydrogen annihilation detection plays a central role in the final
sensitivity of the experiments. We propose deep learning as a novel technique
to analyze antihydrogen annihilation data, and compare its performance with a
traditional track and vertex reconstruction method. We report that the deep
learning approach yields significant improvement, tripling event coverage while
simultaneously improving performance by over 5% in terms of Area Under Curve
(AUC)
Efficient seeding techniques for protein similarity search
We apply the concept of subset seeds proposed in [1] to similarity search in
protein sequences. The main question studied is the design of efficient seed
alphabets to construct seeds with optimal sensitivity/selectivity trade-offs.
We propose several different design methods and use them to construct several
alphabets.We then perform an analysis of seeds built over those alphabet and
compare them with the standard Blastp seeding method [2,3], as well as with the
family of vector seeds proposed in [4]. While the formalism of subset seed is
less expressive (but less costly to implement) than the accumulative principle
used in Blastp and vector seeds, our seeds show a similar or even better
performance than Blastp on Bernoulli models of proteins compatible with the
common BLOSUM62 matrix
Efficient seeding techniques for protein similarity search
We apply the concept of subset seeds proposed in [1] to similarity search in
protein sequences. The main question studied is the design of efficient seed
alphabets to construct seeds with optimal sensitivity/selectivity trade-offs.
We propose several different design methods and use them to construct several
alphabets.We then perform an analysis of seeds built over those alphabet and
compare them with the standard Blastp seeding method [2,3], as well as with the
family of vector seeds proposed in [4]. While the formalism of subset seed is
less expressive (but less costly to implement) than the accumulative principle
used in Blastp and vector seeds, our seeds show a similar or even better
performance than Blastp on Bernoulli models of proteins compatible with the
common BLOSUM62 matrix
- …