Search CORE

Springer - Publisher Connector

Predicting specificity in bZIP coiled-coil protein interactions

Author: Fong Jessica H.
Keating Amy E.
Singh Mona
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/01/2016
Field of study

We present a method for predicting protein-protein interactions mediated by the coiled-coil motif. When tested on interactions between nearly all human and yeast bZIP proteins, our method identifies 70% of strong interactions while maintaining that 92% of predictions are correct. Furthermore, cross-validation testing shows that including the bZIP experimental data significantly improves performance. Our method can be used to predict bZIP interactions in other genomes and is a promising approach for predicting coiled-coil interactions more generally

DSpace@MIT

An Approximate L^p Difference Algorithm for Massive Data Streams

Author: Jessica H. Fong
Martin Strauss
Publication venue: Discrete Mathematics & Theoretical Computer Science
Publication date: 01/01/2001
Field of study

Several recent papers have shown how to approximate the difference ∑ _i|a_i-b_i| or ∑ |a_i-b_i|^2 between two functions, when the function values a_i and b_i are given in a data stream, and their order is chosen by an adversary. These algorithms use little space (much less than would be needed to store the entire stream) and little time to process each item in the stream. They approximate with small relative error. Using different techniques, we show how to approximate the L^p-difference ∑ _i|a_i-b_i|^p for any rational-valued p∈(0,2], with comparable efficiency and error. We also show how to approximate ∑ _i|a_i-b_i|^p for larger values of p but with a worse error guarantee. Our results fill in gaps left by recent work, by providing an algorithm that is precisely tunable for the application at hand. These results can be used to assess the difference between two chronologically or physically separated massive data sets, making one quick pass over each data set, without buffering the data or requiring the data source to pause. For example, one can use our techniques to judge whether the traffic on two remote network routers are similar without requiring either router to transmit a copy of its traffic. A web search engine could use such algorithms to construct a library of small ''sketches,'' one for each distinct page on the web; one can approximate the extent to which new web pages duplicate old ones by comparing the sketches of the web pages. Such techniques will become increasingly important as the enormous scale, distributional nature, and one-pass processing requirements of data sets become more commonplace

Better Alternatives to OSPF Routing

Author: Fong Jessica H.
Gilbert Anna C.
Kannan Sampath
Strauss Martin J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2005
Field of study

The current standard for intra-domain network routing, Open ShortestPath First (OSPF), suffers from a number ofproblems-the tunable parameters (the weights) are hard tooptimize, the chosen paths are not robust underchanges in traffic or network state, and some network links are over-usedat the expense of others. We present prototypical scenarios that illustrate these problems.Then we propose several variants of a protocol to eliminate oralleviate them and demonstrate the improvements in performance underthose scenarios. We also prove that these protocols never performsignificantly worse than OSPF and show that for at least a limitedclass of network topologies, it is possible to find efficiently theoptimal weight settings. Some of the problems with OSPF are well known; indeed, there areseveral routing protocols that perform better than OSPF in routingquality (i.e., in terms of congestion, delay, etc.). OSPF’spopularity persists in part because of its efficiency with respect toseveral resource bounds. In contrast, many competing protocols thatprovide routing superior to OSPF are computationally prohibitive.Motivated by this consideration, we designed our protocols not only toachieve better routing quality than OSPF, but also to use resources inamount comparable with OSPF with respect to offline broadcastcommunication, size of and time to compute routing tables, packet deliverylatency, and packet header structure and size.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/41349/1/453_2005_Article_1161.pd

Deep Blue Documents at the University of Michigan

ComSin: database of protein structures in bound (complex) and unbound (single) states in relation to their intrinsic disorder

Author: Altschul
Anna R. Panchenko
Bateman
Benjamin A. Shoemaker
Berman
Fong
He
Huber
Jessica H. Fong
Kabsch
Krissinel
Letunic
Linding
Loh
Marchler-Bauer
Marchler-Bauer
Meereis
Meszaros
Michail Yu. Lobanov
Mohan
Olejniczak
Oxana V. Galzitskaya
Romero
Sergiy O. Garbuzynskiy
Shoemaker
Shoemaker
Shoemaker
Sickmeier
Sigalov
Stivers
Tompa
Tompa
Uversky
Wang
Wright
Xie
Zidek
Publication venue: Oxford University Press
Publication date
Field of study

Most of the proteins in a cell assemble into complexes to carry out their function. In this work, we have created a new database (named ComSin) of protein structures in bound (complex) and unbound (single) states to provide a researcher with exhaustive information on structures of the same or homologous proteins in bound and unbound states. From the complete Protein Data Bank (PDB), we selected 24 910 pairs of protein structures in bound and unbound states, and identified regions of intrinsic disorder. For 2448 pairs, the proteins in bound and unbound states are identical, while 7129 pairs have sequence identity 90% or larger. The developed server enables one to search for proteins in bound and unbound states with several options including sequence similarity between the corresponding proteins in bound and unbound states, and validation of interaction interfaces of protein complexes. Besides that, through our web server, one can obtain necessary information for studying disorder-to-order and order-to-disorder transitions upon complex formation, and analyze structural differences between proteins in bound and unbound states. The database is available at http://antares.protres.ru/comsin/

Inferred Biomolecular Interaction Server—a web server to analyze and predict protein interacting partners and binding sites

Author: Anna R. Panchenko
Aron Marchler-Bauer
Atwell
Benjamin A. Shoemaker
Bork
Brylinski
Campbell
Chen
Chen
Dachuan Zhang
Gerlt
Gibrat
Giot
Hegyi
Hernandez
Huang
Jessica H. Fong
Jones
Krissinel
Landgraf
Laurie
Li
Manoj Tyagi
Marchler-Bauer
Marchler-Bauer
Matthews
Pazos
Qin
Ratna R. Thangudu
Rentzsch
Shoemaker
Slonim
Snyder
Stein
Stephen H. Bryant
Sussman
Talavera
Teichmann
Thomas Madej
Wang
Wang
Wang
Yu
Publication venue: Oxford University Press
Publication date
Field of study

IBIS is the NCBI Inferred Biomolecular Interaction Server. This server organizes, analyzes and predicts interaction partners and locations of binding sites in proteins. IBIS provides annotations for different types of binding partners (protein, chemical, nucleic acid and peptides), and facilitates the mapping of a comprehensive biomolecular interaction network for a given protein query. IBIS reports interactions observed in experimentally determined structural complexes of a given protein, and at the same time IBIS infers binding sites/interacting partners by inspecting protein complexes formed by homologous proteins. Similar binding sites are clustered together based on their sequence and structure conservation. To emphasize biologically relevant binding sites, several algorithms are used for verification in terms of evolutionary conservation, biological importance of binding partners, size and stability of interfaces, as well as evidence from the published literature. IBIS is updated regularly and is freely accessible via http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.html

Public Library of Science (PLOS)

Whole-Genome Sequencing of a Single Proband Together with Linkage Analysis Identifies a Mendelian Disease Gene

Author: AAL Jorge
Curtis E. Gumbs
D Botstein
David B. Goldstein
David Valle
Dimitrios Avramopoulos
Dongliang Ge
Elizabeth T. Cirulli
Elizabeth Wohler
ER Mardis
Eric L. Stevens
George Thomas
GR Abecasis
Gregory S. Barsh
Gretchen L. Oswald
GS Bassett
H-H Ropers
J Amberger
Jason P. Smith
Jessica M. Maia
Jonathan Pevsner
Julie E. Hoover-Fong
JVMG Bovee
K Koslowski
K Oishi
Kevin V. Shianna
KI Goh
LA Kennedy
M Choi
M Tartaglia
Nara L. M. Sobreira
P Maroteaux
RS Lachman
SB Ng
SB Ng
SE Antonarakis
TM Saxton
WE Tidyman
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Although more than 2,400 genes have been shown to contain variants that cause Mendelian disease, there are still several thousand such diseases yet to be molecularly defined. The ability of new whole-genome sequencing technologies to rapidly indentify most of the genetic variants in any given genome opens an exciting opportunity to identify these disease genes. Here we sequenced the whole genome of a single patient with the dominant Mendelian disease, metachondromatosis (OMIM 156250), and used partial linkage data from her small family to focus our search for the responsible variant. In the proband, we identified an 11 bp deletion in exon four of PTPN11, which alters frame, results in premature translation termination, and co-segregates with the phenotype. In a second metachondromatosis family, we confirmed our result by identifying a nonsense mutation in exon 4 of PTPN11 that also co-segregates with the phenotype. Sequencing PTPN11 exon 4 in 469 controls showed no such protein truncating variants, supporting the pathogenicity of these two mutations. This combination of a new technology and a classical genetic approach provides a powerful strategy to discover the genes responsible for unexplained Mendelian disorders

CiteSeerX

Infoscience - École polytechnique fédérale de Lausanne

DukeSpace

The Characterization of Twenty Sequenced Human Genomes

We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten “case” genomes from individuals with severe hemophilia A and ten “control” genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs) discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways

Public Library of Science (PLOS)