45 research outputs found
The High Throughput Sequence Annotation Service (HT-SAS) â the shortcut from sequence to true Medline words
<p>Abstract</p> <p>Background</p> <p>Advances in high-throughput technologies available to modern biology have created an increasing flood of experimentally determined facts. Ordering, managing and describing these raw results is the first step which allows facts to become knowledge. Currently there are limited ways to automatically annotate such data, especially utilizing information deposited in published literature.</p> <p>Results</p> <p>To aid researchers in describing results from high-throughput experiments we developed HT-SAS, a web service for automatic annotation of proteins using general English words. For each protein a poll of Medline abstracts connected to homologous proteins is gathered using the UniProt-Medline link. Overrepresented words are detected using binomial statistics approximation. We tested our automatic approach with a protein test set from SGD to determine the accuracy and usefulness of our approach. We also applied the automatic annotation service to improve annotations of proteins from <it>Plasmodium bergei </it>expressed exclusively during the blood stage.</p> <p>Conclusion</p> <p>Using HT-SAS we created new, or enriched already established annotations for over 20% of proteins from <it>Plasmodium bergei </it>expressed in the blood stage, deposited in PlasmoDB. Our tests show this approach to information extraction provides highly specific keywords, often also when the number of abstracts is limited. Our service should be useful for manual curators, as a complement to manually curated information sources and for researchers working with protein datasets, especially from poorly characterized organisms.</p
Metallurgy study on swords from the Roman period burial ground in Czelin, West Pomeranian Voivodeship
Results of the metallurgy study on three, two-edged swords from a cremation
burial ground in Czelin, representing the Pompeii, Lachmirowice-Apa and VimoseIllerup types indicate a variability in the material used as well as in technique of their
production, and thus in the quality of the specimens. Two of them were made of
a single piece of metal with low (specimen of the Lachmirowice-Apa type) or
medium quality (specimen of the Pompeii type). A much higher level of craftsmanship is represented by the third sword of the Vimose-Illerup type, precisely
forged from several pieces of diverse, high-quality material, representing the socalled pattern welding techniqu
Polemika
ODPOWIEDĆč W SPRAWIE ANTROPOGENEZ
e-LiSe--an online tool for finding needles in the '(Medline) haystack'.
UNLABELLED
Using literature databases one can find not only known and true relations between processes but also less studied, non-obvious associations. The main problem with discovering such type of relevant biological information is 'selection'. The ability to distinguish between a true correlation (e.g. between different types of biological processes) and random chance that this correlation is statistically significant is crucial for any bio-medical research, literature mining being no exception. This problem is especially visible when searching for information which has not been studied and described in many publications. Therefore, a novel bio-linguistic statistical method is required, capable of 'selecting' true correlations, even when they are low-frequency associations. In this article, we present such statistical approach based on Z-score and implemented in a web-based application 'e-LiSe'.
AVAILABILITY
The software is available at http://miron.ibb.waw.pl/elise
Why similar protein sequences encode similar three-dimensional structures?
Evolutionarily related proteins have similar sequences. Such similarity is called homology and can be described using substitution matrices such as Blosum 60. Naturally occurring homologous proteins usually have similar stable tertiary structures and this fact is used in so-called homology modeling. In contrast, the artificial protein designed by the Regan group has 50% identical sequence to the B1 domain of Streptococcal IgG-binding protein and a structure similar to the protein Rop. In this study, we asked the question whether artificial similar protein sequences (pseudohomologs) tend to encode similar protein structures, such as proteins existing in nature. To answer this question, we designed sets of protein sequences (pseudohomologs) homologous to sequences having known three-dimensional structures (template structures), same number of identities, same composition and equal level of homology, according to Blosum 60 substitution matrix as the known natural homolog. We compared the structural features of homologs and pseudohomologs by fitting them to the template structure. The quality of such structures was evaluated by threading potentials. The packing quality was measured using three-dimensional homology models. The packing quality of the models was worse for the âpseudohomologsâ than for real homologs. The native homologs have better threading potentials (indicating better sequence-structure fit) in the native structure than the designed sequences. Therefore, we have shown that threading potentials and proper packing are evolutionarily more strongly conserved than sequence homology measured using the Blosum 60 matrix. Our results indicate that three-dimensional protein structure is evolutionarily more conserved than expected due to sequence conservation
A kinetic model of the evolution of a protein interaction network
Abstract
Background: Known protein interaction networks have very particular properties. Old proteins tend to have more
interactions than new ones. One of the best statistical representatives of this property is the node degree
distribution (distribution of proteins having a given number of interactions). It has previously been shown that this
distribution is very close to the sum of two distinct exponential components. In this paper, we asked: What are the
possible mechanisms of evolution for such types of networks? To answer this question, we tested a kinetic model
for simplified evolution of a protein interactome. Our proposed model considers the emergence of new genes and
interactions and the loss of old ones. We assumed that there are generally two coexisting classes of proteins.
Proteins constituting the first class are essential only for ecological adaptations and are easily lost when ecological
conditions change. Proteins of the second class are essential for basic life processes and, hence, are always
effectively protected against deletion. All proteins can transit between the above classes in both directions. We also
assumed that the phenomenon of gene duplication is always related to ecological adaptation and that a new copy
of a duplicated gene is not essential. According to this model, all proteins gain new interactions with a rate that
preferentially increases with the number of interactions (the rich get richer). Proteins can also gain interactions
because of duplication. Proteins lose their interactions both with and without the loss of partner genes.
Results: The proposed model reproduces the main properties of protein-protein interaction networks very well. The
connectivity of the oldest part of the interaction network is densest, and the node degree distribution follows the
sum of two shifted power-law functions, which is a theoretical generalization of the previous finding. The above
distribution covers the wide range of values of node degrees very well, much better than a power law or
generalized power law supplemented with an exponential cut-off. The presented model also relates the total
number of interactome links to the total number of interacting proteins. The theoretical results were for the
interactomes of A. thaliana, B. taurus, C. elegans, D. melanogaster, E. coli, H. pylori, H. sapiens, M. musculus, R.norvegicus and S. cerevisiae.
Conclusions: Using these approaches, the kinetic parameters could be estimated. Finally, the model revealed the
evolutionary kinetics of proteome formation, the phenomenon of protein differentiation and the process of gaining
new interactions