Search CORE

236 research outputs found

Identification of Trace Element-Containing Proteins in Genomic Databases

Author: Fomenko Dmitri E.
Gladyshev Vadim N.
Hatfield Dolph L.
Kryukov Gregory V.
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2004
Field of study

Development of bioinformatics tools provided researchers with the ability to identify full sets of trace element–containing proteins in organisms for which complete genomic sequences are available. Recently, independent bioinformatics methods were used to identify all, or almost all, genes encoding selenocysteine-containing proteins in human, mouse, and Drosophila genomes, characterizing entire selenoproteomes in these organisms. It also should be possible to search for entire sets of other trace element–associated proteins, such as metal-containing proteins, although methods for their identification are still in development

DigitalCommons@University of Nebraska

STING Report: convenient web-based application for graphic and tabular presentations of protein sequence, structure and function descriptors from the STING database

Author: Baudet Christian
Fileto Renato
Higa Roberto H.
Krauchenco João N.
Kuser Paula R.
Mancini Adauto L.
Montagner Arnaldo J.
Neshich Goran
Palandrani Juliana F.
Pinto Ivan P.
Yamagishi Michel E. B.
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

The Sting Report is a versatile web-based application for extraction and presentation of detailed information about any individual amino acid of a protein structure stored in the STING Database. The extracted information is presented as a series of GIF images and tables, containing the values of up to 125 sequence/structure/function descriptors/parameters. The GIF images are generated by the Gold STING modules. The HTML page resulting from the STING Report query can be printed and, most importantly, it can be composed and visualized on a computer platform with an elementary configuration. Using the STING Report, a user can generate a collection of customized reports for amino acids of specific interest. Such a collection comes as an ideal match for a demand for the rapid and detailed consultation and documentation of data about structure/function. The inclusion of information generated with STING Report in a research report or even a textbook, allows for the increased density of its contents. STING Report is freely accessible within the Gold STING Suite at http://www.cbi.cnptia.embrapa.br, http://www.es.embnet.org/SMS/, http://gibk26.bse.kyutech.ac.jp/SMS/ and http://trantor.bioc.columbia.edu/SMS (option: STING Report)

Crossref

PubMed Central

String Matching with Variable Length Gaps

Author: Aho
Crochemore
David Kofoed Wind
Fredriksson
Hjalte Wedel Vildhøj
Hofmann
Inge Li Gørtz
Knuth
Morgante
Myers
Myers
Myers
Navarro
Navarro
Philip Bille
Thompson
Publication venue
Publication date: 01/01/2010
Field of study

We consider string matching with variable length gaps. Given a string

T

and a pattern

P

consisting of strings separated by variable length gaps (arbitrary strings of length in a specified range), the problem is to find all ending positions of substrings in

T

that match

P

. This problem is a basic primitive in computational biology applications. Let

m

and

n

be the lengths of

P

and

T

, respectively, and let

k

be the number of strings in

P

. We present a new algorithm achieving time

O(n\log k + m +\alpha)

and space

O(m + A)

, where

A

is the sum of the lower bounds of the lengths of the gaps in

P

and

\alpha

is the total number of occurrences of the strings in

P

within

T

. Compared to the previous results this bound essentially achieves the best known time and space complexities simultaneously. Consequently, our algorithm obtains the best known bounds for almost all combinations of

m

n

k

A

, and

\alpha

. Our algorithm is surprisingly simple and straightforward to implement. We also present algorithms for finding and encoding the positions of all strings in

P

for every match of the pattern.Comment: draft of full version, extended abstract at SPIRE 201

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Crossref

Online Research Database In Technology

String Indexing for Patterns with Wildcards

Author: A. Tam
B. Chazelle
D. Harel
D. Tsur
G. Chen
G. Landau
G. Landau
G. Navarro
H.L. Chan
K. Hofmann
L.P. Coelho
M. Lewenstein
M. Maas
M.L. Fredman
P. Bille
P. Bille
P. Clifford
T.-W. Lam
Z. Galil
Publication venue
Publication date: 01/01/2012
Field of study

We consider the problem of indexing a string

t

of length

n

to report the occurrences of a query pattern

p

containing

m

characters and

j

wildcards. Let

occ

be the number of occurrences of

p

t

, and

\sigma

the size of the alphabet. We obtain the following results. - A linear space index with query time

O(m+\sigma^j \log \log n + occ)

. This significantly improves the previously best known linear space index by Lam et al. [ISAAC 2007], which requires query time

\Theta(jn)

in the worst case. - An index with query time

O(m+j+occ)

using space

O(\sigma^{k^2} n \log^k \log n)

, where

k

is the maximum number of wildcards allowed in the pattern. This is the first non-trivial bound with this query time. - A time-space trade-off, generalizing the index by Cole et al. [STOC 2004]. We also show that these indexes can be generalized to allow variable length gaps in the pattern. Our results are obtained using a novel combination of well-known and new techniques, which could be of independent interest

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology

A Novel Plant Major Intrinsic Protein in Physcomitrella patens

Author: Anne-Sophie Lebrun
François Chaumont
Kristina Nordén
Sofia Gustavsson
Urban Johanson
Publication venue: 'American Society of Plant Biologists (ASPB)'
Publication date
Field of study

Crossref

Statistical methods for biological sequence analysis for DNA binding motifs and protein contacts

Author: Roth Christian
Publication venue: University Goettingen Repository
Publication date: 06/09/2021
Field of study

Over the last decades a revolution in novel measurement techniques has permeated the biological sciences filling the databases with unprecedented amounts of data ranging from genomics, transcriptomics, proteomics and metabolomics to structural and ecological data. In order to extract insights from the vast quantity of data, computational and statistical methods are nowadays crucial tools in the toolbox of every biological researcher. In this thesis I summarize my contributions in two data-rich fields in biological sciences: transcription factor binding to DNA and protein structure prediction from protein sequences with shared evolutionary ancestry. In the first part of my thesis I introduce our work towards a web server for analysing transcription factor binding data with Bayesian Markov Models. In contrast to classical PWM or di-nucleotide models, Bayesian Markov models can capture complex inter-nucleotide dependencies that can arise from shape-readout and alternative binding modes. In addition to giving access to our methods in an easy-to-use, intuitive web-interface, we provide our users with novel tools and visualizations to better evaluate the biological relevance of the inferred binding motifs. We hope that our tools will prove useful for investigating weak and complex transcription factor binding motifs which cannot be predicted accurately with existing tools. The second part discusses a statistical attempt to correct out the phylogenetic bias arising in co-evolution methods applied to the contact prediction problem. Co-evolution methods have revolutionized the protein-structure prediction field more than 10 years ago, and, until very recently, have retained their importance as crucial input features to deep neural networks. As the co-evolution information is extracted from evolutionarily related sequences, we investigated whether the phylogenetic bias to the signal can be corrected out in a principled way using a variation of the Felsenstein's tree-pruning algorithm applied in combination with an independent-pair assumption to derive pairwise amino counts that are corrected for the evolutionary history. Unfortunately, the contact prediction derived from our corrected pairwise amino acid counts did not yield a competitive performance.2021-09-2

Georg-August-University Göttingen

bloated tubules (blot) Encodes a Drosophila Member of the Neurotransmitter Transporter Family Required for Organisation of the Apical Cytocortex

Author: Johnson Kevin
Knust Elisabeth
Skaer Helen
Publication venue: Academic Press.
Publication date: 15/08/1999
Field of study

AbstractWe have identified a novel member of the vertebrate sodium- and chloride-dependent neurotransmitter symporter family from Drosophila melanogaster. This gene, named bloated tubules (blot), shows significant sequence similarity to a subgroup of vertebrate orphan transporters. blot transcripts are maternally supplied and during embryogenesis exhibit a complex and dynamic pattern in a subset of ectodermally derived epithelia, notably in the Malpighian tubules, and in the nervous system. Animals mutant for this gene are larval lethals, in which the Malpighian tubule cells are distended with an enlarged and disorganised apical surface. Embryos lacking the maternal component of blot expression die during early stages of development. They show an inability to form actin filaments in the apical cortex, resulting in impaired syncytial nuclear divisions, severe defects in the organisation of the cortical cytoskeleton, and a failure to cellularise. For the first time, a neurotransmitter transporter-like protein has been implicated in a function outside the nervous system. The isolation of blot thus provides the basis for an analysis of the relationship between the function of this putative transporter and epithelial morphogenesis

Elsevier - Publisher Connector

Acoustic sequences in non-human animals: a tutorial review and prospectus.

Animal acoustic communication often takes the form of complex sequences, made up of multiple distinct acoustic units. Apart from the well-known example of birdsong, other animals such as insects, amphibians, and mammals (including bats, rodents, primates, and cetaceans) also generate complex acoustic sequences. Occasionally, such as with birdsong, the adaptive role of these sequences seems clear (e.g. mate attraction and territorial defence). More often however, researchers have only begun to characterise - let alone understand - the significance and meaning of acoustic sequences. Hypotheses abound, but there is little agreement as to how sequences should be defined and analysed. Our review aims to outline suitable methods for testing these hypotheses, and to describe the major limitations to our current and near-future knowledge on questions of acoustic sequences. This review and prospectus is the result of a collaborative effort between 43 scientists from the fields of animal behaviour, ecology and evolution, signal processing, machine learning, quantitative linguistics, and information theory, who gathered for a 2013 workshop entitled, 'Analysing vocal sequences in animals'. Our goal is to present not just a review of the state of the art, but to propose a methodological framework that summarises what we suggest are the best practices for research in this field, across taxa and across disciplines. We also provide a tutorial-style introduction to some of the most promising algorithmic approaches for analysing sequences. We divide our review into three sections: identifying the distinct units of an acoustic sequence, describing the different ways that information can be contained within a sequence, and analysing the structure of that sequence. Each of these sections is further subdivided to address the key questions and approaches in that area. We propose a uniform, systematic, and comprehensive approach to studying sequences, with the goal of clarifying research terms used in different fields, and facilitating collaboration and comparative studies. Allowing greater interdisciplinary collaboration will facilitate the investigation of many important questions in the evolution of communication and sociality.This review was developed at an investigative workshop, “Analyzing Animal Vocal Communication Sequences” that took place on October 21–23 2013 in Knoxville, Tennessee, sponsored by the National Institute for Mathematical and Biological Synthesis (NIMBioS). NIMBioS is an Institute sponsored by the National Science Foundation, the U.S. Department of Homeland Security, and the U.S. Department of Agriculture through NSF Awards #EF-0832858 and #DBI-1300426, with additional support from The University of Tennessee, Knoxville. In addition to the authors, Vincent Janik participated in the workshop. D.T.B.’s research is currently supported by NSF DEB-1119660. M.A.B.’s research is currently supported by NSF IOS-0842759 and NIH R01DC009582. M.A.R.’s research is supported by ONR N0001411IP20086 and NOPP (ONR/BOEM) N00014-11-1-0697. S.L.DeR.’s research is supported by the U.S. Office of Naval Research. R.F.-i-C.’s research was supported by the grant BASMATI (TIN2011-27479-C04-03) from the Spanish Ministry of Science and Innovation. E.C.G.’s research is currently supported by a National Research Council postdoctoral fellowship. E.E.V.’s research is supported by CONACYT, Mexico, award number I010/214/2012.This is the accepted manuscript. The final version is available at http://dx.doi.org/10.1111/brv.1216

epublications@Marquette

LJMU Research Online (Liverpool John Moores University)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Southampton (e-Prints Soton)

UPCommons. Portal del coneixement obert de la UPC

PubMed Central

Apollo (Cambridge)

University of St. Andrews - Pure

Deep Blue Documents at the University of Michigan

St Andrews Research Repository

Innovative Algorithms and Evaluation Methods for Biological Motif Finding

Author: Dr. Yi Pan
Under Direction
Wooyoung Kim
Wooyoung Kim
Publication venue: ScholarWorks @ Georgia State University
Publication date: 05/05/2012
Field of study

Biological motifs are defined as overly recurring sub-patterns in biological systems. Sequence motifs and network motifs are the examples of biological motifs. Due to the wide range of applications, many algorithms and computational tools have been developed for efficient search for biological motifs. Therefore, there are more computationally derived motifs than experimentally validated motifs, and how to validate the biological significance of the ‘candidate motifs’ becomes an important question. Some of sequence motifs are verified by their structural similarities or their functional roles in DNA or protein sequences, and stored in databases. However, biological role of network motifs is still invalidated and currently no databases exist for this purpose. In this thesis, we focus not only on the computational efficiency but also on the biological meanings of the motifs. We provide an efficient way to incorporate biological information with clustering analysis methods: For example, a sparse nonnegative matrix factorization (SNMF) method is used with Chou-Fasman parameters for the protein motif finding. Biological network motifs are searched by various clustering algorithms with Gene ontology (GO) information. Experimental results show that the algorithms perform better than existing algorithms by producing a larger number of high-quality of biological motifs. In addition, we apply biological network motifs for the discovery of essential proteins. Essential proteins are defined as a minimum set of proteins which are vital for development to a fertile adult and in a cellular life in an organism. We design a new centrality algorithm with biological network motifs, named MCGO, and score proteins in a protein-protein interaction (PPI) network to find essential proteins. MCGO is also combined with other centrality measures to predict essential proteins using machine learning techniques. We have three contributions to the study of biological motifs through this thesis; 1) Clustering analysis is efficiently used in this work and biological information is easily integrated with the analysis; 2) We focus more on the biological meanings of motifs by adding biological knowledge in the algorithms and by suggesting biologically related evaluation methods. 3) Biological network motifs are successfully applied to a practical application of prediction of essential proteins

CiteSeerX

ScholarWorks @ Georgia State University

Bioinformatics

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

This book is divided into different research areas relevant in Bioinformatics such as biological networks, next generation sequencing, high performance computing, molecular modeling, structural bioinformatics, molecular modeling and intelligent data analysis. Each book section introduces the basic concepts and then explains its application to problems of great relevance, so both novice and expert readers can benefit from the information and research works presented here

Directory of Open Access Books (DOAB)