Search CORE

5,435 research outputs found

Influenza research database: an integrated bioinformatics resource for influenza research and surveillance.

Author: Baumgarth Nicole
Deitrich Jon
García-Sastre Adolfo
Hunt Victoria
Klem Edward
Kumar Sanjeev
Larsen Christopher N
Macken Catherine
Noronha Jyothi
Pickett Brett E
Ramsey Alvin
Scheuermann Richard H
Squires R Burke
Suarez David
Zaremba Sam
Zhang Yun
Zhou Liwei
Publication venue: eScholarship, University of California
Publication date: 01/11/2012
Field of study

BackgroundThe recent emergence of the 2009 pandemic influenza A/H1N1 virus has highlighted the value of free and open access to influenza virus genome sequence data integrated with information about other important virus characteristics.DesignThe Influenza Research Database (IRD, http://www.fludb.org) is a free, open, publicly-accessible resource funded by the U.S. National Institute of Allergy and Infectious Diseases through the Bioinformatics Resource Centers program. IRD provides a comprehensive, integrated database and analysis resource for influenza sequence, surveillance, and research data, including user-friendly interfaces for data retrieval, visualization and comparative genomics analysis, together with personal log in-protected 'workbench' spaces for saving data sets and analysis results. IRD integrates genomic, proteomic, immune epitope, and surveillance data from a variety of sources, including public databases, computational algorithms, external research groups, and the scientific literature.ResultsTo demonstrate the utility of the data and analysis tools available in IRD, two scientific use cases are presented. A comparison of hemagglutinin sequence conservation and epitope coverage information revealed highly conserved protein regions that can be recognized by the human adaptive immune system as possible targets for inducing cross-protective immunity. Phylogenetic and geospatial analysis of sequences from wild bird surveillance samples revealed a possible evolutionary connection between influenza virus from Delaware Bay shorebirds and Alberta ducks.ConclusionsThe IRD provides a wealth of integrated data and information about influenza virus to support research of the genetic determinants dictating virus pathogenicity, host range restriction and transmission, and to facilitate development of vaccines, diagnostics, and therapeutics

PubMed Central

eScholarship - University of California

Recommended from our members

Evolutionary relationships among bifidobacteria and their hosts and environments.

Author: Martiny Jennifer BH
Rodriguez Cynthia I
Publication venue: eScholarship, University of California
Publication date: 08/01/2020
Field of study

BACKGROUND:The assembly of animal microbiomes is influenced by multiple environmental factors and host genetics, although the relative importance of these factors remains unclear. Bifidobacteria (genus Bifidobacterium, phylum Actinobacteria) are common first colonizers of gut microbiomes in humans and inhabit other mammals, social insects, food, and sewages. In humans, the presence of bifidobacteria in the gut has been correlated with health-promoting benefits. Here, we compared the genome sequences of a subset of the over 400 Bifidobacterium strains publicly available to investigate the adaptation of bifidobacteria diversity. We tested 1) whether bifidobacteria show a phylogenetic signal with their isolation sources (hosts and environments) and 2) whether key traits encoded by the bifidobacteria genomes depend on the host or environment from which they were isolated. We analyzed Bifidobacterium genomes available in the PATRIC and NCBI repositories and identified the hosts and/or environment from which they were isolated. A multilocus phylogenetic analysis was conducted to compare the genetic relatedness the strains harbored by different hosts and environments. Furthermore, we examined differences in genomic traits and genes related to amino acid biosynthesis and degradation of carbohydrates. RESULTS:We found that bifidobacteria diversity appears to have evolved with their hosts as strains isolated from the same host were non-randomly associated with their phylogenetic relatedness. Moreover, bifidobacteria isolated from different sources displayed differences in genomic traits such as genome size and accessory gene composition and on particular traits related to amino acid production and degradation of carbohydrates. In contrast, when analyzing diversity within human-derived bifidobacteria, we observed no phylogenetic signal or differences on specific traits (amino acid biosynthesis genes and CAZymes). CONCLUSIONS:Overall, our study shows that bifidobacteria diversity is strongly adapted to specific hosts and environments and that several genomic traits were associated with their isolation sources. However, this signal is not observed in human-derived strains alone. Looking into the genomic signatures of bifidobacteria strains in different environments can give insights into how this bacterial group adapts to their environment and what types of traits are important for these adaptations

eScholarship - University of California

Assembly of an interactive correlation network for the Arabidopsis genome using a novel heuristic clustering algorithm

Author: Ebenhoeh Oliver
Loraine Ann
Mutwil Marek
Persson Staffan
Schütte Moritz
Usadel Björn
Publication venue: 'American Society of Plant Biologists (ASPB)'
Publication date: 01/01/2010
Field of study

Peer reviewedPublisher PD

Aberdeen University Research

Crossref

PubMed Central

MPG.PuRe

Domain-mediated interactions for protein subfamily identification

Author: Han S.K.
Kim D.
Kim I.
KIM SANGUK
Kong J.
Lee H.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Within a protein family, proteins with the same domain often exhibit different cellular functions, despite the shared evolutionary history and molecular function of the domain. We hypothesized that domain-mediated interactions (DMIs) may categorize a protein family into subfamilies because the diversified functions of a single domain often depend on interacting partners of domains. Here we systematically identified DMI subfamilies, in which proteins share domains with DMI partners, as well as with various functional and physical interaction networks in individual species. In humans, DMI subfamily members are associated with similar diseases, including cancers, and are frequently co-associated with the same diseases. DMI information relates to the functional and evolutionary subdivisions of human kinases. In yeast, DMI subfamilies contain proteins with similar phenotypic outcomes from specific chemical treatments. Therefore, the systematic investigation here provides insights into the diverse functions of subfamilies derived from a protein family with a link-centric approach and suggests a useful resource for annotating the functions and phenotypic outcomes of proteins.11Ysciescopu

포항공과대학교

ePlant and the 3D Data Display Initiative: Integrative Systems Biology on the World Wide Web

Author: A Garcia Castro
A Marchler-Bauer
A Marchler-Bauer
A Theocharidis
C Lau
D Fange
D Honys
D Lee
D Merico
D Weigel
David Di Biase
Dinesh Christendat
DJ Watts
G Coruzzi
G Fucile
G Jander
GA Pavlopoulos
Garon La
GD Bader
GD Bader
Geoffrey Fucile
H Ge
H Goda
Hardeep Nahal
I Vastrik
J Behr
J Binkley
J Fisher
J Geisler-Lee
J Kilian
J Kopka
J McDermott
J Paananen
JA Sagotsky
JE Stajich
JL Heazlewood
K Birnbaum
K Katoh
K Nakabayashi
K Takahashi
K Toufighi
Kante Easley
L Li
L Matthews
LA Kelley
Lawrence Kelley
M de Tayrac
M Kanehisa
M Nordborg
M Schmid
M Tomita
MC Suh
N Gehlenborg
N Gehlenborg
N Halabi
N Tsesmetzis
Nicholas J. Provart
O Thimm
P Kahlem
P Mendes
PB Neerincx
PD Karp
R Chenna
R Swanson
RC Gentleman
RK Yadav
S Hunter
S Mostafavi
SF Altschul
Shin-Han Shiu
Shokoufeh Khodabandeh
SI O'Donoghue
SK Card
SM Brady
SM Stephens
U Alon
W Zhong
Y Qin
Y Yang
Yani Chen
Publication venue: Public Library of Science
Publication date: 10/01/2011
Field of study

Visualization tools for biological data are often limited in their ability to interactively integrate data at multiple scales. These computational tools are also typically limited by two-dimensional displays and programmatic implementations that require separate configurations for each of the user's computing devices and recompilation for functional expansion. Towards overcoming these limitations we have developed “ePlant” (http://bar.utoronto.ca/eplant) – a suite of open-source world wide web-based tools for the visualization of large-scale data sets from the model organism Arabidopsis thaliana. These tools display data spanning multiple biological scales on interactive three-dimensional models. Currently, ePlant consists of the following modules: a sequence conservation explorer that includes homology relationships and single nucleotide polymorphism data, a protein structure model explorer, a molecular interaction network explorer, a gene product subcellular localization explorer, and a gene expression pattern explorer. The ePlant's protein structure explorer module represents experimentally determined and theoretical structures covering >70% of the Arabidopsis proteome. The ePlant framework is accessed entirely through a web browser, and is therefore platform-independent. It can be applied to any model organism. To facilitate the development of three-dimensional displays of biological data on the world wide web we have established the “3D Data Display Initiative” (http://3ddi.org)

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Algorithmic Techniques in Gene Expression Processing. From Imputation to Visualization

Author: Tuikkala Johannes
Publication venue: Turku Centre for Computer Science
Publication date: 20/11/2014
Field of study

The amount of biological data has grown exponentially in recent decades. Modern biotechnologies, such as microarrays and next-generation sequencing, are capable to produce massive amounts of biomedical data in a single experiment. As the amount of the data is rapidly growing there is an urgent need for reliable computational methods for analyzing and visualizing it. This thesis addresses this need by studying how to efficiently and reliably analyze and visualize high-dimensional data, especially that obtained from gene expression microarray experiments. First, we will study the ways to improve the quality of microarray data by replacing (imputing) the missing data entries with the estimated values for these entries. Missing value imputation is a method which is commonly used to make the original incomplete data complete, thus making it easier to be analyzed with statistical and computational methods. Our novel approach was to use curated external biological information as a guide for the missing value imputation. Secondly, we studied the effect of missing value imputation on the downstream data analysis methods like clustering. We compared multiple recent imputation algorithms against 8 publicly available microarray data sets. It was observed that the missing value imputation indeed is a rational way to improve the quality of biological data. The research revealed differences between the clustering results obtained with different imputation methods. On most data sets, the simple and fast k-NN imputation was good enough, but there were also needs for more advanced imputation methods, such as Bayesian Principal Component Algorithm (BPCA). Finally, we studied the visualization of biological network data. Biological interaction networks are examples of the outcome of multiple biological experiments such as using the gene microarray techniques. Such networks are typically very large and highly connected, thus there is a need for fast algorithms for producing visually pleasant layouts. A computationally efficient way to produce layouts of large biological interaction networks was developed. The algorithm uses multilevel optimization within the regular force directed graph layout algorithm.Siirretty Doriast

UTUPub

Frustration in Biomolecules

Author: Ferreiro Diego U.
Komives Elizabeth A.
Wolynes Peter G.
Publication venue
Publication date: 03/12/2013
Field of study

Biomolecules are the prime information processing elements of living matter. Most of these inanimate systems are polymers that compute their structures and dynamics using as input seemingly random character strings of their sequence, following which they coalesce and perform integrated cellular functions. In large computational systems with a finite interaction-codes, the appearance of conflicting goals is inevitable. Simple conflicting forces can lead to quite complex structures and behaviors, leading to the concept of "frustration" in condensed matter. We present here some basic ideas about frustration in biomolecules and how the frustration concept leads to a better appreciation of many aspects of the architecture of biomolecules, and how structure connects to function. These ideas are simultaneously both seductively simple and perilously subtle to grasp completely. The energy landscape theory of protein folding provides a framework for quantifying frustration in large systems and has been implemented at many levels of description. We first review the notion of frustration from the areas of abstract logic and its uses in simple condensed matter systems. We discuss then how the frustration concept applies specifically to heteropolymers, testing folding landscape theory in computer simulations of protein models and in experimentally accessible systems. Studying the aspects of frustration averaged over many proteins provides ways to infer energy functions useful for reliable structure prediction. We discuss how frustration affects folding, how a large part of the biological functions of proteins are related to subtle local frustration effects and how frustration influences the appearance of metastable states, the nature of binding processes, catalysis and allosteric transitions. We hope to illustrate how Frustration is a fundamental concept in relating function to structural biology.Comment: 97 pages, 30 figure

arXiv.org e-Print Archive

CONICET Digital

PubMed Central

DSpace at Rice University

Applications of Evolutionary Bioinformatics in Basic and Biomedical Research

Author: Adebali Ogun
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2015
Field of study

With the revolutionary progress in sequencing technologies, computational biology emerged as a game-changing field which is applied in understanding molecular events of life for not only complementary but also exploratory purposes. Bioinformatics resources and tools significantly help in data generation, organization and analysis. However, there is still a need for developing new approaches built based on a biologist’s point of view. In protein bioinformatics, there are several fundamental problems such as (i) determining protein function; (ii) identifying protein-protein interactions; (iii) predicting the effect of amino acid variants. Here, I present three chapters addressing these problems from an evolutionary perspective. Firstly, I describe a novel search pipeline for protein domain identification. The algorithm chain provides sensitive domain assignments with the highest possible specificity. Secondly, I present a tool enabling large-scale visualization of presences and absences of proteins in hierarchically clustered genomes. This tool visualizes multi-layer information of any kind of genome-linked data with a special focus on domain architectures, enabling identification of coevolving domains/proteins, which can eventually help in identifying functionally interacting proteins. And finally, I propose an approach for distinguishing between benign and damaging missense mutations in a human disease by establishing the precise evolutionary history of the associated gene. This part introduces new criteria on how to determine functional orthologs via phylogenetic analysis. All three parts use comparative genomics and/or sequence analyses. Taken together, this study addresses important problems in protein bioinformatics and as a whole it can be utilized to describe proteins by their domains, coevolving partners and functionally important residues

University of Tennessee, Knoxville: Trace

Quantification of DNA-associated proteins inside eukaryotic cells using single-molecule localization microscopy

Author: Adam T. Watson
Alex Herbert
Antony M. Carr
Betzig
Bähler
Culbertson
David Klenerman
David Lando
Dion
Elf
Ernest Laue
Forsburg
Heilemann
Henteges
Hess
Horrocks
Jem Tucker
Kearsey
Kim
Krichevsky
Kubota
Lando
Manley
Mark A. Osborne
Masai
Matsuyama
Matthieu Palayret
Meister
Michalet
Mortensen
Patterson
Peter Jönsson
Reyes-Lamothe
Rust
Rémi L. Boulineau
Shiomi
Sophie George
Spendier
Steven F. Lee
Stracy
Su'etsugu
Szymborska
Thomas J. Etheridge
Tokunaga
Uphoff
Watson
Yasukazu Daigaku
Publication venue: 'Oxford University Press (OUP)'
Publication date: 08/08/2014
Field of study

Development of single-molecule localization microscopy techniques has allowed nanometre scale localization accuracy inside cells, permitting the resolution of ultra-fine cell structure and the elucidation of crucial molecular mechanisms. Application of these methodologies to understanding processes underlying DNA replication and repair has been limited to defined in vitro biochemical analysis and prokaryotic cells. In order to expand these techniques to eukaryotic systems, we have further developed a photo-activated localization microscopy-based method to directly visualize DNA-associated proteins in unfixed eukaryotic cells. We demonstrate that motion blurring of fluorescence due to protein diffusivity can be used to selectively image the DNA-bound population of proteins. We designed and tested a simple methodology and show that it can be used to detect changes in DNA binding of a replicative helicase subunit, Mcm4, and the replication sliding clamp, PCNA, between different stages of the cell cycle and between distinct genetic backgrounds

Lund University Publications

Crossref

PubMed Central

Sussex Research Online