Search CORE

203 research outputs found

Systems Metagenomics:Applying Systems Biology Thinking to Human Microbiome Analysis

Author: E Boutet
FP Breitwieser
JT Simpson
M Nei
PJ Turnbaugh
RC Edgar
RD Finn
RD Isokpehi
TK Attwood
Y Sanz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/08/2018
Field of study

Crossref

Royal Holloway - Pure

Visualising biological data: a semantic approach to tool and database integration

Author: AL Mitchell
Alice Villéger
C Bru
C Notredame
David Thorne
DJ Parry-Smith
DN Perkins
Douglas B Kell
E Sonnhammer
I Letunic
J McCarthy
J Pérez
J Schultz
James Marsh
JD Thompson
L Devereux
L Pritchard
MY Galperin
N Hulo
NF Noy
NJ Mulder
P McDermott
Philip McDermott
PW Lord
RC Edgar
RD Finn
S Bergamaschi
S Hunter
S Pettifer
Steve Pettifer
Teresa K Attwood
TK Attwood
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Motivation In the biological sciences, the need to analyse vast amounts of information has become commonplace. Such large-scale analyses often involve drawing together data from a variety of different databases, held remotely on the internet or locally on in-house servers. Supporting these tasks are <it>ad hoc </it>collections of data-manipulation tools, scripting languages and visualisation software, which are often combined in arcane ways to create cumbersome systems that have been customised for a particular purpose, and are consequently not readily adaptable to other uses. For many day-to-day bioinformatics tasks, the sizes of current databases, and the scale of the analyses necessary, now demand increasing levels of automation; nevertheless, the unique experience and intuition of human researchers is still required to interpret the end results in any meaningful biological way. Putting humans in the loop requires tools to support real-time interaction with these vast and complex data-sets. Numerous tools do exist for this purpose, but many do not have optimal interfaces, most are effectively isolated from other tools and databases owing to incompatible data formats, and many have limited real-time performance when applied to realistically large data-sets: much of the user's cognitive capacity is therefore focused on controlling the software and manipulating esoteric file formats rather than on performing the research. Methods To confront these issues, harnessing expertise in human-computer interaction (HCI), high-performance rendering and distributed systems, and guided by bioinformaticians and end-user biologists, we are building reusable software components that, together, create a toolkit that is both architecturally sound from a computing point of view, and addresses both user and developer requirements. Key to the system's usability is its direct exploitation of semantics, which, crucially, gives individual components knowledge of their own functionality and allows them to interoperate seamlessly, removing many of the existing barriers and bottlenecks from standard bioinformatics tasks. Results The toolkit, named Utopia, is freely available from <url>http://utopia.cs.man.ac.uk/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The University of Manchester - Institutional Repository

Effectively incorporating selected multimedia content into medical publications

Author: A Ziegler
A Ziegler
A Ziegler
AA Goodman
Alexander Ziegler
Andreas Ziegler
B Ruthensteiner
Christoph Schöbel
Cornelius Faber
Daniel Mietchen
GR Thoma
J Maunsell
J Murienne
JK Tyzack
L Selvam
Markus Sellerer
MHA Schmitz
P Kumar
SM Smith
TK Attwood
Wolfram von Hausen
X Zhang
Y Zhang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Until fairly recently, medical publications have been handicapped by being restricted to non-electronic formats, effectively preventing the dissemination of complex audiovisual and three-dimensional data. However, authors and readers could significantly profit from advances in electronic publishing that permit the inclusion of multimedia content directly into an article. For the first time, the de facto gold standard for scientific publishing, the portable document format (PDF), is used here as a platform to embed a video and an audio sequence of patient data into a publication. Fully interactive three-dimensional models of a face and a schematic representation of a human brain are also part of this publication. We discuss the potential of this approach and its impact on the communication of scientific medical data, particularly with regard to electronic and open access publications. Finally, we emphasise how medical teaching can benefit from this new tool and comment on the future of medical publishing

Crossref

Springer - Publisher Connector

PubMed Central

Designing a course model for distance-based online bioinformatics training in Africa: the H3ABioNet experience

Author: A Via
Ahmed Mansour Alzohairy
Amel Ghouila
B Güzer
B Holmberg
Colleen Saunders
CP Zeki
David P. Judge
Deogratius Ssemwanga
Fatma Z. Guerfali
Francis Ouellette
J Bergmann
Jean-Baka Domelevo Entfellner
Jonathan Kayondo
Kim T. Gurwitz
L Welch
N Kemp
Nicola Mulder
NJ Mulder
O Tastan Bishop
P Pevzner
Pedro L. Fernandes
Rehab Ahmed
Ruben Cloete
Samson P. Salifu
Shaun Aron
Sumir Panji
Suresh Maslamoney
TK Attwood
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2017
Field of study

Africa is not unique in its need for basic bioinformatics training for individuals from a diverse range of academic backgrounds. However, particular logistical challenges in Africa, most notably access to bioinformatics expertise and internet stability, must be addressed in order to meet this need on the continent. H3ABioNet (www.h3abionet.org), the Pan African Bioinformatics Network for H3Africa, has therefore developed an innovative, free-of-charge "Introduction to Bioinformatics" course, taking these challenges into account as part of its educational efforts to provide on-site training and develop local expertise inside its network. A multiple-delivery±mode learning model was selected for this 3-month course in order to increase access to (mostly) African, expert bioinformatics trainers. The content of the course was developed to include a range of fundamental bioinformatics topics at the introductory level. For the first iteration of the course (2016), classrooms with a total of 364 enrolled participants were hosted at 20 institutions across 10 African countries. To ensure that classroom success did not depend on stable internet, trainers pre-recorded their lectures, and classrooms downloaded and watched these locally during biweekly contact sessions. The trainers were available via video conferencing to take questions during contact sessions, as well as via online "question and discussion" forums outside of contact session time. This learning model, developed for a resource-limited setting, could easily be adapted to other settings.IS

Access to Research and Communications Annals

Crossref

Directory of Open Access Journals

University of the Western Cape Research Repository

FigShare

The Origin of GPCRs: Identification of Mammalian like Rhodopsin, Adhesion, Glutamate and Frizzled GPCRs in Fungi

Author: A Lafon
Arunkumar Krishnan
C Xue
Chaoyang Xue
DH O'Day
DM Morens
DM Rosenbaum
DW Warnock
EJ Byrnes 3rd
EV Armbrust
F Ronquist
F Silveira
Helgi B. Schiöth
I Ruiz-Trillo
I Ruiz-Trillo
JA Eisen
JH Yu
JM Aury
JR Xu
K Katoh
K Palczewski
KJ Nordstrom
KJ Nordstrom
KJ Nordstrom
L Eichinger
L Kall
L Li
M Carr
M Medina
M Rask-Andersen
M Srivastava
Markus Sällman Almén
MC Lagerstrom
MS Almen
N Kamesh
PS Klein
R Fredriksson
R Fredriksson
RD Finn
RD Kulkarni
Robert Fredriksson
S Guindon
SR Eddy
T Brody
TK Attwood
TK Bjarnadottir
TY James
W Li
W Meersseman
Y Wang
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

G protein-coupled receptors (GPCRs) in humans are classified into the five main families named Glutamate, Rhodopsin, Adhesion, Frizzled and Secretin according to the GRAFS classification. Previous results show that these mammalian GRAFS families are well represented in the Metazoan lineages, but they have not been shown to be present in Fungi. Here, we systematically mined 79 fungal genomes and provide the first evidence that four of the five main mammalian families of GPCRs, namely Rhodopsin, Adhesion, Glutamate and Frizzled, are present in Fungi and found 142 novel sequences between them. Significantly, we provide strong evidence that the Rhodopsin family emerged from the cAMP receptor family in an event close to the split of Opisthokonts and not in Placozoa, as earlier assumed. The Rhodopsin family then expanded greatly in Metazoans while the cAMP receptor family is found in 3 invertebrate species and lost in the vertebrates. We estimate that the Adhesion and Frizzled families evolved before the split of Unikonts from a common ancestor of all major eukaryotic lineages. Also, the study highlights that the fungal Adhesion receptors do not have N-terminal domains whereas the fungal Glutamate receptors have a broad repertoire of mammalian-like N-terminal domains. Further, mining of the close unicellular relatives of the Metazoan lineage, Salpingoeca rosetta and Capsaspora owczarzaki, obtained a rich group of both the Adhesion and Glutamate families, which in particular provided insight to the early emergence of the N-terminal domains of the Adhesion family. We identified 619 Fungi specific GPCRs across 79 genomes and revealed that Blastocladiomycota and Chytridiomycota phylum have Metazoan-like GPCRs rather than the GPCRs specific for Fungi. Overall, this study provides the first evidence of the presence of four of the five main GRAFS families in Fungi and clarifies the early evolutionary history of the GPCR superfamily

Public Library of Science (PLOS)

Crossref

Publikationer från Uppsala Universitet

Directory of Open Access Journals

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line

FigShare

Fast index based algorithms and software for matching position specific scoring matrices

Author: A Kel
A Sandelin
B Dorohonceanu
D Weeks
G Castillo
H Gonnet
J Henikoff
J Henikoff
J Kärkkäinen
K Quandt
L Goldstein
LR Murphy
M Abouelhoda
M Beckstette
M Beckstette
M Gribskov
Michael Beckstette
N de Bruijn
N Hulo
P Embrechts
P Haverty
P Scordis
R Giegerich
R Staden
R Tatusov
Robert Giegerich
Robert Homann
S Kurtz
S Kurtz
S Rahmann
S Rajasekaran
Stefan Kurtz
T Kasai
T Li
T Wu
T Wu
TK Attwood
V Freschi
V Matys
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: In biological sequence analysis, position specific scoring matrices (PSSMs) are widely used to represent sequence motifs in nucleotide as well as amino acid sequences. Searching with PSSMs in complete genomes or large sequence databases is a common, but computationally expensive task. RESULTS: We present a new non-heuristic algorithm, called ESAsearch, to efficiently find matches of PSSMs in large databases. Our approach preprocesses the search space, e.g., a complete genome or a set of protein sequences, and builds an enhanced suffix array that is stored on file. This allows the searching of a database with a PSSM in sublinear expected time. Since ESAsearch benefits from small alphabets, we present a variant operating on sequences recoded according to a reduced alphabet. We also address the problem of non-comparable PSSM-scores by developing a method which allows the efficient computation of a matrix similarity threshold for a PSSM, given an E-value or a p-value. Our method is based on dynamic programming and, in contrast to other methods, it employs lazy evaluation of the dynamic programming matrix. We evaluated algorithm ESAsearch with nucleotide PSSMs and with amino acid PSSMs. Compared to the best previous methods, ESAsearch shows speedups of a factor between 17 and 275 for nucleotide PSSMs, and speedups up to factor 1.8 for amino acid PSSMs. Comparisons with the most widely used programs even show speedups by a factor of at least 3.8. Alphabet reduction yields an additional speedup factor of 2 on amino acid sequences compared to results achieved with the 20 symbol standard alphabet. The lazy evaluation method is also much faster than previous methods, with speedups of a factor between 3 and 330. CONCLUSION: Our analysis of ESAsearch reveals sublinear runtime in the expected case, and linear runtime in the worst case for sequences not shorter than | [Formula: see text] |(m )+ m - 1, where m is the length of the PSSM and [Formula: see text] a finite alphabet. In practice, ESAsearch shows superior performance over the most widely used programs, especially for DNA sequences. The new algorithm for accurate on-the-fly calculations of thresholds has the potential to replace formerly used approximation approaches. Beyond the algorithmic contributions, we provide a robust, well documented, and easy to use software package, implementing the ideas and algorithms presented in this manuscript

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Publications at Bielefeld University

Human phenotype ontology annotation and cluster analysis to unravel genetic defects in 707 cases with unexplained bleeding and platelet disorders

Author: Attwood A
Austin S
Bakchoul T
Bariana TK
Crisp-Hihn A
Erber WN
Favier R
Foad N
Freson K
Furie B
Gattens M
Gomez K
Greene D
Jansen SBG
Jolley JD
Kelly AM
Laffan MA
Lambert MP
Lentaigne C
Liesner R
Meacham S
Millar CM
Mumford AD
Nurden AT
Nurden P
Ouwehand WH
Peerlinck K
Perry DJ
Pillois X
Poudel P
Rendon A
Richardson S
Robinson PN
Schulman S
Schulze H
Simeoni I
Stephens JC
Turro E
Van Geet C
Westbury SK
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/03/2015
Field of study

Spiral - Imperial College Digital Repository

Investigation of G72 (DAOA) expression in the human brain

Author: A Siepel
AD Medhurst
B John
C Burge
CM van Drunen
E Birney
E Hattori
EL Sonnhammer
F Larsen
Fiona Kelly
G Pesole
I Chumakov
Isabel Benzel
J Cheng
Jacqueline de Belleroche
James NC Kew
JD Bendtsen
JL Ashurst
JNC Kew
L Falquet
L Verrall
LM Melnick
M Korostishevsky
M Kvajo
M Rehmsmeier
MC Frith
MJ van Baren
P Rice
Peter R Maycox
R Kapoor
Ramya Viknaraja
SD Detera-Wadleigh
Steven Hirsch
T Bakheet
T Werner
T Wiehe
TA Down
Thirza H Sanderson
TK Attwood
VG Levitsky
WJ Kent
Publication venue: BioMed Central
Publication date: 01/12/2008
Field of study

Abstract Background Polymorphisms at the G72/G30 locus on chromosome 13q have been associated with schizophrenia or bipolar disorder in more than ten independent studies. Even though the genetic findings are very robust, the physiological role of the predicted G72 protein has thus far not been resolved. Initial reports suggested G72 as an activator of D-amino acid oxidase (DAO), supporting the glutamate dysfunction hypothesis of schizophrenia. However, these findings have subsequently not been reproduced and reports of endogenous human G72 mRNA and protein expression are extremely limited. In order to better understand the function of this putative schizophrenia susceptibility gene, we attempted to demonstrate G72 mRNA and protein expression in relevant human brain regions. Methods The expression of G72 mRNA was studied by northern blotting and semi-quantitative SYBR-Green and Taqman RT-PCR. Protein expression in human tissue lysates was investigated by western blotting using two custom-made specific anti-G72 peptide antibodies. An in-depth <it>in silico </it>analysis of the G72/G30 locus was performed in order to try and identify motifs or regulatory elements that provide insight to G72 mRNA expression and transcript stability. Results Despite using highly sensitive techniques, we failed to identify significant levels of G72 mRNA in a variety of human tissues (e.g. adult brain, amygdala, caudate nucleus, fetal brain, spinal cord and testis) human cell lines or schizophrenia/control post mortem BA10 samples. Furthermore, using western blotting in combination with sensitive detection methods, we were also unable to detect G72 protein in a number of human brain regions (including cerebellum and amygdala), spinal cord or testis. A detailed <it>in silico </it>analysis provides several lines of evidence that support the apparent low or absent expression of G72. Conclusion Our results suggest that native G72 protein is not normally present in the tissues that we analysed in this study. We also conclude that the lack of demonstrable G72 expression in relevant brain regions does not support a role for G72 in modulation of DAO activity and the pathology of schizophrenia via a DAO-mediated mechanism. <it>In silico </it>analysis suggests that G72 is not robustly expressed and that the transcript is potentially labile. Further studies are required to understand the significance of the G72/30 locus to schizophrenia.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Spiral - Imperial College Digital Repository

PhenoFam-gene set enrichment analysis through protein structural information

Author: A Keller
A Subramanian
ASL Cheng
CA Worby
CH Wu
daW Huang
DH Haft
F Al-Shahrour
F Corpet
F Hahne
F Pearl
Frank Buchholz
G Dennis
H Mi
I Letunic
J Gough
JD Storey
JD Storey
JJ Fuster
LN Nguyen
M Ashburner
M Kanehisa
M Teresa Pisabarro
MA Sartor
Maciej Paszkowski-Rogacz
Mikolaj Slabicki
N Hulo
P Khatri
PD Thomas
PD Thomas
R Kittler
RD Finn
RJ Hernstein
S Hunter
SY Kim
T Li
T Lima
TJ Hubbard
TK Attwood
U Fuchs
VK Mootha
WJ Gehring
Y Ben-Shaul
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background With the current technological advances in high-throughput biology, the necessity to develop tools that help to analyse the massive amount of data being generated is evident. A powerful method of inspecting large-scale data sets is gene set enrichment analysis (GSEA) and investigation of protein structural features can guide determining the function of individual genes. However, a convenient tool that combines these two features to aid in high-throughput data analysis has not been developed yet. In order to fill this niche, we developed the user-friendly, web-based application, PhenoFam. Results PhenoFam performs gene set enrichment analysis by employing structural and functional information on families of protein domains as annotation terms. Our tool is designed to analyse complete sets of results from quantitative high-throughput studies (gene expression microarrays, functional RNAi screens, <it>etc</it>.) without prior pre-filtering or hits-selection steps. PhenoFam utilizes Ensembl databases to link a list of user-provided identifiers with protein features from the InterPro database, and assesses whether results associated with individual domains differ significantly from the overall population. To demonstrate the utility of PhenoFam we analysed a genome-wide RNA interference screen and discovered a novel function of plexins containing the cytoplasmic RasGAP domain. Furthermore, a PhenoFam analysis of breast cancer gene expression profiles revealed a link between breast carcinoma and altered expression of PX domain containing proteins. Conclusions PhenoFam provides a user-friendly, easily accessible web interface to perform GSEA based on high-throughput data sets and structural-functional protein information, and therefore aids in functional annotation of genes.</p

Qucosa

Crossref

HSSS - Hochschulschriftenserver der SLUB

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Technische Universität Dresden: Qucosa

Quantitative sequence-function relationships in proteins based on gene ontology

Author: A Bairoch
A Bairoch
A Bateman
A Bateman
A Conesa
AE Todd
Arthur M Lesk
CA Wilson
CZ Cai
D Devos
D Devos
Daniel J Blankenberg
E Camon
EL Sonnhammer
J Piatigorsky
JA Gerlt
JA Ranea
JC Whisstock
K Fleming
L Holm
LB Koski
LJ Jensen
M Ashburner
M Shadidy
MA Andrade
MD Ganfornina
N Hulo
Naomi Altman
P Bork
R Karp
RA Laskowski
RA Laskowski
RC Edgar
S Jones
S Nakayama
SB Needleman
SE Brenner
SF Altschul
SR Eddy
SS Jeong
T Doerks
TF Smith
TK Attwood
Vineet Sangar
X Lu
Publication venue: BioMed Central
Publication date: 01/08/2007
Field of study

Abstract Background The relationship between divergence of amino-acid sequence and divergence of function among homologous proteins is complex. The assumption that homologs share function – the basis of transfer of annotations in databases – must therefore be regarded with caution. Here, we present a quantitative study of sequence and function divergence, based on the Gene Ontology classification of function. We determined the relationship between sequence divergence and function divergence in 6828 protein families from the PFAM database. Within families there is a broad range of sequence similarity from very closely related proteins – for instance, orthologs in different mammals – to very distantly-related proteins at the limit of reliable recognition of homology. Results We correlated the divergence in sequences determined from pairwise alignments, and the divergence in function determined by path lengths in the Gene Ontology graph, taking into account the fact that many proteins have multiple functions. Our results show that, among homologous proteins, the proportion of divergent functions decreases dramatically above a threshold of sequence similarity at about 50% residue identity. For proteins with more than 50% residue identity, transfer of annotation between homologs will lead to an erroneous attribution with a totally dissimilar function in fewer than 6% of cases. This means that for very similar proteins (about 50 % identical residues) the chance of completely incorrect annotation is low; however, because of the phenomenon of recruitment, it is still non-zero. Conclusion Our results describe general features of the evolution of protein function, and serve as a guide to the reliability of annotation transfer, based on the closeness of the relationship between a new protein and its nearest annotated relative.</p

Crossref

Directory of Open Access Journals

PubMed Central