Search CORE

Wageningen University & Research Publications

CoPub Mapper: mining MEDLINE based on search term co-publication

Author: Alako Blaise TF
Jelier Rob
Jenster Guido
Polman Jan
Rullmann Ton
van Baal Sjozef
Veldhoven Antoine
Verhoeven Stefan
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: High throughput microarray analyses result in many differentially expressed genes that are potentially responsible for the biological process of interest. In order to identify biological similarities between genes, publications from MEDLINE were identified in which pairs of gene names and combinations of gene name with specific keywords were co-mentioned. RESULTS: MEDLINE search strings for 15,621 known genes and 3,731 keywords were generated and validated. PubMed IDs were retrieved from MEDLINE and relative probability of co-occurrences of all gene-gene and gene-keyword pairs determined. To assess gene clustering according to literature co-publication, 150 genes consisting of 8 sets with known connections (same pathway, same protein complex, or same cellular localization, etc.) were run through the program. Receiver operator characteristics (ROC) analyses showed that most gene sets were clustered much better than expected by random chance. To test grouping of genes from real microarray data, 221 differentially expressed genes from a microarray experiment were analyzed with CoPub Mapper, which resulted in several relevant clusters of genes with biological process and disease keywords. In addition, all genes versus keywords were hierarchical clustered to reveal a complete grouping of published genes based on co-occurrence. CONCLUSION: The CoPub Mapper program allows for quick and versatile querying of co-published genes and keywords and can be successfully used to cluster predefined groups of genes and microarray data

Lirias

Wageningen University & Research Publications

EUR Research Repository

Erasmus University Digital Repository

In-operando optical observations of alkaline fuel cell electrode surfaces during harsh cycling tests

Author: Alako Kolade
Dawson Richard
Hinde Christopher
Parhar Samritha
Patel Anant
Reynolds Christopher
Publication venue: 'Elsevier BV'
Publication date: 17/08/2017
Field of study

The durability of low-cost fuel cells is one of the last technical challenges to be overcome before the widespread adoption of fuel cells can become a reality. Most research concentrates on polymer electrolyte membrane or solid oxide fuel cells in this topic with little published regarding the durability of recirculating liquid electrolyte alkaline fuel cells. In this paper we present an investigation into the durability of this fuel cell variant under harsh load cycling, air starvation and fuel starvation conditions. In the study, making use of the high ionic conductivity of the electrolyte, a novel rig design was utilised, which allowed the surfaces of the electrodes to be constantly monitored optically during the experiments. This demonstrated the good physical durability of the anode during the test protocols whilst highlighted the instability of the manganese-cobalt spinel cathode, used in this study, during the air starvation protocols. The load cycling stability of the alkaline fuel cells used was found to be good with the standard configuration giving only around a 2.7% voltage degradation at 100 mA cm−2 operating point over 8000 load cycles

Lancaster E-Prints

Gene List significance at-a-glance with GeneValorization

Author: Alako
Anne Biton
Bryan Brancotte
Fabien Reyal
François Radvanyi
Grimes
Isabelle Bernard-Pierrot
Krallinger
Paik
Plake
Sarah Cohen-Boulakia
Vellay
Publication venue: Oxford University Press
Publication date: 15/04/2011
Field of study

Motivation: High-throughput technologies provide fundamental informations concerning thousands of genes. Many of the current research laboratories daily use one or more of these technologies and end-up with lists of genes. Assessing the originality of the results obtained includes being aware of the number of publications available concerning individual or multiple genes and accessing information about these publications. Faced with the exponential growth of publications avaliable and number of genes involved in a study, this task is becoming particularly difficult to achieve

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Public Library of Science (PLOS)

HAL-Polytechnique

HAL-Rennes 1

Plasmodium falciparum Heterochromatin Protein 1 Marks Genomic Loci Linked to Phenotypic Variation of Exported Virulence Factors

Author: Alako Blaise T. F.
Bartfai Richard
Bozdech Zbynek
Cowman Alan F.
Ehlgen Florian
Flueck Christian
Niederwieser Igor
Ralph Stuart A.
Salcedo-Amaya Adriana M.
Stunnenberg Hendrik G.
Volz Jennifer
Voss Till S.
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Epigenetic processes are the main conductors of phenotypic variation in eukaryotes. The malaria parasite Plasmodium falciparum employs antigenic variation of the major surface antigen PfEMP1, encoded by 60 var genes, to evade acquired immune responses. Antigenic variation of PfEMP1 occurs through in situ switches in mono-allelic var gene transcription, which is PfSIR2-dependent and associated with the presence of repressive H3K9me3 marks at silenced loci. Here, we show that P. falciparum heterochromatin protein 1 (PfHP1) binds specifically to H3K9me3 but not to other repressive histone methyl marks. Based on nuclear fractionation and detailed immuno-localization assays, PfHP1 constitutes a major component of heterochromatin in perinuclear chromosome end clusters. High-resolution genome-wide chromatin immuno-precipitation demonstrates the striking association of PfHP1 with virulence gene arrays in subtelomeric and chromosome-internal islands and a high correlation with previously mapped H3K9me3 marks. These include not only var genes, but also the majority of P. falciparum lineage-specific gene families coding for exported proteins involved in host–parasite interactions. In addition, we identified a number of PfHP1-bound genes that were not enriched in H3K9me3, many of which code for proteins expressed during invasion or at different life cycle stages. Interestingly, PfHP1 is absent from centromeric regions, implying important differences in centromere biology between P. falciparum and its human host. Over-expression of PfHP1 results in an enhancement of variegated expression and highlights the presence of well-defined heterochromatic boundaries. In summary, we identify PfHP1 as a major effector of virulence gene silencing and phenotypic variation. Our results are instrumental for our understanding of this widely used survival strategy in unicellular pathogens

LSHTM Research Online

edoc

University of Melbourne Institutional Repository

CoPub update: CoPub 5.0 a text mining system to answer biological questions

Author: Alako
B. Heupers
Chen
Frijters
Ideker
J. de Vlieg
J. Polman
Pico
R. Frijters
R. van Schaik
S. Verhoeven
Sharan
W. Alkema
W. W. M. Fleuren
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

In this article, we present CoPub 5.0, a publicly available text mining system, which uses Medline abstracts to calculate robust statistics for keyword co-occurrences. CoPub was initially developed for the analysis of microarray data, but we broadened the scope by implementing new technology and new thesauri. In CoPub 5.0, we integrated existing CoPub technology with new features, and provided a new advanced interface, which can be used to answer a variety of biological questions. CoPub 5.0 allows searching for keywords of interest and its relations to curated thesauri and provides highlighting and sorting mechanisms, using its statistics, to retrieve the most important abstracts in which the terms co-occur. It also provides a way to search for indirect relations between genes, drugs, pathways and diseases, following an ABC principle, in which A and C have no direct connection but are connected via shared B intermediates. With CoPub 5.0, it is possible to create, annotate and analyze networks using the layout and highlight options of Cytoscape web, allowing for literature based systems biology. Finally, operations of the CoPub 5.0 Web service enable to implement the CoPub technology in bioinformatics workflows. CoPub 5.0 can be accessed through the CoPub portal http://www.copub.org

d-Omix: a mixer of generic protein domain analysis tools

Author: Alako
Apic
Bashton
Bashton
Bj rklund
Bork
D. Wichadakul
Enright
Geer
Han
Hegyi
Hiraguri
Marcotte
Mott
Murzin
Novatchkova
Orengo
Quevillon
S. Ingsriswang
S. Numnark
Thompson
Vogel
Watts
Wuchty
Ye
Publication venue: Oxford University Press
Publication date
Field of study

Domain combination provides important clues to the roles of protein domains in protein function, interaction and evolution. We have developed a web server d-Omix (a Mixer of Protein Domain Analysis Tools) aiming as a unified platform to analyze, compare and visualize protein data sets in various aspects of protein domain combinations. With InterProScan files for protein sets of interest provided by users, the server incorporates four services for domain analyses. First, it constructs protein phylogenetic tree based on a distance matrix calculated from protein domain architectures (DAs), allowing the comparison with a sequence-based tree. Second, it calculates and visualizes the versatility, abundance and co-presence of protein domains via a domain graph. Third, it compares the similarity of proteins based on DA alignment. Fourth, it builds a putative protein network derived from domain–domain interactions from DOMINE. Users may select a variety of input data files and flexibly choose domain search tools (e.g. hmmpfam, superfamily) for a specific analysis. Results from the d-Omix could be interactively explored and exported into various formats such as SVG, JPG, BMP and CSV. Users with only protein sequences could prepare an InterProScan file using a service provided by the server as well. The d-Omix web server is freely available at http://www.biotec.or.th/isl/Domix

mspecLINE: bridging knowledge of human disease with the proteome

Author: AM Cohen
B Ye
BJ Stapley
BT Alako
C Bennett
CC van der Eijk
DJ Slotta
E Keogh
Eric W Deutsch
EW Deutsch
F Desiere
H Liao
H Liu
HJ Lowe
J Boyle
J Saltz
Jeremy Handcock
John Boyle
M Li
M Li
M Li
MY Brusniak
P Khatri
P Mallick
P Picotti
P Shannon
PA Covitz
R Cilibrasi
R Cilibrasi
R Homayouni
RL Cilibrasi
S Deerwester
V Lange
Y Tsuruoka
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Public proteomics databases such as PeptideAtlas contain peptides and proteins identified in mass spectrometry experiments. However, these databases lack information about human disease for researchers studying disease-related proteins. We have developed mspecLINE, a tool that combines knowledge about human disease in MEDLINE with empirical data about the detectable human proteome in PeptideAtlas. mspecLINE associates diseases with proteins by calculating the semantic distance between annotated terms from a controlled biomedical vocabulary. We used an established semantic distance measure that is based on the co-occurrence of disease and protein terms in the MEDLINE bibliographic database. Results The mspecLINE web application allows researchers to explore relationships between human diseases and parts of the proteome that are detectable using a mass spectrometer. Given a disease, the tool will display proteins and peptides from PeptideAtlas that may be associated with the disease. It will also display relevant literature from MEDLINE. Furthermore, mspecLINE allows researchers to select proteotypic peptides for specific protein targets in a mass spectrometry assay. Conclusions Although mspecLINE applies an information retrieval technique to the MEDLINE database, it is distinct from previous MEDLINE query tools in that it combines the knowledge expressed in scientific literature with empirical proteomics data. The tool provides valuable information about candidate protein targets to researchers studying human disease and is freely available on a public web server.</p

GenCLiP: a software program for clustering gene lists by literature profiling and constructing gene co-occurrence networks related to custom keywords

Author: AA Schaffer
BT Alako
C Plake
C Rodriguez-Penagos
D Chaussabel
D Lee
EG Cerami
G Karakiulakis
H Kim
Hui-Yong Tian
Jin Zhao
K Fundel
Kai-Tai Yao
KJ Bussey
LJ Jensen
M Bundschus
M Suderman
MB Eisen
N Daraselia
P Shannon
R Hammamieh
R Hoffmann
R Rubinstein
RT Tsai
S Li
T Ide
TK Jenssen
VK Gajendran
Yi-Bo Zhou
Z Huang
ZF Hu
Zhen-Fu Hu
Zhong-Xi Huang
Publication venue: BioMed Central
Publication date: 01/07/2008
Field of study

Abstract Background Biomedical researchers often want to explore pathogenesis and pathways regulated by abnormally expressed genes, such as those identified by microarray analyses. Literature mining is an important way to assist in this task. Many literature mining tools are now available. However, few of them allows the user to make manual adjustments to zero in on what he/she wants to know in particular. Results We present our software program, GenCLiP (Gene Cluster with Literature Profiles), which is based on the methods presented by Chaussabel and Sher (<it>Genome Biol </it>2002, 3(10):RESEARCH0055) that search gene lists to identify functional clusters of genes based on up-to-date literature profiling. Four features were added to this previously described method: the ability to 1) manually curate keywords extracted from the literature, 2) search genes and gene co-occurrence networks related to custom keywords, 3) compare analyzed gene results with negative and positive controls generated by GenCLiP, and 4) calculate probabilities that the resulting genes and gene networks are randomly related. In this paper, we show with a set of differentially expressed genes between keloids and normal control, how implementation of functions in GenCLiP successfully identified keywords related to the pathogenesis of keloids and unknown gene pathways involved in the pathogenesis of keloids. Conclusion With regard to the identification of disease-susceptibility genes, GenCLiP allows one to quickly acquire a primary pathogenesis profile and identify pathways involving abnormally expressed genes not previously associated with the disease.</p

Improving protein function prediction methods with integrated literature data

Author: A Karimpour-Fard
A Vazquez
A Vinayagam
Aaron P Gabow
AK Ramani
B Schwikowski
BTF Alako
C Brun
C von Mering
Debra S Goldberg
E Nabieva
HW Mewes
I Xenarios
J Rual
K Tsuda
L Hunter
L Hunter
L Tanabe
Lawrence E Hunter
M Ashburner
M Aubry
M Chagoyen
M Huynen
M Krallinger
M Krallinger
M Pelligri
M Yetisgen-Yildiz
OG Troyanskaya
P Srinivasan
PM Bowers
R Cilibrasi
R Hoffmann
S Letovsky
S Raychaudhuri
Sonia M Leach
T Schlitt
T Tanabe
TK Jenssen
U Karaoz
William A Baumgartner
Y Ofran
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Determining the function of uncharacterized proteins is a major challenge in the post-genomic era due to the problem's complexity and scale. Identifying a protein's function contributes to an understanding of its role in the involved pathways, its suitability as a drug target, and its potential for protein modifications. Several graph-theoretic approaches predict unidentified functions of proteins by using the functional annotations of better-characterized proteins in protein-protein interaction networks. We systematically consider the use of literature co-occurrence data, introduce a new method for quantifying the reliability of co-occurrence and test how performance differs across species. We also quantify changes in performance as the prediction algorithms annotate with increased specificity. Results We find that including information on the co-occurrence of proteins within an abstract greatly boosts performance in the Functional Flow graph-theoretic function prediction algorithm in yeast, fly and worm. This increase in performance is not simply due to the presence of additional edges since supplementing protein-protein interactions with co-occurrence data outperforms supplementing with a comparably-sized genetic interaction dataset. Through the combination of protein-protein interactions and co-occurrence data, the neighborhood around unknown proteins is quickly connected to well-characterized nodes which global prediction algorithms can exploit. Our method for quantifying co-occurrence reliability shows superior performance to the other methods, particularly at threshold values around 10% which yield the best trade off between coverage and accuracy. In contrast, the traditional way of asserting co-occurrence when at least one abstract mentions both proteins proves to be the worst method for generating co-occurrence data, introducing too many false positives. Annotating the functions with greater specificity is harder, but co-occurrence data still proves beneficial. Conclusion Co-occurrence data is a valuable supplemental source for graph-theoretic function prediction algorithms. A rapidly growing literature corpus ensures that co-occurrence data is a readily-available resource for nearly every studied organism, particularly those with small protein interaction databases. Though arguably biased toward known genes, co-occurrence data provides critical additional links to well-studied regions in the interaction network that graph-theoretic function prediction algorithms can exploit.</p