Search CORE

Stirling Online Research Repository (RIOXX)

Publications at Bielefeld University

Stirling Online Research Repository

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

Identification of conserved gene clusters in multiple genomes based on synteny and homology

Author: A Alexeyenko
A Bergeron
A Bergeron
AK Bansal
Anasua Sarkar
B Snel
D Goldberg
DJ Sherman
EA Housworth
FA Kondrashov
G Consortium
G Didier
Hayssam Soueidan
J Tamames
JH Nadeau
K Vandepoele
KH Wolfe
L Li
L Parida
M Ermolaeva
M Lynch
M Lynch
M Nikolski
Macha Nikolski
ML Seret
MP Beal
Q Yang
R Hoberman
R Jothi
R Overbeek
S Heber
S Kim
S Ohno
T Dandekar
T Schmidt
TJ Vision
W Fitch
WM Fitch
X He
Publication venue: BioMed Central
Publication date: 01/10/2011
Field of study

Abstract Background Uncovering the relationship between the conserved chromosomal segments and the functional relatedness of elements within these segments is an important question in computational genomics. We build upon the series of works on <it>gene teams</it> and <it>homology teams.</it> Results Our primary contribution is a local sliding-window SYNS (SYNtenic teamS) algorithm that refines an existing family structure into orthologous sub-families by analyzing the neighborhoods around the members of a given family with a locally sliding window. The neighborhood analysis is done by computing conserved gene clusters. We evaluate our algorithm on the existing homologous families from the Genolevures database over five genomes of the Hemyascomycete phylum. Conclusions The result is an efficient algorithm that works on multiple genomes, considers paralogous copies of genes and is able to uncover orthologous clusters even in distant genomes. Resulting orthologous clusters are comparable to those obtained by manual curation.</p

Lund University Publications

Springer - Publisher Connector

Publications at Bielefeld University

A Survey of Matrix Completion Methods for Recommendation Systems

Author: Li Min
Li Yaohang
Liu Quan
Ramlatchan Andy
Wang Jianxin
Yang Mengyun
Publication venue: ODU Digital Commons
Publication date: 01/07/2018
Field of study

In recent years, the recommendation systems have become increasingly popular and have been used in a broad variety of applications. Here, we investigate the matrix completion techniques for the recommendation systems that are based on collaborative filtering. The collaborative filtering problem can be viewed as predicting the favorability of a user with respect to new items of commodities. When a rating matrix is constructed with users as rows, items as columns, and entries as ratings, the collaborative filtering problem can then be modeled as a matrix completion problem by filling out the unknown elements in the rating matrix. This article presents a comprehensive survey of the matrix completion methods used in recommendation systems. We focus on the mathematical models for matrix completion and the corresponding computational algorithms as well as their characteristics and potential issues. Several applications other than the traditional user-item association prediction are also discussed

Old Dominion University

Efficient algorithms for gene cluster detection in prokaryotic genomes

Author: Schmidt Thomas
Publication venue: Bielefeld University
Publication date: 01/01/2005
Field of study

Schmidt T. Efficient algorithms for gene cluster detection in prokaryotic genomes. Bielefeld (Germany): Bielefeld University; 2005.The research in genomics science rapidly emerged in the last few years, and the availability of completely sequenced genomes continuously increases due to the use of semi-automatic sequencing machines. Also these sequences, mostly prokaryotic ones, are well annotated, which means that the positions of their genes and parts of their regulatory or metabolic pathways are known. A new task in the field of bioinformatics now is to gain gene or protein information from the comparison of genomes on a higher level. In the approach of "comparative genomics" researchers in bioinformatics are attempting to locate groups or clusters of orthologous genes that may have the same function in multiple genomes. These researches are often anchored on the simple, but biologically verified fact, that functionally related proteins are usually coded by genes placed in a region of close genomic neighborhood, in different species. From an algorithmic and combinatorial point of view, the first descriptions of the concept of "closely placed genes" were only fragmentary, and sometimes confusing. The given algorithms often lack the necessary grounds to prove their correctness, or assess their complexity. Within the first formal models of a conserved genomic neighborhood, genomes are often represented as permutations of their genes, and common intervals, i.e. intervals containing the same set of genes, are interpreted as gene clusters. But here the major disadvantage of representing genomes as permutations is the fact that paralogous copies of the same gene inside one genome can not be modelled. Since especially large genomes contain numerous paralogous genes, this model is insufficient to be used on real genomic data. In this work, we consider a modified model of gene clusters that allows paralogs, simply by representing genomes as sequences rather than permutations of genes. We define common intervals based on this model, and we present a simple algorithm that finds all common intervals of two sequences in [Theta](n2) time using [Theta](n2) space. Another, more complicated algorithm runs in [Omikron](n2) time and uses only linear space. We also show how to extend these algorithms to more than two genomes and present the implementation of the algorithms as well as the visualization of the located clusters in the tool Gecko. Since the creation of the string representation of a set of genomes is a non-trivial task, we also present the data preparation tool GhostFam that groups all genes from the given set of genomes to their families of homologs. In the evaluation on a set of 20 bacterial genomes, we show that with the presented approach it is possible to correctly locate gene clusters that are known from the literature, and to successfully predict new groups of functionally related genes

Silent but Not Static: Accelerated Base-Pair Substitution in Silenced Chromatin of Budding Yeasts

Author: AB Barton
AB Barton
AL Goldstein
C Diaz-Castillo
C Terleth
D Carter
DC Amberg
DE Gottschling
E Kejnovsky
EJ Louis
EV Linardopoulou
F Baudat
FE Pryde
G Liti
Gregory S. Barsh
JA Anderson
Jasper Rine
JD Barry
JD Lieb
JF Hughes
JJ Wyrick
JM Kim
JQ Svejstrup
Leonid Teytelman
LN Rusche
M Kellis
M Livingstone-Zatchej
MA Vega-Palas
Michael B. Eisen
P Cliften
P Rice
PF Cliften
R Chenna
S Henikoff
SF Altschul
SI Grewal
SI Grewal
SK Mewborn
TF Smith
U Gueldener
U Nagalakshmi
WH Tham
Y Su
Y Zhu
Z Lippman
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Subtelomeric DNA in budding yeasts, like metazoan heterochromatin, is gene poor, repetitive, transiently silenced, and highly dynamic. The rapid evolution of subtelomeric regions is commonly thought to arise from transposon activity and increased recombination between repetitive elements. However, we found evidence of an additional factor in this diversification. We observed a surprising level of nucleotide divergence in transcriptionally silenced regions in inter-species comparisons of Saccharomyces yeasts. Likewise, intra-species analysis of polymorphisms also revealed increased SNP frequencies in both intergenic and synonymous coding positions of silenced DNA. This analysis suggested that silenced DNA in Saccharomyces cerevisiae and closely related species had increased single base-pair substitution that was likely due to the effects of the silencing machinery on DNA replication or repair

CiteSeerX

Public Library of Science (PLOS)

Research Papers in Economics

Resource management and the effects of trade on vulnerable places and people : lessons from six case studies

Author: Larson Donald F.
Nash John
Publication venue
Publication date
Field of study

Lessons from six case studies illustrate the complex relationships between international trade, vulnerable ecologies and the poor. The studies, taken from Africa, Asia and Latin America and conducted by local researchers, are set in places where the poor live in close proximity to ecologies that are important to global conservation efforts, and focus on the cascading consequences of trade policy for local livelihoods and environmental services. Collectively, the studies show how under-valued common resources are often poorly protected and consequently subject to shifting economic incentives, including those that arise from trade. The studies provide examples where trade works to accelerate the use of natural resources and to exacerbate unsustainable dependencies by the poor, and other examples where trade has the opposite effect. An important conclusion is that local livelihood and technology choices have important consequences for how environmental resources are used and should be taken into account when designing policies to safeguard fragile ecologies.Environmental Economics&Policies,Economic Theory&Research,Emerging Markets,Labor Policies,Population Policies

A Fresh Insight into Transmission of Schistosomiasis: A Misleading Tale of Biomphalaria in Lake Victoria

Author: Chris Wade
Claire J. Standley
Henk D. F. H. Schallig
J. Russell Stothard
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Lake Victoria is a known hot-spot for Schistosoma mansoni, which utilises freshwater snails of the genus Biomphalaria as intermediate hosts. Different species of Biomphalaria are associated with varying parasite compatibility, affecting local transmission. It is thought that two species, B. choanomphala and B. sudanica, inhabit Lake Victoria; despite their biomedical importance, the taxonomy of these species has not been thoroughly examined. This study combined analysis of morphological and molecular variables; the results demonstrated that molecular groupings were not consistent with morphological divisions. Habitat significantly predicted morphotype, suggesting that the different Lake Victorian forms of Biomphalaria are ecophentoypes of one species. The nomenclature should be revised accordingly; the names B. choanomphala choanomphala and B. c. sudanica are proposed. From a public health perspective, these findings can be utilised by policy-makers for better understanding of exposure risk, resulting in more effective and efficient control initiatives

CiteSeerX

Repository@Nottingham

ComPath: comparative enzyme analysis and annotation in pathway/subsystem contexts

Author: A Andreeva
A Bateman
A Marchler-Bauer
A Osterman
AL Barabási
C Gene Ontology
C The UniProt
CJA Sigrist
CM Zmasek
DA Benson
DH Haft
HM Berman
HW Ma
J Wu
K Choi
Kwangmin Choi
L Pireddu
M Kanehisa
M Kanehisa
M Madera
N Hulo
P Stothard
PC Babbitt
PD Karp
R Caspi
R Overbeek
RA George
S Kim
S Kim
S Kim
S Kim
SCH Pegg
SF Altschul
Sun Kim
V BATAGELJL
VM Markowitz
W Thompson
WR Pearson
Y Ye
Y Zheng
YI Wolf
Publication venue: BioMed Central
Publication date: 01/03/2008
Field of study

Abstract Background Once a new genome is sequenced, one of the important questions is to determine the presence and absence of biological pathways. Analysis of biological pathways in a genome is a complicated task since a number of biological entities are involved in pathways and biological pathways in different organisms are not identical. Computational pathway identification and analysis thus involves a number of computational tools and databases and typically done in comparison with pathways in other organisms. This computational requirement is much beyond the capability of biologists, so information systems for reconstructing, annotating, and analyzing biological pathways are much needed. We introduce a new comparative pathway analysis workbench, ComPath, which integrates various resources and computational tools using an interactive spreadsheet-style web interface for reliable pathway analyses. Results ComPath allows users to compare biological pathways in multiple genomes using a spreadsheet style web interface where various sequence-based analysis can be performed either to compare enzymes (e.g. sequence clustering) and pathways (e.g. pathway hole identification), to search a genome for <it>de novo </it>prediction of enzymes, or to annotate a genome in comparison with reference genomes of choice. To fill in pathway holes or make <it>de novo </it>enzyme predictions, multiple computational methods such as FASTA, Whole-HMM, CSR-HMM (a method of our own introduced in this paper), and PDB-domain search are integrated in ComPath. Our experiments show that FASTA and CSR-HMM search methods generally outperform Whole-HMM and PDB-domain search methods in terms of sensitivity, but FASTA search performs poorly in terms of specificity, detecting more false positive as E-value cutoff increases. Overall, CSR-HMM search method performs best in terms of both sensitivity and specificity. Gene neighborhood and pathway neighborhood (global network) visualization tools can be used to get context information that is complementary to conventional KEGG map representation. Conclusion ComPath is an interactive workbench for pathway reconstruction, annotation, and analysis where experts can perform various sequence, domain, context analysis, using an intuitive and interactive spreadsheet-style interface. </p

Springer - Publisher Connector