Search CORE

1,098 research outputs found

Distance, dissimilarity index, and network community structure

Author: A. Bairoch
C. von Mering
C.M. Deane
E. Ravasz
H. Zhou
H.W. Mewes
Haijun Zhou
I. Xenarios
L.C. Freeman
M. Girvan
P. Uetz
W.W. Zachary
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2003
Field of study

We address the question of finding the community structure of a complex network. In an earlier effort [H. Zhou, {\em Phys. Rev. E} (2003)], the concept of network random walking is introduced and a distance measure defined. Here we calculate, based on this distance measure, the dissimilarity index between nearest-neighboring vertices of a network and design an algorithm to partition these vertices into communities that are hierarchically organized. Each community is characterized by an upper and a lower dissimilarity threshold. The algorithm is applied to several artificial and real-world networks, and excellent results are obtained. In the case of artificially generated random modular networks, this method outperforms the algorithm based on the concept of edge betweenness centrality. For yeast's protein-protein interaction network, we are able to identify many clusters that have well defined biological functions.Comment: 10 pages, 7 figures, REVTeX4 forma

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Historical urban growth in Europe (1300–1800)

Author: Bairoch P.
De Vries J.
Dittmar J.
Gibrat R.
Hohenberg P. M.
Russell J.
Zipf G.
Publication venue: 'Wiley'
Publication date: 01/01/2019
Field of study

This paper analyses the evolution of the European urban system from a long-term perspective (from 1300 to 1800). Using the method recently proposed by Clauset, Shalizi, and Newman, a Pareto-type city size distribution (power law) is rejected from 1300 to 1600. A power law is a plausible model for the city size distribution only in 1700 and 1800, although the log-normal distribution is another plausible alternative model that we cannot reject. Moreover, the random growth of cities is rejected using parametric and non-parametric methods. The results reveal a clear pattern of convergent growth in all the periods

Crossref

Repositorio Universidad de Zaragoza

neXtProt: a knowledge platform for human proteins

Author: A. Bairoch
A. Britan
A. Gateau
A. Gleizes
A. Masselot
Altschul
Ashburner
Bairoch
C. Zwahlen
Deutsch
G. Argoud-Puy
Goel
I. Cusin
Kelso
L. Lane
Liebel
O. Evalet
Orchard
P. D. Duek
P. Gaudet
Sewell
Shannon
Simpson
Uhlen
Publication venue: Oxford University Press
Publication date: 01/01/2012
Field of study

neXtProt (http://www.nextprot.org/) is a new human protein-centric knowledge platform. Developed at the Swiss Institute of Bioinformatics (SIB), it aims to help researchers answer questions relevant to human proteins. To achieve this goal, neXtProt is built on a corpus containing both curated knowledge originating from the UniProtKB/Swiss-Prot knowledgebase and carefully selected and filtered high-throughput data pertinent to human proteins. This article presents an overview of the database and the data integration process. We also lay out the key future directions of neXtProt that we consider the necessary steps to make neXtProt the one-stop-shop for all research projects focusing on human proteins

Crossref

PubMed Central

Archive ouverte UNIGE

The PROSITE database, its status in 1999

Author: Bairoch A.
Bucher P.
Falquet L.
Hofmann K.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 17/12/2007
Field of study

The PROSITE database (http://www.expasy.ch/sprot/prosite.htm l) consists of biologically significant patterns and profiles formulated in such a way that with appropriate computational tools it can help to determine to which known family of protein (if any) a new sequence belongs, or which known domain(s) it contains

Infoscience - École polytechnique fédérale de Lausanne

An approach to describing and analysing bulk biological annotation quality: a case study using UniProtKB

Author: Bairoch
Baumgartner
Boeckmann
C. S. Gillespie
Curwen
D. Swan
Dolan
Flesch
Gilks
Lord
M. J. Bell
P. Lord
Pal
Ussery
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

Motivation: Annotations are a key feature of many biological databases, used to convey our knowledge of a sequence to the reader. Ideally, annotations are curated manually, however manual curation is costly, time consuming and requires expert knowledge and training. Given these issues and the exponential increase of data, many databases implement automated annotation pipelines in an attempt to avoid un-annotated entries. Both manual and automated annotations vary in quality between databases and annotators, making assessment of annotation reliability problematic for users. The community lacks a generic measure for determining annotation quality and correctness, which we look at addressing within this article. Specifically we investigate word reuse within bulk textual annotations and relate this to Zipf's Principle of Least Effort. We use UniProt Knowledge Base (UniProtKB) as a case study to demonstrate this approach since it allows us to compare annotation change, both over time and between automated and manually curated annotations. Results: By applying power-law distributions to word reuse in annotation, we show clear trends in UniProtKB over time, which are consistent with existing studies of quality on free text English. Further, we show a clear distinction between manual and automated analysis and investigate cohorts of protein records as they mature. These results suggest that this approach holds distinct promise as a mechanism for judging annotation quality. Availability: Source code is available at the authors website: http://homepages.cs.ncl.ac.uk/m.j.bell1/annotation. Contact: [email protected]: Paper accepted at The European Conference on Computational Biology 2012 (ECCB'12). Subsequently will be published in a special issue of the journal Bioinformatics. Paper consists of 8 pages, made up of 5 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

A survey of orphan enzyme activities

Author: A Bairoch
A Bairoch
A Barrett
A Barrett
B Briggs
D Naumoff
DL Wheeler
E Pennisi
J Melnick
JD Peterson
K Tipton
MY Galperin
O Lespinet
O Lespinet
O Lespinet
O Lespinet
P Bork
P Karp
P Romero
PD Karp
Peter D Karp
RJ Roberts
RJ Roberts
RV Misra
T Cheng
TJ Lee
W Nishii
X Chen
Y Wang
Yannick Pouliot
Publication venue: BioMed Central
Publication date: 01/07/2007
Field of study

Abstract Background Using computational database searches, we have demonstrated previously that no gene sequences could be found for at least 36% of enzyme activities that have been assigned an Enzyme Commission number. Here we present a follow-up literature-based survey involving a statistically significant sample of such "orphan" activities. The survey was intended to determine whether sequences for these enzyme activities are truly unknown, or whether these sequences are absent from the public sequence databases but can be found in the literature. Results We demonstrate that for ~80% of sampled orphans, the absence of sequence data is bona fide. Our analyses further substantiate the notion that many of these enzyme activities play biologically important roles. Conclusion This survey points toward significant scientific cost of having such a large fraction of characterized enzyme activities disconnected from sequence data. It also suggests that a larger effort, beginning with a comprehensive survey of all putative orphan activities, would resolve nearly 300 artifactual orphans and reconnect a wealth of enzyme research with modern genomics. For these reasons, we propose that a systematic effort to identify the cognate genes of orphan enzymes be undertaken.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Economic Backwardness and Social Tension

Author: Acemoglu D.
Bairoch P.
Chenoweth E.
Downey G. J.
Duesenberry J. S.
Edwards L. P.
Gerschenkron A.
Gurr T. R.
Post Office
Post Office Department
Ross N. E.
Standage T.
Veblen T.
Publication venue: 'Wiley'
Publication date: 09/10/2017
Field of study

We propose that relative economic backwardness contributes to the build-up of social tension and non-violent and violent conflict. We test our hypothesis using data on organized mass movements and armed civil conflict. The findings show that greater economic backwardness is consistently linked to a higher probability of onset of violent and especially non-violent forms of civil unrest. We provide evidence that the relationship is causal in instrumental variables estimations using new instruments, including mailing speeds and telegram charges around 1900. The magnitude of the effect of backwardness on social tension increases in the two-stage least-squares estimations

Crossref

University of East Anglia digital repository

Complex networks theory for analyzing metabolic networks

Author: A. B. Horne
A. Bairoch
A. Broder
A. L. Barabasi
A. L. Barabasi
A. Samal
A. Wagner
B. M. Bakker
C. H. Schilling
C. Wagner
D. J. Watts
E. Ravasz
H. Jeong
H. Kitano
H. Lipson
H. W. Ma
H. W. Ma
H. W. Ma
Hong Yu
J. A. Papin
J. C. M. Mombach
J. Gagneur
J. Stelling
J. Stelling
Jianhua Luo
Jing Zhao
L. H. Hartwell
M. Arita
M. C. Palumbo
M. Faloutsos
M. Kanehisa
M. Kanehisa
M. Nakao
M. V. Martinov
N. Lemke
P. D. Karp
P. D. Karp
P. Erdös
P. Holme
R. Albert
R. Guimera
R. Mahadevan
R. Overbeek
R. Schuster
S. Goto
S. Schuster
S. Schuster
S. Wuchty
Yixue Li
Z. N. Oltvai
Z. W. Cao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/08/2006
Field of study

One of the main tasks of post-genomic informatics is to systematically investigate all molecules and their interactions within a living cell so as to understand how these molecules and the interactions between them relate to the function of the organism, while networks are appropriate abstract description of all kinds of interactions. In the past few years, great achievement has been made in developing theory of complex networks for revealing the organizing principles that govern the formation and evolution of various complex biological, technological and social networks. This paper reviews the accomplishments in constructing genome-based metabolic networks and describes how the theory of complex networks is applied to analyze metabolic networks.Comment: 13 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Political Regimes and Sovereign Credit Risk in Europe, 1750-1913

Author: ***
***
A Maddison
Austro-Prussian War
B Frey
Belgian Revolt
C Hill
C Kindleberger
C Meissner
C Reinhart
C Tilly
Carlist St
D Acemoglu
D Acemoglu
D Hirst
D Sacks
D Stasavage
D Stasavage
E Kiser
E White
E White
E White
Encyclopedia Britannica
England
F Velde
France
G Clark
G Tortella
H Jackson
J Bai
J Brewer
J De Vries
J Jones
J L Rosenthal
J L Rosenthal
J L Van Zanden
J L Van Zanden
J Mokyr
J Mokyr
J Wooldridge
K Jaggers
K Mitchener
K Willard
L Stone
La Porta
La Porta
La Porta
La Porta
M Bordo
M Clodfelter
M Dincecco
M Dincecco
M Flandreau
M Hart
M Obstfeld
Mark Dincecco
Mercurius Maandelijksche Hollandsche
N Beck
N Ferguson
N Ferguson
N Ferguson
N Sussman
N Sussman
Netherlands
P Bairoch
P Dickson
P Hoffman
P Hoffman
P Hohenberg
P Mathias
P Mauro
P O&apos
Prussia
R Bonney
R Bonney
R Cust
R Price
R Tilly
R Tilly
S Homer
S Quinn
S R Epstein
Spain
T Sargent
Universel Le Moniteur
W Brown
W Fritschy
W Fritschy
W Fritschy
W Summerhill
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2008
Field of study

This article uses a new panel data set to perform a statistical analysis of political regimes and sovereign credit risk in Europe from 1750 to 1913. Old Regime polities typically suffered from fiscal fragmentation and absolutist rule. By the start of World War I, however, many such countries had centralized institutions and limited government. Panel regressions indicate that centralized and?or limited regimes were associated with significant improvements in credit risk relative to fragmented and absolutist ones. Structural break tests also reveal close relationships between major turning points in yield series and political transformations

Crossref

IMT Institutional Repository

Protein folding using contact maps

Author: A. Bairoch
A. M. Gutin
A. Sali
A. T. Brünger
A. V. Finkelstein
C. J. Camacho
C. Micheletti
D. A. Hinds
D. Nabutovsky
E. Domany
E. I. Shakhnovich
F. Seno
H. Frauenfelder
H. Frauenkron
H. Li
H. Li
K. D. Klimov
K. F. Lau
L. Mirny
L. Mirny
M. H. Hao
M. L. Minsky
M. Vendruscolo
M. Vendruscolo
M. Vendruscolo
M. Vendruscolo
P. D. Thomas
R. L. Jernigan
R. Najmanovich
S. Miyazawa
T. Garel
T. Garel
V. S. Pande
V. S. Pande
Publication venue: 'American Physical Society (APS)'
Publication date: 21/01/1999
Field of study

We present the development of the idea to use dynamics in the space of contact maps as a computational approach to the protein folding problem. We first introduce two important technical ingredients, the reconstruction of a three dimensional conformation from a contact map and the Monte Carlo dynamics in contact map space. We then discuss two approximations to the free energy of the contact maps and a method to derive energy parameters based on perceptron learning. Finally we present results, first for predictions based on threading and then for energy minimization of crambin and of a set of 6 immunoglobulins. The main result is that we proved that the two simple approximations we studied for the free energy are not suitable for protein folding. Perspectives are discussed in the last section.Comment: 29 pages, 10 figure

arXiv.org e-Print Archive

Crossref