Search CORE

43 research outputs found

Amino Acid Usage Is Asymmetrically Biased in AT- and GC-Rich Microbial Genomes.

Author: A Dufresne
A Garcia-Gonzalez
B Wang
CE McEwan
D Bharanidharan
David W. Ussery
EP Rocha
Eystein Skjerve
H Akaike
H Naya
H Willenbrock
J Bohlin
J Bohlin
J Bohlin
J Bohlin
J Lightfield
JJ Wernegreen
Jon Bohlin
JP McCutcheon
KT Konstantinidis
KU Foerstner
M Woolfit
N Molina
NA Moran
NA Moran
NA Moran
Ola Brynildsrud
ON Reva
PA Lind
PM Sharp
PM Sharp
PS Novichkov
R Hershberg
R Hershberg
R Mendez
R Raghavan
S Audic
S Mann
SA Marashi
SN Wood
T Banerjee
Tamir Tuller
Tammi Vesth
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

INTRODUCTION: Genomic base composition ranges from less than 25% AT to more than 85% AT in prokaryotes. Since only a small fraction of prokaryotic genomes is not protein coding even a minor change in genomic base composition will induce profound protein changes. We examined how amino acid and codon frequencies were distributed in over 2000 microbial genomes and how these distributions were affected by base compositional changes. In addition, we wanted to know how genome-wide amino acid usage was biased in the different genomes and how changes to base composition and mutations affected this bias. To carry this out, we used a Generalized Additive Mixed-effects Model (GAMM) to explore non-linear associations and strong data dependences in closely related microbes; principal component analysis (PCA) was used to examine genomic amino acid- and codon frequencies, while the concept of relative entropy was used to analyze genomic mutation rates. RESULTS: We found that genomic amino acid frequencies carried a stronger phylogenetic signal than codon frequencies, but that this signal was weak compared to that of genomic %AT. Further, in contrast to codon usage bias (CUB), amino acid usage bias (AAUB) was differently distributed in AT- and GC-rich genomes in the sense that AT-rich genomes did not prefer specific amino acids over others to the same extent as GC-rich genomes. AAUB was also associated with relative entropy; genomes with low AAUB contained more random mutations as a consequence of relaxed purifying selection than genomes with higher AAUB. CONCLUSION: Genomic base composition has a substantial effect on both amino acid- and codon frequencies in bacterial genomes. While phylogeny influenced amino acid usage more in GC-rich genomes, AT-content was driving amino acid usage in AT-rich genomes. We found the GAMM model to be an excellent tool to analyze the genomic data used in this study

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Online Research Database In Technology

FigShare

Genome Sequence of the Pea Aphid Acyrthosiphon pisum

Author: Abdel-latief M
Aguilar J
Alioto T
Altincicek B
Anselme C
Ashton P
Ashton PD
Atamian H
Barribeau SM
Bell SN
Bickel RD
Brault V
Brisson JA
Burke G
Butts T
Caillaud M
Calevro F
Calevro F
Camara F
Campbell P
Carolan JC
Cass B
Cazzamali G
Chacko J
Chandrabose MN
Chang C-C
Charles H
Chavez D
Chen H-C
Christiaens O
Colella S
Collin O
Consortium IAG
Cortes T
Cottret L
Cree A
Dale RP
Dang PM
Dao MD
Davies TGE
Davis C
Davis GK
de Vos M
Dearden P
Degnan P
Diaz J
Dinh HH
Dombrovsky A
Douglas A
Douglas AE
Duncan E
Duncan EJ
Edwards OR
Ermolaeva O
Evans J
Febvay G
Febvay G
Fenton B
Ferrier D
Field LM
Fitzroy CIJ
Fowler GR
Fukatsu T
Futami R
Gabaldon T
Gabisi RA
Gatehouse JA
Gauthier J-P
Gerardo NM
Ghanim M
Gibbs RA
Gilbert D
Gordon K
Grimmelikhuijzen CJP
Guigo R
Hansen KK
Hauser F
He X-L
Heckel DG
Heddi A
Hedges D
Hilgarth RS
Hines S
Hitchens ME
Hlavina W
Huang W
Huerta-Cepas J
Hume J
Hunnicutt L
Hunter W
Hurwitz B
Huybrechts J
Iga M
Ishikawa A
Jander G
Janssen R
Jaubert-Possamai S
Jhangian SN
Jiang H
Johnson AJ
Jones A
Jones DH
Jones DH
Joshi V
Jr RSD
Jr WRL
Ju H-J
Kaloshian I
Kamins A
Kamphuis LG
Kapustin Y
Kiryutin B
Kitts P
Koga R
Kosarev P
Kovar C
Kudo T
Kudo T
Kushlan PF
Latorre A
Lavenier D
Lee SL
Legeai F
Leonardo T
Lewis LR
Liu R
Liu Y-S
Llorens C
Lopez J
Lozado RJ
Lu H-L
Macdonald S
Maglott D
Marcet-Houben M
Mariotti M
Martinez-Torres D
McCutcheon JP
McGregor A
Miura T
Miyagishima S-Y
Moen C
Monsion B
Moran N
Moran NA
Morgan MB
Morioka M
Moya A
Murphy T
Muzny D
Nakabachi A
Nazareth LV
Nguyen NB
Nicolas J
Nikoh N
Okwuonu GO
Ollivier M
Ortiz-Rivas B
Parker BJ
Patel BM
Pechuan X
Perez-Brocal V
Permal E
Pignatelli M
Price DRG
Pruitt K
Pu L-L
Quesneville H
Rahbe Y
Ramsey J
Reardon KT
Reeck GR
Ren Q
Richards S
Rispe C
Rizk G
Robertson HM
Robertson HM
Robertson HM
Robichon A
Rozas J
Ruiz SJ
Sagot MF
Santibanez J
Sapojnikov V
Sattelle D
Schneider M
Schwartz J
Seah S
Shigenobu S
Singh K
Smadja C
Smagghe G
Smith J
Solovyev V
Souvorov A
Spragg CJ
Srinivasan D
Stafflinger E
Stanke M
Steffen D
Stern D
Tagu D
Tamames J
Tamarit D
Tamborindeguy C
Thibaud-Nissen F
Thomas G
van der Zee M
van Fleet E
Vattathil S
Veenstra JA
Velarde R
Vellozo A
Vellozo A
Vieira FG
Vilcinskas A
Vincent-Monegat C
Walsh TK
Warren JT
Wilkinson TL
Williamson M
Williamson MS
Williamson S
Wilson ACC
Wilson M
Wolstenholme A
Worley KC
Wright RA
Zhang J
Zhang L
Zhou J-J
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems

Fraunhofer-ePrints

Directory of Open Access Journals

Digitala Vetenskapliga Arkivet - Academic Archive On-line

UPF Digital Repository

White Rose Research Online

Rothamsted Repository

Publikationer från Uppsala Universitet

Ghent University Academic Bibliography

Royal Holloway - Pure

Secretaría de Estado de Cultura

Diposit Digital de la Universitat de Barcelona

University of St. Andrews - Pure

St Andrews Research Repository

Durham Research Online

MURAL - Maynooth University Research Archive Library

Royal Holloway Research Online

HAL Descartes

Wageningen University & Research Publications

University of Miami: Scholarship Miami

Sunderland University Institutional Repository

The University of Manchester - Institutional Repository

Surrey Research Insight

Hal-Diderot

Repository for Publications and Research Data

Lund University Publications

INRIA a CCSD electronic archive server

UCL Discovery

OAKTrust Digital Repository (Texas A&M Univ)

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

eScholarship - University of California

HAL-Rennes 1

Large expert-curated database for benchmarking document similarity detection in biomedical literature search

Author: Aanei CM
Abid MB
Abramowitz MK
Abu-Zaid A
Afnan M
Agarabi C
Ahmad R
Aizat WM
Al-Farha AA
Al-Lawama M
Alanio A
Alaux C
Albiol J
Albrecht DR
Albuquerque LG
Alimba CG
Allardyce J
Almeida GMF
Alonso-Caneiro D
Alper OM
Amer SEDR
Amiya E
Ammerman BA
Amorim RM
An Q
Andersen SU
Aplin JD
Argyropoulos C
Armitage C
Ascher DB
Ashry M
Asmann YW
Assaeed AM
Atack JM
Atanasov AG
Atchison DA
Atkins GJ
Atlas L
Avery SV
Avillach P
Baade PD
Backman L
Badie C
Bae T
Baier D
Baker CI
Bakkach J
Baldi A
Ball E
Bannon R
Bansal A
Bardot O
Barnett AG
Barraud P
Basharat Z
Basner M
Batra J
Baumert P
Bazanova OM
Beale A
Beck CR
Becker D
Beddoe T
Bell ML
Benezeth Y
Bengtsson-Palme J
Berbesque C
Berezikov E
Bergsland N
Berners-Price S
Bernhardt P
Berrevoet F
Berry E
Berthold M
Bessa TB
Beyene TJ
Biedermann PHW
Bijleveld E
Billington C
Birch J
Bittner F
Bitzer M
Blakely RD
Blanck O
Blaskovich MAT
Bleackley M
Blombach F
Blum R
Boehme KA
Boelaert M
Bogdanos D
Bonvin AMJJ
Bosch C
Bosch O
Boudreau SA
Bourgoin T
Bourke E
Bouvard D
Boykin LM
Bradley G
Bradshaw W
Bramoweth AD
Brand T
Braubach O
Braun D
Braun RJ
Brenneisen P
Bridges KM
Brown JAL
Brown P
Browngardt C
Brownlie J
Bruhl A
Bukowy-Bieryllo Z
Bull JA
Burt A
Bush SJ
Butler LM
Byrareddy SN
Byrne HJ
Cabantous S
Cai Y
Calatayud S
Campana LG
Campbell M
Candal E
Cao Z
Cao Z
Cardoso P
Carlson K
Carter D
Cascella M
Casillas S
Castelvetro V
Caswell PT
Catry T
Cavalli G
Cernava T
Cerovsky V
Chacko G
Chagoyen M
Chakraborty S
Chan SS
Chandrasekaran AR
Chatzitheochari S
Chavez-Fumagalli MA
Chen B
Chen C-E
Chen C-S
Chen DF
Chen H
Chen H
Chen J-T
Chen X
Chen Y
Cheng C
Cheng J
Cheng S
Cheung JTK
Chinapaw M
Chinopoulos C
Cho WCS
Chong L
Chowdhury D
Chung H-J
Chwalibog A
Ciresi A
Cobine PA
Cockcroft S
Coelho LP
Colella V
Conesa A
Conway A
Cook PA
Cooper DN
Cooper J
Coqueret O
Corea EM
Cosacak MI
Costa BM
Costa E
Costa VD
Coupland C
Crawford SY
Cruz AD
Cui H
Cui Q
Cuiv PO
Culver DC
Cuypers M
Cyr N
D'Angiulli A
Dahms TES
Dai Z
Daigle F
Dalgleish R
Dalrymple BP
Danchin A
Danielsen HE
Darras S
Daulatabad SV
Davidson SM
Day DA
de Keersmaecker K
de Leeuw F-E
Dean LT
Debrabant B
Degirmenci V
del Tredici AL
Delahay RM
Demaison L
Denzel MS
Deschodt M
Devkota HP
Devriendt K
Dhariwal R
Diao J
Ding J
Dings RPM
Diouf B
Dixon R
Dlamini SV
Dogan Y
Domingues HS
Dong XC
Donner CF
Dono M
Doxey AC
Dressick W
Drevon CA
Duan H
Ducho C
Ducommun B
Dudley KJ
Dufies M
Duijf PHG
Dumaz N
Dwarakanath BS
Ebell MH
Echeverria N
Ecke T
Eckweiler D
Eerola T
Effiong A
Ehret F
Eisenhardt S
Eixarch E
El-Adawy H
El-Esawi MA
Elkum N
Emmrich JV
Engel MS
Engel N
Epp T
Erickson TB
Esfahlani SS
Eskelinen E-L
Eskew EA
Esnakul AK
Eustace AJ
Evangelou E
Fairhead M
Falk S
Fallah M
Falter-Wagner CM
Fan X
Farber DB
Faville MJ
Feghali KA
Fejzo MS
Fernandez-Triana J
Festa F
Feteira A
Feyerabend F
Fierz W
Filipp FV
Fiona .
Flegel WA
Flood-Page P
Florio T
Forano E
Forsayeth J
Fox SA
Franks SJ
Frentiu FD
Friebe M
Frilander MJ
Fu X
Fujita S
Furuta S
Fuss J
Gabrielsen M
Gajda M
Galea I
Galluzzi L
Gani F
Ganpule AP
Gao J
Garcia-Alix A
Gatchell M
Gaullier G
Gedye K
Gelfer Y
Ghelardi E
Gill MR
Gilliham M
Giordano M
Giunta C
Gladue DP
Gleeson PA
Gloyn L
Gnasso A
Goarant C
Gobet A
Goggs R
Gong H
Gonzalezlez-Prendes R
Goodin A
Goodyear CS
Gora D
Gough MJ
Govender P
Govinden U
Goyal R
Graham EB
Graham KE
Grande-Perez A
Graves PM
Greene G
Greenwald NF
Greidanus H
Greiff V
Grice D
Grimm DG
Groen EJN
Gruber J
Grunau C
Grundle DS
Gruneberg P
Grybos M
Guisado JL
Gumede N
Gumulya Y
Guo Y
Gurevich VV
Gurney-Champion OJ
Gusev O
Gutierrez-Sacristan A
Habes M
Hacker E
Hage SR
Hagen G
Hahn S
Haller DM
Hammerschmidt S
Han H
Han J
Han Q
Han R
Handfield M
Hanson J
Haore G
Hapuarachchi HC
Harder T
Hardingham JE
Harrison P
Hartmann MD
Harvey DJ
Haston S
Heck M
Heers M
Heffler E
Heinrich M
Helantera H
Herbelet S
Hew KF
Higginbottom DB
Higuchi Y
Hilton R
Hiroi N
Hobbs E
Hodzic E
Hoenner X
Hojsgaard D
Hone A
Hongoh Y
Honjo K
Horbar J
Hori H
Hu G
Hu P
Huber HP
Huber M
Hueso LE
Huirne J
Hurt L
Huttner FJ
Idborg H
Ide K
Ikeo K
Ikonomopoulou MP
Ingley E
Jakeman PM
Janga SC
Janzen T
Jayaraman J
Jeltsch A
Jensen A
Jeurissen P
Jia H
Jia H
Jia S
Jiang F
Jiang J
Jiang X
Jibb LA
Jin Y
Jo D
Johnson AM
Johnson DM
Johnston M
Jongen S
Jonscher KR
Jorens PG
Jorgensen JOL
Josse C
Joubert JW
Jung S-H
Junior AM
Jurman G
Kabra D
Kahan T
Kaiser S
Kamagata K
Kamboj SK
Kamiya H
Kane NC
Kang Y-K
Karamanos Y
Karmakar C
Karp NA
Kasian O
Kauppila JH
Kaye LK
Kelly R
Kelly S
Kenna R
Kennedy J
Kersten B
Khalaf RA
Khalid JM
Khan MM
Khatlani T
Khider T
Kijanka GS
Kim Y-M
King SRB
Kinyanjui T
Kish JK
Klempnauer K-H
Kleppe A
Klump H
Kluz T
Knox P
Kobayashi T
Kobold S
Koch K-W
Kohanbash G
Kohls G
Kohonen-Corish MRJ
Koleva-Kolarova RG
Kong X
Konkle-Parker D
Korpela KM
Kostrikis LG
Kraiczy P
Kratz H
Krause G
Krebsbach PH
Kristensen SR
Kristiansson E
Kueberuwa G
Kugler J-M
Kulkarni A
Kumar G
Kumar N
Kumar N
Kumari P
Kunimatsu A
Kurdak H
Kurgan L
Kurniawan NA
Kwon YD
Lachat C
Lacy-Colson J
Lagisz M
Lai HM
Laky B
Lalaouna D
Lammerding J
Lange M
Larrosa M
Laslett AL
Latif A
Lau CL
Lauschke VM
LeClair EE
Lee K-W
Lee M-S
Lee M-Y
Lee S
Li B
Li G
Li J
Li J
Li J
Li Z
Liang D
Liang S
Lidbury BA
Lieb K
Liehr T
Liew AWC
Lim CJ
Lim YY
Lin MZ
Lindsey ML
Line P-D
Liu D
Liu E
Liu F
Liu F
Liu H
Liu H
Liu S
Liu X
Liu Y-P
Lloyd VK
Lo T-W
Locci E
Loft ND
Loidl J
Lopez-Escamez JA
Lopez-Ruiz FJ
Lorenzen J
Lorkowski S
Lovell NH
Lu H
Lu J-J
Lu Q
Lu W
Lu Z
Luengo GS
Lund BA
Lundh L-G
Lussier AA
Luu AM
Lynch I
Lysy PA
Ma C
Ma L
Ma L
Ma L
Ma R
Ma W
Mabb A
Mack HG
Mackey DA
Mahavadi P
Mahdavi SR
Maher P
Maher T
Maibach EW
Maity SN
Malgrange B
Mamoulakis C
Mangoni AA
Manke T
Manstead ASR
Mantalaris A
Marchbank KJ
Marinello F
Marsal J
Marschalek R
Marschall H-U
Martin CS
Martin FL
Martinez-Raga J
Martinez-Salas E
Martis E
Marzocchi U
Mather DE
Mathieu D
Matsui Y
Maza E
McCrum C
McCutcheon JE
McGarrigle CA
Mckay GJ
McMillan B
McMillan N
Meads C
Medina L
Merrick BA
Meseko C
Metzger DW
Meule A
Meunier FA
Michaelis M
Micheau O
Miele AE
Mier P
Mihara H
Min R
Mintz EM
Miotla P
Mitchell KM
Mizukami T
Moal I
Moalic Y
Mohapatra DP
Molari M
Molleman L
Mondal SR
Montagutelli X
Monteiro A
Montes M
Moore MD
Moran JV
Morcillo E
Morozov SY
Mort M
Moss WN
Moultos OA
Moyer R
Mukherjee M
Murai N
Murphy DJ
Murphy SK
Murray SA
Muth T
Naganawa S
Nagler K
Nakayama K
Nammi S
Nandakumar KS
Narayan E
Nasios G
Natoli RM
Navaratnarajah .
Neumann P-A
Ng G
Nguyen F
Nicol C
Nicoletti R
Nie J
Nie Y
Niehof M
Niemeyer F
Nilsen EB
Nilsson H
Nixon B
Nobile CJ
Norris AD
Nwaiwu O
O'Mahony M
O'Toole R
Ogami K
Ohgami RS
Ohlsson S
Ohtomo T
Olatunbosun O
Oldenmenger WH
Olofsson P
Olumayede E
Orme MW
Ortiz A
Oster H
Ostrikov K
Otto S
Ou J
Outeiro TF
Ouyang S
Paganoni S
Page A
Pallebage-Gamarallage M
Palm C
Palma J-A
Pan Z
Panthee S
Paradies Y
Parchi P
Parsons JR
Parsons MH
Parsons N
Pascal P
Paterson R
Paul E
Pearce SP
Pearson JA
Peckham M
Pedemonte N
Peifer M
Pelkonen T
Pelleri MC
Pellizzon MA
Peng Y
Perco P
Pereira JL
Peres MA
Petrelli M
Pheko M
Pichugin A
Pinto CJC
Pinto IM
Pinto KA
Piotrowski M
Piovesan A
Plevris JN
Pluess M
Podolsky IM
Pollesello P
Polz M
Ponti G
Popoola SI
Porcelli P
Portilla M
Portillo MC
Pourret O
Prajapati AS
Pranata R
Prescott J
Prieto D
Prince M
Pritchard AL
Pusch S
Qi D
Qi X
Quinn GP
Quinn TJ
Raghava GPS
Rahimi F
Rahman MS
Raikou VD
Ramula S
Ranft A
Rappsilber J
Reddan T
Rehfeldt F
Reiling JH
Remacle C
Reschke CR
Rezaei M
Rhodes J
Riddick EW
Ritter U
Riva G
Roach NW
Roberts DD
Roberts NJ
Robles G
Rodrigues T
Rodriguez C
Roislien J
Roobol MJ
Ross K
Ross SA
Rotge J-Y
Rowe AD
Rowe JA
Ruepp A
Rust P
Saad S
Sabnis SC
Sack GH
Saggar M
Saito Y
Salama MF
Sallmon H
Santos M
Saudemont A
Sava G
Schrading S
Schramm A
Schreiber M
Schuele B
Schuler S
Schulte LN
Schuon RA
Schymkowitz J
Sczyrba A
Seib KL
Senghore T
Seow E
Sergeant K
Shabalin IG
Shahid S
Shalchyan V
Shen J
Shi H-P
Shimada T
Shin J-S
Shortt C
Siebers R
Sillanpaa E
Silveyra P
Skinner D
Small I
Smeets PAM
Smith SS
So P-W
Solano F
Sonenshine DE
Song H
Song J
Sorzano CO
Southall T
Speakman JR
Srinivasan MV
St Hilaire C
Stabile LP
Staege MS
Stasiak A
Steadman KJ
Stein N
Stella A
Stephens AW
Stevanovic D
Stewart CJ
Stewart DI
Stine K
Storlazzi C
Stoynova NV
Strzalka W
Suarez OM
Subhash S
Sukocheva O
Sultana T
Sumant AV
Summers MJ
Sun G
Sydes M
Tacon P
Tamaian R
Tan A-C
Tan E-C
Tan K-H
Tanaka K
Tang H
Tanino Y
Targett-Adams P
Tayebi M
Tayyem R
Tebbe CC
Telfer EE
Tempel W
Teodorczyk-Injeyan JA
Terrier O
Testoni I
Thijs G
Thorne S
Thrift AG
Tiffon C
Tinnefeld P
Tjahjono DH
Tofani M
Tolle F
Torga G
Toth E
Tressoldi P
Troder SE
Tsapas A
Tsirigotis K
Turak A
Tuttle N
Tzotzos G
Uchendu F
Udo EE
Uhle F
Utsumi T
Uversky VN
Vaidyanathan S
Vaillant M
Valsesia A
Van de Mortel T
Van den Bos W
van Meerten T
van Nieuwerburgh F
van Raaij MJ
van Ruitenbeek J
Vandenbroucke RE
Vanneste S
Veiga FH
Vendrell M
Verloh N
Vesk PA
Vickers P
Victor VM
Villemur R
Villet MH
Vindin H
Viveiros M
Vohl M-C
Voolstra CR
Vorholt JA
Voskarides K
Voutchkova DD
Vuillemin A
Wakelin S
Waldron L
Walsh LJ
Wang AY
Wang F
Wang Y
Watanabe Y
Weigert A
Weinstock C
Wen J-C
Werner GDA
Werten S
Westermair AL
Wham C
White EP
Widera D
Wiener J
Wilharm G
Wilkinson S
Williams R
Willmann R
Wilson C
Wirth B
Wojan TR
Woldesemayat AA
Wolff M
Wong A
Wong BM
Wu T-W
Wuerbel H
Xia W
Xiao X
Xu D
Xu H
Xu J
Xu J
Xu JW
Xue B
Xue Y
Yadollahpour A
Yalcin S
Yamato M
Yan H
Yang E-C
Yang H
Yang L
Yang S
Yang SY
Yang W
Yang Y
Ye Y
Ye Z-Q
Yeung AWK
Yin C-C
Yli-Kauhaluoma J
Yoneyama H
Yu Y
Yuan G-C
Yuh C-H
Zabetakis I
Zaccolo M
Zaucha J
Zeng C
Zeng E
Zevnik B
Zhang C
Zhang C
Zhang J
Zhang L
Zhang L
Zhang X
Zhang Y
Zhang Y
Zhang Z
Zhang Z
Zhang Z-Y
Zhao X
Zhao Y
Zhou K
Zhou M
Zhu S
Ziegler A
Zinke K
Zuberbier T
Publication venue: OXFORD UNIV PRESS
Publication date: 29/10/2019
Field of study

Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research

UCL Discovery