Search CORE

19 research outputs found

Quantification of codon selection for comparative bacterial genomics

Abstract Background Statistics measuring codon selection seek to compare genes by their sensitivity to selection for translational efficiency, but existing statistics lack a model for testing the significance of differences between genes. Here, we introduce a new statistic for measuring codon selection, the Adaptive Codon Enrichment (ACE). Results This statistic represents codon usage bias in terms of a probabilistic distribution, quantifying the extent that preferred codons are over-represented in the gene of interest relative to the mean and variance that would result from stochastic sampling of codons. Expected codon frequencies are derived from the observed codon usage frequencies of a broad set of genes, such that they are likely to reflect nonselective, genome wide influences on codon usage (<it>e.g</it>. mutational biases). The relative adaptiveness of synonymous codons is deduced from the frequency of codon usage in a pre-selected set of genes relative to the expected frequency. The ACE can predict both transcript abundance during rapid growth and the rate of synonymous substitutions, with accuracy comparable to or greater than existing metrics. We further examine how the composition of reference gene sets affects the accuracy of the statistic, and suggest methods for selecting appropriate reference sets for any genome, including bacteriophages. Finally, we demonstrate that the ACE may naturally be extended to quantify the genome-wide influence of codon selection in a manner that is sensitive to a large fraction of codons in the genome. This reveals substantial variation among genomes, correlated with the tRNA gene number, even among groups of bacteria where previously proposed whole-genome measures show little variation. Conclusions The statistical framework of the ACE allows rigorous comparison of the level of codon selection acting on genes, both within a genome and between genomes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

D-Scholarship@Pitt

Translational Selection Is Ubiquitous in Prokaryotes

Author: A Carbone
A Carbone
A Carbone
A Wagner
AM Resch
AV Glyakina
B Lafay
C Kimchi-Sarfaty
C Nadeau
C Rispe
DC Hess
EP Rocha
EP Rocha
EP Rocha
EV Koonin
F Meier
F Supek
F Supek
Fran Supek
G Perriere
H Charles
H Grosjean
H Roy
H Suzuki
HS Najafabadi
J Mrazek
J Rozenski
JA Ranea
JC Marioni
JD Selengut
Jelena Repar
JG Lawrence
JH McDonald
JH McDonald
JL Bennetzen
JL Parmley
JO McInerney
JR Lobry
JT Herbeck
K Chen
K Mizuguchi
KA Dittmar
KB Zeldovich
Kristian Vlahoviček
L Breiman
L Dethlefsen
LC Seaver
M Ashburner
M dos Reis
M Neuhauser
M Oresic
MG Langille
MK Kruger
N Molina
N Stoletzki
Nancy A. Moran
Nives Škunca
P Lu
PF Agris
PF Agris
PM Sharp
PM Sharp
PP Chan
R Hershberg
RD Knight
RJ Grocock
RL Tatusov
RM Weiner
S D'Amico
S Kanaya
S Kanaya
S Karlin
S Karlin
S Karlin
S Karlin
SL Chen
T Banerjee
T Fawcett
Tomislav Šmuc
V Daubin
X Xia
Y Ishihama
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Codon usage bias in prokaryotic genomes is largely a consequence of background substitution patterns in DNA, but highly expressed genes may show a preference towards codons that enable more efficient and/or accurate translation. We introduce a novel approach based on supervised machine learning that detects effects of translational selection on genes, while controlling for local variation in nucleotide substitution patterns represented as sequence composition of intergenic DNA. A cornerstone of our method is a Random Forest classifier that outperformed previous distance measure-based approaches, such as the codon adaptation index, in the task of discerning the (highly expressed) ribosomal protein genes by their codon frequencies. Unlike previous reports, we show evidence that translational selection in prokaryotes is practically universal: in 460 of 461 examined microbial genomes, we find that a subset of genes shows a higher codon usage similarity to the ribosomal proteins than would be expected from the local sequence composition. These genes constitute a substantial part of the genome—between 5% and 33%, depending on genome size—while also exhibiting higher experimentally measured mRNA abundances and tending toward codons that match tRNA anticodons by canonical base pairing. Certain gene functional categories are generally enriched with, or depleted of codon-optimized genes, the trends of enrichment/depletion being conserved between Archaea and Bacteria. Prominent exceptions from these trends might indicate genes with alternative physiological roles; we speculate on specific examples related to detoxication of oxygen radicals and ammonia and to possible misannotations of asparaginyl–tRNA synthetases. Since the presence of codon optimizations on genes is a valid proxy for expression levels in fully sequenced genomes, we provide an example of an “adaptome” by highlighting gene functions with expression levels elevated specifically in thermophilic Bacteria and Archaea

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

100,000 Genomes Pilot on Rare-Disease Diagnosis in Health Care — Preliminary Report

Author: 100 000 Genomes Project Pilot Investigators
Abulhoul L
Arno G
Ashford S
Aurora P
Babcock M
Bale M
Banka S
Baple E
Barnes MR
Beales P
Bentley D
Bitner-Glindzicz M
Black G
Bleda M
Bockenhauer D
Boustred C
Bradley JR
Brennan P
Brittain H
Broomfield A
Browning AC
Buchanan J
Bueser T
Burn J
Burns G
Burrows N
Cacheiro P
Campbell C
Camps C
Caulfield M
Chan G
Chinnery PF
Chisholm J
Chitty L
Cipriani V
Clayton-Smith J
Cleary MA
Compeyrot-Lacassagne S
Compton C
Crichton C
Dattani M
Daugherty LC
Davies J
Davison V
de Burca A
Deshpande C
Devereau A
Dewhurst E
Douglas A
Douzgou S
Ellard S
Ellingford JM
Elliott P
Fassihi H
Flinter F
Floto RA
Footitt E
Foulger RE
Fowler T
Fuller G
Ganesan V
Gibson K
Gorman GS
Grocock RJ
Grunewald S
Gräf S
Gurnell M
Habib B
Haendel M
Halai D
Hall G
Halsall S
Haque E
Haraldsdottir E
Haworth A
Hill S
Horvath R
Houlden H
Humphray SJ
Hunter S
Hyder Z
Ibanez K
Irving M
Izatt L
Jacobsen JO
James R
Josifova D
Kasanicki M
Kasperaviciute D
Koelling N
Lam T
Leigh S
Leong IUS
Lester T
Li E
Malka S
Mapeta R
Martin A
Matchan A
McDonagh EM
McFarland R
McMullan D
Mehta SG
Michaelides M
Mohammed S
Moore AT
Morris HR
Moutsianas L
Mumford AD
Muntoni F
Naresh K
Need A
Newland K
Newman W
Niblock O
Németh AH
O'Connor E
O'Keefe RT
Ouwehand WH
Palles C
Patch C
Patel S
Penkett C
Pilkington C
Polke J
Polychronopoulos D
Pontikos N
Poole K
Quinlivan R
Quinton R
Rahman S
Ratnaike T
Raymond FL
Reese MG
Rehmström K
Rendon A
Robert L
Robinson PN
Rose S
Roy NBA
Ruddy D
Ryten M
Sarkany R
Savage H
Say G
Sayer JA
Schaefer AM
Scott RH
Seaby EG
Sen A
Shaw AC
Simpson MA
Smedley D
Smith KR
Snow C
Spasic-Boskovic O
Stirrups K
Straub V
Sultana R
Tavares ALT
Taylor J
Taylor JC
Taylor RW
Temple IK
Thapar N
Thomas EA
Thomas HB
Tregidgo C
Tucci A
Turnbull C
Turnbull DM
Twiss P
Vandrovcova J
Vestito L
Wagner A
Wallis C
Webster AR
Wedderburn LR
Wei W
Welch J
Welland M
Wielscher M
Wilkie AOM
Williams E
Williams HJ
Wilson G
Wolejko A
Wood NW
Wood S
Woods K
Wordsworth S
Worth A
Wright CF
Yip J
Yu-Wai-Man P
Publication venue
Publication date: 11/11/2021
Field of study

BACKGROUND: The U.K. 100,000 Genomes Project is in the process of investigating the role of genome sequencing in patients with undiagnosed rare diseases after usual care and the alignment of this research with health care implementation in the U.K. National Health Service. Other parts of this project focus on patients with cancer and infection. METHODS: We conducted a pilot study involving 4660 participants from 2183 families, among whom 161 disorders covering a broad spectrum of rare diseases were present. We collected data on clinical features with the use of Human Phenotype Ontology terms, undertook genome sequencing, applied automated variant prioritization on the basis of applied virtual gene panels and phenotypes, and identified novel pathogenic variants through research analysis. RESULTS: Diagnostic yields varied among family structures and were highest in family trios (both parents and a proband) and families with larger pedigrees. Diagnostic yields were much higher for disorders likely to have a monogenic cause (35%) than for disorders likely to have a complex cause (11%). Diagnostic yields for intellectual disability, hearing disorders, and vision disorders ranged from 40 to 55%. We made genetic diagnoses in 25% of the probands. A total of 14% of the diagnoses were made by means of the combination of research and automated approaches, which was critical for cases in which we found etiologic noncoding, structural, and mitochondrial genome variants and coding variants poorly covered by exome sequencing. Cohortwide burden testing across 57,000 genomes enabled the discovery of three new disease genes and 19 new associations. Of the genetic diagnoses that we made, 25% had immediate ramifications for clinical decision making for the patients or their relatives. CONCLUSIONS: Our pilot study of genome sequencing in a national health care system showed an increase in diagnostic yield across a range of rare diseases. (Funded by the National Institute for Health Research and others.)

UCL Discovery

Genetic determinants of risk in pulmonary arterial hypertension: international genome-wide association studies and meta-analysis

Author: Abrea I
Adlard J
Ahmad F
Ahmad S
Ahmed M
Ahmed S
Aitman T
Alachkar H
Allahua M
Allen HL
Allsup D
Almeida-King J
Almeida-Peters S
Aman J
Amouyel P
Ancliff P
Anderson A
Andrews M
Antrobus R
Archer SL
Argula R
Arkon A
Armstrong R
Arno G
Arora A
Ashford S
Astle W
Aston V
Attwood A
Austin ED
Ayesh W
Babbs C
Badesch D
Bakchoul T
Bakshi S
Bariana T
Barnett C
Barve M
Barwell J
Batai K
Beckmann J
Bennett D
Bentley D
Benza R
Bhatt N
Bierzynska A
Bindu J
Birru H
Biss T
Blake R
Bleda M
Boekwig D
Bogaard H
Bogaard HJ
Bourne C
Boyce S
Bradley J
Brady D
Breen G
Brennan P
Brewer C
Broach D
Brod NC
Brody L
Brown M
Browning M
Bruno J
Buchan R
Buckland M
Bueser T
Burger CD
Burns S
Burren O
Calleja P
Carr-White G
Carss K
Casanova N
Casey R
Caskey E
Caulfield M
Cebola I
Chakinala M
Chambers J
Chambers J
Cheng F
Chinnery PF
Christian M
Church C
Coghlan G
Coghlan JG
Colby E
Cole T
Collins J
Collins P
Colombo C
Condliffe R
Cook S
Cook T
Cooper N
Cope M
Cordell S
Correa P
Corris P
Corris PA
Crisp-Hihn A
Curry N
Danesino C
Daniels M
Daugherty L
Davis A
Davis J
Debette S
Deevi SVV
del Junco H
DeMartino J
Dent T
Desai AA
Devereux J
Dewhurst E
Dillon L
Dixon P
Do R
Dotson A
Downes K
Drake J
Drazyk A
Drewe E
Durst L
Dutt T
Edgar D
Edwards K
Egner W
Elliott CG
Elwing J
Erber W
Erwood M
Estiu MC
Evans DG
Evans G
Everington T
Eyries M
Eyries M
Farber H
Favier R
Feliz N
Ferrer J
Fletcher D
Fortin T
Fox J
Franke A
Frantz RP
Franzo M
Frary A
French C
Freson K
Frontini M
Frost A
Gale D
Gall H
Garcia JGN
Geoghegan C
Gerighty T
Germain M
Ghio S
Ghofrani H-A
Gibbs JSR
Gibbs S
Gilmour K
Girerd B
Goddard S
Gomez K
Gordins P
Gosal D
Graf S
Grassi L
Greene D
Greenhalgh L
Greinacher A
Gresele P
Griffiths P
Grigoriadou S
Grocock R
Grozeva D
Gruhlke P
Gygi A
Hackett S
Hadinnapola C
Hague W
Haimel M
Hall M
Hannon K
Hanscombe KB
Hanson H
Harbaum L
Harkness K
Harley J
Harper A
Harrington B
Harris C
Hart D
Hassan A
Hawke L
Hawkes N
Hayman G
He H
Henderson A
Heuerman S
Hill NS
Hirsch R
Hoffmann J
Holy R
Horvath R
Houweling A
Houweling AC
Howard L
Howard LS
Hu F
Hudson G
Hughes J
Huissoon A
Humbert M
Humphray S
Hunter S
Hurles M
Iem T
Igenoza T
Ingledue R
Ivy D
Izatt L
Jackson E
James R
Johnson S
Jolles S
Jolley J
Jurkute N
Kaakinen M
Karnekar R
Karnes JH
Kasanicki M
Kazkaz H
Kazmi R
Kelleher P
Kennedy K
Kiely D
Kiely DG
Kingston N
Kittles R
Klima R
Klinger J
Knight J
Kostadima M
Kovacs G
Koziell A
Kreuzhuber R
Kuijpers T
Kumar A
Kumararatne D
Kurian M
Laffan M
Lahm T
Lalloo F
Lambert M
Larimore D
Laudes M
Lawrie A
Layton M
Lee J
Lemma M
Lentaigne C
Levine A
Lewis K
Light A
Linger R
Longhurst H
Louka E
Lovato E
Lutz K
MacDonald C
Machado RD
Madan B
Maher E
Maimaris J
Malur A
Mangles S
Mapeta R
Marchbank K
Marks J
Marks S
Markus HS
Marschall H-U
Marshall A
Marsolo K
Martin J
Martin LJ
Mathias M
Matthews E
Maxwell H
McAlinden P
McCarthy M
McClain K
Mcgaha T
Meacham S
Mead A
Megy K
Mehta S
Mendibles L
Michaelides M
Millar C
Miller-Reed K
Moledina S
Montani D
Moore T
Morrell N
Morrell NW
Mozere M
Muir K
Mumford A
Nathan SD
Newnham M
Nichols WC
Noordegraaf AV
Norwood T
O'Sullivan J
Obaji S
Okoli S
Olanipekun L
Olschewski A
Olschewski H
Ong KR
Ormiston M
Ormondroyd E
Oudiz RJ
Ouwehand W
Ouwehand WH
Paciotti G
Palmer J
Palmisciano A
Papadia S
Park S-M
Parry D
Paterson J
Pauciulo MW
Peacock A
Peacock AJ
Peden J
Peerlinck K
Peichel G
Penkett C
Pepke-Zaba J
Petersen R
Pisarcik J
Prokopenko I
Pyle A
Rankin S
Rao A
Raymond FL
Rayner-Matthews P
Rees C
Rehman Z
Rendon A
Renton T
Reponen A
Rhodes CJ
Rice A
Richardson S
Richter A
Roads T
Robbins I
Roberts I
Roden DM
Rohwer K
Rosenzweig EB
Ross RM
Ross RVM
Roughley C
Rowley C
Roy N
Sadeghi-Alavijeh O
Saleem M
Samani N
Sanchis-Juan A
Santiago J
Sargur R
Satchell S
Savic S
Saydain G
Scelsi L
Schiltz K
Schilz R
Schulman S
Scovel P
Scully M
Searle C
Seeger W
Sewell C
Seyres D
Shaffer CM
Shapiro S
Sharmardina O
Shtoyerman R
Sibson K
Side L
Simeoni I
Simms RW
Simon M
Simpson M
Singleton D
Sitbon O
Sivapalaratnam S
Skytte A-B
Smith K
Smith KGC
Snape K
Soubrier F
Southgate L
Spears J
Staines S
Staples E
Stark H
Stephens J
Stirrups K
Stock S
Stratoberdha M
Stratton E
Suntharalingam J
Swietlik E
Swietlik EM
Szuberla S
Tait RC
Talks K
Tan R
Tang H
Tavlarides A
Tchourbanov AY
Thaventhiran J
Themistocleous A
Thenappan T
Theuer A
Thomas M
Thomson K
Thrasher A
Thys C
Tischkowitz M
Titterton C
Toh C-H
Tomer D
Torres F
Toshner M
Toshner MR
Traylor M
Treacy C
Treacy CM
Tregouet D-A
Trembath R
Trembath RC
Tuna S
Turek W
Turk E
Turro E
Ulrich A
Urban T
Vale T
Van Geet C
Van Zuydam N
Vang B
Vazquez-Lopez M
Visnaw K
Von Ziegenweidt J
Waisfisz Q
Waisfisz Q
Walker S
Walsworth AK
Walter RE
Warden A
Ware J
Watkins H
Watt C
Webster A
Wei W
Welch S
Wessels J
Westbury S
Westwood J-P
Wharton J
White RJ
Whitehorn D
Whitman R
Whitworth J
Wilkins MR
Williams A
Williamson C
Wilson O
Wilt J
Winslow C
Wong E
Wood N
Wood Y
Woods G
Woodward E
Wort S
Wort SJ
Worth A
Yates K
Yong P
Young T
Yu P
Yu-Wai-Man P
Yung D
Ziemak C
Publication venue: ELSEVIER SCI LTD
Publication date: 01/03/2019
Field of study

Background Rare genetic variants cause pulmonary arterial hypertension, but the contribution of common genetic variation to disease risk and natural history is poorly characterised. We tested for genome-wide association for pulmonary arterial hypertension in large international cohorts and assessed the contribution of associated regions to outcomes. Methods We did two separate genome-wide association studies (GWAS) and a meta-analysis of pulmonary arterial hypertension. These GWAS used data from four international case-control studies across 11744 individuals with European ancestry (including 2085 patients). One GWAS used genotypes from 5895 whole-genome sequences and the other GWAS used genotyping array data from an additional 5849 individuals. Cross-validation of loci reaching genome-wide significance was sought by meta-analysis. Conditional analysis corrected for the most significant variants at each locus was used to resolve signals for multiple associations. We functionally annotated associated variants and tested associations with duration of survival. All-cause mortality was the primary endpoint in survival analyses. Findings A locus near SOX17 (rs10103692, odds ratio 1·80 [95% CI 1·55–2·08], p=5·13×10– ¹⁵) and a second locus in HLA-DPA1 and HLA-DPB1 (collectively referred to as HLA-DPA1/DPB1 here; rs2856830, 1·56 [1·42–1·71], p=7·65×10– ²⁰) within the class II MHC region were associated with pulmonary arterial hypertension. The SOX17 locus had two independent signals associated with pulmonary arterial hypertension (rs13266183, 1·36 [1·25–1·48], p=1·69×10– ¹²; and rs10103692). Functional and epigenomic data indicate that the risk variants near SOX17 alter gene regulation via an enhancer active in endothelial cells. Pulmonary arterial hypertension risk variants determined haplotype-specific enhancer activity, and CRISPR-mediated inhibition of the enhancer reduced SOX17 expression. The HLA-DPA1/DPB1 rs2856830 genotype was strongly associated with survival. Median survival from diagnosis in patients with pulmonary arterial hypertension with the C/C homozygous genotype was double (13·50 years [95% CI 12·07 to >13·50]) that of those with the T/T genotype (6·97 years [6·02–8·05]), despite similar baseline disease severity. Interpretation This is the first study to report that common genetic variation at loci in an enhancer near SOX17 and in HLA-DPA1/DPB1 is associated with pulmonary arterial hypertension. Impairment of SOX17 function might be more common in pulmonary arterial hypertension than suggested by rare mutations in SOX17. Further studies are needed to confirm the association between HLA typing or rs2856830 genotyping and survival, and to determine whether HLA typing or rs2856830 genotyping improves risk stratification in clinical practice or trials. Funding UK NIHR, BHF, UK MRC, Dinosaur Trust, NIH/NHLBI, ERS, EMBO, Wellcome Trust, EU, AHA, ACClinPharm, Netherlands CVRI, Dutch Heart Foundation, Dutch Federation of UMC, Netherlands OHRD and RNAS, German DFG, German BMBF, APH Paris, INSERM, Université Paris-Sud, and French ANR

UCL Discovery