Search CORE

162 research outputs found

STAR: predicting recombination sites from amino acid sequence

Author: A Crameri
C Berezin
CA Voigt
CR Otey
DA Drummond
Denis C Bauer
DT Jones
E Capriotti
Elizabeth M Gillam
G Pollastri
J Cheng
JB Endelman
K Hiraga
M Boden
M Ostermeier
MC Saraf
MC Saraf
MC Saraf
Mikael Bodén
MM Meyer
P Baldi
Ricarda Thier
S Hua
S Lutz
S Sundararajan
SF Altschul
U Hobohm
Z Yuan
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Designing novel proteins with site-directed recombination has enormous prospects. By locating effective recombination sites for swapping sequence parts, the probability that hybrid sequences have the desired properties is increased dramatically. The prohibitive requirements for applying current tools led us to investigate machine learning to assist in finding useful recombination sites from amino acid sequence alone. RESULTS: We present STAR, Site Targeted Amino acid Recombination predictor, which produces a score indicating the structural disruption caused by recombination, for each position in an amino acid sequence. Example predictions contrasted with those of alternative tools, illustrate STAR'S utility to assist in determining useful recombination sites. Overall, the correlation coefficient between the output of the experimentally validated protein design algorithm SCHEMA and the prediction of STAR is very high (0.89). CONCLUSION: STAR allows the user to explore useful recombination sites in amino acid sequences with unknown structure and unknown evolutionary origin. The predictor service is available from

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Queensland University of Technology ePrints Archive

University of Queensland eSpace

Genomic selection in commercial perennial crops: applicability and improvement in oil palm (Elaeis guineensis Jacq.)

Author: BJ Hayes
BJ Hayes
CC Li
CK Teh
CK Wong
D Cros
D Habier
FM Bassi
G Los Campos de
G Moser
H Muranty
J Crossa
J Pew
J Spindel
J Yang
JB Endelman
JE Spindel
MF Resende Jr.
MF Resende Jr.
MP Calus
N Zaitlen
P Perez
R Singh
R Singh
S Purcell
SA Clark
T Park
TH Meuwissen
V Rao
WG Hill
ZA Desta
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Genomic selection (GS) uses genome-wide markers to select individuals with the desired overall combination of breeding traits. A total of 1,218 individuals from a commercial population of Ulu Remis x AVROS (UR x AVROS) were genotyped using the OP200K array. The traits of interest included: shellto- fruit ratio (S/F, %), mesocarp-to-fruit ratio (M/F, %), kernel-to-fruit ratio (K/F, %), fruit per bunch (F/B, %), oil per bunch (O/B, %) and oil per palm (O/P, kg/palm/year). Genomic heritabilities of these traits were estimated to be in the range of 0.40 to 0.80. GS methods assessed were RR-BLUP, Bayes A (BA), Cπ (BC), Lasso (BL) and Ridge Regression (BRR). All methods resulted in almost equal prediction accuracy. The accuracy achieved ranged from 0.40 to 0.70, correlating with the heritability of traits. By selecting the most important markers, RR-BLUP B has the potential to outperform other methods. The marker density for certain traits can be further reduced based on the linkage disequilibrium (LD). Together with in silico breeding, GS is now being used in oil palm breeding programs to hasten parental palm selection

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

UM Digital Repository

ScholarBank@NUS

Computer vision and machine learning for robust phenotyping in genome-wide studies

Author: A Singh
AE Lipka
AE Lipka
Alexander E. Lipka
BL Browning
BW Diers
CH Bock
CH Bock
Céline Rousseau
D Franzen
DV Charlson
DW Barker
GA Peiffer
J Morrissey
J Rodriguez-Celma
J Vollmann
J Zhang
JA Berni
JA O’Rourke
JB Endelman
MF Oliveira
N Terry
NB Schmid
NC Hansen
Qijian Song
R Bernardo
S Lin
S Mamidi
SF Lin
SF Lin
THE Meuwissen
WR Fehr
XH Huang
Zhiwu Zhang
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2017
Field of study

Traditional evaluation of crop biotic and abiotic stresses are time-consuming and labor-intensive limiting the ability to dissect the genetic basis of quantitative traits. A machine learning (ML)-enabled image-phenotyping pipeline for the genetic studies of abiotic stress iron deficiency chlorosis (IDC) of soybean is reported. IDC classification and severity for an association panel of 461 diverse plant-introduction accessions was evaluated using an end-to-end phenotyping workflow. The workflow consisted of a multi-stage procedure including: (1) optimized protocols for consistent image capture across plant canopies, (2) canopy identification and registration from cluttered backgrounds, (3) extraction of domain expert informed features from the processed images to accurately represent IDC expression, and (4) supervised ML-based classifiers that linked the automatically extracted features with expert-rating equivalent IDC scores. ML-generated phenotypic data were subsequently utilized for the genome-wide association study and genomic prediction. The results illustrate the reliability and advantage of ML-enabled image-phenotyping pipeline by identifying previously reported locus and a novel locus harboring a gene homolog involved in iron acquisition. This study demonstrates a promising path for integrating the phenotyping pipeline into genomic prediction, and provides a systematic framework enabling robust and quicker phenotyping through ground-based systems

Digital Repository @ Iowa State University (ISU)

Crossref

PubMed Central

Optimization of genomic selection training populations with a genetic algorithm

Author: AC Atkinson
AE Hoerl
CR Henderson
D Gianola
D Habier
DE Goldberg
Deniz Akdemir
F Phocas
F Pukelsheim
G de Los Campos
HP Piepho
I Misztal
J Crossa
JB Endelman
Jean-Luc Jannink
JH Holland
JM Hickey
JO Ogutu
Julio I Sanchez
K Zhao
L Pronzato
LD Davis
MC Romay
MF Resende
N Heslot
P VanRaden
R Rincent
S Atwell
V Wimmer
VB Melas
VS Windhausen
VV Fedorov
WM Muir
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Evaluation of methods and marker systems in genomic selection of oil palm (Elaeis guineensis Jacq.)

Author: A Legarra
A Liaw MW
A Noh
Ai Ling Ong
B Oboh
BJ Hayes
C-K Teh
CC Li
Chee Keng Teh
CK Teh
CK Wong
CO Okwuagwu
D Boichard
D Cros
D Habier
D Kainer
David Ross Appleton
Fook Tim Chew
G Blaak
G Blaak
G de Los Campos
Harikrishna Kulaveerasingam
J Crossa
J Pew
J Spindel
J Yang
JB Endelman
Jennifer Ann Harikrishna
JJ Hardon
LC Ooi
M Schuelke
Martti Tammi
ME Goddard
MW Blair
P Perez
P Perez
QB Kwong
QB Kwong
Qi Bin Kwong
R Singh
R Singh
Sean Mayes
Suat Hui Yeoh
TH Meuwissen
V Rao
YJ Shu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Background Genomic selection (GS) uses genome-wide markers as an attempt to accelerate genetic gain in breeding programs of both animals and plants. This approach is particularly useful for perennial crops such as oil palm, which have long breeding cycles, and for which the optimal method for GS is still under debate. In this study, we evaluated the effect of different marker systems and modeling methods for implementing GS in an introgressed dura family derived from a Deli dura x Nigerian dura (Deli x Nigerian) with 112 individuals. This family is an important breeding source for developing new mother palms for superior oil yield and bunch characters. The traits of interest selected for this study were fruit-to-bunch (F/B), shell-to-fruit (S/F), kernel-to-fruit (K/F), mesocarp-to-fruit (M/F), oil per palm (O/P) and oil-to-dry mesocarp (O/DM). The marker systems evaluated were simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). RR-BLUP, Bayesian A, B, Cπ, LASSO, Ridge Regression and two machine learning methods (SVM and Random Forest) were used to evaluate GS accuracy of the traits. Results The kinship coefficient between individuals in this family ranged from 0.35 to 0.62. S/F and O/DM had the highest genomic heritability, whereas F/B and O/P had the lowest. The accuracies using 135 SSRs were low, with accuracies of the traits around 0.20. The average accuracy of machine learning methods was 0.24, as compared to 0.20 achieved by other methods. The trait with the highest mean accuracy was F/B (0.28), while the lowest were both M/F and O/P (0.18). By using whole genomic SNPs, the accuracies for all traits, especially for O/DM (0.43), S/F (0.39) and M/F (0.30) were improved. The average accuracy of machine learning methods was 0.32, compared to 0.31 achieved by other methods. Conclusion Due to high genomic resolution, the use of whole-genome SNPs improved the efficiency of GS dramatically for oil palm and is recommended for dura breeding programs. Machine learning slightly outperformed other methods, but required parameters optimization for GS implementation

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

Directory of Open Access Journals

ScholarBank@NUS

Finding the sources of missing heritability in a yeast cross

Author: A Keinan
C Loader
D Bates
DB Goldstein
DC Amberg
DL Aylor
DM Ruderfer
DS Falconer
E Birney
EE Eichler
ES Buckler
G Pau
H Lango Allen
H Li
H Li
HA Orr
Ian M. Ehrenreich
IM Ehrenreich
IM Ehrenreich
J Maller
J Yang
JA Tennessen
JB Endelman
JD Storey
JK Pritchard
Joshua S. Bloom
KW Broman
L Chen
L Kruglyak
Leonid Kruglyak
M Lynch
MD Abramoff
MR Nelson
MV Rockman
O Zuk
PM Visscher
PM Visscher
PM Visscher
RB Brem
RB Brem
RD Dowell
RW Doerge
S Atwell
SH Lee
TA Manolio
TF Mackay
TFC Mackay
Thúy-Lan Võ Lite
Wesley T. Loo
WG Hill
WG Hill
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/08/2012
Field of study

For many traits, including susceptibility to common diseases in humans, causal loci uncovered by genetic mapping studies explain only a minority of the heritable contribution to trait variation. Multiple explanations for this "missing heritability" have been proposed. Here we use a large cross between two yeast strains to accurately estimate different sources of heritable variation for 46 quantitative traits and to detect underlying loci with high statistical power. We find that the detected loci explain nearly the entire additive contribution to heritable variation for the traits studied. We also show that the contribution to heritability of gene-gene interactions varies among traits, from near zero to 50%. Detected two-locus interactions explain only a minority of this contribution. These results substantially advance our understanding of the missing heritability problem and have important implications for future studies of complex and quantitative traits

arXiv.org e-Print Archive

CiteSeerX

Crossref

Cold Spring Harbor Laboratory Institutional Repository

eScholarship - University of California

Incorporating pleiotropic quantitative trait loci in dissection of complex traits: seed yield in rapeseed as an example

Author: Annaliese S. Mason
B Chalhoub
B Goffinet
B Peng
Bruce D. L. Fitt
C Groos
C Jestin
C Jiang
C Silva Lda
Chunyu Zhang
D Qiu
DL Yang
EJ Chesler
F Chardon
F Sun
G Chen
GA Churchill
H Dargahi
H Raman
HU Jan
I Bancroft
J Feng
J Lee
J Liu
J Shi
J Xiao
J Zou
JB Endelman
JB Holland
Jinling Meng
Jinxia Xiang
JM Lacape
Jun Zou
Lei Shi
M El-Soda
M Maccaferri
M Radoev
Meng Wang
MI Vales
MS Khatkar
N Li
N Ramchiary
NJ Larkan
P Moncada
Peifa Liu
R Bernardo
R Core Team
Rod J. Snowdon
S Wright
SF Altschul
T Shi
T Würschum
T Würschum
TF Mackay
W Zhao
WA Cowling
WE Clarke
WK Zhang
X Chen
X Fan
Xiang Liu
XJ Song
Y Long
Y Xu
Y Zhang
Y Zhao
Y Zhao
Yan Long
YJ Huang
Yongju Huang
Z Fang
ZB Zeng
Ziliang Luo
ZK Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/04/2017
Field of study

© The Author(s) 2017 This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 International License http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.Most agronomic traits of interest for crop improvement (including seed yield) are highly complex quantitative traits controlled by numerous genetic loci, which brings challenges for comprehensively capturing associated markers/ genes. We propose that multiple trait interactions underlie complex traits such as seed yield, and that considering these component traits and their interactions can dissect individual quantitative trait loci (QTL) effects more effectively and improve yield predictions. Using a segregating rapeseed (Brassica napus) population, we analyzed a large set of trait data generated in 19 independent experiments to investigate correlations between seed yield and other complex traits, and further identified QTL in this population with a SNP-based genetic bin map. A total of 1904 consensus QTL accounting for 22 traits, including 80 QTL directly affecting seed yield, were anchored to the B. napus reference sequence. Through trait association analysis and QTL meta-analysis, we identified a total of 525 indivisible QTL that either directly or indirectly contributed to seed yield, of which 295 QTL were detected across multiple environments. A majority (81.5%) of the 525 QTL were pleiotropic. By considering associations between traits, we identified 25 yield-related QTL previously ignored due to contrasting genetic effects, as well as 31 QTL with minor complementary effects. Implementation of the 525 QTL in genomic prediction models improved seed yield prediction accuracy. Dissecting the genetic and phenotypic interrelationships underlying complex quantitative traits using this method will provide valuable insights for genomics-based crop improvement.Peer reviewedFinal Published versio

Crossref

University of Hertfordshire Research Archive