48 research outputs found
Efficient Exact Maximum a Posteriori Computation for Bayesian SNP Genotyping in Polyploids
The problem of genotyping polyploids is extremely important for the creation of genetic maps and assembly of complex plant genomes. Despite its significance, polyploid genotyping still remains largely unsolved and suffers from a lack of statistical formality. In this paper a graphical Bayesian model for SNP genotyping data is introduced. This model can infer genotypes even when the ploidy of the population is unknown. We also introduce an algorithm for finding the exact maximum a posteriori genotype configuration with this model. This algorithm is implemented in a freely available web-based software package SuperMASSA. We demonstrate the utility, efficiency, and flexibility of the model and algorithm by applying them to two different platforms, each of which is applied to a polyploid data set: Illumina GoldenGate data from potato and Sequenom MassARRAY data from sugarcane. Our method achieves state-of-the-art performance on both data sets and can be trivially adapted to use models that utilize prior information about any platform or species
Comparação dos algoritmos delineação rápida em cadeia e seriação, para a construção de mapas genéticos
The objective of this work was to evaluate the efficiency for the construction of genetic linkage maps of the algorithms seriation and rapid chain delineation, as well as the criteria: product of adjacent recombination fractions, sum of adjacent recombination fractions, and sum of adjacent LOD Scores, used with the ripple algorithm. A genetic linkage map was simulated containing 24 markers with random distances between them, with an average of 10 cM. Using the Monte Carlo method, 1,000 backcross populations and 1,000 F2 populations were simulated. The populations comprised 200 individuals each, as well as different combinations of dominant and codominant markers (100% codominant, 100% dominant and mixture containing 50% codominant and 50% dominant). It were also simulated 25, 50 e 75% of missing data. It was observed that both algorithms presented similar performance, and were sensitive to the presence of dominant markers, which makes it difficult to get estimates with good accuracy for both order and distance. Moreover, the algorithm ripple, when applied with the criteria sum of adjacent recombination fractions and product of adjacent recombination fractions, increased the number of correct orders.O objetivo deste trabalho foi avaliar a eficiência, na construção de mapas genéticos, dos algoritmos seriação e delineação rápida em cadeia, além dos critérios para avaliação de ordens: produto mínimo das frações de recombinação adjacentes, soma mínima das frações de recombinação adjacentes e soma máxima dos LOD Scores adjacentes, quando usados com o algoritmo de verificação de erros "ripple". Foi simulado um mapa com 24 marcadores, posicionados aleatoriamente a distâncias variadas, com média 10 cM. Por meio do método Monte Carlo, foram obtidas 1.000 populações de retrocruzamento e 1.000 populações F2, com 200 indivíduos cada, e diferentes combinações de marcadores dominantes e co-dominantes (100% co-dominantes, 100% dominantes e mistura com 50% co-dominantes e 50% dominantes). Foi, também, simulada a perda de 25, 50 e 75% dos dados. Observou-se que os dois algoritmos avaliados tiveram desempenho semelhante e foram sensíveis à presença de dados perdidos e à presença de marcadores dominantes; esta última dificultou a obtenção de estimativas com boa acurácia, tanto da ordem quanto da distância. Além disso, observou-se que o algoritmo "ripple" geralmente aumenta o número de ordens corretas e pode ser combinado com os critérios soma mínima das frações de recombinação adjacentes e produto mínimo das frações de recombinação adjacentes
A novel linkage map of sugarcane with evidence for clustering of retrotransposon-based markers
The development of sugarcane as a sustainable crop has unlimited applications. The crop is one of the most economically viable for renewable energy production, and CO2 balance. Linkage maps are valuable tools for understanding genetic and genomic organization, particularly in sugarcane due to its complex polyploid genome of multispecific origins. The overall objective of our study was to construct a novel sugarcane linkage map, compiling AFLP and EST-SSR markers, and to generate data on the distribution of markers anchored to sequences of scIvana_1, a complete sugarcane transposable element, and member of the Copia superfamily. The mapping population parents (‘IAC66-6’ and ‘TUC71-7’) contributed equally to polymorphisms, independent of marker type, and generated markers that were distributed into nearly the same number of co-segregation groups (or CGs). Bi-parentally inherited alleles provided the integration of 19 CGs. The marker number per CG ranged from two to 39. The total map length was 4,843.19 cM, with a marker density of 8.87 cM. Markers were assembled into 92 CGs that ranged in length from 1.14 to 404.72 cM, with an estimated average length of 52.64 cM. The greatest distance between two adjacent markers was 48.25 cM. The scIvana_1-based markers (56) were positioned on 21 CGs, but were not regularly distributed. Interestingly, the distance between adjacent scIvana_1-based markers was less than 5 cM, and was observed on five CGs, suggesting a clustered organization. Results indicated the use of a NBS-profiling technique was efficient to develop retrotransposon-based markers in sugarcane. The simultaneous maximum-likelihood estimates of linkage and linkage phase based strategies confirmed the suitability of its approach to estimate linkage, and construct the linkage map. Interestingly, using our genetic data it was possible to calculate the number of retrotransposon scIvana_1 (~60) copies in the sugarcane genome, confirming previously reported molecular results. In addition, this research possibly will have indirect implications in crop economics e.g., productivity enhancement via QTL studies, as the mapping population parents differ in response to an important fungal disease13CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO - CNPQCOORDENAÇÃO DE APERFEIÇOAMENTO DE PESSOAL DE NÍVEL SUPERIOR - CAPESFUNDAÇÃO DE AMPARO À PESQUISA DO ESTADO DE SÃO PAULO - FAPESPnão temnão tem2010/51708-
Functional markers for gene mapping and genetic diversity studies in sugarcane
<p>Abstract</p> <p>Background</p> <p>The database of sugarcane expressed sequence tags (EST) offers a great opportunity for developing molecular markers that are directly associated with important agronomic traits. The development of new EST-SSR markers represents an important tool for genetic analysis. In sugarcane breeding programs, functional markers can be used to accelerate the process and select important agronomic traits, especially in the mapping of quantitative traits loci (QTL) and plant resistant pathogens or qualitative resistance loci (QRL). The aim of this work was to develop new simple sequence repeat (SSR) markers in sugarcane using the sugarcane expressed sequence tag (SUCEST database).</p> <p>Findings</p> <p>A total of 365 EST-SSR molecular markers with trinucleotide motifs were developed and evaluated in a collection of 18 genotypes of sugarcane (15 varieties and 3 species). In total, 287 of the EST-SSRs markers amplified fragments of the expected size and were polymorphic in the analyzed sugarcane varieties. The number of alleles ranged from 2-18, with an average of 6 alleles per locus, while polymorphism information content values ranged from 0.21-0.92, with an average of 0.69. The discrimination power was high for the majority of the EST-SSRs, with an average value of 0.80. Among the markers characterized in this study some have particular interest, those that are related to bacterial defense responses, generation of precursor metabolites and energy and those involved in carbohydrate metabolic process.</p> <p>Conclusions</p> <p>These EST-SSR markers presented in this work can be efficiently used for genetic mapping studies of segregating sugarcane populations. The high Polymorphism Information Content (PIC) and Discriminant Power (DP) presented facilitate the QTL identification and marker-assisted selection due the association with functional regions of the genome became an important tool for the sugarcane breeding program.</p
Development of a model to build genetic maps in autopolyploides, with applications in sugarcane
Espécies autopoliploides são extremamente importantes na agricultura. No entanto, a estrutura complexa de seus genomas não é bem compreendida. Apesar de todos os avanços no mapeamento genético de autotetraploides, a grande maioria dos modelos utilizados para a construção de mapas em espécies autopoliploides com elevado nível de ploidia, tais como a canade- açúcar, são aproximações daqueles usados para organismos diplóides. Assim, este trabalho teve como objetivo o desenvolvimento de um novo modelo para construção de mapas genéticos em espécies autopoliploides com qualquer nível de ploidia e incluindo marcadores com todas as dosagens possíveis. Para tanto foi utilizada a tecnologia dos modelos de Markov ocultos. O modelo aqui apresentado pode ser aplicado a dados de marcadores dominantes e codominantes, com comportamento bialélico ou multialélico. O método baseia-se no cálculo das probabilidades condicionais que compõem a matriz de transição seguido da redução de sua dimensão usando uma abordagem computacional. O uso do método foi ilustrado em uma população de mapeamento de cana-de-açúcar proveniente do cruzamento entre duas variedades pré-comerciais (IACSP 95-3018 × IACSP 93-3046) e genotipadas com três tipos de marcadores: SNPs, microssatélites e AFLPs. Os resultados indicam que o novo método é muito eficiente na obtenção de mapas genéticos, mesmo em situações com níveis de ploidia altos e marcadores com altas doses, particularmente quando estes marcadores têm comportamento codominante. Também foi possível estimar a verossimilhança, as frações de recombinação e as fases de ligação usando a abordagem multiponto, a qual leva em consideração todos os marcadores do grupo de ligação analisado simultaneamente. O novo modelo aqui proposto representa um importante passo para realizar futuramente a localização de regiões genômicas associadas à variação das características quantitativas, no entendimento da arquitetura genética de tais características e na montagem de genomas de espécies autopoliploides.Autopolyploid species are extremely important in agriculture. However, the complex structure of their genomes is not well understood. Despite all advances in genetic mapping of autotetraploids, the vast majority of the models used for autopolyploid species with high ploidy level, such as sugarcane, are approximations of those used in diploid organisms. Thus, the aim of this work was to develop a new model to build genetic linkage maps in autopolyploid species with any ploidy level, including markers with all possible dosages. For doing so, hiddenMarkov model technology was used. The new model presented herein can be applied to dominant and codominantmarkers data, with biallelic or multiallelic behavior. The method is based on the calculation of conditional probabilities that comprise the transition matrix followed by a reduction of its dimension using a computer-based approach. An application of the method was illustrated with a sugarcane mapping population derived from a cross between two pre-commercial varieties (IACSP 95-3018 × IACSP 93-3046), scored with three types of markers: SNPs, microssatellite and AFLPs. The results indicate that the new method is very efficient in obtaining genetic maps, even for high ploidy levels and for markers with high dosages, particularly when these markers have codominant behavior. It was also possible to estimate the likelihood, the recombination fractions and the linkage phases between allmarkers using themultipoint approach, which takes into account all markers of the linkage group simultaneously. The new model proposed in this work represents a major step towards the location of genomic regions associated with variation in quantitative traits and its genetic architecture, and assembling autopoliploid genome sequences