789 research outputs found

    Genome bioinformatics of tomato and potato

    Get PDF
    In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have been deciphered and are being exploited for fundamental research and applied to improve their breeding programs. The developments in sequencing technologies have also impacted the associated bioinformatics strategies and tools, both those that are required for data processing, management, and quality control, and those used for interpretation of the data. This thesis focuses on the application of genome sequencing, assembly and annotation to two members of the Solanaceae family, tomato and potato. Potato is the economically most important species within the Solanaceae, and its tubers contribute to dietary intake of starch, protein, antioxidants, and vitamins. Tomato fruits are the second most consumed vegetable after potato, and are a globally important dietary source of lycopene, beta-carotene, vitamin C, and fiber. The chapters in this thesis document the generation, exploitation and interpretation of genomic sequence resources for these two species and shed light on the contents, structure and evolution of their genomes. Chapter 1introduces the concepts of genome sequencing, assembly and annotation, and explains the novel genome sequencing technologies that have been developed in the past decade. These so-called Next Generation Sequencing platforms display considerable variation in chemistry and workflow, and as a consequence the throughput and data quality differs by orders of magnitude between the platforms. The currently available sequencing platforms produce a vast variety of read lengths and facilitate the generation of paired sequences with an approximately fixed distance between them. The choice of sequencing chemistry and platform combined with the type of sequencing template demands specifically adapted bioinformatics for data processing and interpretation. Irrespective of the sequencing and assembly strategy that is chosen, the resulting genome sequence, often represented by a collection of long linear strings of nucleotides, is of limited interest by itself. Interpretation of the genome can only be achieved through sequence annotation – that is, identification and classification of all functional elements in a genome sequence. Once these elements have been annotated, sequence alignments between multiple genomes of related accessions or species can be utilized to reveal the genetic variation on both the nucleotide and the structural level that underlies the difference between these species or accessions. Chapter 2describes BlastIf, a novel software tool that exploits sequence similarity searches with BLAST to provide a straightforward annotation of long nucleotide sequences. Generally, two problems are associated with the alignment of a long nucleotide sequence to a database of short gene or protein sequences: (i) the large number of similar hits that can be generated due to database redundancy; and (ii) the relationships implied between aligned segments within a hit that in fact correspond to distinct elements on the sequence such as genes. BlastIf generates a comprehensible BLAST output for long nucleotide sequences by reducing the number of similar hits while revealing most of the variation present between hits. It is a valuable tool for molecular biologists who wish to get a quick overview of the genetic elements present in a newly sequenced segment of DNA, prior to more elaborate efforts of gene structure prediction and annotation. In Chapter 3 a first genome-wide comparison between the emerging genomic sequence resources of tomato and potato is presented. Large collections of BAC end sequences from both species were annotated through repeat searches, transcript alignments and protein domain identification. In-depth comparisons of the annotated sequences revealed remarkable differences in both gene and repeat content between these closely related genomes. The tomato genome was found to be more repetitive than the potato genome, and substantial differences in the distribution of Gypsy and Copia retrotransposable elements as well as microsatellites were observed between the two genomes. A higher gene content was identified in the potato sequences, and in particular several large gene families including cytochrome P450 mono-oxygenases and serine-threonine protein kinases were significantly overrepresented in potato compared to tomato. Moreover, the cytochrome P450 gene family was found to be expanded in both tomato and potato when compared to Arabidopsis thaliana, suggesting an expanded network of secondary metabolic pathways in the Solanaceae. Together these findings present a first glimpse into the evolution of Solanaceous genomes, both within the family and relative to other plant species. Chapter 4explores the physical and genetic organization of tomato chromosome 6 through integration of BAC sequence analysis, High Information Content Fingerprinting, genetic analysis, and BAC-FISH mapping data. A collection of BACs spanning substantial parts of the short and long arm euchromatin and several dispersed regions of the pericentrometric heterochromatin were sequenced and assembled into several tiling paths spanning approximately 11 Mb. Overall, the cytogenetic order of BACs was in agreement with the order of BACs anchored to the Tomato EXPEN 2000 genetic map, although a few striking discrepancies were observed. The integration of BAC-FISH, sequence and genetic mapping data furthermore provided a clear picture of the borders between eu- and heterochromatin on chromosome 6. Annotation of the BAC sequences revealed that, although the majority of protein-coding genes were located in the euchromatin, the highly repetitive pericentromeric heterochromatin displayed an unexpectedly high gene content. Moreover, the short arm euchromatin was relatively rich in repeats, but the ratio of Gypsy and Copia retrotransposons across the different domains of the chromosome clearly distinguished euchromatin from heterochromatin. The ongoing whole-genome sequencing effort will reveal if these properties are unique for tomato chromosome 6, or a more general property of the tomato genome. Chapter 5presents the potato genome, the first genome sequence of an Asterid. To overcome the problems associated with genome assembly due tothe high level of heterozygosity that is observed in commercial tetraploid potato varieties, a homozygous doubled-monoploid potato clone was exploited to sequence and assemble 86% of the 844 Mb genome. This potato reference genome sequence was complemented with re-sequencing of aheterozygous diploid clone, revealing the form and extent of sequence polymorphism both between different genotypes and within a single heterozygous genotype. Gene presence/absence variants and other potentially deleterious mutations were found to occur frequently in potato and are a likely cause of inbreeding depression. Annotation of the genome was supported by deep transcriptome sequencing of both the doubled-monoploid and the heterozygous potato, resulting in the prediction of more than 39,000 protein coding genes. Transcriptome analysis provided evidence for the contribution of gene family expansion, tissue specific expression, and recruitment of genes to new pathways to the evolution of tuber development. The sequence of the potato genome has provided new insights into Eudicot genome evolution and has provided a solid basis for the elucidation of the evolution of tuberisation. Many traits of interest to plant breeders are quantitative in nature and the potato sequence will simplify both their characterization and deployment to generate novel cultivars. The outstanding challenges in plant genome sequencing are addressed in Chapter 6. The high concentration of repetitive elements and the heterozygosity and polyploidy of many interesting crop plant species currently pose a barrier for the efficient reconstruction of their genome sequences. Nonetheless, the completion of a large number of new genome sequences in recent years and the ongoing advances in sequencing technology provide many excitingopportunities for plant breeding and genome research. Current sequencing platforms are being continuously updated and improved, and novel technologies are being developed and implemented in third-generation sequencing platforms that sequence individual molecules without need for amplification. While these technologies create exciting opportunities for new sequencing applications, they also require robust software tools to process the data produced through them efficiently. The ever increasing amount of available genome sequences creates the need for an intuitive platform for the automated and reproducible interrogation of these data in order to formulate new biologically relevant questions on datasets spanning hundreds or thousands of genome sequences. </p

    Survival prediction in head and neck cancer

    Get PDF

    USING BALD EAGLES TO MONITOR HYDROELECTRIC PROJECTS LISCENSE REQUIREMENTS ALONG THE AU SABLE, MANISTEE AND MUSKEGON RIVER, MICHIGAN

    Get PDF
    Consumers Energy operated hydroelectric projects located along the Au Sable, Manistee, and Muskegon Rivers underwent environmental studies in the late 1980s and early 1990s as part of the Federal Energy Regulatory Commission relicensing. One of the questions posed during these studies was, would passage of Great Lakes\u27 fishes over barrier dams along these rivers cause detrimental impacts to sensitive wildlife species. Relicensing also required that the operation of all hydroelectric projects on the Au Sable, Manistee, and Muskegon rivers be maintained as run-of-river. Bald eagles (Haliaeetus leucocephalus) were chosen as a biomonitor. This risk assessment included calculating new hazard quotients (HQs) from toxic reference values (TRVs) to determine if it was safe for inland wildlife to be exposed to anadromous fish allowed past barrier dams. A risk assessment was conducted for contaminants of PCBs, DDT, dieldrin, TCDD-EQ and mercury in a fish diet comparing exposure in Great Lakes\u27 accessible regions to interior regions of the Au Sable, Manistee and Muskegon rivers, using fish collected after 1990. The bald eagle population nesting in the study area increased throughout the study period. Mean mercury was greater in fishes in inland than Great Lakes influenced. Mean total PCBs, sum DDT and dieldrin were greater in Great Lakes influenced areas. Total PCBs and sum DDT were greater in Great Lakes influenced nesting areas than inland nesting areas. TCDD-EQ was the limiting factor for bald eagle reproduction on Great Lakes influenced areas with the greatest HQ, which was greater than the adverse population level. My data suggests that if protection of wildlife from environmental contaminants is the management goal, then fish passage should not be allowed past Foote, Tippy and Croton dams. Concentrations of environmental contaminants in nestling bald eagle blood plasma confirm these results. Productivity and success increased on the Manistee and Muskegon Rivers after run-of-river implementation, but there was inconclusive supporting evidence that run-of-river was the factor for the increase

    Mathematical modeling of black-and-white chromogenic image stability

    Get PDF
    Agfapan Vario-XL film was faded at various levels of temperature, humidity, light, and fade time to determine the mathematical relationships of these variables and to examine whether interaction occurs between each factor. Light stability of the film was measured, and the Arrhenius relationship was used to predict dark stability at ambient storage conditions. It was found that the amount of fade, as measured as either a change in transmittance or density, could be mathematically modeled with a high degree of correlation. Each independent variable (temperature, humidity, and time) was interactive with the other two variables. Under the specific conditions tested, as a significant interaction existed between light and dark fading reactions. For example, both the light and dark cyan dye reactions inhibit each other. However, in the case of the magenta and yellow dyes, a synergistic, or catalytic, effect occurs when light fading precedes dark fading. Agfapan Vario-XL is extremely light stable when irradiated by a conventional enlarger light source. An intermittency effect was noted. The dark stability compares with some of the least stable chromoegenic print films - - a ten percent loss in printing density is predicted by Arrhenius extrapolation when the Agfapan Vario-XL is stored at room temperature at 45 percent relative humidity for five years

    Survival Prediction in Head and Neck Cancer: Impact of Tumor and Patient Specific Characteristics

    Get PDF
    Head and neck cancer accounts for almost 5% of all malignant tumors in the Netherlands. The most up‐todate Dutch Cancer Registry (NCR) database from 2009 reported 2878 new patients with an invasive carcinoma of the lip, oral cavity, pharynx and larynx (general incidence 17:100.000). In this thesis we focus on head and neck squamous cell carcinoma (HNSCC). Head and neck squamous cell carcinomas originate from the mucosal lining of the upper aero‐digestive tract. Tobacco and alcohol are irritants to this mucosal lining and therefore form major risk factors for the genesis of malignant epithelial tumors. Other reported etiological factors are malnutrition, viral factors (Epstein Barr virus and Human Papilloma virus), genetic predispositions and occupational hazards

    Survival prediction in head and neck cancer

    Get PDF
    corecore