40,991 research outputs found

    Large-scale Multi-label Text Classification - Revisiting Neural Networks

    Full text link
    Neural networks have recently been proposed for multi-label classification because they are able to capture and model label dependencies in the output layer. In this work, we investigate limitations of BP-MLL, a neural network (NN) architecture that aims at minimizing pairwise ranking error. Instead, we propose to use a comparably simple NN approach with recently proposed learning techniques for large-scale multi-label text classification tasks. In particular, we show that BP-MLL's ranking loss minimization can be efficiently and effectively replaced with the commonly used cross entropy error function, and demonstrate that several advances in neural network training that have been developed in the realm of deep learning can be effectively employed in this setting. Our experimental results show that simple NN models equipped with advanced techniques such as rectified linear units, dropout, and AdaGrad perform as well as or even outperform state-of-the-art approaches on six large-scale textual datasets with diverse characteristics.Comment: 16 pages, 4 figures, submitted to ECML 201

    Genetic regulation of glucoraphanin accumulation in Beneforté® broccoli

    Get PDF
    Diets rich in broccoli (Brassica oleracea var italica) have been associated with maintenance of cardiovascular health and reduction in risk of cancer. These health benefits have been attributed to glucoraphanin that specifically accumulates in broccoli. The development of broccoli with enhanced concentrations of glucoraphanin may deliver greater health benefits. Three high-glucoraphanin F1 broccoli hybrids were developed in independent programmes through genome introgression from the wild species Brassica villosa. Glucoraphanin and other metabolites were quantified in experimental field trials. Global SNP analyses quantified the differential extent of B. villosa introgression The high-glucoraphanin broccoli hybrids contained 2.5–3 times the glucoraphanin content of standard hybrids due to enhanced sulphate assimilation and modifications in sulphur partitioning between sulphur-containing metabolites. All of the high-glucoraphanin hybrids possessed an introgressed B. villosa segment which contained a B. villosa Myb28 allele. Myb28 expression was increased in all of the high-glucoraphanin hybrids. Two high-glucoraphanin hybrids have been commercialised as Beneforte broccoli. The study illustrates the translation of research on glucosinolate genetics from Arabidopsis to broccoli, the use of wild Brassica species to develop cultivars with potential consumer benefits, and the development of cultivars with contrasting concentrations of glucoraphanin for use in blinded human intervention studie

    Association mapping in tetraploid potato

    Get PDF
    The results of a four year project within the Centre for BioSystems Genomics (www.cbsg.nl), entitled “Association mapping and family genotyping in potato” are described in this thesis. This project was intended to investigate whether a recently emerged methodology, association mapping, could provide the means to improve potato breeding efficiency. In an attempt to answer this research question a set of potato cultivars representative for the commercial potato germplasm was selected. In total 240 cultivars and progenitor clones were chosen. In a later stage this set was expanded with 190 recent breeds contributed by five participating breeding companies which resulted in a total of 430 genotypes. In a pilot experiment, the results of which are reported in Chapter 2, a subset of 220 of the abovementioned 240 cultivars and progenitor clones was used. Phenotypic data was retrieved through contributions of the participating breeding companies and represented summary statistics of recent observations for a number of traits across years and locations, calculated following company specific procedures. With AFLP marker data, in the form of normalised log-transformed band intensities, obtained from five well-known primer combinations, the extent of linkage disequilibrium (LD), using the r2 statistic, was estimated. Population structure within the set of 220 cultivars was analysed by deploying a clustering approach. No apparent, nor statistically supported population structure was revealed and the LD seemed to decay below the threshold of 0.1 at a genetic distance of about 3cM with this set of marker data. Furthermore, marker-trait associations were investigated by fitting single marker regression models for phenotypic traits on marker band intensities with and without correction for population structure. Population structure correction was performed in a straightforward way by incorporating a design matrix into the model assuming that each breeding company represented a different breeding germplasm pool. The potential of association mapping in tetraploid potato has been demonstrated in this pilot experiment, because existing phenotypic data, a modest number of AFLP markers, and a relatively straightforward statistical analysis allowed identification of interesting associations for a number of agro-morphological and quality traits. These promising results encouraged us to engage into an encompassing genome-wide association mapping study in potato. Two association mapping panels were compiled. One panel comprising 205 genotypes, all of which were also present in the set used for the pilot experiment, and another panel containing in total 299 genotypes including the entire set of 190 recent breeds together with a series of standard cultivars, about 100 of which are in common with the first panel. Phenotypic data for the association panel with 205 genotypes were obtained in a field trial performed in 2006 in Wageningen at two locations with two replicates. We will refer to this set as the “2006 field trial”. Phenotypic data for the other panel with 299 genotypes was contributed by the five participating breeding companies and consisted of multi-year-multi-location data obtained during generations of clonal selection. The 2006 data were nicely balanced, because the trial was designed in that way. The historical breeding dataset was highly unbalanced. Analysis of these two differing phenotypic datasets was performed to deliver insight in variance components for the genotypic main effects and the genotype by environment interaction (GEI), besides estimated genotype main effects across environments. Both phenotypic datasets were analysed separately within a mixed model framework including terms for GEI. In Chapter 3 we describe both phenotypic datasets by comparing variance components, heritabilities (=repeatabilities), intra-dataset relationships and inter-dataset relationships. Broader aspects related to phenotypic datasets and their analysis are discussed as well. To retrieve information about hidden population structure and genetic relatedness, and to estimate the extent of LD in potato germplasm, we used marker information generated with 41 AFLP primer combinations and 53 microsatellite loci on a collection of 430 genotypes. These 430 genotypes contain all genotypes present in the two association mapping panels introduced before plus a few extra genotypes to increase potato germplasm coverage. Two methods were used: a Bayesian approach and a distance-based clustering approach. Chapter 4 describes the results of this exercise. Both strategies revealed a weak level of structure in our material. Groups were detected which complied with criteria such as their intended market segment, as well as groups differing in their year of first registration on a national list. Linkage disequilibrium, using the r2 statistic, appeared to decay below the threshold of 0.1 across linkage groups at a genetic distance of about 5cM on average. The results described in Chapter 4 are promising for association mapping research in potato. The odds are reasonable that useful marker-trait associations can be detected and that the potential mapping resolution will suffice for detection of QTL in an association mapping context. In Chapter 5 a comprehensive genome-wide association mapping study is presented. The adjusted genotypic means obtained from two association mapping panels as a result of phenotypic analysis performed in Chapter 3 were combined with marker information in two association mapping models. Marker information consisted of normalised log-transformed band intensities of 41 AFLP primer combinations and allele dosage information from 53 microsatellites. A baseline model without correction for population structure and a more advanced model with correction for population structure and genetic relatedness were applied. Population structure and genetic relatedness were estimated using available marker information. Interesting QTL could be identified for 19 agro-morphological and quality traits. The observed QTL partly confirm previous studies e.g. for tuber shape and frying colour, but also new QTL have been detected e.g. for after baking darkening and enzymatic browning. In the final chapter, the general discussion, results of preceding chapters are evaluated and their implications for research as well as breeding are discussed. <br/

    Spatial and temporal variation in otolith chemistry for tautog (Tautoga onitis) in Narragansett Bay and Rhode Island coastal ponds

    Get PDF
    The elemental composition of otoliths may provide valuable information for establishing connectivity between fish nursery grounds and adult fish populations. Concentrations of Rb, Mg, Ca, Mn, Sr, Na, K, Sr, Pb, and Ba were determined by using solution-based inductively coupled plasma mass spectrometry in otoliths of young-of-the year tautog (Tautoga onitis) captured in nursery areas along the Rhode Island coast during two consecutive years. Stable oxygen (δ18O) and carbon (δ13C) isotopic ratios in young-of-the year otoliths were also analyzed with isotope ratio mass spectrometry. Chemical signatures differed significantly among the distinct nurseries within Narragansett Bay and the coastal ponds across years. Significant differences were also observed within nurseries from year to year. Classification accuracy to each of the five tautog nursery areas ranged from 85% to 92% across years. Because accurate classification of juvenile tautog nursery sites was achieved, otolith chemistry can potentially be used as a natural habitat tag

    Fourth-order dispersion mediated modulation instability in dispersion oscillating fibers

    Get PDF
    We investigate the role played by fourth-order dispersion on the modulation instability process in dispersion oscillating fibers. It not only leads to the appearance of instability sidebands in the normal dispersion regime (as in uniform fibers), but also to a new class of large detuned instability peaks that we ascribe to the variation of dispersion. All these theoretical predictions are experimentally confirmed. (C) 2013 Optical Society of Americ

    Clinical application of high throughput molecular screening techniques for pharmacogenomics.

    Get PDF
    Genetic analysis is one of the fastest-growing areas of clinical diagnostics. Fortunately, as our knowledge of clinically relevant genetic variants rapidly expands, so does our ability to detect these variants in patient samples. Increasing demand for genetic information may necessitate the use of high throughput diagnostic methods as part of clinically validated testing. Here we provide a general overview of our current and near-future abilities to perform large-scale genetic testing in the clinical laboratory. First we review in detail molecular methods used for high throughput mutation detection, including techniques able to monitor thousands of genetic variants for a single patient or to genotype a single genetic variant for thousands of patients simultaneously. These methods are analyzed in the context of pharmacogenomic testing in the clinical laboratories, with a focus on tests that are currently validated as well as those that hold strong promise for widespread clinical application in the near future. We further discuss the unique economic and clinical challenges posed by pharmacogenomic markers. Our ability to detect genetic variants frequently outstrips our ability to accurately interpret them in a clinical context, carrying implications both for test development and introduction into patient management algorithms. These complexities must be taken into account prior to the introduction of any pharmacogenomic biomarker into routine clinical testing
    corecore