38 research outputs found

    Pareto optimal-based feature selection framework for biomarker identification

    Get PDF
    Numerous computational techniques have been applied to identify the vital features of gene expression datasets in aiming to increase the efficiency of biomedical applications. The classification of microarray data samples is an important task to correctly recognise diseases by identifying small but clinically meaningful genes. However, identification of disease representative genes or biomarkers in high dimensional microarray gene-expression datasets remains a challenging task. This thesis investigates the viability of Pareto optimisation in identifying relevant subsets of biomarkers in high-dimensional microarray datasets. A robust Pareto Optimal based feature selection framework for biomarker discovery is then proposed. First, a two-stage feature selection approach using ensemble filter methods and Pareto Optimality is proposed. The integration of the multi-objective approach employing Pareto Optimality starts with well-known filter methods applied to various microarray gene-expression datasets. Although filter methods provide ranked lists of features, they do not give information about optimum subsets of features, which are namely genes in this study. To address this limitation, the Pareto Optimality is incorporated along with filter methods. The robustness of the proposed framework is successfully demonstrated on several well-known microarray gene expression datasets and it is shown to achieve comparable or up to 100% predictive accuracy with comparatively fewer features. Better performance results are obtained in comparison with other approaches, which are single-objective approaches. Furthermore, cross-validation and k-fold approaches are integrated into the framework, which can enhance the over-fitting problem and the gene selection process is subsequently more accurate under various conditions. Then the proposed framework is developed in several phases. The Sequential Forward Selection method (SFS) is first used to represent wrapper techniques, and the developed Pareto Optimality based framework is applied multiple times and tested on different data types. Given the nature of most real-life data, imbalanced classes are examined using the proposed framework. The classifier achieves high performance at a similar level of different cases using the proposed Pareto Optimal based feature selection framework, which has a novel structure for imbalanced classes. Comparable or better gene subset sizes are obtained using the proposed framework. Finally, handling missing data within the proposed framework is investigated and it is demonstrated that different data imputation methods can also help in the effective integration of various feature selection methods

    Pod shattering: A homologous series of variation underlying domestication and an avenue for crop improvement

    Get PDF
    All rights reserved. In wild habitats, fruit dehiscence is a critical strategy for seed dispersal; however, in cultivated crops it is one of the major sources of yield loss. Therefore, indehiscence of fruits, pods, etc., was likely to be one of the first traits strongly selected in crop domestication. Even with the historical selection against dehiscence in early domesticates, it is a trait still targeted in many breeding programs, particularly in minor or underutilized crops. Here, we review dehiscence in pulse (grain legume) crops, which are of growing importance as a source of protein in human and livestock diets, and which have received less attention than cereal crops and the model plant Arabidopsis thaliana. We specifically focus on the (i) history of indehiscence in domestication across legumes, (ii) structures and the mechanisms involved in shattering, (iii) the molecular pathways underlying this important trait, (iv) an overview of the extent of crop losses due to shattering, and the effects of environmental factors on shattering, and, (v) efforts to reduce shattering in crops. While our focus is mainly pulse crops, we also included comparisons to crucifers and cereals because there is extensive research on shattering in these taxa

    The diversity and evolution of pollination systems in large plant clades: Apocynaceae as a case study

    Get PDF
    Background and Aims Large clades of angiosperms are often characterized by diverse interactions with pollinators, but how these pollination systems are structured phylogenetically and biogeographically is still uncertain for most families. Apocynaceae is a clade of >5300 species with a worldwide distribution. A database representing >10 % of species in the family was used to explore the diversity of pollinators and evolutionary shifts in pollination systems across major clades and regions. Methods The database was compiled from published and unpublished reports. Plants were categorized into broad pollination systems and then subdivided to include bimodal systems. These were mapped against the five major divisions of the family, and against the smaller clades. Finally, pollination systems were mapped onto a phylogenetic reconstruction that included those species for which sequence data are available, and transition rates between pollination systems were calculated. Key Results Most Apocynaceae are insect pollinated with few records of bird pollination. Almost three-quarters of species are pollinated by a single higher taxon (e.g. flies or moths); 7 % have bimodal pollination systems, whilst the remaining approx. 20 % are insect generalists. The less phenotypically specialized flowers of the Rauvolfioids are pollinated by a more restricted set of pollinators than are more complex flowers within the Apocynoids + Periplocoideae + Secamonoideae + Asclepiadoideae (APSA) clade. Certain combinations of bimodal pollination systems are more common than others. Some pollination systems are missing from particular regions, whilst others are over-represented. Conclusions Within Apocynaceae, interactions with pollinators are highly structured both phylogenetically and biogeographically. Variation in transition rates between pollination systems suggest constraints on their evolution, whereas regional differences point to environmental effects such as filtering of certain pollinators from habitats. This is the most extensive analysis of its type so far attempted and gives important insights into the diversity and evolution of pollination systems in large clades

    The Effects of Dispersal and Pollination on Plantaginaceae Diversification

    No full text
    The rich diversity of flowering plants can be explained by a variety of mechanisms, including geographical distribution, range expansion, and floral variance, which correlates with different biotic pollination forms. Plantaginaceae is an ideal model to examine these mechanisms providing the angiosperm diversity, as the family has diverse distribution patterns both in the Old World and the New World, and the family has representatives of many different pollination syndromes. Using molecular phylogenetics, ancestral reconstructions, and phylogenetic modeling and hypothesis testing, this study aimed to investigate the factors affecting the macroevolution of the angiosperm family Plantaginaceae. With 683 species from 72 genera, and a total of 6996 characters from 5 different molecular markers, the phylogenetic reconstruction revealed that Plantaginaceae have 12 strongly supported monophyletic tribes. The family was inferred to have a New World origin, and experienced several long-distance dispersal events between the Old World and the New World. In some cases, these long-distance dispersals were linked to chromosome number changes in the family. Sympatric speciation was shown to be a significant diversification mode in the family, which had some heterogeneity in terms of speciation rates among the tribes. These diversification patterns were not correlated with geographic distribution, as diversification rates in the Old World and the New World were similar. However, long-distance dispersals are found to be the main drivers of speciation within the family. Lastly, pollination was shown to have no effect on diversification in the tribe Antirrhineae In summary, this study investigated the diversification patterns within the diverse angiosperm family Plantaginaceae. Since its origin in the New World approximately 48.81 mya, the family has experienced several long-distance dispersal events between the Old World and the New World. Along with the changes in chromosome numbers, long-distance dispersal was found to be a strong contributor to the diversity in the family

    Diversification in Monkeyflowers: An Investigation of the Effects of Elevation and Floral Color in the Genus Mimulus

    No full text
    The vast diversity of floral colours in many flowering plant families, paired with the observation of preferences among pollinators, suggests that floral colour may be involved in the process of speciation in flowering plants. While transitions in floral colour have been examined in numerous genera, we have very little information on the consequences of floral colour transitions to the evolutionary success of a clade. Overlaid upon these patterns is the possibility that certain floral colours are more prevalent in certain environments, with the causes of differential diversification being more directly determined by geographical distribution. Here we examine transition rates to anthocyanin + carotenoid rich (red/orange/fuschia) flowers and examine whether red/orange flowers are associated with differences in speciation and/or extinction rates in Mimulus. Because it has been suggested that reddish flowers are more prevalent at high elevation, we also examine the macroevolutionary evidence for this association and determine if there is evidence for differential diversification at high elevations. We find that, while red/orange clades have equivalent speciation rates, the trait state of reddish flowers reverts more rapidly to the nonreddish trait state. Moreover, there is evidence for high speciation rates at high elevation and no evidence for transition rates in floral colour to differ depending on elevation.Peer Reviewe
    corecore