929 research outputs found

    Subset-Based Ant Colony Optimisation for the Discovery of Gene-Gene Interactions in Genome Wide Association Studies

    Get PDF
    In this paper an ant colony optimisation approach for the discovery of gene-gene interactions in genome-wide association study (GWAS) data is proposed. The subset-based approach includes a novel encoding mechanism and tournament selection to analyse full scale GWAS data consisting of hundreds of thousands of variables to discover associations between combinations of small DNA changes and Type II diabetes. The method is tested on a large established database from the Wellcome Trust Case Control Consortium and is shown to discover combinations that are statistically significant and biologically relevant within reasonable computational time.The work contained in this paper was supported by an EPSRC First Grant (EP/J007439/1). This study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the inves- tigators who contributed to the generation of the data is available from http://www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under award 076113

    Discovering Higher-order SNP Interactions in High-dimensional Genomic Data

    Get PDF
    In this thesis, a multifactor dimensionality reduction based method on associative classification is employed to identify higher-order SNP interactions for enhancing the understanding of the genetic architecture of complex diseases. Further, this thesis explored the application of deep learning techniques by providing new clues into the interaction analysis. The performance of the deep learning method is maximized by unifying deep neural networks with a random forest for achieving reliable interactions in the presence of noise

    Ant Colony Optimization

    Get PDF
    Ant Colony Optimization (ACO) is the best example of how studies aimed at understanding and modeling the behavior of ants and other social insects can provide inspiration for the development of computational algorithms for the solution of difficult mathematical problems. Introduced by Marco Dorigo in his PhD thesis (1992) and initially applied to the travelling salesman problem, the ACO field has experienced a tremendous growth, standing today as an important nature-inspired stochastic metaheuristic for hard optimization problems. This book presents state-of-the-art ACO methods and is divided into two parts: (I) Techniques, which includes parallel implementations, and (II) Applications, where recent contributions of ACO to diverse fields, such as traffic congestion and control, structural optimization, manufacturing, and genomics are presented

    Gene selection and classification in autism gene expression data

    Get PDF
    Autism spectrum disorders (ASD) are neurodevelopmental disorders that are currently diagnosed on the basis of abnormal stereotyped behaviour as well as observable deficits in communication and social functioning. Although a variety of candidate genes have been attributed to the disorder, no single gene is applicable to more than 1–2% of the general ASD population. Despite extensive efforts, definitive genes that contribute to autism susceptibility have yet to be identified. The major problems in dealing with the gene expression dataset of autism include the presence of limited number of samples and large noises due to errors of experimental measurements and natural variation. In this study, a systematic combination of three important filters, namely t-test (TT), Wilcoxon Rank Sum (WRS) and Feature Correlation (COR) are applied along with efficient wrapper algorithm based on geometric binary particle swarm optimization-support vector machine (GBPSO-SVM), aiming at selecting and classifying the most attributed genes of autism. A new approach based on the criterion of median ratio, mean ratio and variance deviations is also applied to reduce the initial dataset prior to its involvement. Results showed that the most discriminative genes that were identified in the first and last selection steps concluded the presence of a repetitive gene (CAPS2), which was assigned as the most ASD risk gene. The fused result of genes subset that were selected by the GBPSO-SVM algorithm increased the classification accuracy to about 92.10%, which is higher than those reported in literature for the same autism dataset. Noticeably, the application of ensemble using random forest (RF) showed better performance compared to that of previous studies. However, the ensemble approach based on the employment of SVM as an integrator of the fused genes from the output branches of GBPSO-SVM outperformed the RF integrator. The overall improvement was ascribed to the selection strategies that were taken to reduce the dataset and the utilization of efficient wrapper based GBPSO-SVM algorithm

    Learning Bayesian network equivalence classes using ant colony optimisation

    Get PDF
    Bayesian networks have become an indispensable tool in the modelling of uncertain knowledge. Conceptually, they consist of two parts: a directed acyclic graph called the structure, and conditional probability distributions attached to each node known as the parameters. As a result of their expressiveness, understandability and rigorous mathematical basis, Bayesian networks have become one of the first methods investigated, when faced with an uncertain problem domain. However, a recurring problem persists in specifying a Bayesian network. Both the structure and parameters can be difficult for experts to conceive, especially if their knowledge is tacit.To counteract these problems, research has been ongoing, on learning both the structure and parameters of Bayesian networks from data. Whilst there are simple methods for learning the parameters, learning the structure has proved harder. Part ofthis stems from the NP-hardness of the problem and the super-exponential space of possible structures. To help solve this task, this thesis seeks to employ a relatively new technique, that has had much success in tackling NP-hard problems. This technique is called ant colony optimisation. Ant colony optimisation is a metaheuristic based on the behaviour of ants acting together in a colony. It uses the stochastic activity of artificial ants to find good solutions to combinatorial optimisation problems. In the current work, this method is applied to the problem of searching through the space of equivalence classes of Bayesian networks, in order to find a good match against a set of data. The system uses operators that evaluate potential modifications to a current state. Each of the modifications is scored and the results used to inform the search. In order to facilitate these steps, other techniques are also devised, to speed up the learning process. The techniques includeThe techniques are tested by sampling data from gold standard networks and learning structures from this sampled data. These structures are analysed using various goodnessof-fit measures to see how well the algorithms perform. The measures include structural similarity metrics and Bayesian scoring metrics. The results are compared in depth against systems that also use ant colony optimisation and other methods, including evolutionary programming and greedy heuristics. Also, comparisons are made to well known state-of-the-art algorithms and a study performed on a real-life data set. The results show favourable performance compared to the other methods and on modelling the real-life data

    The evolution and mechanisms of caste plasticity in vespid wasps

    Get PDF
    Social insects are ecologically dominant predators, pollinators, herbivores and detritivores across many terrestrial ecosystems. Key to the ecological success of these species is a uniquely strong division of labour between reproductives (‘queens’) and non-reproductives (‘workers’). In some social insect species, reproductive division of labour is obligate and developmentally determined, but many other taxa possess full reproductive plasticity, which is the basal state for social insect evolution. Answering the question of how division of reproductive labour is maintained in the presence of reproductive plasticity is an important prerequisite to understanding how and why this plasticity has been lost in the most derived social insect taxa. In this thesis, I address this question using two species of social wasp which exhibit strong division of reproductive labour but full reproductive plasticity. Two chapters of the thesis examine responses to queen loss in the European paper wasp P. dominula, in order to understand the mechanisms by which groups accommodate the loss of a reproductive. In Chapter 2 I show that in this species, groups generate replacement reproductives rapidly and with little conflict by relying on an age-based succession criterion. In Chapter 3 I analyse the transcriptomic mechanisms that underlie this succession process, and show that variation in individuals’ phenotypes only partially explains their transcriptomic responses, a result that suggests hidden costs of queen loss. In Chapter 4, I analyse individual-level transcriptomic data from a facultatively social tropical hover wasp, Liostenogaster flavolineata, which forms linearly age-based dominance hierarchies in which individuals exhibit progressively reduced foraging effort as they move up in rank. I show that despite differences in social structure, variation in gene expression in colonies of this species is surprisingly similar to that of obligately social species such as P. dominula. I also find that genes that are associated with indirect fitness in L. flavolineata are more strongly evolutionarily conserved than genes associated with direct fitness, a surprising result that runs counter to results obtained for other social insect species. Additionally, in Chapter 5 I argue for a reconceptualization of the loss of reproductive plasticity that has occurred in more complex insect societies. Taken as a whole, this thesis sheds light on the behavioural and transcriptomic mechanisms by which distinct fitness strategies are maintained in reproductively skewed societies as well as revealing potential limitations of these mechanisms, emphasising the value of reproductively plastic social insects as models for the evolution of sociality

    Investigation of genetic associations of mother to neonate group B streptococcus

    Get PDF
    Group B Streptococcus (GBS, Streptococcus agalactiae) is a leading cause of neonatal infections and stillbirths in infants under 3 months old. Vertical transmission remains the most common route of transmission to infants ≀6 days old due to ingestion of GBS laden fluids during delivery, or in utero. The route of transmission to infants 7-90 days old is less well understood. To understand the genetic distribution of GBS strains from The Gambia, and the associations of mother to infant transmission, we used whole genome sequencing. Genomic analysis of 781 GBS isolates from 154 GBS colonised women and their infants ≀90 days old from The Gambia revealed the most common serotypes were serotypes V (42%), II (27%), III (13%), IV (9%), Ia (7%) and Ib (2%). Multilocus sequence typing (MLST) grouped the isolates into 20 STs, with four novel STs identified which were ST1354, ST1355, ST1356 and ST1357. All GBS isolates clustered into CC1 (43.9%), CC26 (25.9%), CC17 (10%), CC19 (7.2%), CC23 (7.2%) and CC10 (5.1%). Antibiotic resistance was low in GBS in The Gambia. No isolates were resistant to penicillin or clindamycin, but resistance to macrolides (5%), tetracycline (99%), and fluoroquinolones (2.3%) were observed. Using comparative genomics to identify genetic mutations that are associated with mother to infant GBS transmission, multiple mutations were identified in 94 mother-infant pairs colonised with the same ST. Mutations were found in key virulence factors such as bibA and bca, but no gene had a mutation in more than one mother-infant pair, except for mutations in ispE, bioB, rsmB, infB and nylA_1, where each of these genes were found in two mother-infant pairs. Overall, this work shows GBS strains in colonisation are heterogeneous and GBS can undergo genetic changes within a short duration of ≀89 days.Open Acces
    • 

    corecore