612 research outputs found

    Ant Colony Optimisation for Exploring Logical Gene-Gene Associations in Genome Wide Association Studies.

    Get PDF
    In this paper a search for the logical variants of gene-gene interactions in genome-wide association study (GWAS) data using ant colony optimisation is proposed. The method based on stochastic algorithms is tested on a large established database from the Wellcome Trust Case Control Consortium and is shown to discover logical operations between combinations of single nucleotide polymorphisms that can discriminate Type II diabetes. A variety of logical combinations are explored and the best discovered associations are found within reasonable computational time and are shown to be statistically significantThis study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the investigators who contributed to the generation of the data is available from http://www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under award 076113. The work contained in this paper was funded by an EPSRC First Grant (EP/J007439/1) and we acknowledge their kind support

    Subset-Based Ant Colony Optimisation for the Discovery of Gene-Gene Interactions in Genome Wide Association Studies

    Get PDF
    In this paper an ant colony optimisation approach for the discovery of gene-gene interactions in genome-wide association study (GWAS) data is proposed. The subset-based approach includes a novel encoding mechanism and tournament selection to analyse full scale GWAS data consisting of hundreds of thousands of variables to discover associations between combinations of small DNA changes and Type II diabetes. The method is tested on a large established database from the Wellcome Trust Case Control Consortium and is shown to discover combinations that are statistically significant and biologically relevant within reasonable computational time.The work contained in this paper was supported by an EPSRC First Grant (EP/J007439/1). This study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the inves- tigators who contributed to the generation of the data is available from http://www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under award 076113

    An adaptive ant colony optimization algorithm for rule-based classification

    Get PDF
    Classification is an important data mining task with different applications in many fields. Various classification algorithms have been developed to produce classification models with high accuracy. Differing from other complex and difficult classification models, rules-based classification algorithms produce models which are understandable for users. Ant-Miner is a variant of ant colony optimisation and a prominent intelligent algorithm widely use in rules-based classification. However, the Ant-Miner has overfitting and easily falls into local optima problems which resulted in low classification accuracy and complex classification rules. In this study, a new Ant-Miner classifier is developed, named Adaptive Genetic Iterated-AntMiner (AGI-AntMiner) that aims to avoid local optima and overfitting problems. The components of AGI-AntMiner includes: i) an Adaptive AntMiner which is a prepruning technique to dynamically select the appropriate threshold based on the quality of the rules; ii) Genetic AntMiner that improves the post-pruning by adding/removing terms in a dual manner; and, iii) an Iterated Local Search-AntMiner that improves exploitation based on multiple-neighbourhood structure. The proposed AGI-AntMiner algorithm is evaluated on 16 benchmark datasets of medical, financial, gaming and social domains obtained from the University California Irvine repository. The algorithmā€™s performance was compared with other variants of Ant-Miner and state-of-the-art rules-based classification algorithms based on classification accuracy and model complexity. Experimental results proved that the proposed AGI-AntMiner algorithm is superior in two (2) aspects. Hybridization of local search in AGI-AntMiner has improved the exploitation mechanism which leads to the discovery of more accurate classification rules. The new pre-pruning and postpruning techniques have improved the pruning ability to produce shorter classification rules which are easier to interpret by the users. Thus, the proposed AGI-AntMiner algorithm is capable in conducting an efficient search in finding the best classification rules that balance the classification accuracy and model complexity to overcome overfitting and local optima problems

    A Survey on Natural Inspired Computing (NIC): Algorithms and Challenges

    Get PDF
    Nature employs interactive images to incorporate end users2019; awareness and implication aptitude form inspirations into statistical/algorithmic information investigation procedures. Nature-inspired Computing (NIC) is an energetic research exploration field that has appliances in various areas, like as optimization, computational intelligence, evolutionary computation, multi-objective optimization, data mining, resource management, robotics, transportation and vehicle routing. The promising playing field of NIC focal point on managing substantial, assorted and self-motivated dimensions of information all the way through the incorporation of individual opinion by means of inspiration as well as communication methods in the study practices. In addition, it is the permutation of correlated study parts together with Bio-inspired computing, Artificial Intelligence and Machine learning that revolves efficient diagnostics interested in a competent pasture of study. This article intend at given that a summary of Nature-inspired Computing, its capacity and concepts and particulars the most significant scientific study algorithms in the field

    Learning Bayesian network equivalence classes using ant colony optimisation

    Get PDF
    Bayesian networks have become an indispensable tool in the modelling of uncertain knowledge. Conceptually, they consist of two parts: a directed acyclic graph called the structure, and conditional probability distributions attached to each node known as the parameters. As a result of their expressiveness, understandability and rigorous mathematical basis, Bayesian networks have become one of the first methods investigated, when faced with an uncertain problem domain. However, a recurring problem persists in specifying a Bayesian network. Both the structure and parameters can be difficult for experts to conceive, especially if their knowledge is tacit.To counteract these problems, research has been ongoing, on learning both the structure and parameters of Bayesian networks from data. Whilst there are simple methods for learning the parameters, learning the structure has proved harder. Part ofthis stems from the NP-hardness of the problem and the super-exponential space of possible structures. To help solve this task, this thesis seeks to employ a relatively new technique, that has had much success in tackling NP-hard problems. This technique is called ant colony optimisation. Ant colony optimisation is a metaheuristic based on the behaviour of ants acting together in a colony. It uses the stochastic activity of artificial ants to find good solutions to combinatorial optimisation problems. In the current work, this method is applied to the problem of searching through the space of equivalence classes of Bayesian networks, in order to find a good match against a set of data. The system uses operators that evaluate potential modifications to a current state. Each of the modifications is scored and the results used to inform the search. In order to facilitate these steps, other techniques are also devised, to speed up the learning process. The techniques includeThe techniques are tested by sampling data from gold standard networks and learning structures from this sampled data. These structures are analysed using various goodnessof-fit measures to see how well the algorithms perform. The measures include structural similarity metrics and Bayesian scoring metrics. The results are compared in depth against systems that also use ant colony optimisation and other methods, including evolutionary programming and greedy heuristics. Also, comparisons are made to well known state-of-the-art algorithms and a study performed on a real-life data set. The results show favourable performance compared to the other methods and on modelling the real-life data

    Analysis of physiological signals using machine learning methods

    Get PDF
    Technological advances in data collection enable scientists to suggest novel approaches, such as Machine Learning algorithms, to process and make sense of this information. However, during this process of collection, data loss and damage can occur for reasons such as faulty device sensors or miscommunication. In the context of time-series data such as multi-channel bio-signals, there is a possibility of losing a whole channel. In such cases, existing research suggests imputing the missing parts when the majority of data is available. One way of understanding and classifying complex signals is by using deep neural networks. The hyper-parameters of such models have been optimised using the process of back propagation. Over time, improvements have been suggested to enhance this algorithm. However, an essential drawback of the back propagation can be the sensitivity to noisy data. This thesis proposes two novel approaches to address the missing data challenge and back propagation drawbacks: First, suggesting a gradient-free model in order to discover the optimal hyper-parameters of a deep neural network. The complexity of deep networks and high-dimensional optimisation parameters presents challenges to find a suitable network structure and hyper-parameter configuration. This thesis proposes the use of a minimalist swarm optimiser, Dispersive Flies Optimisation(DFO), to enable the selected model to achieve better results in comparison with the traditional back propagation algorithm in certain conditions such as limited number of training samples. The DFO algorithm offers a robust search process for finding and determining the hyper-parameter configurations. Second, imputing whole missing bio-signals within a multi-channel sample. This approach comprises two experiments, namely the two-signal and five-signal imputation models. The first experiment attempts to implement and evaluate the performance of a model mapping bio-signals from A toB and vice versa. Conceptually, this is an extension to transfer learning using CycleGenerative Adversarial Networks (CycleGANs). The second experiment attempts to suggest a mechanism imputing missing signals in instances where multiple data channels are available for each sample. The capability to map to a target signal through multiple source domains achieves a more accurate estimate for the target domain. The results of the experiments performed indicate that in certain circumstances, such as having a limited number of samples, finding the optimal hyper-parameters of a neural network using gradient-free algorithms outperforms traditional gradient-based algorithms, leading to more accurate classification results. In addition, Generative Adversarial Networks could be used to impute the missing data channels in multi-channel bio-signals, and the generated data used for further analysis and classification tasks

    A Survey on Natural Inspired Computing (NIC): Algorithms and Challenges

    Get PDF
    Nature employs interactive images to incorporate end usersā€™ awareness and implication aptitude form inspirations into statistical/algorithmic information investigation procedures. Nature-inspired Computing (NIC) is an energetic research exploration field that has appliances in various areas, like as optimization, computational intelligence, evolutionary computation, multi-objective optimization, data mining, resource management, robotics, transportation and vehicle routing. The promising playing field of NIC focal point on managing substantial, assorted and self-motivated dimensions of information all the way through the incorporation of individual opinion by means of inspiration as well as communication methods in the study practices. In addition, it is the permutation of correlated study parts together with Bio-inspired computing, Artificial Intelligence and Machine learning that revolves efficient diagnostics interested in a competent pasture of study. This article intend at given that a summary of Nature-inspired Computing, its capacity and concepts and particulars the most significant scientific study algorithms in the field

    Investigation of genetic associations of mother to neonate group B streptococcus

    Get PDF
    Group B Streptococcus (GBS, Streptococcus agalactiae) is a leading cause of neonatal infections and stillbirths in infants under 3 months old. Vertical transmission remains the most common route of transmission to infants ā‰¤6 days old due to ingestion of GBS laden fluids during delivery, or in utero. The route of transmission to infants 7-90 days old is less well understood. To understand the genetic distribution of GBS strains from The Gambia, and the associations of mother to infant transmission, we used whole genome sequencing. Genomic analysis of 781 GBS isolates from 154 GBS colonised women and their infants ā‰¤90 days old from The Gambia revealed the most common serotypes were serotypes V (42%), II (27%), III (13%), IV (9%), Ia (7%) and Ib (2%). Multilocus sequence typing (MLST) grouped the isolates into 20 STs, with four novel STs identified which were ST1354, ST1355, ST1356 and ST1357. All GBS isolates clustered into CC1 (43.9%), CC26 (25.9%), CC17 (10%), CC19 (7.2%), CC23 (7.2%) and CC10 (5.1%). Antibiotic resistance was low in GBS in The Gambia. No isolates were resistant to penicillin or clindamycin, but resistance to macrolides (5%), tetracycline (99%), and fluoroquinolones (2.3%) were observed. Using comparative genomics to identify genetic mutations that are associated with mother to infant GBS transmission, multiple mutations were identified in 94 mother-infant pairs colonised with the same ST. Mutations were found in key virulence factors such as bibA and bca, but no gene had a mutation in more than one mother-infant pair, except for mutations in ispE, bioB, rsmB, infB and nylA_1, where each of these genes were found in two mother-infant pairs. Overall, this work shows GBS strains in colonisation are heterogeneous and GBS can undergo genetic changes within a short duration of ā‰¤89 days.Open Acces

    The multiple pheromone Ant clustering algorithm

    Get PDF
    Ant Colony Optimisation algorithms mimic the way ants use pheromones for marking paths to important locations. Pheromone traces are followed and reinforced by other ants, but also evaporate over time. As a consequence, optimal paths attract more pheromone, whilst the less useful paths fade away. In the Multiple Pheromone Ant Clustering Algorithm (MPACA), ants detect features of objects represented as nodes within graph space. Each node has one or more ants assigned to each feature. Ants attempt to locate nodes with matching feature values, depositing pheromone traces on the way. This use of multiple pheromone values is a key innovation. Ants record other ant encounters, keeping a record of the features and colony membership of ants. The recorded values determine when ants should combine their features to look for conjunctions and whether they should merge into colonies. This ability to detect and deposit pheromone representative of feature combinations, and the resulting colony formation, renders the algorithm a powerful clustering tool. The MPACA operates as follows: (i) initially each node has ants assigned to each feature; (ii) ants roam the graph space searching for nodes with matching features; (iii) when departing matching nodes, ants deposit pheromones to inform other ants that the path goes to a node with the associated feature values; (iv) ant feature encounters are counted each time an ant arrives at a node; (v) if the feature encounters exceed a threshold value, feature combination occurs; (vi) a similar mechanism is used for colony merging. The model varies from traditional ACO in that: (i) a modified pheromone-driven movement mechanism is used; (ii) ants learn feature combinations and deposit multiple pheromone scents accordingly; (iii) ants merge into colonies, the basis of cluster formation. The MPACA is evaluated over synthetic and real-world datasets and its performance compares favourably with alternative approaches
    • ā€¦
    corecore