229 research outputs found

    Data mining using intelligent systems : an optimized weighted fuzzy decision tree approach

    Get PDF
    Data mining can be said to have the aim to analyze the observational datasets to find relationships and to present the data in ways that are both understandable and useful. In this thesis, some existing intelligent systems techniques such as Self-Organizing Map, Fuzzy C-means and decision tree are used to analyze several datasets. The techniques are used to provide flexible information processing capability for handling real-life situations. This thesis is concerned with the design, implementation, testing and application of these techniques to those datasets. The thesis also introduces a hybrid intelligent systems technique: Optimized Weighted Fuzzy Decision Tree (OWFDT) with the aim of improving Fuzzy Decision Trees (FDT) and solving practical problems. This thesis first proposes an optimized weighted fuzzy decision tree, incorporating the introduction of Fuzzy C-Means to fuzzify the input instances but keeping the expected labels crisp. This leads to a different output layer activation function and weight connection in the neural network (NN) structure obtained by mapping the FDT to the NN. A momentum term was also introduced into the learning process to train the weight connections to avoid oscillation or divergence. A new reasoning mechanism has been also proposed to combine the constructed tree with those weights which had been optimized in the learning process. This thesis also makes a comparison between the OWFDT and two benchmark algorithms, Fuzzy ID3 and weighted FDT. SIx datasets ranging from material science to medical and civil engineering were introduced as case study applications. These datasets involve classification of composite material failure mechanism, classification of electrocorticography (ECoG)/Electroencephalogram (EEG) signals, eye bacteria prediction and wave overtopping prediction. Different intelligent systems techniques were used to cluster the patterns and predict the classes although OWFDT was used to design classifiers for all the datasets. In the material dataset, Self-Organizing Map and Fuzzy C-Means were used to cluster the acoustic event signals and classify those events to different failure mechanism, after the classification, OWFDT was introduced to design a classifier in an attempt to classify acoustic event signals. For the eye bacteria dataset, we use the bagging technique to improve the classification accuracy of Multilayer Perceptrons and Decision Trees. Bootstrap aggregating (bagging) to Decision Tree also helped to select those most important sensors (features) so that the dimension of the data could be reduced. Those features which were most important were used to grow the OWFDT and the curse of dimensionality problem could be solved using this approach. The last dataset, which is concerned with wave overtopping, was used to benchmark OWFDT with some other Intelligent Systems techniques, such as Adaptive Neuro-Fuzzy Inference System (ANFIS), Evolving Fuzzy Neural Network (EFuNN), Genetic Neural Mathematical Method (GNMM) and Fuzzy ARTMAP. Through analyzing these datasets using these Intelligent Systems Techniques, it has been shown that patterns and classes can be found or can be classified through combining those techniques together. OWFDT has also demonstrated its efficiency and effectiveness as compared with a conventional fuzzy Decision Tree and weighted fuzzy Decision Tree

    Bioinformatics

    Get PDF
    This book is divided into different research areas relevant in Bioinformatics such as biological networks, next generation sequencing, high performance computing, molecular modeling, structural bioinformatics, molecular modeling and intelligent data analysis. Each book section introduces the basic concepts and then explains its application to problems of great relevance, so both novice and expert readers can benefit from the information and research works presented here

    Data mining using intelligent systems : an optimized weighted fuzzy decision tree approach

    Get PDF
    Data mining can be said to have the aim to analyze the observational datasets to find relationships and to present the data in ways that are both understandable and useful. In this thesis, some existing intelligent systems techniques such as Self-Organizing Map, Fuzzy C-means and decision tree are used to analyze several datasets. The techniques are used to provide flexible information processing capability for handling real-life situations. This thesis is concerned with the design, implementation, testing and application of these techniques to those datasets. The thesis also introduces a hybrid intelligent systems technique: Optimized Weighted Fuzzy Decision Tree (OWFDT) with the aim of improving Fuzzy Decision Trees (FDT) and solving practical problems. This thesis first proposes an optimized weighted fuzzy decision tree, incorporating the introduction of Fuzzy C-Means to fuzzify the input instances but keeping the expected labels crisp. This leads to a different output layer activation function and weight connection in the neural network (NN) structure obtained by mapping the FDT to the NN. A momentum term was also introduced into the learning process to train the weight connections to avoid oscillation or divergence. A new reasoning mechanism has been also proposed to combine the constructed tree with those weights which had been optimized in the learning process. This thesis also makes a comparison between the OWFDT and two benchmark algorithms, Fuzzy ID3 and weighted FDT. SIx datasets ranging from material science to medical and civil engineering were introduced as case study applications. These datasets involve classification of composite material failure mechanism, classification of electrocorticography (ECoG)/Electroencephalogram (EEG) signals, eye bacteria prediction and wave overtopping prediction. Different intelligent systems techniques were used to cluster the patterns and predict the classes although OWFDT was used to design classifiers for all the datasets. In the material dataset, Self-Organizing Map and Fuzzy C-Means were used to cluster the acoustic event signals and classify those events to different failure mechanism, after the classification, OWFDT was introduced to design a classifier in an attempt to classify acoustic event signals. For the eye bacteria dataset, we use the bagging technique to improve the classification accuracy of Multilayer Perceptrons and Decision Trees. Bootstrap aggregating (bagging) to Decision Tree also helped to select those most important sensors (features) so that the dimension of the data could be reduced. Those features which were most important were used to grow the OWFDT and the curse of dimensionality problem could be solved using this approach. The last dataset, which is concerned with wave overtopping, was used to benchmark OWFDT with some other Intelligent Systems techniques, such as Adaptive Neuro-Fuzzy Inference System (ANFIS), Evolving Fuzzy Neural Network (EFuNN), Genetic Neural Mathematical Method (GNMM) and Fuzzy ARTMAP. Through analyzing these datasets using these Intelligent Systems Techniques, it has been shown that patterns and classes can be found or can be classified through combining those techniques together. OWFDT has also demonstrated its efficiency and effectiveness as compared with a conventional fuzzy Decision Tree and weighted fuzzy Decision Tree.EThOS - Electronic Theses Online ServiceUniversity of WarwickOverseas Research Students Awards Scheme (ORSAS)GBUnited Kingdo

    Biological applications of multimodal imaging involving Raman and 4Pi Raman microscopy

    Get PDF
    Raman microscopy is becoming an increasingly important label-free imaging technique. It proved to be a viable tool for life science applications allowing to analyze bacteria, cells, and tissues at the molecular level. Combining Raman microscopy with complementary imaging modalities and techniques is explored here to: (1) analyze mild traumatic brain injury (mTBI) in a combination with magnetic resonance imaging (MRI) for detecting mild, and invisible to medical imaging techniques, brain tissue damage; (2) reveal complementarity of Raman and fluorescence microscopy approaches for investigating and tracking bovine lactoferrin inside calf rectal epithelial cells in the presence of enterohemorrhagic Escherichia coli (EHEC); (3) apply Raman microscopy along-side the molecular analysis approaches (such as scanning transmission electron microscopy-energy dispersive X-ray (STEM-EDX), low energy X-ray fluorescence (LEXRF), nanoscale secondary ion mass spectrometry (Nano-SIMS)) to uncover the origin of the long-range conductance in cable bacteria; (4) develop multifunctional surface enhanced Raman scattering (SERS) platform based on calcium carbonate particles for enhancing a weak Raman scattering signal of biomolecules as well as to apply Raman microscopy for particle detection in vivo in Caenorhabditis elegans (C. elegans) worms; and (5) combine Raman microscopy and atomic force microscopy (AFM) to track Chlamydia psittaci in cells. Analysis of described above samples and phenomena is based on Raman molecular fingerprint images, where, similarly to fluorescence light microscopy, the resolution is limited by diffraction of light. Therefore, efforts are also put to enhance the resolution of Raman microscopy-based imaging by adding a 4Pi configuration to a confocal Raman microscope. As a result, a possibility to enhance the axial (also called longitudinal) resolution is investigated by constructing a 4Pi confocal Raman microscope, which is also applied to study bacteria inside cells. Results presented in this work emphasize the added value of multimodal microscopy approaches, particularly involving Raman microscopy, in a broad range of applications in bioengineering, biomedicine, and biology

    Sequence analysis in bioinformatics: methodological and practical aspects

    Get PDF
    2011 - 2012My PhD research activities has focused on the development of new computational methods for biological sequence analyses. To overcome an intrinsic problem to protein sequence analysis, whose aim was to infer homologies in large biological protein databases with short queries, I developed a statistical framework BLAST-based to detect distant homologies conserved in transmembrane domains of different bacterial membrane proteins. Using this framework, transmembrane protein domains of all Salmonella spp. have been screened and more than five thousands of significant homologies have been identified. My results show that the proposed framework detects distant homologies that, because of their conservation in distinct bacterial membrane proteins, could represent ancient signatures about the existence of primeval genetic elements (or mini-genes) coding for short polypeptides that formed, through a primitive assembly process, more complex genes. Further, my statistical framework lays the foundation for new bioinformatics tools to detect homologies domain-oriented, or in other words, the ability to find statistically significant homologies in specific target-domains. The second problem that I faced deals with the analysis of transcripts obtained with RNA-Seq data. I developed a novel computational method that combines transcript borders, obtained from mapped RNA-Seq reads, with sequence features based operon predictions to accurately infer operons in prokaryotic genomes. Since the transcriptome of an organism is dynamic and condition dependent, the RNA-Seq mapped reads are used to determine a set of confirmed or predicted operons and from it specific transcriptomic features are extracted and combined with standard genomic features to train and validate three operon classification models (Random Forests - RFs, Neural Networks – NNs, and Support Vector Machines - SVMs). These classifiers have been exploited to refine the operon map annotated by DOOR, one of the most used database of prokaryotic operons. This method proved that the integration of genomic and transcriptomic features improve the accuracy of operon predictions, and that it is possible to predict the existence of potential new operons. An inherent limitation of using RNA-Seq to improve operon structure predictions is that it can be not applied to genes not expressed under the condition studied. I evaluated my approach on different RNA-Seq based transcriptome profiles of Histophilus somni and Porphyromonas gingivalis. These transcriptome profiles were obtained using the standard RNA-Seq or the strand-specific RNA-Seq method. My experimental results demonstrate that the three classifiers achieved accurate operon maps including reliable predictions of new operons. [edited by author]XI n.s

    1996 Eighth Annual IMSA Presentation Day

    Get PDF
    Attached is the Eighth Annual Presentation Day 1996 Program and Abstract Packet. We will be showcasing the research and achievements of students and staff of the IMSA community, as well as several off-campus presenters.https://digitalcommons.imsa.edu/archives_sir/1016/thumbnail.jp

    Modeling Gardnerella Host-Pathogen Interaction: Characterization of Vaginolysin and the Opacity Phenotype

    Get PDF
    Bacterial vaginosis (BV) describes a dysbiotic state of the vaginal mucosa, during which the dominance by beneficial lactobacilli species is lost, the vaginal pH increases, and the flora comprising the vaginal microbiome becomes more diverse. Bacterial taxa associated with this dysbiotic state interact to form a dense polymicrobial biofilm on the surface of the vaginal epithelium. This shift in the vaginal microflora is clinically important, even when symptoms are not noted, as women with BV are at increased risk for acquisition of other sexually transmitted infections, including HIV, and are more than two times as likely to experience preterm birth. Moreover, BV has been associated with numerous other adverse outcomes, including pelvic inflammatory disease, recurring urinary tract infections, chorioamnionitis, postpartum endometritis, infertility, and adverse neonatal outcomes, even among full term infants. The prevalence of BV, in the United States alone, is approximately 30% for women of reproductive age, and is significantly increased in Hispanic and African American women. Standard metronidazole treatment carries a recurrence rate of 52% within 12 months of treatment and recent reports indicate that BV recurrence can, at least in part, be attributed to the inability to reestablish the beneficial lactobacilli population after antibiotic treatment. Thus, it is clear that more targeted approaches to BV treatment are needed. While the last few decades of research have given us a better understanding of the organisms that comprise the vaginal microbiome, there are still a lot of gaps in our understanding of what they are doing and how they are interacting with the host and each other, both in the context of health and disease. This is due to inadequacies of animal models and difficulty in culturing and genetically manipulating vaginally relevant microorganisms. The overarching aim of this dissertation research was to characterize the factors that contribute to the establishment and maintenance of vaginal dysbiosis, focusing on the role(s) of one organism, Gardnerella vaginalis. Various tools were developed to further characterize putative virulence determinants produced by Gardnerella to better understand how the bacterium interacts with the human host, with emphasis on the cholesterol-dependent toxin, vaginolysin (VLY), and an opacity phenotype presumptively driven by differential pilus production. Through the development of a human vaginal epithelial model of Gardnerella infection, in which Gardnerella spp. grow to in vivo densities, we uncovered a previously undescribed influence of tissue polarity on VLY activity. This influence was mediated by differential expression of the proposed VLY co-receptor, CD59. Further investigation of VLY in different Gardnerella strains revealed conserved amino acid differences within regions of the toxin known to interact with host cells, allowing for the characterization of distinct VLY types. Two predominant VLY types exhibited differential interaction with vaginal epithelial cells, putatively related to CD59 availability. Bioinformatic analyses showed that certain VLY types are restricted to select Gardnerella species and the dominant VLY type in vaginal samples has implications for overall Gardnerella abundance and symptom frequency during BV. We then characterized a colony opacity phenotype potentially susceptible to phase variation. Both opaque and translucent variants were detected in all tested Gardnerella isolates, representing six different species. Opaque variants exhibited increased bacterial aggregation in suspension as well as increased surface biofilm formation, to suggest that opacity may promote resistance to host defenses. Finally, we report the development of the first method for performing site-specific mutagenesis in Gardnerella spp. After optimization, this oligonucleotide-based protocol yielded transformants at an efficiency similar to that observed in other bacterial species. Using these optimized methods, we were able to transform 3 strains of Gardnerella representing 3 different species. While this represents a significant advance in the field, continued optimization is necessary to further improve transformation efficiency in order to increase the practicality of using this tool to generate non-selectable transformants. In sum, continuing to build upon the knowledge gleaned from this work will allow us to develop more targeted therapies for BV and ultimately reduce its global health burden

    2010 IMSAloquium, Student Investigation Showcase

    Get PDF
    IMSA students engage in investigations in nanotechnology, particle physics, law, neonatal medicine, literature, transplantation biology, water purity, the educational achievement gap, neurobiology and memory, ethics, theatre, discrete mathematics, economics, and more.https://digitalcommons.imsa.edu/archives_sir/1002/thumbnail.jp

    06. 2010 IMSAloquium Student Investigation Showcase

    Get PDF
    https://digitalcommons.imsa.edu/class_of_2010/1004/thumbnail.jp
    • …
    corecore