25 research outputs found

    Gene selection and classification in autism gene expression data

    Get PDF
    Autism spectrum disorders (ASD) are neurodevelopmental disorders that are currently diagnosed on the basis of abnormal stereotyped behaviour as well as observable deficits in communication and social functioning. Although a variety of candidate genes have been attributed to the disorder, no single gene is applicable to more than 1–2% of the general ASD population. Despite extensive efforts, definitive genes that contribute to autism susceptibility have yet to be identified. The major problems in dealing with the gene expression dataset of autism include the presence of limited number of samples and large noises due to errors of experimental measurements and natural variation. In this study, a systematic combination of three important filters, namely t-test (TT), Wilcoxon Rank Sum (WRS) and Feature Correlation (COR) are applied along with efficient wrapper algorithm based on geometric binary particle swarm optimization-support vector machine (GBPSO-SVM), aiming at selecting and classifying the most attributed genes of autism. A new approach based on the criterion of median ratio, mean ratio and variance deviations is also applied to reduce the initial dataset prior to its involvement. Results showed that the most discriminative genes that were identified in the first and last selection steps concluded the presence of a repetitive gene (CAPS2), which was assigned as the most ASD risk gene. The fused result of genes subset that were selected by the GBPSO-SVM algorithm increased the classification accuracy to about 92.10%, which is higher than those reported in literature for the same autism dataset. Noticeably, the application of ensemble using random forest (RF) showed better performance compared to that of previous studies. However, the ensemble approach based on the employment of SVM as an integrator of the fused genes from the output branches of GBPSO-SVM outperformed the RF integrator. The overall improvement was ascribed to the selection strategies that were taken to reduce the dataset and the utilization of efficient wrapper based GBPSO-SVM algorithm

    Evolutionary Computation

    Get PDF
    This book presents several recent advances on Evolutionary Computation, specially evolution-based optimization methods and hybrid algorithms for several applications, from optimization and learning to pattern recognition and bioinformatics. This book also presents new algorithms based on several analogies and metafores, where one of them is based on philosophy, specifically on the philosophy of praxis and dialectics. In this book it is also presented interesting applications on bioinformatics, specially the use of particle swarms to discover gene expression patterns in DNA microarrays. Therefore, this book features representative work on the field of evolutionary computation and applied sciences. The intended audience is graduate, undergraduate, researchers, and anyone who wishes to become familiar with the latest research work on this field

    Discovering Biomarkers of Alzheimer's Disease by Statistical Learning Approaches

    Get PDF
    In this work, statistical learning approaches are exploited to discover biomarkers for Alzheimer's disease (AD). The contributions has been made in the fields of both biomarker and software driven studies. Surprising discoveries were made in the field of blood-based biomarker search. With the inclusion of existing biological knowledge and a proposed novel feature selection method, several blood-based protein models were discovered to have promising ability to separate AD patients from healthy individuals. A new statistical pattern was discovered which can be potential new guideline for diagnosis methodology. In the field of brain-based biomarker, the positive contribution of covariates such as age, gender and APOE genotype to a AD classifier was verified, as well as the discovery of panel of highly informative biomarkers comprising 26 RNA transcripts. The classifier trained by the panetl of genes shows excellent capacity in discriminating patients from control. Apart from biomarker driven studies, the development of statistical packages or application were also involved. R package metaUnion was designed and developed to provide advanced meta-analytic approach applicable for microarray data. This package overcomes the defects appearing in previous meta-analytic packages { 1) the neglection of missing data, 2) the in exibility of feature dimension 3) the lack of functions to support post-analysis summary. R package metaUnion has been applied in a published study as part of the integrated genomic approaches and resulted in significant findings. To provide benchmark references about significance of features for dementia researchers, a web-based platform AlzExpress was built to provide researchers with granular level of differential expression test and meta-analysis results. A combination of fashionable big data technologies and robust data mining algorithms make AlzExpress flexible, scalable and comprehensive platform of valuable bioinformatics in dementia research.Plymouth Universit

    Assigning function to genome wide association study variants associated with complex gastrointestinal disease

    Get PDF
    PhDThe genome‐wide association study era has identified numerous loci associated with many common polygenic diseases. The next challenge is to identify the functional consequences of these variants and elicit how they impact on disease risk. Using a combination of protein based assays, large scale microarrays and high‐throughput generation sequencing platforms this thesis aims to identify the functional effects of disease loci, with particular focus on Crohn’s disease and coeliac disease, two common complex gastrointestinal diseases. Variants located within the Interleukin 23 receptor are associated with both susceptibility and protection from Crohn’s disease, a debilitating chronic inflammatory disease of the bowel. A study was undertaken to investigate the effect of these variants, at the mRNA as well as the protein level, on both cytokine and receptor levels. Coeliac disease is a dietary intolerance to the gluten component of wheat, barley and rye and has an estimated prevalence of approximately 1%. Genome‐wide association studies have identified eight genomic different loci as associated with coeliac disease but none have been functionally characterised. To investigate the effect that genotype has on gene transcript levels, a genetical genomics study was undertaken in patients with coeliac disease generating results with relevance to a range of autoimmune disorders. Before disease based effects can be identified, it is first important to fully characterise the normal human transcriptome and methylome. To this end CD4 + T cells were studied using novel high‐throughput sequencing techniques, with the aim of providing some insight into novel genomic properties that may illuminate current and future disease associated loci. Given the base pair resolution approach of high‐throughput sequencing, a novel method of assaying for SNP effects on gene expression was developed. This allele specific method, using whole transcriptome sequencing, is capable of identifying alterations in transcript expression on a genome‐wide scale

    Mediation of triple-negative breast cancer cell fate via cellular redox and Wnt signalling

    Get PDF
    Breast cancer is the most common cause of malignancy affecting women worldwide. This thesis focusses on the role of DDX20 in regulating Wnt/ÎČ-catenin signalling and its impact on cell fate in triple-negative breast cancer (TNBC). The results of this study demonstrated a new role for DDX20-mediated Wnt signalling governing intracellular redox and mitochondrial function. Furthermore, we have determined that DDX20 is an essential regulator of Wnt/ÎČ-catenin signalling in TNBC stem cells

    Characterisation and structural studies of a superoxide dismutase and OmpA-like proteins from Borrelia burgdorferi sensu lato

    Get PDF
    Lyme borreliosis is the most common tick-borne, human infection across the Northern hemisphere. The agent responsible, Borrelia burgdorferi sensu lato (s.l.) covers a family of Spirochaetes with unique characteristics which are shared by both Gram-negative and Grampositive bacteria. The outer membrane (OM) is rich in lipoproteins but contains a relatively low density of integral membrane proteins (OMPs), of these OMPs very few have been identified and even fewer are well characterised. The OmpA-like transmembrane domain defined by the Pfam family PF01389 is a 8-stranded membrane spanning ÎČ-barrel and is well conserved among Gram-negative bacteria but to date remains unknown in Spirochaetes. Building from previous computational work which had sought to identify possible OMPs from B. burgdorferi s.l. four OmpA-like proteins, BAPKO_0422 (Borrelia afzelii), BB_0562, BB_0406 (B. burgdorferi) and BG0408 (Borrelia garinii) have been identified and structurally characterised. The four proteins are encoded by chromosomal genes and highly conserved between Borrelia species and may be of diagnostic or therapeutic value. Structural characterisation by both circular dichroism and small angle X-ray scattering suggests these four proteins adopt a compact globular structure rich in ÎČ-strand (~40%) with Ab initio molecular envelopes resembling a cylindrical peanut shape with dimensions of ~25x45 Å consistent with an 8-stranded ÎČ-barrel. The present work demonstrates that BAPKO_0422 can bind human factor H (hfH) and some evidence for a further interaction between the BAPKO_0422 protein and heparin. The interaction with hfH may contribute to the spirochaete’s immune evasion mechanisms by the inhibition of the complement response. The zoonotic life-cycle of Borrelia and challenges by the host’s immune system causes an ever changing environment which often leads to fluctuations of O2 exposure. Although B. burgdorferi s.l. have a distinct lack of metabolic systems including peroxidases and catalase enzymes the Spirochaetes genome does encode a single superoxide dismutase gene (sodA - bb_0157). Previously assigned as a Fe-SOD there has been some debate whether this protein requires iron or manganese as a co-factor. The present work demonstrates that the B. burgdorferi enzyme SodA requires manganese for activity and does not display cambialistic behaviour. Structural and proteomic characterisation suggests the B. burgdorferi SodA enzyme shares significant sequence similarity to a superoxide dismutase from Thermus thermophilus

    RFID Technology in Intelligent Tracking Systems in Construction Waste Logistics Using Optimisation Techniques

    Get PDF
    Construction waste disposal is an urgent issue for protecting our environment. This paper proposes a waste management system and illustrates the work process using plasterboard waste as an example, which creates a hazardous gas when land filled with household waste, and for which the recycling rate is less than 10% in the UK. The proposed system integrates RFID technology, Rule-Based Reasoning, Ant Colony optimization and knowledge technology for auditing and tracking plasterboard waste, guiding the operation staff, arranging vehicles, schedule planning, and also provides evidence to verify its disposal. It h relies on RFID equipment for collecting logistical data and uses digital imaging equipment to give further evidence; the reasoning core in the third layer is responsible for generating schedules and route plans and guidance, and the last layer delivers the result to inform users. The paper firstly introduces the current plasterboard disposal situation and addresses the logistical problem that is now the main barrier to a higher recycling rate, followed by discussion of the proposed system in terms of both system level structure and process structure. And finally, an example scenario will be given to illustrate the system’s utilization
    corecore