75 research outputs found

    Novel Algorithm Development for ‘NextGeneration’ Sequencing Data Analysis

    Get PDF
    In recent years, the decreasing cost of ‘Next generation’ sequencing has spawned numerous applications for interrogating whole genomes and transcriptomes in research, diagnostic and forensic settings. While the innovations in sequencing have been explosive, the development of scalable and robust bioinformatics software and algorithms for the analysis of new types of data generated by these technologies have struggled to keep up. As a result, large volumes of NGS data available in public repositories are severely underutilised, despite providing a rich resource for data mining applications. Indeed, the bottleneck in genome and transcriptome sequencing experiments has shifted from data generation to bioinformatics analysis and interpretation. This thesis focuses on development of novel bioinformatics software to bridge the gap between data availability and interpretation. The work is split between two core topics – computational prioritisation/identification of disease gene variants and identification of RNA N6 -adenosine Methylation from sequencing data. The first chapter briefly discusses the emergence and establishment of NGS technology as a core tool in biology and its current applications and perspectives. Chapter 2 introduces the problem of variant prioritisation in the context of Mendelian disease, where tens of thousands of potential candidates are generated by a typical sequencing experiment. Novel software developed for candidate gene prioritisation is described that utilises data mining of tissue-specific gene expression profiles (Chapter 3). The second part of chapter investigates an alternative approach to candidate variant prioritisation by leveraging functional and phenotypic descriptions of genes and diseases from multiple biomedical domain ontologies (Chapter 4). Chapter 5 discusses N6 AdenosineMethylation, a recently re-discovered posttranscriptional modification of RNA. The core of the chapter describes novel software developed for transcriptome-wide detection of this epitranscriptomic mark from sequencing data. Chapter 6 presents a case study application of the software, reporting the previously uncharacterised RNA methylome of Kaposi’s Sarcoma Herpes Virus. The chapter further discusses a putative novel N6-methyl-adenosine -RNA binding protein and its possible roles in the progression of viral infection

    Advances in Evolutionary Algorithms

    Get PDF
    With the recent trends towards massive data sets and significant computational power, combined with evolutionary algorithmic advances evolutionary computation is becoming much more relevant to practice. Aim of the book is to present recent improvements, innovative ideas and concepts in a part of a huge EA field

    Omics-based predictive and causative modeling of neurobehavioral traits

    Get PDF
    Neurobehavioral disorders can be phenotypically and genetically complex, and often diagnosed through observational study or subjective assessment alone. Certain neurobehavioral phenotypes, such as those caused by circadian rhythm related behavior, are biochemically well characterized, others, though, do not have yet a well understood genetic aetiology. Furthermore, circadian biology and psychological disorders are often intertwined. To advance our understanding of neurobehavioral trait/gene relationships, I first built a machine learning model that encompasses mouse transcriptomics to predict genes involved in circadian rhythms. Next, I used genome wide association studies to model the causal influence of genetic exposure in humans to an evening chronotype on several mental health and social support traits, from depression to group religious participation. To more accurately model how neurobehaviors relate to one another, I mined psychological assessment instruments to build a species-agnostic psychological neurobehavior ontology encompassing autism and schizophrenia phenotypes. I, then, tested the utility of this ontology in clustering children on the autism spectrum based on phenotypic profiles. Lastly, I annotated genes to behaviors identified among subgroups through genome wide association studies applied to phenotype profiles. This allowed for the gene prioritization of circadian related experimentation results and the discovery of new, potentially, casual relationships between chronotype and neurobehavioral traits. Finally, the semantic representation of schizophrenia endophenotypes in a consistent, ontology framework catered its application for the identification of novel gene-trait associations in humans. These contributions provide new knowledge to the scientific community of the potential novel circadian functions for known genes, of the likely causal influence of chronotype on social and mental health, provide novel robust ways of modeling the complex phenotype of autism and schizophrenia patients, while annotating neurologically active genes to new behavioral traits for the first time

    Self-identity and consumption : a study of consumer personality, brand personality, and brand relationship

    Get PDF
    This thesis investigates the relationship between self-identity and consumption by discussing the conceptual and measurement issues of consumer personality, brand personality, and brand relationship. The investigation is based on the theories of personality, self-identity, and interpersonal relationship. The self-identity theories (Belk 1988; Cooley 1964; James 1890; Mead 1935) suggest that consumers may use brands to construct, maintain, and enhance their self-identities. Drawing from the literature of personality and self-identity, this thesis repositions the concept of personality for the context of consumption and refers it to self-identity (self-perception) rather than behaviour. This repositioning indicates that consumer personality and brand personality can be examined by the same personality concept. On the basis of the self-identity theories, a positive relationship is expected to exist between consumer personality and brand personality. Moreover, the interpersonal relationship theories (Aron et al. 1991; Rodin 1978; Thibaut and Kelley 1959) indicate that the relationship partners become a part of the self-identity in a close relationship. Therefore, it is hypothesised that the closer the brand personality and consumer personality perceived by the consumers (consumer-brand congruence), the better the brand relationship quality. This study applies a quasi-experiment from a field setting to examine the relationship among consumer personality, brand personality, and brand relationship. A 2 (high and low involvement) x 2 (high and low feeling) factorial design is featured to explore the role of involvement and feeling in the relationship of self-identity and consumption. A total number of 468 observations reveals that consumer and brand personality are strongly and positively related. The greater the consumer-brand congruence is, the better the brand relationship. Minimal moderating effects of involvement and feeling to the relationships between consumer personality and brand personality and between consumer-brand congruence and brand relationship quality are observed. These findings suggest that consumers use brands from various product categories in different situations to maintain their self-identities. The study attempts to make contributions on the theoretical, methodological, and managerial levels. Theoretically, it clarifies the concepts of consumer personality and brand personality, and reaffirms the concept of brand relationship. In this way, some measurement issues of self-identity and brand personality are resolved. The findings suggest that brand personality can be used as a tool to investigate global markets and to facilitate market segmentation and communication. Finally, the limitations of the thesis are recognised and directions for future research are offered

    Natural Language Processing: Emerging Neural Approaches and Applications

    Get PDF
    This Special Issue highlights the most recent research being carried out in the NLP field to discuss relative open issues, with a particular focus on both emerging approaches for language learning, understanding, production, and grounding interactively or autonomously from data in cognitive and neural systems, as well as on their potential or real applications in different domains

    Pharmacogenomics of sickle cell disease therapeutics: pain and drug metabolism associated gene variants and hydroxyurea-induced post-transcriptional expression of miRNAs

    Get PDF
    Sickle cell disease (SCD) is a common blood disease caused by a single nucleotide substitution (c.20T>A, p.Glu6Val) in the beta globin gene on chromosome 11. The prevalence of the disease is high throughout large areas in sub-Saharan Africa, the Mediterranean basin, the Middle East, and India due to the level of protection that the sickle cell trait, provides against severe malaria. Approximately 300,000 infants are born per year with sickle cell anemia, which is defined as homozygosity for the sickle hemoglobin (HbS). The majority (nearly 75%) of these births occur in sub-Saharan Africa, particularly in two countries: Nigeria, and the Democratic Republic of the Congo where there are poorly resourced healthcare systems. Early diagnosis, penicillin prophylaxis, blood transfusions, hydroxyurea, and hematopoietic stem-cell transplantation can dramatically improve survival and quality of life for patients with SCD. However, our understanding of the role of genetic and clinical factors in explaining the complex phenotypic diversity of this disease is still limited. Early prediction of the severity, and patients' responses to specific therapeutics of SCD could lead to more precise treatment and management. Beyond well-known modifiers of disease severity, such as fetal hemoglobin (HbF) levels and αthalassemia, other genetic variants might influence specific sub-phenotypes. New treatments and management strategies accounting for these genetic and nongenetic factors could substantially and rapidly improve the quality of life and reduce health care costs for patients with SCD. Patients with SCD are subjected to long term administration of drugs and there is a limited data on pharmacogenomics of SCD therapeutics. Vaso-occlusive crisis (VOC) are the main clinical events of SCD and are associated with recurrent and long-term use of antalgics/opioids and HU. This project aimed to investigate the clinical and genetic predictors of painful vaso-occlusive crisis (VOC) among SCD Cameroon patients by exploring pharmacokinetic determinants of treatment responses as well as post-transcriptional signatures triggered by hydroxyurea treatment, particularly, miRNA expression. SCD patients were recruited from Yaounde Central Hospital and Laquintinie Hospital in Douala (Wonkam et al., 2018, Mnika et al., 2019 (b)), and recent migrants SCD patients from the DRC, recruited at the Haematology Clinic, Groote Schuur Hospital in Cape Town, South Africa (Mnika et al., 2019 (a) and Mnika et al., 2019 (b)). Sociodemographic and clinical data were collected by means of a structured questionnaire. Patients' medical records were reviewed to extract their clinical features over the past 3 years. Specifically, the occurrences of VOC, hematological parameters, hospital outpatient visits, hospitalisation, overt strokes, blood transfusions, and administration of hydroxyurea were recorded. Height, weight, body mass index (BMI), systolic and diastolic blood pressures (SBP and DBP) were measured. Detailed descriptions of patients and sampling methods used in the Cameroonian patients have been reported previously (Wonkam et al., 2018 Mnika et al., 2019 (a) and Mnika et al., 2019 (b)). For the purpose of comparing frequencies of variants, ethnically matched Cameroonian controls were randomly recruited from apparently healthy blood donors in Yaounde for participation in the study. All blood samples were collected for genomic characterisation and analysis. DNA was extracted from peripheral blood, following instructions on the available commercial kit [QIAamp DNA Blood Maxi Kit ® (Qiagen, United States)]. Genotyping (TaqMan and MassArray) was performed for 40 variants in 17 pain-related genes, three fetal haemoglobin (HbF)-promoting loci, two kidney dysfunction-related genes, and HBA1/HBA2 genes for 436 patients. A subset of these samples was also genotyped to analyse 32 core and 267 extended pharmacogenes using commercially available PharmacoScan® platform for characterisation of pharmacokinetic determinant of response. We also compared the pharmacogenes variants from these African groups, to data extracted from the 1000 genomes Project. Moreover, association studies were carried out on pharmacogenes variants with SCD clinical variability. Additionally, protein-protein interaction (PPI) network and enriched biological processes and pathways were investigated. For association studies, statistical models using regression frameworks to analyse 40 variants were performed in R®. For miRNA expression, total RNA was isolated using the miRNeasy kit according to protocol of the Manufacturer (QIAGEN, Hilden, Germany); and sequenced by the Genomic and RNA Profiling Core at Baylor College of Medicine, United States, using the NanoString Platform (NanoString Technologies, Inc., Seattle, WA, United States), according to manufacturer's instructions. Genes with statistically significant changes in expression were analysed using the significance analyses of microarrays (SAM) tools. Female sex, body mass index, Hb/HbF, blood transfusions, leucocytosis and consultation or hospitalisation rates significantly correlated with VOC. Three painrelated gene variants correlated with VOC (CACNA2D3-rs6777055, P = 0·025; DRD2- rs4274224, P = 0·037; KCNS1-rs734784, P= 0·01). Five pain-related gene variants correlated with hospitalization/consultation rates (COMT-rs6269, P = 0·027; FAAHrs4141964, P = 0·003; OPRM1- rs1799971, P = 0·031; ADRB2-rs1042713; P < 0·001; UGT2B7-rs7438135, P = 0·037). The 3·7 kb HBA1/HBA2 deletion correlated with increased VOC (P = 0·002). HbF-promoting loci variants correlated with decreased hospitalisation (BCL11A-rs4671393, P = 0·026; HBS1L-MYB-rs28384513, P = 0·01). APOL1 G1/G2 correlated with increased hospitalisation (P = 0·048). A commercial genotyping array platform (PharmacoScan®) with 4627 markers located in 1191 genes was used to investigate 299 pharmacogenes (32 ADME core and 267 extended pharmacogenes). Based on the PharmacoScan analyses, no statistically significant differences in allele frequencies were detected between SCD cases and controls from Cameroon. A principal component analysis (PCA) revealed that Cameroonians' data clustered with other Africans, but this population is significantly distinct from American, European and Asian populations data. Variant allele frequencies in 21/32 core pharmacogenes were significantly different between the two SCD groups (Cameroon vs. Congo). No correlation between clinical variability and variants in the core genes was detected for both populations under study. An association study of the core and extended PharmacoScan variants to VOC identified statistically significant associations between two single nucleotide polymorphisms (SNPs) to VOC after correction of multiple testing. These two SNPs mapped to 50 genes, with two SNPs located in core pharmacogenes (SLCO4A1- rs118042746, p=1.21e-07; UGT1A10, UGT1A8- rs10176426, p=1.22e-07). Functional enrichment analyses revealed that these 50 genes are involved in three biological processes and four pathways relevant to SCD pathophysiology, including xenobiotic glucuronidation (GO:0052697, p = 2.3e-03), and drug metabolism - other enzymes (p = 2.1e-02). Further analyses of the 50 genes, identified key genes in human proteinprotein networks: NTSR1, LRMDA, SMAD SMAD4 and CDH2. These four genes also interacted with three core pharmacogenes associated with VOC: UGT1A8, UGT1A10 and SLCO4A1. We found 22/798 miRNAs to be differentially expressed under HU treatment, with the majority (13/22) being functionally associated with HbF-regulatory genes, including BCL11A (miR-148b-3p, miR-32-5p, miR-340-5p, miR-29c-3p), MYB (miR-105-5p), KLF-3 (miR-106b-5), and SP1 (miR-29b-3p, miR-625-5p, miR-324-5p, miR-125a-5p, miR-99b-5p, miR-374b-5p, miR-145-5p). The present thesis started by highlighting the scarcity of studies investigating variable responses to pain in SCD patients and then proceeded to addressing this research gap. To our knowledge this is the first body of from Africa to provide evidence supporting the possible development of a genetic risk model for pain in SCD. This is also the first body of work to report an association between these two SNPs and VOC in core and extended pharmacogenes. Our data reveals that the commercial pharmacogenes arrays investigated might need additional evidence for appropriateness among Africans. Therefore, it advocates the need to invest in research exploring population-specific arrays, drug design, targeting, and efficacy, for improved clinical management of patients of African descent. Previous studies have investigated various mechanisms to understand the genomic variations affecting responses to HU, but full understanding of the variable HU-mediated HbF production among individuals affected by SCD remains elusive. The present study showed that mechanisms of HbF production in response to HU, could particularly be mediated through miRNA regulation. The data reveals some alternative perspectives and routes towards identifying new therapeutic targets and approaches for SCD. However, this study needs to be replicated in larger samples in multiple African populations
    corecore