317 research outputs found
Extracting Symptoms from Narrative Text using Artificial Intelligence
Indiana University-Purdue University Indianapolis (IUPUI)Electronic health records collect an enormous amount of data about patients. However, the information about the patient’s illness is stored in progress notes that are in an un- structured format. It is difficult for humans to annotate symptoms listed in the free text. Recently, researchers have explored the advancements of deep learning can be applied to pro- cess biomedical data. The information in the text can be extracted with the help of natural language processing. The research presented in this thesis aims at automating the process of symptom extraction. The proposed methods use pre-trained word embeddings such as BioWord2Vec, BERT, and BioBERT to generate vectors of the words based on semantics and syntactic structure of sentences. BioWord2Vec embeddings are fed into a BiLSTM neural network with a CRF layer to capture the dependencies between the co-related terms in the sentence. The pre-trained BERT and BioBERT embeddings are fed into the BERT model with a CRF layer to analyze the output tags of neighboring tokens. The research shows that with the help of the CRF layer in neural network models, longer phrases of symptoms can be extracted from the text. The proposed models are compared with the UMLS Metamap tool that uses various sources to categorize the terms in the text to different semantic types and Stanford CoreNLP, a dependency parser, that analyses syntactic relations in the sentence to extract information. The performance of the models is analyzed by using strict, relaxed, and n-gram evaluation schemes. The results show BioBERT with a CRF layer can extract the majority of the human-labeled symptoms. Furthermore, the model is used to extract symptoms from COVID-19 tweets. The model was able to extract symptoms listed by CDC as well as new symptoms
Natural Language Processing: Emerging Neural Approaches and Applications
This Special Issue highlights the most recent research being carried out in the NLP field to discuss relative open issues, with a particular focus on both emerging approaches for language learning, understanding, production, and grounding interactively or autonomously from data in cognitive and neural systems, as well as on their potential or real applications in different domains
The genetics and pathophysiology of cluster headache and associated disorders
Cluster headache (CH) is described as one of the most painful conditions known to humans. It effects approximately 60,000 individuals in the UK and carries significant morbidity. It exhibits hereditability evident by reports of familial aggregation and is categorised as a trigeminal autonomic cephalalgia (TAC). Despite this, the exact pathophysiological and genetic drivers of this condition remain elusive. The purpose of this thesis is to examine the clinical and genetic determinants of CH, and thus gain insights into the underlying neurobiological mechanisms. This work consists of two components. In the first section, I conduct clinical observational studies to further delineate the CH phenotype. I address the postulated association between pituitary adenomas and CH and question the utility of dedicated pituitary imagining in this patient group. I also describe the largest series of Post-Traumatic Headache of Cluster Headache (PTH-CH) and demonstrate its distinct features and increased intractability to treatment. Finally, through meta-analysis, I estimate the prevalence of familial CH to be 6.27% and demonstrate an overlap with concurrent short-Lasting unilateral neuralgiform headache attacks with conjunctival injection and tearing (SUNCT) in familial cases. The second section explores the genetics architecture underlying CH. I perform a Genome-Wide Association Study (GWAS) to identify replicable susceptibility loci and conduct a downstream analysis. Subsequent genetic correlation analysis showed an overlap with migraine, depression, bipolar and sleep disturbance implying the possibility of a common genetic driver for these conditions, which frequently present concurrently. I then carry out linkage analysis in CH families and replicate a linked region suggestive of significance on chromosome 2 that also overlaps a genome wide significant locus. Finally, I execute whole exome sequencing and utilise rare variant association tests and segregation analysis to identify causal variants for familial CH
Neurobiological mechanisms of heterogeneous nuclear ribonucleoprotein H1 in methamphetamine stimulant and addictive behaviors
Addiction to psychostimulants such as methamphetamine (MA) is a significant public health issue in the United States with no FDA-approved pharmacological interventions. MA addiction is a heritable neuropsychiatric disorder, however, its genetic basis is almost entirely unknown. Available human genome-wide association studies (GWAS) lack sufficient power to detect the influence of common genetic variation on the risk of addiction. Mammalian model organisms offer an attractive alternative to more rapidly uncover novel genetic factors that contribute to addiction-relevant neurobehavioral traits. Using quantitative trait locus (QTL) mapping in mice, we identified a locus on chromosome 11 that contributed to a decrease in sensitivity to the locomotor stimulant properties of MA. To fine map this QTL, we generated interval-specific congenic lines and deduced a 206 kb critical interval on chromosome 11 that contained only two protein coding genes (Rufy1 and Hnrnph1). Replicate mouse lines heterozygous for Transcription Activator-like Effector Nucleases (TALENs)-induced frameshift deletions in Hnrnph1 (Hnrnph1+/-), but not in Rufy1 (Rufy1+/-), recapitulated the decrease in MA sensitivity observed in congenic mice; thus, identifying Hnrnph1 as a novel quantitative trait gene for MA sensitivity. Hnrnph1, an RNA-binding protein, has not previously been identified in human GWAS of neuropsychiatric disorders but has been implicated in mu-opioid receptor splicing associated with heroin dependence. The primary objectives of this dissertation is to (1) detail the forward genetic and reverse genetic approaches taken to identify Hnrnph1 as a quantitative trait gene for MA sensitivity; (2) assess the MA addiction-relevant behaviors presented by Hnrnph1+/- mice through conditioned place preference (CPP) and oral self-administration procedures; and (3) identify the neurobiological mechanisms through which Hnrnph1 affects behavior via transcriptome, immunohistochemical and neurochemical assessments of the mesocorticolimbic dopamine circuit. Overall, Hnrnph1+/- mice display increased dopaminergic innervation and MA dose-dependent dopamine release in nucleus accumbens, which could underlie reduced drug sensitivity, reward, and reinforcement. The results of this thesis provide substantial evidence to implicate Hnrnph1 in MA addiction
Exploring the genetic contribution to idiopathic Parkinson disease
Background: Parkinson disease (PD) is a major cause of death and disability and has a devastating global socioeconomic impact. It affects 1-2% of the population above the age of 65 and its prevalence increases as the population ages. Several biological processes have been implicated in Parkinson disease, including mitochondrial dysfunction, aberrant protein clearance, and neuroinflammation. To which degree these processes are cause, effect or bystander to disease initiation and progression, remains however largely unknown. Having limited understanding of the mechanisms underlying the pathogenesis and pathophysiology of Parkinson disease, we are unable to develop disease-modifying therapies and patients face a future of progressive disability and premature death.
There is a clear hereditary component to idiopathic PD, established through both twin studies and genome-wide association studies. However, only a minor fraction of the total estimated heritability can be explained by known associated genetic variability. It has been hypothesized that the cumulative effects of rare, low-impact mutations spread across genes and biological pathways could explain some of this “missing heritability”.
Aims: The aim of this work was to explore the genetic contribution to idiopathic PD, focusing on the cumulative effects of rare mutations.
Materials and methods: The main study population utilized in all four papers was the ParkWest cohort, a Norwegian population-based cohort of incident PD. In paper I, ParkWest provided both cases and controls, including clinical longitudinal data up to and including 7 years after baseline. All ParkWest cases were whole-exome sequenced and combined with previously sequenced control samples to form the genetic cohort utilized in papers II-IV. Additionally, a whole-exome sequencing cohort from the Parkinson Progression Markers Initiative was used in papers II-IV. Finally, a publicly available chip-genotyped dataset (NeuroX) from the International Parkinson’s Disease Genomics Consortium was used as a replication cohort in paper IV. In paper I, we characterized the familial aggregation of Parkinson disease in the ParkWest cohort and explored the effect of family history on disease progression. Subsequently, we used genetic data from multiple cohorts to assess the impact of rare, protein-altering mutations in mitochondrial biological pathways (paper III) and in genes previously linked to PD (paper II and IV).
Results and conclusions: We show that, while familial aggregation is present in our Norwegian cohort, this has a slightly lower effect size compared to previous studies. Through regression analysis we also show that having a family history of PD among first degree relatives is associated with a slightly milder phenotype, which may be due to genetic variability.
In paper II, we attempted to replicate the results of a recently published study reporting an association between genetic variation in the TRAP1 gene and Parkinson disease. Our analyses did not replicate this association in our Norwegian cohort. Moreover, using stricter quality control parameters abolished the association in the same dataset used in the original study. Our results do not support the proposed role of TRAP1 in idiopathic PD.
In paper III, we sought to investigate the role of rare, amino acid changing variation in molecular pathways related to mitochondrial function. Using the sequence kernel association (SKAT) test, we detected a statistically significant enrichment in the pathway of mitochondrial DNA maintenance. Impaired mitochondrial DNA homeostasis has previously been shown to be present in PD neurons, and our results indicate that this dysfunction could be partly mediated by inherited genetic mutations.
In paper IV, we performed a targeted single gene and gene-set association study on genes that had previously been implicated in PD through genome-wide association studies. We identified 303 genes of interest, but did not find statistically significant associations, either in the single gene or gene-set analyses. Our results do not therefore support a major role for rare variant enrichment in genes tagged by GWAS, but cannot rule out effects with small effect sizes.Doktorgradsavhandlin
Recommended from our members
Genomic characterization of a novel leporid Herpes simplex virus
The viral family Herpesviridae consists of large double stranded DNA viruses including eight species that infect humans with varying pathology from benign rashes to cancerous cell transformation. From three subfamilies, alpha-, beta- and gammaherpes, the alphaherpes contains the genera iltovirus, mardivirus, varicellovirus and simplex, two of which, the human simplex viruses I and 2 (HSV) induce life-long infections that have appeared to have coevolved with their hosts from the origins of our species. Unique features of the simplex genus are latency, tropism in dorsal root ganglia neurons, extraordinary high GC content ranging from 65 to 77%, and nucleosome formation of their genomes within the host's nucleus without integration. Reviewing the basic molecular and genetic characteristics of herpes simplex will introduced in Chapter 1, followed by the introduction of a newly sequenced, de novo assembled and predicatively annotated herpes simplex virus, Leporid Herpes Virus-4 (LHV4). Isolated from a virulent outbreak in domesticated rabbits, LHV4 has the smallest reported simplex virus to date at roughly 125,600 base pairs and presents similar pathology seen in rabbit models infected with HSV. Comparative genomics revealed a high degree of sequence similarity and genome synteny between LHV4 and other simplex viruses. Four genes were not computationally predicted in our annotation and may be absent in the LHV4 genome. The absent proteins correspond to: UL56, ICP34.5, US5 and US12 and have postulated roles in membrane trafficking, neurovirulence, apoptotic control and MHC I presentation respectively. The solved genome structure leads to how this compacted genome functions with the noted absences to produce a similar pathology in rabbits to that of HSV and whether other biological correlates will continue to be found in in vitro and in vivo infection. The inverted repeat regions (IR), duplicated and inverted to simplex virus' two larger blocks of protein-coding regions are described in Chapter 3. The similarities and differences in critical genes from the IR that balance latency and replicative viral cycles are compared. A two-fold reduction in IR content indicates the ability for a simplex virus to maintain infectivity despite this large truncation. The appendix describes the eukaryotic phylogeny of two initiating proteins of the mismatch repair (MMR) pathway. MMR proteins are present in the replicative foci of productive herpes virus infection and this analysis may indicate adaptive pressures involved in both genomic fidelity and host tropism. The emerging era of state-of-the-art genome sequencing and computational power advances this newly characterized herpes virus, along with its model host organism, as excellent candidates for systems interaction, and experimental biology
Discovering circadian clocks in microbes
We humans experience the influence of our circadian clock every day. This clock mechanism causes, for example, a jet lag during transatlantic air travel. We now believe that almost all organisms have developed a circadian clock mechanism.In this thesis I describe the analysis techniques we developed and the newly discovered molecular components of a circadian mechanism in Saccharomyces cerevisiae and Bacillus subtilis. To identify these molecular components, I applied structured zeitgebers, i.e. light and temperature cycling, to yeast and bacillus cultures. All this in conjunction with bioinformatic in-silico approachesIn Bacillus biofilm populations, we found a free-running rhythm of ytvA and KinC activity of nearly 24 hours after entrainment and release to constant dark and temperature conditions. The free-running oscillations are temperature compensated. This is one of the most important features of a circadian clock mechanism, making it very likely that such a system exists in B. subtilis.We found in yeasts that temperature appears to mainly regulate metabolic processes. Light appears to act more indirectly via photo-oxidation of mitochondrial cytochromes.Finally, I present a hypothetical model for an integrated circadian clock mechanism in unicellular microbes with an emphasis on S. cerevisiae. This mechanism involves several metabolic pathways and the main regulator is the stress sensitive transcriptional activator Msn2p. The model shows that in the circadian clock mechanism in yeast, energy metabolism appears to be an important theme. Other processes that are relevant: metabolic process of nitrogen compounds, oxidation-reduction process and fatty acid metabolism. All could serve as a starting point for further research on the circadian clock in yeast
Recommended from our members
In vitro expanded human CD4+CD25+ regulatory T cells suppress effector T cell proliferation.
Regulatory T cells (Tregs) have been shown to be critical in the balance between autoimmunity and tolerance and have been implicated in several human autoimmune diseases. However, the small number of Tregs in peripheral blood limits their therapeutic potential. Therefore, we developed a protocol that would allow for the expansion of Tregs while retaining their suppressive activity. We isolated CD4+CD25 hi cells from human peripheral blood and expanded them in vitro in the presence of anti-CD3 and anti-CD28 magnetic Xcyte Dynabeads and high concentrations of exogenous Interleukin (IL)-2. Tregs were effectively expanded up to 200-fold while maintaining surface expression of CD25 and other markers of Tregs: CD62L, HLA-DR, CCR6, and FOXP3. The expanded Tregs suppressed proliferation and cytokine secretion of responder PBMCs in co-cultures stimulated with anti-CD3 or alloantigen. Treg expansion is a critical first step before consideration of Tregs as a therapeutic intervention in patients with autoimmune or graft-versus-host disease
- …