110 research outputs found

    Ventricular arrhythmias classification and onset determination system

    Get PDF
    Accurately differentiating between ventricular fibrillation (VF) and ventricular tachycardia (VT) episodes is crucial in preventing potentially fatal missed interpretations that could lead to needless shock to the patients, resulting in damaging the heart. Apart from accurately classifying between VT and VF, the predetermination of the onset of the ventricular arrhythmias is also important in order to allow for more efficient monitoring of patients and can potentially save one’s life. Thus, this research intends to focus on developing a system called Classification and Onset Determination System (CODS) that is able to classify, track and monitor ventricular arrhythmias by using a method called Second Order Dynamic Binary Decomposition (SOD-BD) technique. Two significant characteristics (the natural frequency and the input parameter) were extracted from Electrocardiogram (ECG) signals that are provided by Physiobank database and analyzed to find the significant differences for each ventricular arrhythmia types and classify the ECGs accordingly (N, VT and VF). The outcome from these ECG extractions was also used to locate the onset of ventricular arrhythmia that is useful to predict the occurrence of the heart abnormalities. All the ECGs analysis, parameters extraction, classification techniques, and the CODS are developed using LabVIEW software

    Pattern recognition beyond classification: An abductive framework for time series interpretation

    Get PDF
    Time series interpretation aims to provide an explanation of what is observed in terms of its underlying processes. The present work is based on the assumption that the common classification-based approaches to time series interpretation suffer from a set of inherent weaknesses, whose ultimate cause lies in the monotonic nature of the deductive reasoning paradigm. In this thesis we propose a new approach to this problem, based on the initial hypothesis that abductive reasoning properly accounts for the human ability to identify and characterize the patterns appearing in a time series. The result of this interpretation is a set of conjectures in the form of observations, organized into an abstraction hierarchy and explaining what has been observed. A knowledge-based framework and a set of algorithms for the interpretation task are provided, implementing a hypothesize-and-test cycle guided by an attentional mechanism. As a representative application domain, interpretation of the electrocardiogram allows us to highlight the strengths of the present approach in comparison with traditional classification-based approaches

    The Application of Computer Techniques to ECG Interpretation

    Get PDF
    This book presents some of the latest available information on automated ECG analysis written by many of the leading researchers in the field. It contains a historical introduction, an outline of the latest international standards for signal processing and communications and then an exciting variety of studies on electrophysiological modelling, ECG Imaging, artificial intelligence applied to resting and ambulatory ECGs, body surface mapping, big data in ECG based prediction, enhanced reliability of patient monitoring, and atrial abnormalities on the ECG. It provides an extremely valuable contribution to the field

    Literature-aided interpretation of gene expression data with the weighted global test

    Get PDF
    Most methods for the interpretation of gene expression profiling experiments rely on the categorization of genes, as provided by the Gene Ontology (GO) and pathway databases. Due to the manual curation process, such databases are never up-to-date and tend to be limited in focus and coverage. Automated literature mining tools provide an attractive, alternative approach. We review how they can be employed for the interpretation of gene expression profiling experiments. We illustrate that their comprehensive scope aids the interpretation of data from domains poorly covered by GO or alternative databases, and allows for the linking of gene expression with diseases, drugs, tissues and other types of concepts. A framework for proper statistical evaluation of the associations between gene expression values and literature concepts was lacking and is now implemented in a weighted extension of global test. The weights are the literature association scores and reflect the importance of a gene for the concept of interest. In a direct comparison with classical GO-based gene sets, we show that use of literature-based associations results in the identification of much more specific GO categories. We demonstrate the possibilities for linking of gene expression data to patient survival in breast cancer and the action and metabolism of drugs. Coupling with online literature mining tools ensures transparency and allows further study of the identified associations. Literature mining tools are therefore powerful additions to the toolbox for the interpretation of high-throughput genomics data.UB – Publicatie

    Planification de l’ablation radiofréquence des arythmies cardiaques en combinant modélisation et apprentissage automatique

    Get PDF
    Cardiac arrhythmias are heart rhythm disruptions which can lead to sudden cardiac death. They require a deeper understanding for appropriate treatment planning. In this thesis, we integrate personalized structural and functional data into a 3D tetrahedral mesh of the biventricular myocardium. Next, the Mitchell-Schaeffer (MS) simplified biophysical model is used to study the spatial heterogeneity of electrophysiological (EP) tissue properties and their role in arrhythmogenesis. Radiofrequency ablation (RFA) with the elimination of local abnormal ventricular activities (LAVA) has recently arisen as a potentially curative treatment for ventricular tachycardia but the EP studies required to locate LAVA are lengthy and invasive. LAVA are commonly found within the heterogeneous scar, which can be imaged non-invasively with 3D delayed enhanced magnetic resonance imaging (DE-MRI). We evaluate the use of advanced image features in a random forest machine learning framework to identify areas of LAVA-inducing tissue. Furthermore, we detail the dataset’s inherent error sources and their formal integration in the training process. Finally, we construct MRI-based structural patient-specific heart models and couple them with the MS model. We model a recording catheter using a dipole approach and generate distinct normal and LAVA-like electrograms at locations where they have been found in clinics. This enriches our predictions of the locations of LAVA-inducing tissue obtained through image-based learning. Confidence maps can be generated and analyzed prior to RFA to guide the intervention. These contributions have led to promising results and proofs of concepts.Les arythmies sont des perturbations du rythme cardiaque qui peuvent entrainer la mort subite et requièrent une meilleure compréhension pour planifier leur traitement. Dans cette thèse, nous intégrons des données structurelles et fonctionnelles à un maillage 3D tétraédrique biventriculaire. Le modèle biophysique simplifié de Mitchell-Schaeffer (MS) est utilisé pour étudier l’hétérogénéité des propriétés électrophysiologiques (EP) du tissu et leur rôle sur l’arythmogénèse. L’ablation par radiofréquence (ARF) en éliminant les activités ventriculaires anormales locales (LAVA) est un traitement potentiellement curatif pour la tachycardie ventriculaire, mais les études EP requises pour localiser les LAVA sont longues et invasives. Les LAVA se trouvent autour de cicatrices hétérogènes qui peuvent être imagées de façon non-invasive par IRM à rehaussement tardif. Nous utilisons des caractéristiques d’image dans un contexte d’apprentissage automatique avec des forêts aléatoires pour identifier des aires de tissu qui induisent des LAVA. Nous détaillons les sources d’erreur inhérentes aux données et leur intégration dans le processus d’apprentissage. Finalement, nous couplons le modèle MS avec des géométries du coeur spécifiques aux patients et nous modélisons le cathéter avec une approche par un dipôle pour générer des électrogrammes normaux et des LAVA aux endroits où ils ont été localisés en clinique. Cela améliore la prédiction de localisation du tissu induisant des LAVA obtenue par apprentissage sur l’image. Des cartes de confiance sont générées et peuvent être utilisées avant une ARF pour guider l’intervention. Les contributions de cette thèse ont conduit à des résultats et des preuves de concepts prometteurs

    Learning Biosignals with Deep Learning

    Get PDF
    The healthcare system, which is ubiquitously recognized as one of the most influential system in society, is facing new challenges since the start of the decade.The myriad of physiological data generated by individuals, namely in the healthcare system, is generating a burden on physicians, losing effectiveness on the collection of patient data. Information systems and, in particular, novel deep learning (DL) algorithms have been prompting a way to take this problem. This thesis has the aim to have an impact in biosignal research and industry by presenting DL solutions that could empower this field. For this purpose an extensive study of how to incorporate and implement Convolutional Neural Networks (CNN), Recursive Neural Networks (RNN) and Fully Connected Networks in biosignal studies is discussed. Different architecture configurations were explored for signal processing and decision making and were implemented in three different scenarios: (1) Biosignal learning and synthesis; (2) Electrocardiogram (ECG) biometric systems, and; (3) Electrocardiogram (ECG) anomaly detection systems. In (1) a RNN-based architecture was able to replicate autonomously three types of biosignals with a high degree of confidence. As for (2) three CNN-based architectures, and a RNN-based architecture (same used in (1)) were used for both biometric identification, reaching values above 90% for electrode-base datasets (Fantasia, ECG-ID and MIT-BIH) and 75% for off-person dataset (CYBHi), and biometric authentication, achieving Equal Error Rates (EER) of near 0% for Fantasia and MIT-BIH and bellow 4% for CYBHi. As for (3) the abstraction of healthy clean the ECG signal and detection of its deviation was made and tested in two different scenarios: presence of noise using autoencoder and fully-connected network (reaching 99% accuracy for binary classification and 71% for multi-class), and; arrhythmia events by including a RNN to the previous architecture (57% accuracy and 61% sensitivity). In sum, these systems are shown to be capable of producing novel results. The incorporation of several AI systems into one could provide to be the next generation of preventive medicine, as the machines have access to different physiological and anatomical states, it could produce more informed solutions for the issues that one may face in the future increasing the performance of autonomous preventing systems that could be used in every-day life in remote places where the access to medicine is limited. These systems will also help the study of the signal behaviour and how they are made in real life context as explainable AI could trigger this perception and link the inner states of a network with the biological traits.O sistema de saúde, que é ubiquamente reconhecido como um dos sistemas mais influentes da sociedade, enfrenta novos desafios desde o ínicio da década. A miríade de dados fisiológicos gerados por indíviduos, nomeadamente no sistema de saúde, está a gerar um fardo para os médicos, perdendo a eficiência no conjunto dos dados do paciente. Os sistemas de informação e, mais espcificamente, da inovação de algoritmos de aprendizagem profunda (DL) têm sido usados na procura de uma solução para este problema. Esta tese tem o objetivo de ter um impacto na pesquisa e na indústria de biosinais, apresentando soluções de DL que poderiam melhorar esta área de investigação. Para esse fim, é discutido um extenso estudo de como incorporar e implementar redes neurais convolucionais (CNN), redes neurais recursivas (RNN) e redes totalmente conectadas para o estudo de biosinais. Diferentes arquiteturas foram exploradas para processamento e tomada de decisão de sinais e foram implementadas em três cenários diferentes: (1) Aprendizagem e síntese de biosinais; (2) sistemas biométricos com o uso de eletrocardiograma (ECG), e; (3) Sistema de detecção de anomalias no ECG. Em (1) uma arquitetura baseada na RNN foi capaz de replicar autonomamente três tipos de sinais biológicos com um alto grau de confiança. Quanto a (2) três arquiteturas baseadas em CNN e uma arquitetura baseada em RNN (a mesma usada em (1)) foram usadas para ambas as identificações, atingindo valores acima de 90 % para conjuntos de dados à base de eletrodos (Fantasia, ECG-ID e MIT -BIH) e 75 % para o conjunto de dados fora da pessoa (CYBHi) e autenticação, atingindo taxas de erro iguais (EER) de quase 0 % para Fantasia e MIT-BIH e abaixo de 4 % para CYBHi. Quanto a (3) a abstração de sinais limpos e assimptomáticos de ECG e a detecção do seu desvio foram feitas e testadas em dois cenários diferentes: na presença de ruído usando um autocodificador e uma rede totalmente conectada (atingindo 99 % de precisão na classificação binária e 71 % na multi-classe), e; eventos de arritmia incluindo um RNN na arquitetura anterior (57 % de precisão e 61 % de sensibilidade). Em suma, esses sistemas são mais uma vez demonstrados como capazes de produzir resultados inovadores. A incorporação de vários sistemas de inteligência artificial em um unico sistema pederá desencadear a próxima geração de medicina preventiva. Os algoritmos ao terem acesso a diferentes estados fisiológicos e anatómicos, podem produzir soluções mais informadas para os problemas que se possam enfrentar no futuro, aumentando o desempenho de sistemas autónomos de prevenção que poderiam ser usados na vida quotidiana, nomeadamente em locais remotos onde o acesso à medicinas é limitado. Estes sistemas também ajudarão o estudo do comportamento do sinal e como eles são feitos no contexto da vida real, pois a IA explicável pode desencadear essa percepção e vincular os estados internos de uma rede às características biológicas

    Ontology-based methods for disease similarity estimation and drug repositioning

    Get PDF
    Title from PDF of title page, viewed on October 2, 2012Dissertation advisor: Deendayal DinakarpandianVitaIncludes bibliographic references (p. 174-181)Thesis (Ph.D.)--School of Computing and Engineering and Dept. of Mathematics and Statistics. University of Missouri--Kansas City, 2012Human genome sequencing and new biological data generation techniques have provided an opportunity to uncover mechanisms in human disease. Using gene-disease data, recent research has increasingly shown that many seemingly dissimilar diseases have similar/common molecular mechanisms. Understanding similarity between diseases aids in early disease diagnosis and development of new drugs. The growing collection of gene-function and gene-disease data has instituted a need for formal knowledge representation in order to extract information. Ontologies have been successfully applied to represent such knowledge, and data mining techniques have been applied on them to extract information. Informatics methods can be used with ontologies to find similarity between diseases which can yield insight into how they are caused. This can lead to therapies which can actually cure diseases rather than merely treating symptoms. Estimating disease similarity solely on the basis of shared genes can be misleading as variable combinations of genes may be associated with similar diseases, especially for complex diseases. This deficiency can be potentially overcome by looking for common or similar biological processes rather than only explicit gene matches between diseases. The use of semantic similarity between biological processes to estimate disease similarity could enhance the identification and characterization of disease similarity besides indentifying novel biological processes involved in the diseases. Also, if diseases have similar molecular mechanisms, then drugs that are currently being used could potentially be used against diseases beyond their original indication. This can greatly benefit patients with diseases that do not have adequate therapies especially people with rare diseases. This can also drastically reduce healthcare costs as development of new drugs is far more expensive than re-using existing ones. In this research we present functions to measure similarity between terms in an ontology, and between entities annotated with terms drawn from the ontology, based on both co-occurrence and information content. The new similarity measure is shown to outperform existing methods using biological pathways. The similarity measure is then used to estimate similarity among diseases using the biological processes involved in them and is evaluated using a manually curated and external datasets with known disease similarities. Further, we use ontologies to encode diseases, drugs and biological processes and demonstrate a method that uses a network-based algorithm to combine biological data about diseases with drug information to find new uses for existing drugs. The effectiveness of the method is demonstrated by comparing the predicted new disease-drug pairs with existing drug-related clinical trials.Introduction and motivation -- Ontologies in biomedical domain -- Methods to compute ontological similarity -- Proposed approach for ontological term similarity -- Augmentation of vocabulary and annotation in ontologies -- Estimation of disease similarity -- Use of ontologies for drug repositioning -- Future directions-perspective from pharmaceutical industry -- Appendix 1. Table for the ontological similarity scores -- Appendix 2. Test set of 200 records for evaluating mapping of disease text to Disease Ontology -- Appendix 3. Curated set of disease similarities used as the benchmark set -- Appendix 4. F-scores for different combinations of Score-Pvalues and GO-Process-Pvalues for PSB estimates of disease similarity -- Appendix 5. Test set formed from opinions of medical residents http://rxinformatics.umn.edu/SemanticRelatednessResources.html -- Appendix 6. Drug repositioning candidate

    Systematising and scaling literature curation for genetically determined developmental disorders

    Get PDF
    The widespread availability of genomic sequencing has transformed the diagnosis of genetically-determined developmental disorders (GDD). However, this type of test often generates a number of genetic variants, which have to be reviewed and related back to the clinical features (phenotype) of the individual being tested. This frequently entails a time-consuming review of the peer-reviewed literature to look for case reports describing variants in the gene(s) of interest. This is particularly true for newly described and/or very rare disorders not covered in phenotype databases. Therefore, there is a need for scalable, automated literature curation to increase the efficiency of this process. This should lead to improvements in the speed in which diagnosis is made, and an increase in the number of individuals who are diagnosed through genomic testing. Phenotypic data in case reports/case series is not usually recorded in a standardised, computationally-tractable format. Plain text descriptions of similar clinical features may be recorded in several different ways. For example, a technical term such as ‘hypertelorism’, may be recorded as its synonym ‘widely spaced eyes’. In addition, case reports are found across a wide range of journals, with different structures and file formats for each publication. The Human Phenotype Ontology (HPO) was developed to store phenotypic data in a computationally-accessible format. Several initiatives have been developed to link diseases to phenotype data, in the form of HPO terms. However, these rely on manual expert curation and therefore are not inherently scalable, and cannot be updated automatically. Methods of extracting phenotype data from text at scale developed to date have relied on abstracts or open access papers. At the time of writing, Europe PubMed Central (EPMC, https://europepmc.org/) contained approximately 39.5 million articles, of which only 3.8 million were open access. Therefore, there is likely a significant volume of phenotypic data which has not been used previously at scale, due to difficulties accessing non-open access manuscripts. In this thesis, I present a method for literature curation which can utilise all relevant published full text through a newly developed package which can download almost all manuscripts licenced by a university or other institution. This is scalable to the full spectrum of GDD. Using manuscripts identified through manual literature review, I use a full text download pipeline and NLP (natural language processing) based methods to generate disease models. These are comprised of HPO terms weighted according to their frequency in the literature. I demonstrate iterative refinement of these models, and use a custom annotated corpus of 50 papers to show the text mining process has high precision and recall. I demonstrate that these models clinically reflect true disease expressivity, as defined by manual comparison with expert literature reviews, for three well-characterised GDD. I compare these disease models to those in the most commonly used genetic disease phenotype databases. I show that the automated disease models have increased depth of phenotyping, i.e. there are more terms than those which are manually-generated. I show that, in comparison to ‘real life’ prospectively gathered phenotypic data, automated disease models outperform existing phenotype databases in predicting diagnosis, as defined by increased area under the curve (by 0.05 and 0.08 using different similarity measures) on ROC curve plots. I present a method for automated PubMed search at scale, to use as input for disease model generation. I annotated a corpus of 6500 abstracts. Using this corpus I show a high precision (up to 0.80) and recall (up to 1.00) for machine learning classifiers used to identify manuscripts relevant to GDD. These use hand-picked domain-specific features, for example utilising specific MeSH terms. This method can be used to scale automated literature curation to the full spectrum of GDD. I also present an analysis of the phenotypic terms used in one year of GDD-relevant papers in a prominent journal. This shows that use of supplemental data and parsing clinical report sections from manuscripts is likely to result in more patient-specific phenotype extraction in future. In summary, I present a method for automated curation of full text from the peer-reviewed literature in the context of GDD. I demonstrate that this method is robust, reflects clinical disease expressivity, outperforms existing manual literature curation, and is scalable. Applying this process to clinical testing in future should improve the efficiency and accuracy of diagnosis
    corecore