8,382 research outputs found

    Machine Learning Approaches for the Prioritisation of Cardiovascular Disease Genes Following Genome- wide Association Study

    Get PDF
    Genome-wide association studies (GWAS) have revealed thousands of genetic loci, establishing itself as a valuable method for unravelling the complex biology of many diseases. As GWAS has grown in size and improved in study design to detect effects, identifying real causal signals, disentangling from other highly correlated markers associated by linkage disequilibrium (LD) remains challenging. This has severely limited GWAS findings and brought the method’s value into question. Although thousands of disease susceptibility loci have been reported, causal variants and genes at these loci remain elusive. Post-GWAS analysis aims to dissect the heterogeneity of variant and gene signals. In recent years, machine learning (ML) models have been developed for post-GWAS prioritisation. ML models have ranged from using logistic regression to more complex ensemble models such as random forests and gradient boosting, as well as deep learning models (i.e., neural networks). When combined with functional validation, these methods have shown important translational insights, providing a strong evidence-based approach to direct post-GWAS research. However, ML approaches are in their infancy across biological applications, and as they continue to evolve an evaluation of their robustness for GWAS prioritisation is needed. Here, I investigate the landscape of ML across: selected models, input features, bias risk, and output model performance, with a focus on building a prioritisation framework that is applied to blood pressure GWAS results and tested on re-application to blood lipid traits

    Evaluation of different segmentation-based approaches for skin disorders from dermoscopic images

    Full text link
    Treballs Finals de Grau d'Enginyeria BiomĂšdica. Facultat de Medicina i CiĂšncies de la Salut. Universitat de Barcelona. Curs: 2022-2023. Tutor/Director: Sala Llonch, Roser, Mata Miquel, Christian, Munuera, JosepSkin disorders are the most common type of cancer in the world and the incident has been lately increasing over the past decades. Even with the most complex and advanced technologies, current image acquisition systems do not permit a reliable identification of the skin lesion by visual examination due to the challenging structure of the malignancy. This promotes the need for the implementation of automatic skin lesion segmentation methods in order to assist in physicians’ diagnostic when determining the lesion's region and to serve as a preliminary step for the classification of the skin lesion. Accurate and precise segmentation is crucial for a rigorous screening and monitoring of the disease's progression. For the purpose of the commented concern, the present project aims to accomplish a state-of-the-art review about the most predominant conventional segmentation models for skin lesion segmentation, alongside with a market analysis examination. With the rise of automatic segmentation tools, a wide number of algorithms are currently being used, but many are the drawbacks when employing them for dermatological disorders due to the high-level presence of artefacts in the image acquired. In light of the above, three segmentation techniques have been selected for the completion of the work: level set method, an algorithm combining GrabCut and k-means methods and an intensity automatic algorithm developed by Hospital Sant Joan de DĂ©u de Barcelona research group. In addition, a validation of their performance is conducted for a further implementation of them in clinical training. The proposals, together with the got outcomes, have been accomplished by means of a publicly available skin lesion image database

    Using machine learning to predict pathogenicity of genomic variants throughout the human genome

    Get PDF
    GeschĂ€tzt mehr als 6.000 Erkrankungen werden durch VerĂ€nderungen im Genom verursacht. Ursachen gibt es viele: Eine genomische Variante kann die Translation eines Proteins stoppen, die Genregulation stören oder das Spleißen der mRNA in eine andere Isoform begĂŒnstigen. All diese Prozesse mĂŒssen ĂŒberprĂŒft werden, um die zum beschriebenen PhĂ€notyp passende Variante zu ermitteln. Eine Automatisierung dieses Prozesses sind Varianteneffektmodelle. Mittels maschinellem Lernen und Annotationen aus verschiedenen Quellen bewerten diese Modelle genomische Varianten hinsichtlich ihrer PathogenitĂ€t. Die Entwicklung eines Varianteneffektmodells erfordert eine Reihe von Schritten: Annotation der Trainingsdaten, Auswahl von Features, Training verschiedener Modelle und Selektion eines Modells. Hier prĂ€sentiere ich ein allgemeines Workflow dieses Prozesses. Dieses ermöglicht es den Prozess zu konfigurieren, Modellmerkmale zu bearbeiten, und verschiedene Annotationen zu testen. Der Workflow umfasst außerdem die Optimierung von Hyperparametern, Validierung und letztlich die Anwendung des Modells durch genomweites Berechnen von Varianten-Scores. Der Workflow wird in der Entwicklung von Combined Annotation Dependent Depletion (CADD), einem Varianteneffektmodell zur genomweiten Bewertung von SNVs und InDels, verwendet. Durch Etablierung des ersten Varianteneffektmodells fĂŒr das humane Referenzgenome GRCh38 demonstriere ich die gewonnenen Möglichkeiten Annotationen aufzugreifen und neue Modelle zu trainieren. Außerdem zeige ich, wie Deep-Learning-Scores als Feature in einem CADD-Modell die Vorhersage von RNA-Spleißing verbessern. Außerdem werden Varianteneffektmodelle aufgrund eines neuen, auf AllelhĂ€ufigkeit basierten, Trainingsdatensatz entwickelt. Diese Ergebnisse zeigen, dass der entwickelte Workflow eine skalierbare und flexible Möglichkeit ist, um Varianteneffektmodelle zu entwickeln. Alle entstandenen Scores sind unter cadd.gs.washington.edu und cadd.bihealth.org frei verfĂŒgbar.More than 6,000 diseases are estimated to be caused by genomic variants. This can happen in many possible ways: a variant may stop the translation of a protein, interfere with gene regulation, or alter splicing of the transcribed mRNA into an unwanted isoform. It is necessary to investigate all of these processes in order to evaluate which variant may be causal for the deleterious phenotype. A great help in this regard are variant effect scores. Implemented as machine learning classifiers, they integrate annotations from different resources to rank genomic variants in terms of pathogenicity. Developing a variant effect score requires multiple steps: annotation of the training data, feature selection, model training, benchmarking, and finally deployment for the model's application. Here, I present a generalized workflow of this process. It makes it simple to configure how information is converted into model features, enabling the rapid exploration of different annotations. The workflow further implements hyperparameter optimization, model validation and ultimately deployment of a selected model via genome-wide scoring of genomic variants. The workflow is applied to train Combined Annotation Dependent Depletion (CADD), a variant effect model that is scoring SNVs and InDels genome-wide. I show that the workflow can be quickly adapted to novel annotations by porting CADD to the genome reference GRCh38. Further, I demonstrate the integration of deep-neural network scores as features into a new CADD model, improving the annotation of RNA splicing events. Finally, I apply the workflow to train multiple variant effect models from training data that is based on variants selected by allele frequency. In conclusion, the developed workflow presents a flexible and scalable method to train variant effect scores. All software and developed scores are freely available from cadd.gs.washington.edu and cadd.bihealth.org

    Question Answering with distilled BERT models: A case study for Biomedical Data

    Get PDF
    In the healthcare industry today, 80% of data is unstructured (Razzak et al., 2019). The challenge this imposes on healthcare providers is that they rely on unstructured data to inform their decision-making. Although Electronic Health Records (EHRs) exist to integrate patient data, healthcare providers are still challenged with searching for information and answers contained within unstructured data. Prior NLP and Deep Learning research has shown that these methods can improve information extraction on unstructured medical documents. This research expands upon those studies by developing a Question Answering system using distilled BERT models. Healthcare providers can use this system on their local computers to search for and receive answers to specific questions about patients. This paper’s best TinyBERT and TinyBioBERT models had Mean Reciprocal Rank (MRRs) of 0.522 and 0.284 respectively. Based on these findings this paper concludes that TinyBERT performed better than TinyBioBERT on BioASQ task 9b data

    Facilitating prosociality through technology: Design to promote digital volunteerism

    Get PDF
    Volunteerism covers many activities involving no financial rewards for volunteers but which contribute to the common good. There is existing work in designing technology for volunteerism in HumanComputer Interaction (HCI) and related disciplines that focuses on motivation to improve performance, but it does not account for volunteer wellbeing. Here, I investigate digital volunteerism in three case studies with a focus on volunteer motivation, engagement, and wellbeing. My research involved volunteers and others in the volunteering context to generate recommendations for a volunteer-centric design for digital volunteerism. The thesis has three aims: 1. To investigate motivational aspects critical for enhancing digital volunteers’ experiences 2. To identify digital platform attributes linked to volunteer wellbeing 3. To create guidelines for effectively supporting volunteer engagement in digital volunteering platforms In the first case study I investigate the design of a chat widget for volunteers working in an organisation with a view to develop a design that improves their workflow and wellbeing. The second case study investigates the needs, motivations, and wellbeing of volunteers who help medical students improve their medical communication skills. An initial mixed-methods study was followed by an experiment comparing two design strategies to improve volunteer relatedness; an important indicator of wellbeing. The third case study looks into volunteer needs, experiences, motivations, and wellbeing with a focus on volunteer identity and meaning-making on a science-based research platform. I then analyse my findings from these case studies using the lens of care ethics to derive critical insights for design. The key contributions of this thesis are design strategies and critical insights, and a volunteer-centric design framework to enhance the motivation, wellbeing and engagement of digital volunteers

    Discovering mHealth Users’ Privacy and Security Concerns through Social Media Mining

    Get PDF
    The purpose of this study is to explore the various privacy and security concerns conveyed by social media users in relation to the use of mHealth wearable technologies, using Grounded Theory and Text Mining methodologies. The results of the emerging theory explain that the concerns of users can be categorized as relating to data management, data surveillance, data invasion, technical safety, or legal & policy issues. The results show that over time, mHealth users are still concerned about areas such as security breaches, real-time data invasion, surveillance, and how companies use the data collected from these devices. Further, the results from the emotion and sentiment analyses revealed that users generally exhibited anger and fear, and sentiments that were negatively expressed. Theoretically, the results also support the literature on user acceptance of mHealth wearables as influenced by the distrust of companies and their utilization of personally harvested data

    Machine Learning in Smart Health Research: A Bibliometric Analysis

    Get PDF
    The advent of new technologies such as Machine Learning has highly influenced the health sector's activities; with this, there is an ease in diagnosis and decision-making processes in the sector. Hence, this study aims to analyze the application of Machine Learning in Smart Health research. This study uses 192 records from the Scopus database based on a well-crafted search term to identify nations with the highest publication output, the principal research subject areas, the top funding sponsors, and research keywords in this subject matter. The result shows that the first document on machine learning in smart health was published in 2011. The research output on this subject has dramatically increased, with India now being the top nation where research in this area is conducted. It was also discovered that the journal IEEE Access has the highest number of publications in this area. This analysis will help researchers, policy developers, and professionals in the health sector to better understand the development of Machine Learning in Smart Health research. Machine Learning in Smart Health portends Growth in the future

    Eating Behavior In-The-Wild and Its Relationship to Mental Well-Being

    Get PDF
    The motivation for eating is beyond survival. Eating serves as means for socializing, exploring cultures, etc. Computing researchers have developed various eating detection technologies that can leverage passive sensors available on smart devices to automatically infer when and, to some extent, what an individual is eating. However, despite their significance in eating literature, crucial contextual information such as meal company, type of food, location of meals, the motivation of eating episodes, the timing of meals, etc., are difficult to detect through passive means. More importantly, the applications of currently developed automated eating detection systems are limited. My dissertation addresses several of these challenges by combining the strengths of passive sensing technologies and EMAs (Ecological Momentary Assessment). EMAs are a widely adopted tool used across a variety of disciplines that can gather in-situ information about individual experiences. In my dissertation, I demonstrate the relationship between various eating contexts and the mental well-being of college students and information workers through naturalistic studies. The contributions of my dissertation are four-fold. First, I develop a real-time meal detection system that can detect meal-level episodes and trigger EMAs to gather contextual data about one’s eating episode. Second, I deploy this system in a college student population to understand their eating behavior during day-to-day life and investigate the relationship of these eating behaviors with various mental well-being outcomes. Third, based on the limitations of passive sensing systems to detect short and sporadic chewing episodes present in snacking, I develop a snacking detection system and operationalize the definition of snacking in this thesis. Finally, I investigate the causal relationship between stress levels experienced by remote information workers during their workdays and its effect on lunchtime. This dissertation situates the findings in an interdisciplinary context, including ubiquitous computing, psychology, and nutrition.Ph.D
    • 

    corecore