5,557 research outputs found

    Machine Learning Approaches for the Prioritisation of Cardiovascular Disease Genes Following Genome- wide Association Study

    Get PDF
    Genome-wide association studies (GWAS) have revealed thousands of genetic loci, establishing itself as a valuable method for unravelling the complex biology of many diseases. As GWAS has grown in size and improved in study design to detect effects, identifying real causal signals, disentangling from other highly correlated markers associated by linkage disequilibrium (LD) remains challenging. This has severely limited GWAS findings and brought the method’s value into question. Although thousands of disease susceptibility loci have been reported, causal variants and genes at these loci remain elusive. Post-GWAS analysis aims to dissect the heterogeneity of variant and gene signals. In recent years, machine learning (ML) models have been developed for post-GWAS prioritisation. ML models have ranged from using logistic regression to more complex ensemble models such as random forests and gradient boosting, as well as deep learning models (i.e., neural networks). When combined with functional validation, these methods have shown important translational insights, providing a strong evidence-based approach to direct post-GWAS research. However, ML approaches are in their infancy across biological applications, and as they continue to evolve an evaluation of their robustness for GWAS prioritisation is needed. Here, I investigate the landscape of ML across: selected models, input features, bias risk, and output model performance, with a focus on building a prioritisation framework that is applied to blood pressure GWAS results and tested on re-application to blood lipid traits

    Novel 129Xe Magnetic Resonance Imaging and Spectroscopy Measurements of Pulmonary Gas-Exchange

    Get PDF
    Gas-exchange is the primary function of the lungs and involves removing carbon dioxide from the body and exchanging it within the alveoli for inhaled oxygen. Several different pulmonary, cardiac and cardiovascular abnormalities have negative effects on pulmonary gas-exchange. Unfortunately, clinical tests do not always pinpoint the problem; sensitive and specific measurements are needed to probe the individual components participating in gas-exchange for a better understanding of pathophysiology, disease progression and response to therapy. In vivo Xenon-129 gas-exchange magnetic resonance imaging (129Xe gas-exchange MRI) has the potential to overcome these challenges. When participants inhale hyperpolarized 129Xe gas, it has different MR spectral properties as a gas, as it diffuses through the alveolar membrane and as it binds to red-blood-cells. 129Xe MR spectroscopy and imaging provides a way to tease out the different anatomic components of gas-exchange simultaneously and provides spatial information about where abnormalities may occur. In this thesis, I developed and applied 129Xe MR spectroscopy and imaging to measure gas-exchange in the lungs alongside other clinical and imaging measurements. I measured 129Xe gas-exchange in asymptomatic congenital heart disease and in prospective, controlled studies of long-COVID. I also developed mathematical tools to model 129Xe MR signals during acquisition and reconstruction. The insights gained from my work underscore the potential for 129Xe gas-exchange MRI biomarkers towards a better understanding of cardiopulmonary disease. My work also provides a way to generate a deeper imaging and physiologic understanding of gas-exchange in vivo in healthy participants and patients with chronic lung and heart disease

    The Development and Performance of the First BICEP Array Receiver at 30 and 40 GHz for Measuring the Polarized Synchrotron Foreground

    Get PDF
    The existence of the CMB marks a big success of the lambda cold dark matter standard model, which describes the universe’s evolution with six free parameters. The inflationary theory was added to the picture in the ’80s to explain the initial conditions of the universe. Scalar perturbations from inflation seeded the formation of the large-scale structure and produced the curl-free E-mode polarization pattern in the CMB. On the other hand, tensor fluctuations sourced primordial gravitational waves (PGW), which could leave unique imprints in the CMB polarization: the gradient-free B-mode pattern. The amplitude of B modes is directly related to the tensor-to-scalar ratio r of the primordial fluctuations, which indicates the energy scale of inflation. The detection of the primordial B modes will be strong supporting evidence of inflation and give us opportunities to study physics at energy scales far beyond what can ever be accessed in laboratory experiments on the Earth. Currently, the main challenge for the B-mode experiments is to separate the primordial B modes from those sourced by matter between us and the last scattering surface: the galactic foregrounds and the gravitational lensing effect. The two most important foregrounds are thermal dust and synchrotron, which have very different spectral properties from the CMB. Thus the key to foreground cleaning is the high sensitivity data at multiple frequency bands and the accurate modeling of the foregrounds in data analyses and simulations. In this dissertation, I present my work on ISM and dust property studies which enriched our understanding of the foregrounds. The BICEP/Keck (BK) experiments build a series of polarization-sensitive microwave telescopes targeting degree-scale B-modes from the early universe. The latest publication from the collaboration with data taken through 2018 reported tensor-to-scalar ratio r0.05 &#60; 0.036 at 95% C.L., providing the tightest constraint on the primordial tensor mode. BICEP Array is the latest generation of the series experiments. The final configuration of the BICEP Array has four BICEP3-class receivers spanning six frequency bands, aiming to achieve σ(r) ≾ 0.003. The first receiver of the BICEP Array is at 30 and 40 GHz, constraining the synchrotron foregrounds. In this dissertation, I cover the development of this new receiver focusing on the design and performance of the detectors. I report on the characterizing and diagnosing tests for the receiver during its first few observing seasons.</p

    Leveraging a machine learning based predictive framework to study brain-phenotype relationships

    Get PDF
    An immense collective effort has been put towards the development of methods forquantifying brain activity and structure. In parallel, a similar effort has focused on collecting experimental data, resulting in ever-growing data banks of complex human in vivo neuroimaging data. Machine learning, a broad set of powerful and effective tools for identifying multivariate relationships in high-dimensional problem spaces, has proven to be a promising approach toward better understanding the relationships between the brain and different phenotypes of interest. However, applied machine learning within a predictive framework for the study of neuroimaging data introduces several domain-specific problems and considerations, leaving the overarching question of how to best structure and run experiments ambiguous. In this work, I cover two explicit pieces of this larger question, the relationship between data representation and predictive performance and a case study on issues related to data collected from disparate sites and cohorts. I then present the Brain Predictability toolbox, a soft- ware package to explicitly codify and make more broadly accessible to researchers the recommended steps in performing a predictive experiment, everything from framing a question to reporting results. This unique perspective ultimately offers recommen- dations, explicit analytical strategies, and example applications for using machine learning to study the brain

    Unsupervised inference methods for protein sequence data

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Targeting Fusion Proteins of HIV-1 and SARS-CoV-2

    Get PDF
    Viruses are disease-causing pathogenic agents that require host cells to replicate. Fusion of host and viral membranes is critical for the lifecycle of enveloped viruses. Studying viral fusion proteins can allow us to better understand how they shape immune responses and inform the design of therapeutics such as drugs, monoclonal antibodies, and vaccines. This thesis discusses two approaches to targeting two fusion proteins: Env from HIV-1 and S from SARS-CoV-2. The first chapter of this thesis is an introduction to viruses with a specific focus on HIV-1 CD4 mimetic drugs and antibodies against SARS-CoV-2. It discusses the architecture of these viruses and fusion proteins and how small molecules, peptides, and antibodies can target these proteins successfully to treat and prevent disease. In addition, a brief overview is included of the techniques involved in structural biology and how it has informed the study of viruses. For the interested reader, chapter 2 contains a review article that serves as a more in-depth introduction for both viruses as well as how the use of structural biology has informed the study of viral surface proteins and neutralizing antibody responses to them. The subsequent chapters provide a body of work divided into two parts. The first part in chapter 3 involves a study on conformational changes induced in the HIV-1 Env protein by CD4-mimemtic drugs using single particle cryo-EM. The second part encompassing chapters 4 and 5 includes two studies on antibodies isolated from convalescent COVID-19 donors. The former involves classification of antibody responses to the SARS-CoV-2 S receptor-binding domain (RBD). The latter discusses an anti-RBD antibody class that binds to a conserved epitope on the RBD and shows cross-binding and cross-neutralization to other coronaviruses in the sarbecovirus subgenus.</p

    Deciphering Regulation in Escherichia coli: From Genes to Genomes

    Get PDF
    Advances in DNA sequencing have revolutionized our ability to read genomes. However, even in the most well-studied of organisms, the bacterium Escherichia coli, for ≈ 65% of promoters we remain ignorant of their regulation. Until we crack this regulatory Rosetta Stone, efforts to read and write genomes will remain haphazard. We introduce a new method, Reg-Seq, that links massively-parallel reporter assays with mass spectrometry to produce a base pair resolution dissection of more than 100 E. coli promoters in 12 growth conditions. We demonstrate that the method recapitulates known regulatory information. Then, we examine regulatory architectures for more than 80 promoters which previously had no known regulatory information. In many cases, we also identify which transcription factors mediate their regulation. This method clears a path for highly multiplexed investigations of the regulatory genome of model organisms, with the potential of moving to an array of microbes of ecological and medical relevance.</p

    Um modelo para suporte automatizado ao reconhecimento, extração, personalização e reconstrução de gráficos estáticos

    Get PDF
    Data charts are widely used in our daily lives, being present in regular media, such as newspapers, magazines, web pages, books, and many others. A well constructed data chart leads to an intuitive understanding of its underlying data and in the same way, when data charts have wrong design choices, a redesign of these representations might be needed. However, in most cases, these charts are shown as a static image, which means that the original data are not usually available. Therefore, automatic methods could be applied to extract the underlying data from the chart images to allow these changes. The task of recognizing charts and extracting data from them is complex, largely due to the variety of chart types and their visual characteristics. Computer Vision techniques for image classification and object detection are widely used for the problem of recognizing charts, but only in images without any disturbance. Other features in real-world images that can make this task difficult are not present in most literature works, like photo distortions, noise, alignment, etc. Two computer vision techniques that can assist this task and have been little explored in this context are perspective detection and correction. These methods transform a distorted and noisy chart in a clear chart, with its type ready for data extraction or other uses. The task of reconstructing data is straightforward, as long the data is available the visualization can be reconstructed, but the scenario of reconstructing it on the same context is complex. Using a Visualization Grammar for this scenario is a key component, as these grammars usually have extensions for interaction, chart layers, and multiple views without requiring extra development effort. This work presents a model for automated support for custom recognition, and reconstruction of charts in images. The model automatically performs the process steps, such as reverse engineering, turning a static chart back into its data table for later reconstruction, while allowing the user to make modifications in case of uncertainties. This work also features a model-based architecture along with prototypes for various use cases. Validation is performed step by step, with methods inspired by the literature. This work features three use cases providing proof of concept and validation of the model. The first use case features usage of chart recognition methods focused on documents in the real-world, the second use case focus on vocalization of charts, using a visualization grammar to reconstruct a chart in audio format, and the third use case presents an Augmented Reality application that recognizes and reconstructs charts in the same context (a piece of paper) overlaying the new chart and interaction widgets. The results showed that with slight changes, chart recognition and reconstruction methods are now ready for real-world charts, when taking time, accuracy and precision into consideration.Os gráficos de dados são amplamente utilizados na nossa vida diária, estando presentes nos meios de comunicação regulares, tais como jornais, revistas, páginas web, livros, e muitos outros. Um gráfico bem construído leva a uma compreensão intuitiva dos seus dados inerentes e da mesma forma, quando os gráficos de dados têm escolhas de conceção erradas, poderá ser necessário um redesenho destas representações. Contudo, na maioria dos casos, estes gráficos são mostrados como uma imagem estática, o que significa que os dados originais não estão normalmente disponíveis. Portanto, poderiam ser aplicados métodos automáticos para extrair os dados inerentes das imagens dos gráficos, a fim de permitir estas alterações. A tarefa de reconhecer os gráficos e extrair dados dos mesmos é complexa, em grande parte devido à variedade de tipos de gráficos e às suas características visuais. As técnicas de Visão Computacional para classificação de imagens e deteção de objetos são amplamente utilizadas para o problema de reconhecimento de gráficos, mas apenas em imagens sem qualquer ruído. Outras características das imagens do mundo real que podem dificultar esta tarefa não estão presentes na maioria das obras literárias, como distorções fotográficas, ruído, alinhamento, etc. Duas técnicas de visão computacional que podem ajudar nesta tarefa e que têm sido pouco exploradas neste contexto são a deteção e correção da perspetiva. Estes métodos transformam um gráfico distorcido e ruidoso em um gráfico limpo, com o seu tipo pronto para extração de dados ou outras utilizações. A tarefa de reconstrução de dados é simples, desde que os dados estejam disponíveis a visualização pode ser reconstruída, mas o cenário de reconstrução no mesmo contexto é complexo. A utilização de uma Gramática de Visualização para este cenário é um componente chave, uma vez que estas gramáticas têm normalmente extensões para interação, camadas de gráficos, e visões múltiplas sem exigir um esforço extra de desenvolvimento. Este trabalho apresenta um modelo de suporte automatizado para o reconhecimento personalizado, e reconstrução de gráficos em imagens estáticas. O modelo executa automaticamente as etapas do processo, tais como engenharia inversa, transformando um gráfico estático novamente na sua tabela de dados para posterior reconstrução, ao mesmo tempo que permite ao utilizador fazer modificações em caso de incertezas. Este trabalho também apresenta uma arquitetura baseada em modelos, juntamente com protótipos para vários casos de utilização. A validação é efetuada passo a passo, com métodos inspirados na literatura. Este trabalho apresenta três casos de uso, fornecendo prova de conceito e validação do modelo. O primeiro caso de uso apresenta a utilização de métodos de reconhecimento de gráficos focando em documentos no mundo real, o segundo caso de uso centra-se na vocalização de gráficos, utilizando uma gramática de visualização para reconstruir um gráfico em formato áudio, e o terceiro caso de uso apresenta uma aplicação de Realidade Aumentada que reconhece e reconstrói gráficos no mesmo contexto (um pedaço de papel) sobrepondo os novos gráficos e widgets de interação. Os resultados mostraram que com pequenas alterações, os métodos de reconhecimento e reconstrução dos gráficos estão agora prontos para os gráficos do mundo real, tendo em consideração o tempo, a acurácia e a precisão.Programa Doutoral em Engenharia Informátic
    corecore