8 research outputs found

    Incorporating Prior Knowledge in Deep Learning Models via Pathway Activity Autoencoders

    Full text link
    Motivation: Despite advances in the computational analysis of high-throughput molecular profiling assays (e.g. transcriptomics), a dichotomy exists between methods that are simple and interpretable, and ones that are complex but with lower degree of interpretability. Furthermore, very few methods deal with trying to translate interpretability in biologically relevant terms, such as known pathway cascades. Biological pathways reflecting signalling events or metabolic conversions are Small improvements or modifications of existing algorithms will generally not be suitable, unless novel biological results have been predicted and verified. Determining which pathways are implicated in disease and incorporating such pathway data as prior knowledge may enhance predictive modelling and personalised strategies for diagnosis, treatment and prevention of disease. Results: We propose a novel prior-knowledge-based deep auto-encoding framework, PAAE, together with its accompanying generative variant, PAVAE, for RNA-seq data in cancer. Through comprehensive comparisons among various learning models, we show that, despite having access to a smaller set of features, our PAAE and PAVAE models achieve better out-of-set reconstruction results compared to common methodologies. Furthermore, we compare our model with equivalent baselines on a classification task and show that they achieve better results than models which have access to the full input gene set. Another result is that using vanilla variational frameworks might negatively impact both reconstruction outputs as well as classification performance. Finally, our work directly contributes by providing comprehensive interpretability analyses on our models on top of improving prognostication for translational medicine

    Aprendendo medidas de centralidade com redes grafo-neurais

    No full text
    Centrality Measures are important metrics used in Social Network Analysis. Such measures allow one to infer which entity in a network is more central (informally, more important) than another. Analyses based on centrality measures may help detect possible social influencers, security weak spots, etc. This dissertation investigates methods for learning how to predict these centrality measures using only the graph’s structure. More specifically, different ways of ranking the vertices according to their centrality measures are shown, as well as a brief analysis on how to approximate the centrality measures themselves. This is achieved by building on previous work that used neural networks to estimate centrality measures given other centrality measures. In this dissertation, we use the concept of a Graph Neural Network – a Deep Learning model that builds the computation graph according to the topology of a desired input graph. Here these models’ performances are evaluated with different centrality measures, briefly comparing them with other machine learning models in the literature. The analyses for both the approximation and ranking of the centrality measures are evaluated and we show that the ranking of centrality measures is easier to compute. The transfer between the tasks of predicting these different centralities is analysed, and the advantages of each model is highlighted. The models are tested on graphs from different random distributions than the ones they were trained with, on graphs larger than the ones they saw during training as well as with real world instances that are much larger than the largest training graphs. The internal embeddings of the vertices produced by the model are analysed through lower-dimensional projections and conjectures are made on the behaviour seen in the experiments. Finally, we raise and identify possible future work highlighted by the experimental results presented here.Medidas de Centralidade são um tipo de métrica importante na Análise de Redes Sociais. Tais métricas permitem inferir qual entidade é mais central (ou informalmente, mais importante) que outra. Análises baseadas em medidas de centralidade podem ajudar a detectar influenciadores sociais, pontos fracos em sistemas de segurança, etc. Nesta dissertação se investiga métodos para aprender a predizer estas medidas de centralidade utilizando somente a estrutura do grafo de entrada. Mais especificamente, são demonstradas diferentes formas de se classificar os vértices de acordo com suas medidas de centralidade, assim como uma breve análise de como aproximar estas medidas de centralidade. Nesta dissertação utiliza-se o conceito de uma Rede Grafo-Neural – um model de Aprendizagem Profunda que constrói o grafo de computação de acordo com a topologia do grafo que recebe de entrada. Aqui as performances destes modelos são avaliadas com várias medidas de centralidade e são comparadas com outros modelos de aprendizado de máquina na literatura. As análises para tanto a aproximação quanto a classificação das medidas de centralidade são feitas e se mostra que a classificação é mais fácil de ser computada. A transferência entre as tarefas de predizer as diferentes centralidades é analizada e as vantagens de cada modelo são destacadas. Os modelos são testados em grafos de distribuições aleatórias diferentes das quais foram treinados, em grafos maiores daqueles vistos durante o treinamento assim como com instâncias reais que são muito maiores do que as maiores instâncias vistas durante o treinamento. As representações internas dos vértices aprendidas pelo modelo são analisadas através de projeções de menor dimensão e se conjectura sobre o comportamento visto nos experimentos. Por fim, se identifica possíveis futuros trabalhosm destacados pelos resultados experimentais apresentados aqui

    A Bayesian predictive analytics model for improving long range epidemic forecasting during an infection wave

    Get PDF
    Following the outbreak of the coronavirus epidemic in early 2020, municipalities, regional governments and policymakers worldwide had to plan their Non-Pharmaceutical Interventions (NPIs) amidst a scenario of great uncertainty. At this early stage of an epidemic, where no vaccine or medical treatment is in sight, algorithmic prediction can become a powerful tool to inform local policymaking. However, when we replicated one prominent epidemiological model to inform health authorities in a region in the south of Brazil, we found that this model relied too heavily on manually predetermined covariates and was too reactive to changes in data trends. Our four proposed models access data of both daily reported deaths and infections as well as take into account missing data (e.g., the under-reporting of cases) more explicitly, with two of the proposed versions also attempting to model the delay in test reporting. We simulated weekly forecasting of deaths from the period from 31/05/2020 until 31/01/2021, with first week data being used as a cold-start to the algorithm, after which we use a lighter variant of the model for faster forecasting. Because our models are significantly more proactive in identifying trend changes, this has improved forecasting, especially in long-range predictions and after the peak of an infection wave, as they were quicker to adapt to scenarios after these peaks in reported deaths. Assuming reported cases were under-reported greatly benefited the model in its stability, and modelling retroactively-added data (due to the “hot” nature of the data used) had a negligible impact on performance

    Características clínicas da Doença Arterial Obstrutiva Periférica (DAOP): um estudo sistemático

    No full text
    Introdução: A doença arterial obstrutiva periférica (DAOP) é um problema crescente de saúde pública por conta da sua alta prevalência e grande impacto na qualidade de vida. Associada a fatores de risco cardiovasculares, a DAOP se origina de uma resposta inflamatória crônica que, juntamente com um desequilíbrio na produção de substâncias vasodilatadoras e vasoconstritoras, leva à obstrução progressiva das artérias. As manifestações clínicas mais comuns incluem claudicação intermitente e dor em repouso. Sem diagnóstico e tratamento adequados, a DAOP pode levar a complicações graves, como úlceras não cicatrizantes e amputação. Dessa forma, o reconhecimento dos sinais clínicos é essencial para um diagnóstico precoce e um manejo adequado da doença. Materiais e métodos: nos meses de junho e julho de 2023, utilizando as seguintes bases de dados: SciELO, Pubmed, Google Acadêmico. Foram selecionados alguns artigos com os descritores: doença arterial periférica, doença arterial obstrutiva periférica, quadro clínico, sinais e sintomas, características clínicas. Resultados: A DAOP é caracterizada por diferentes manifestações clínicas. Claudicação intermitente envolve dor nas pernas durante o esforço, aliviada com repouso. A dor em repouso, intensificada à noite, indica graves obstruções arteriais e pode sinalizar a necessidade de intervenções. Alterações na pele e músculos dos membros inferiores refletem a diminuição do fluxo sanguíneo. A ausência de pulso nas extremidades é um indicador chave da localização da obstrução arterial. Feridas ou úlceras, com cicatrização lenta, surgem devido à redução do fluxo sanguíneo, enquanto a gangrena, um tecido morto, aponta para um estágio avançado de isquemia. A impotência pode estar relacionada à extensão da doença vascular. Sensações de frio nas extremidades estão ligadas a diminuição do fluxo sanguíneo, sendo um sinal do declínio mais rápido da qualidade de vida. Por fim, o diagnóstico precoce em pacientes assintomáticos é essencial, combinando monitoramento, exames e aconselhamento sobre hábitos de vida. Conclusão: A DAOP é uma doença de grande relevância para a saúde pública, tanto por sua prevalência elevada quanto pelo impacto na qualidade de vida. Apesar dos benéficos avanços tecnológicos relacionados ao diagnóstico da doença, é de extrema importância entender os sinais e sintomas para realizar o diagnóstico precoce. Além disso, a conscientização sobre a DAOP é essencial, tanto para o público em geral quanto para os profissionais de saúde

    NEOTROPICAL CARNIVORES: a data set on carnivore distribution in the Neotropics

    No full text
    Mammalian carnivores are considered a key group in maintaining ecological health and can indicate potential ecological integrity in landscapes where they occur. Carnivores also hold high conservation value and their habitat requirements can guide management and conservation plans. The order Carnivora has 84 species from 8 families in the Neotropical region: Canidae; Felidae; Mephitidae; Mustelidae; Otariidae; Phocidae; Procyonidae; and Ursidae. Herein, we include published and unpublished data on native terrestrial Neotropical carnivores (Canidae; Felidae; Mephitidae; Mustelidae; Procyonidae; and Ursidae). NEOTROPICAL CARNIVORES is a publicly available data set that includes 99,605 data entries from 35,511 unique georeferenced coordinates. Detection/non-detection and quantitative data were obtained from 1818 to 2018 by researchers, governmental agencies, non-governmental organizations, and private consultants. Data were collected using several methods including camera trapping, museum collections, roadkill, line transect, and opportunistic records. Literature (peer-reviewed and grey literature) from Portuguese, Spanish and English were incorporated in this compilation. Most of the data set consists of detection data entries (n = 79,343; 79.7%) but also includes non-detection data (n = 20,262; 20.3%). Of those, 43.3% also include count data (n = 43,151). The information available in NEOTROPICAL CARNIVORES will contribute to macroecological, ecological, and conservation questions in multiple spatio-temporal perspectives. As carnivores play key roles in trophic interactions, a better understanding of their distribution and habitat requirements are essential to establish conservation management plans and safeguard the future ecological health of Neotropical ecosystems. Our data paper, combined with other large-scale data sets, has great potential to clarify species distribution and related ecological processes within the Neotropics. There are no copyright restrictions and no restriction for using data from this data paper, as long as the data paper is cited as the source of the information used. We also request that users inform us of how they intend to use the data

    NEOTROPICAL ALIEN MAMMALS: a data set of occurrence and abundance of alien mammals in the Neotropics

    No full text
    Biological invasion is one of the main threats to native biodiversity. For a species to become invasive, it must be voluntarily or involuntarily introduced by humans into a nonnative habitat. Mammals were among first taxa to be introduced worldwide for game, meat, and labor, yet the number of species introduced in the Neotropics remains unknown. In this data set, we make available occurrence and abundance data on mammal species that (1) transposed a geographical barrier and (2) were voluntarily or involuntarily introduced by humans into the Neotropics. Our data set is composed of 73,738 historical and current georeferenced records on alien mammal species of which around 96% correspond to occurrence data on 77 species belonging to eight orders and 26 families. Data cover 26 continental countries in the Neotropics, ranging from Mexico and its frontier regions (southern Florida and coastal-central Florida in the southeast United States) to Argentina, Paraguay, Chile, and Uruguay, and the 13 countries of Caribbean islands. Our data set also includes neotropical species (e.g., Callithrix sp., Myocastor coypus, Nasua nasua) considered alien in particular areas of Neotropics. The most numerous species in terms of records are from Bos sp. (n = 37,782), Sus scrofa (n = 6,730), and Canis familiaris (n = 10,084); 17 species were represented by only one record (e.g., Syncerus caffer, Cervus timorensis, Cervus unicolor, Canis latrans). Primates have the highest number of species in the data set (n = 20 species), partly because of uncertainties regarding taxonomic identification of the genera Callithrix, which includes the species Callithrix aurita, Callithrix flaviceps, Callithrix geoffroyi, Callithrix jacchus, Callithrix kuhlii, Callithrix penicillata, and their hybrids. This unique data set will be a valuable source of information on invasion risk assessments, biodiversity redistribution and conservation-related research. There are no copyright restrictions. Please cite this data paper when using the data in publications. We also request that researchers and teachers inform us on how they are using the data
    corecore