234 research outputs found

    Random walks on mutual microRNA-target gene interaction network improve the prediction of disease-associated microRNAs

    Get PDF
    Background: MicroRNAs (miRNAs) have been shown to play an important role in pathological initiation, progression and maintenance. Because identification in the laboratory of disease-related miRNAs is not straightforward, numerous network-based methods have been developed to predict novel miRNAs in silico. Homogeneous networks (in which every node is a miRNA) based on the targets shared between miRNAs have been widely used to predict their role in disease phenotypes. Although such homogeneous networks can predict potential disease-associated miRNAs, they do not consider the roles of the target genes of the miRNAs. Here, we introduce a novel method based on a heterogeneous network that not only considers miRNAs but also the corresponding target genes in the network model. Results: Instead of constructing homogeneous miRNA networks, we built heterogeneous miRNA networks consisting of both miRNAs and their target genes, using databases of known miRNA-target gene interactions. In addition, as recent studies demonstrated reciprocal regulatory relations between miRNAs and their target genes, we considered these heterogeneous miRNA networks to be undirected, assuming mutual miRNA-target interactions. Next, we introduced a novel method (RWRMTN) operating on these mutual heterogeneous miRNA networks to rank candidate disease-related miRNAs using a random walk with restart (RWR) based algorithm. Using both known disease-associated miRNAs and their target genes as seed nodes, the method can identify additional miRNAs involved in the disease phenotype. Experiments indicated that RWRMTN outperformed two existing state-of-the-art methods: RWRMDA, a network-based method that also uses a RWR on homogeneous (rather than heterogeneous) miRNA networks, and RLSMDA, a machine learning-based method. Interestingly, we could relate this performance gain to the emergence of "disease modules" in the heterogeneous miRNA networks used as input for the algorithm. Moreover, we could demonstrate that RWRMTN is stable, performing well when using both experimentally validated and predicted miRNA-target gene interaction data for network construction. Finally, using RWRMTN, we identified 76 novel miRNAs associated with 23 disease phenotypes which were present in a recent database of known disease-miRNA associations. Conclusions: Summarizing, using random walks on mutual miRNA-target networks improves the prediction of novel disease-associated miRNAs because of the existence of "disease modules" in these networks

    Prediction of miRNA-disease associations with a vector space model

    Get PDF
    MicroRNAs play critical roles in many physiological processes. Their dysregulations are also closely related to the development and progression of various human diseases, including cancer. Therefore, identifying new microRNAs that are associated with diseases contributes to a better understanding of pathogenicity mechanisms. MicroRNAs also represent a tremendous opportunity in biotechnology for early diagnosis. To date, several in silico methods have been developed to address the issue of microRNA-disease association prediction. However, these methods have various limitations. In this study, we investigate the hypothesis that information attached to miRNAs and diseases can be revealed by distributional semantics. Our basic approach is to represent distributional information on miRNAs and diseases in a high-dimensional vector space and to define associations between miRNAs and diseases in terms of their vector similarity. Cross validations performed on a dataset of known miRNA-disease associations demonstrate the excellent performance of our method. Moreover, the case study focused on breast cancer confirms the ability of our method to discover new disease-miRNA associations and to identify putative false associations reported in databases

    Untargeted sequencing of circulating microRNAs in a healthy and diseased older population

    Get PDF
    We performed untargeted profiling of circulating microRNAs (miRNAs) in a well characterized cohort of older adults to verify associations of health and disease-related biomarkers with systemic miRNA expression. Differential expression analysis revealed 30 miRNAs that significantly differed between healthy active, healthy sedentary and sedentary cardiovascular risk patients. Increased expression of miRNAs miR-193b-5p, miR-122-5p, miR-885-3p, miR-193a-5p, miR-34a-5p, miR-505-3p, miR-194-5p, miR-27b-3p, miR-885-5p, miR-23b-5b, miR-365a-3p, miR-365b-3p, miR-22-5p was associated with a higher metabolic risk profile, unfavourable macro- and microvascular health, lower physical activity (PA) as well as cardiorespiratory fitness (CRF) levels. Increased expression of miR-342-3p, miR-1-3p, miR-92b-5p, miR-454-3p, miR-190a-5p and miR-375-3p was associated with a lower metabolic risk profile, favourable macro- and microvascular health as well as higher PA and CRF. Of note, the first two principal components explained as much as 20% and 11% of the data variance. miRNAs and their potential target genes appear to mediate disease- and health-related physiological and pathophysiological adaptations that need to be validated and supported by further downstream analysis in future studies.Clinical Trial Registration: ClinicalTrials.gov: NCT02796976 ( https://clinicaltrials.gov/ct2/show/NCT02796976 )

    A network-based approach to uncover microRNA-mediated disease comorbidities and potential pathobiological implications.

    Get PDF
    Disease-disease relationships (e.g., disease comorbidities) play crucial roles in pathobiological manifestations of diseases and personalized approaches to managing those conditions. In this study, we develop a network-based methodology, termed meta-path-based Disease Network (mpDisNet) capturing algorithm, to infer disease-disease relationships by assembling four biological networks: disease-miRNA, miRNA-gene, disease-gene, and the human protein-protein interactome. mpDisNet is a meta-path-based random walk to reconstruct the heterogeneous neighbors of a given node. mpDisNet uses a heterogeneous skip-gram model to solve the network representation of the nodes. We find that mpDisNet reveals high performance in inferring clinically reported disease-disease relationships, outperforming that of traditional gene/miRNA-overlap approaches. In addition, mpDisNet identifies network-based comorbidities for pulmonary diseases driven by underlying miRNA-mediated pathobiological pathways (i.e., hsa-let-7a- or hsa-let-7b-mediated airway epithelial apoptosis and pro-inflammatory cytokine pathways) as derived from the human interactome network analysis. The mpDisNet offers a powerful tool for network-based identification of disease-disease relationships with miRNA-mediated pathobiological pathways

    NETWORK ANALYTICS FOR THE MIRNA REGULOME AND MIRNA-DISEASE INTERACTIONS

    Get PDF
    miRNAs are non-coding RNAs of approx. 22 nucleotides in length that inhibit gene expression at the post-transcriptional level. By virtue of this gene regulation mechanism, miRNAs play a critical role in several biological processes and patho-physiological conditions, including cancers. miRNA behavior is a result of a multi-level complex interaction network involving miRNA-mRNA, TF-miRNA-gene, and miRNA-chemical interactions; hence the precise patterns through which a miRNA regulates a certain disease(s) are still elusive. Herein, I have developed an integrative genomics methods/pipeline to (i) build a miRNA regulomics and data analytics repository, (ii) create/model these interactions into networks and use optimization techniques, motif based analyses, network inference strategies and influence diffusion concepts to predict miRNA regulations and its role in diseases, especially related to cancers. By these methods, we are able to determine the regulatory behavior of miRNAs and potential causal miRNAs in specific diseases and potential biomarkers/targets for drug and medicinal therapeutics

    TLHNMDA: Triple Layer Heterogeneous Network Based Inference for MiRNA-Disease Association Prediction

    Get PDF
    In recent years, microRNAs (miRNAs) have been confirmed to be involved in many important biological processes and associated with various kinds of human complex diseases. Therefore, predicting potential associations between miRNAs and diseases with the huge number of verified heterogeneous biological datasets will provide a new perspective for disease therapy. In this article, we developed a novel computational model of Triple Layer Heterogeneous Network based inference for MiRNA-Disease Association prediction (TLHNMDA) by using the experimentally verified miRNA-disease associations, miRNA-long noncoding RNA (lncRNA) interactions, miRNA function similarity information, disease semantic similarity information and Gaussian interaction profile kernel similarity for lncRNAs into an triple layer heterogeneous network to predict new miRNA-disease associations. As a result, the AUCs of TLHNMDA are 0.8795 and 0.8795 ± 0.0010 based on leave-one-out cross validation (LOOCV) and 5-fold cross validation, respectively. Furthermore, TLHNMDA was implemented on three complex human diseases to evaluate predictive ability. As a result, 84% (kidney neoplasms), 78% (lymphoma) and 76% (prostate neoplasms) of top 50 predicted miRNAs for the three complex diseases can be verified by biological experiments. In addition, based on the HMDD v1.0 database, 98% of top 50 potential esophageal neoplasms-associated miRNAs were confirmed by experimental reports. It is expected that TLHNMDA could be a useful model to predict potential miRNA-disease associations with high prediction accuracy and stability

    Discovering lesser known molecular players and mechanistic patterns in Alzheimer's disease using an integrative disease modelling approach

    Get PDF
    Convergence of exponentially advancing technologies is driving medical research with life changing discoveries. On the contrary, repeated failures of high-profile drugs to battle Alzheimer's disease (AD) has made it one of the least successful therapeutic area. This failure pattern has provoked researchers to grapple with their beliefs about Alzheimer's aetiology. Thus, growing realisation that Amyloid-β and tau are not 'the' but rather 'one of the' factors necessitates the reassessment of pre-existing data to add new perspectives. To enable a holistic view of the disease, integrative modelling approaches are emerging as a powerful technique. Combining data at different scales and modes could considerably increase the predictive power of the integrative model by filling biological knowledge gaps. However, the reliability of the derived hypotheses largely depends on the completeness, quality, consistency, and context-specificity of the data. Thus, there is a need for agile methods and approaches that efficiently interrogate and utilise existing public data. This thesis presents the development of novel approaches and methods that address intrinsic issues of data integration and analysis in AD research. It aims to prioritise lesser-known AD candidates using highly curated and precise knowledge derived from integrated data. Here much of the emphasis is put on quality, reliability, and context-specificity. This thesis work showcases the benefit of integrating well-curated and disease-specific heterogeneous data in a semantic web-based framework for mining actionable knowledge. Furthermore, it introduces to the challenges encountered while harvesting information from literature and transcriptomic resources. State-of-the-art text-mining methodology is developed to extract miRNAs and its regulatory role in diseases and genes from the biomedical literature. To enable meta-analysis of biologically related transcriptomic data, a highly-curated metadata database has been developed, which explicates annotations specific to human and animal models. Finally, to corroborate common mechanistic patterns — embedded with novel candidates — across large-scale AD transcriptomic data, a new approach to generate gene regulatory networks has been developed. The work presented here has demonstrated its capability in identifying testable mechanistic hypotheses containing previously unknown or emerging knowledge from public data in two major publicly funded projects for Alzheimer's, Parkinson's and Epilepsy diseases

    Computational analysis of the colonic transciptome & in vitro biomarker analysis using a novel microfluidic quantum dot linked immunoassay

    Get PDF
    DNA microarray technology facilitates the high throughput analysis of transcriptional disease regulation by measuring the relative expression levels of transcripts present within a tissue. While such computational approaches have been used to study the genetic regulation of a variety of illnesses, such studies often suffer from inadequate patient sample sizes and statistical power resulting in conflicting results and lab-specific bias. In order to overcome these limitations and fully utilize the wealth of publicly available genomic data, an integrated microarray analysis method was used to analyze and interpret microarray data in the context of colonic diseases including colorectal carcinoma (CRC) and inflammatory bowel disease (IBD).The results of this work indicate widespread genetic perturbations related to IBD in which a variety of cell types are implicated including resident host enterocytes, innate and adaptive immune cells as well as native luminal microflora. Our work has identified subtle genetic differences between IBD phenotypes for the realization of disease specific therapeutic treatments as well as novel diagnostic biomarkers. Furthermore, our analysis has revealed significant overlap in the genetic regulation and predisposition to IBD, lupus, type 1 diabetes, graves disease and rheumatoid arthritis, providing the first genetic link between the enteropathic disease symptoms associated with IBD. Druggable pathways involved in these diseases as well as known therapeutic drug targets were also analyzed for the potential repositioning of existing therapeutics for the treatment of IBD.IBD patients are known to be at an elevated risk for developing colorectal carcinoma, with risk increasing with the duration of the disease. In order to better understand the phenotype shift from IBD to cancerous phenotypes, integrated microarray analysis was used to identify gene signatures, implicated pathways and novel discriminatory biomarkers for differentiating between IBD and CRC phenotypes. Our diagnostic panels were shown to accurately differentiate between phenotypes using an independent dataset for validation.In order to transition the identified biomarkers to the clinic for diagnostic use, a novel microfluidic quantum dot linked immunosorbent assay platform was developed with enhanced surface chemistry and reaction kinetics. The developed prototype has the capability of multiplexed biomarker detection within clinically relevant samples for the stratification of disease phenotypes. In order to validate our design, human samples spiked with the fecal IBD biomarker lactoferrin were analyzed. Results indicate increased sensitivity and signal to noise ratios over our predicate device, with a reduction of the limit of detection. This proof of concept device shows great promise as a portable bedside diagnostic device for multiplexed biomarker analysis within a clinical setting.Ph.D., Biomedical Engineering -- Drexel University, 201

    Discovery of tissue specific network properties associated with cancer driver genes

    Get PDF
    Tese de Mestrado em Bioquímica, Faculdade de Ciências, Universidade de Lisboa, 2022Using the notion of disease modules, network medicine has effectively identified diseaseassociated genes in recent years. In biological networks, genes linked to a particular illness tend to interact closely [1]. These networks allow both physical and functional connections between biomolecules to be identified, resulting in a map of cell components and processes that constitute biological systems [2]. Not all disease-associated genes, however, have a major impact on disease phenotype. The discovery of important genes able to produce or change disease phenotype paves the path to new therapies and a personalized medicine strategy. Recent research has found that biological network topological features per se may accurately predict perturbation effects in a dynamical model of the system with a 65-80% accuracy [3, 4]. Biological networks differ depending on whatever tissue or cell type is being studied. As a result, each gene's topological features and ability to impact the system may alter [5]. The main goal of this thesis is to discover network topological parameters associated with influential cancer driver genes using context specific networks. In order to achieve this, we evaluated local network features around each driver gene across multiple tissue specific networks, including tissues that are affected in the disease and others where the gene perturbation has no significant effect. We aimed to identify topological parameters and its characteristics contributing to the cancer driver gene’s influential role. The results of this dissertation point out that several topological parameters can be used to determine cancer “driver” genes. We found that these genes have higher values of topological parameters, such as Degree or Closeness, in tissues where they tend to cause cancer. We also found that this difference is present in oncogenes and tumor suppressor genes. Another factor that we found to influence the value of topological parameters is the number of tissues in which these genes cause the disease. There is an increasing trend of topological parameter values with the increase of the number of tissues in which they cause cancer. Together, these results support the significant association of topological parameters like the Degree with the influential role of a driver gene in cancer.Usando a noção de módulos de doença, a medicina de redes identificou eficazmente nos últimos anos genes associados a doenças. Nas redes biológicas, os genes ligados a uma determinada doença tendem a interagir proximamente [1] . Essas redes permitem que conexões físicas e funcionais entre biomoléculas sejam identificadas, resultando num mapa de componentes celulares e processos que constituem sistemas biológicos [2]. Nem todos os genes associados à doença, no entanto, têm um grande impacto no fenótipo da doença. A descoberta de genes importantes capazes de produzir ou alterar o fenótipo da doença abre caminho para novas terapias e uma estratégia de medicina personalizada. Pesquisas recentes descobriram que as características topológicas da rede biológica podem prever com precisão os efeitos de perturbação num modelo dinâmico do sistema com uma precisão de 65-80% [3, 4]. As redes biológicas diferem dependendo do tipo de tecido ou célula estudado. Como resultado, as características topológicas de cada gene e a capacidade de impactar o sistema podem ser alteradas [5]. O principal objetivo desta dissertação é descobrir parâmetros topológicos de rede associados a genes promotores de cancro usando redes específicas de tecido. Para conseguir isso, avaliamos as características da rede local em torno de cada gene promotor em várias redes específicas de tecidos, incluindo tecidos afetados pela doença e outros onde a perturbação do gene não tem efeito significativo. Deste modo, podemos identificar parâmetros topológicos e as características que contribuem para o papel influente dos genes promotores do cancro. Para atingir os nossos objetivos, começámos por construir e otimizar as nossas redes específicas de tecidos. Cada rede específica de tecido foi construída usando quatro bases de dados diferentes de interações proteína-proteína, vias de sinalização e fatores de transcrição. Tentámos quatro métodos diferentes de construir as redes, incluindo o uso do filtro de níveis de expressão génica acima de 0,1 e 5 transcritos por milhão em cada tecido. Construímos também uma matriz associando os genes promotores de cancro (retirados de uma base de dados online de genes promotores de cancro) aos tecidos onde provocam a doença. Cada gene promotor foi inserido em seis categorias diferentes de acordo com o número de tecidos onde provocam cancro, sendo a categoria seis aquela que inclui os genes que provocam a doença em seis ou mais tecidos. Começámos por comparar os valores dos parâmetros topológicos dos genes em tecidos onde estes provocam a doença versus os seus valores em tecidos onde não a provocam. Esses valores também foram comparados com uma lista de genes associados ao cancro (retirados de uma base de dados online de genes associados a doenças), mas não promotores de cancro, e uma lista de genes não associados a nenhuma doença. Este estudo foi feito sobre os quatro diferentes métodos de construção de rede. Continuámos o estudo observando como os parâmetros topológicos mostraram diferenças ao nível do tecido. Analisámos em cada tecido os valores dos parâmetros topológicos dos genes promotores que causam a doença num determinado tecido versus os valores dos genes que não causam doença naquele tecido. Depois de comparar os valores dos parâmetros topológicos usando todos os genes promotores juntos num grupo global, queríamos verificar se a diferença entre os valores destes nos tecidos onde causam cancro versus os valores nos tecidos onde não provocam a doença, também estava presente dentro das categorias do número de tecidos onde os genes promotores causam cancro e como esses valores aumentam ou diminuem ao longo dessas categorias. Avaliamos em seguida o impacto combinado dos valores dos parâmetros topológicos (selecionando o parâmetro topológico “Degree”) de genes promotores de cancro em tecidos onde causam doença versus onde não causam e também a diferença entre estes ao longo das seis diferentes categorias de número de tecidos onde provocam cancro, usando um Modelo Linear Generalizado (GLM) para avaliar a interação desses fatores. Da base de dados de onde retiramos a lista de genes promotores de cancro, também retiramos uma lista de oncogenes e genes supressores de tumor que usámos para avaliar também as diferenças dos valores dos seus parâmetros topológicos nos tecidos onde causam cancro versus os tecidos onde não causam. A fim de avaliar outras variáveis que possam ter impacto para além dos parâmetros topológicos e que possam também diferir dependendo do número de tecidos onde os genes “drivers” causam a doença, usamos os dados da base de dados de onde retiramos os genes promotores que incluíam informações sobre o número de interações que cada gene promotor estabelece com diferentes miRNA e sobre o número de complexos proteicos que estes genes integram. Também avaliamos o impacto da expressão génica nas diferentes categorias de número de tecidos. Por fim, enriquecemos funcionalmente os genes promotores de cancro, usando dois métodos diferentes. No primeiro método usamos os genes que tinham uma diferença topológica maior (para este estudo usamos apenas o parâmetro topológico “Degree”) entre os tecidos onde causam ou não cancro. Classificamos cada gene como positivo, negativo e não significativo com base na diferença entre o valor médio do “Degree” nos tecidos onde causam cancro versus o valor nos tecidos onde não causam. O segundo método foi o enriquecimento dos diferentes genes promotores de cancro de acordo com o número de tecidos que causam cancro. Fizemos esse estudo usando as diferentes categorias de número de tecidos. Globalmente, os nossos resultados sugerem que os valores dos parâmetros topológicos (por exemplo, “Degree“ e “Closeness”) tendem a ser maiores nos tecidos em que os genes promoteres de cancro provocam a doença ( “Tissue Drivers”), seguidos pelos valores dos genes de cancro que são não promotores de cancro mas estão associados ao desenvolvimento da doença (“Disease Genes”), os valores dos genes promotores de cancro nos tecidos onde não causam cancro (“NonTissueDrivers”) e por último, com os menores valores de parâmetros topológicos, os genes que não estão associados a qualquer doença. A diferença entre os valores dos parâmetros topológicos nos “TissueDrivers” versus “NonTissueDrivers” é estatisticamente significativa na maioria dos parâmetros topológicos testados e nos diferentes métodos de rede utilizados, exceto no método “JustHuRiTPM5Zminmax” (usando apenas a base de dados Huri). Quando analisámos em cada tecido os valores dos parâmetros topológicos, pudemos ver que os valores de “Degree” tendem a ser maiores nos genes promotores de cancro que causam cancro naquele tecido em comparação com os genes promotores que não provocam cancro nesse tecido. Essa diferença é estatisticamente significativa em muitos dos tecidos analisados. Em relação a como os valores dos parâmetros topológicos se comportam ao longo das diferentes categorias associadas ao número de tecidos em que os genes promotores causam cancro, descobrimos que nos genes promotores de cancro que causam doença em apenas em um e dois tecidos, o valor do “Degree” nos tecidos onde causam cancro é menor que o valor apresentado nos tecidos onde não causam cancro. Observamos a tendência inversa nos genes promotores que causam cancro em seis ou mais tecidos (o valor do “Degree” é maior nos tecidos onde causam cancro). Observamos também que o valor do “Degree” aumenta gradativamente ao longo do número da categoria de tecidos, atingindo o valor mais alto na categoria seis (constituída por genes promotores que provocam cancro em seis ou mais tecidos). No modelo linear generalizado (GLM), pudemos ver o efeito combinado da variável tipo de tecido (onde o gene promotor provoca ou não cancro, mostrando uma diferença estatisticamente significativa entre estas duas situações) e da variável número de tecidos onde os genes promotores provocam cancro (mostrando também uma valor estatisticamente significativo entre as diferentes categorias). A interação entre esses dois fatores também foi estatisticamente significativa. Também pudemos observar valores de “Degree” estatisticamente diferentes entre os genes promotores supressores de tumor nos tecidos que causam cancro (com valores mais altos) e os valores nos tecidos onde não causam. Vimos também a mesma diferença nos Oncogenes, mas com menor significância. Os valores do “Degree” nos genes Supressores de Tumores foram inferiores aos valores do “Degree” apresentados pelos Oncogenes. Pudemos igualmente ver uma clara tendência de correlação entre o aumento do número de tecidos com o aumento do número de complexos que os genes promotores de cancro integram. O mesmo comportamento foi observado em relação ao número de miRNAs com os quais os genes promotores interagem. Em relação à expressão do mRNA ao longo das categorias de número de tecidos, pudemos ver uma diferença estatisticamente significativa nas categorias dois e três entre os valores dos genes promotores(em relação ao parâmetro topológico “Degree”) nos tecidos onde causam cancro versus onde não causam. Finalmente, no estudo de enriquecimento de funções pudemos ver que os processos biológicos, funções moleculares e componentes celulares que obtivemos enriquecidos usando o método das diferentes categorias de número de tecidos estão muito mais relacionados com os processos de cancro baseados na literatura (“hallmarks of cancer”). Não conseguimos encontrar uma divisão muito clara entre funções biológicas enriquecidas que tiveram uma diferença de z-score do “Degree” acima de 1 e aqueles com diferença abaixo de -1. Não encontramos nenhum processo de enriquecimento funcional relevante em nenhum desses dois grupos de genes e que de alguma forma os pudesse distinguir entre si. Os resultados desta dissertação apontam para que vários parâmetros topológicos possam estar associados a genes promotores de cancro. Verificámos que estes genes têm valores de parâmetros topológicos, como o Degree ou Closeness, mais elevados nos tecidos onde tendencionalmente provocam cancro. Verificámos também que esta diferença está presente nos oncogenes e nos genes supressores de tumor. Outro fator que verificamos influenciar o valor dos parâmetros topológicos, é o número de tecidos em que estes genes provocam a doença. Há uma tendência crescente do valor topológico com um número de tecidos em que provocam cancro
    corecore