74 research outputs found
Recommended from our members
Elixir: synthesis of parallel irregular algorithms
Algorithms in new application areas like machine learning and data analytics usually operate on unstructured sparse graphs. Writing efficient parallel code to implement these algorithms is very challenging for a number of reasons.
First, there may be many algorithms to solve a problem and each algorithm may have many implementations. Second, synchronization, which is necessary for correct parallel execution, introduces potential problems such as data-races and deadlocks. These issues interact in subtle ways, making the best solution dependent both on the parallel platform and on properties of the input graph. Consequently, implementing and selecting the best parallel solution can be a daunting task for non-experts, since we have few performance models for predicting the performance of parallel sparse graph programs on parallel hardware.
This dissertation presents a synthesis methodology and a system, Elixir, that addresses these problems by (i) allowing programmers to specify solutions at a high level of abstraction, and (ii) generating many parallel implementations automatically and using search to find the best one. An Elixir specification consists of a set of operators capturing the main algorithm logic and a schedule specifying how to efficiently apply the operators. Elixir employs sophisticated automated reasoning to merge these two components, and uses techniques based on automated planning to insert synchronization and synthesize efficient parallel code.
Experimental evaluation of our approach demonstrates that the performance of the Elixir generated code is competitive to, and can even outperform, hand-optimized code written by expert programmers for many interesting graph benchmarks.Computer Science
Computational Techniques for the Structural and Dynamic Analysis of Biological Networks
The analysis of biological systems involves the study of networks from different omics such as genomics, transcriptomics, metabolomics and proteomics. In general, the computational techniques used in the analysis of biological networks can be divided into those that perform (i) structural analysis, (ii) dynamic analysis of structural prop- erties and (iii) dynamic simulation. Structural analysis is related to the study of the topology or stoichiometry of the biological network such as important nodes of the net- work, network motifs and the analysis of the flux distribution within the network. Dy- namic analysis of structural properties, generally, takes advantage from the availability of interaction and expression datasets in order to analyze the structural properties of a biological network in different conditions or time points. Dynamic simulation is useful to study those changes of the biological system in time that cannot be derived from a structural analysis because it is required to have additional information on the dynamics of the system. This thesis addresses each of these topics proposing three computational techniques useful to study different types of biological networks in which the structural and dynamic analysis is crucial to answer to specific biological questions. In particu- lar, the thesis proposes computational techniques for the analysis of the network motifs of a biological network through the design of heuristics useful to efficiently solve the subgraph isomorphism problem, the construction of a new analysis workflow able to integrate interaction and expression datasets to extract information about the chromo- somal connectivity of miRNA-mRNA interaction networks and, finally, the design of a methodology that applies techniques coming from the Electronic Design Automation (EDA) field that allows the dynamic simulation of biochemical interaction networks and the parameter estimation
A Work-Stealing For Dynamic Workload Balancing On CPU-GPU Heterogeneous Computing Platforms
Although many general purpose workloads have been accelerated on graphical processing units (gpus) over the last decade, other applications whose runtime behaviors are dynamic and irregular such as ones based on trees and graphs have suffered from serious workload imbalance problem caused by architectural differences between cpu and gpu processors. In this thesis, we propose a work-stealing framework to overcome such problems. Our proposed framework allows cpu and gpu threads to steal tasks from each other as well as within the same device by leveraging fine-grained data sharing and thread communication feature available on modern cpu-gpu heterogeneous systems. The implementation of bfs application on the top of our framework achieves a minimum of 8.5% performance improvement over the one with coarse-grained task partitioning scheme. It also achieves 16% performance improvement on average over its non-stealing counterpart
Drugst.One -- A plug-and-play solution for online systems medicine and network-based drug repurposing
In recent decades, the development of new drugs has become increasingly
expensive and inefficient, and the molecular mechanisms of most pharmaceuticals
remain poorly understood. In response, computational systems and network
medicine tools have emerged to identify potential drug repurposing candidates.
However, these tools often require complex installation and lack intuitive
visual network mining capabilities. To tackle these challenges, we introduce
Drugst.One, a platform that assists specialized computational medicine tools in
becoming user-friendly, web-based utilities for drug repurposing. With just
three lines of code, Drugst.One turns any systems biology software into an
interactive web tool for modeling and analyzing complex protein-drug-disease
networks. Demonstrating its broad adaptability, Drugst.One has been
successfully integrated with 21 computational systems medicine tools. Available
at https://drugst.one, Drugst.One has significant potential for streamlining
the drug discovery process, allowing researchers to focus on essential aspects
of pharmaceutical treatment research.Comment: 45 pages, 6 figures, 7 table
A Gene Co-Expression Network-Based Drug Repositioning Approach Identifies Candidates for Treatment of Hepatocellular Carcinoma
Hepatocellular carcinoma (HCC) is a malignant liver cancer that continues to increase deaths worldwide owing to limited therapies and treatments. Computational drug repurposing is a promising strategy to discover potential indications of existing drugs. In this study, we present a systematic drug repositioning method based on comprehensive integration of molecular signatures in liver cancer tissue and cell lines. First, we identify robust prognostic genes and two gene co-expression modules enriched in unfavorable prognostic genes based on two independent HCC cohorts, which showed great consistency in functional and network topology. Then, we screen 10 genes as potential target genes for HCC on the bias of network topology analysis in these two modules. Further, we perform a drug repositioning method by integrating the shRNA and drug perturbation of liver cancer cell lines and identifying potential drugs for every target gene. Finally, we evaluate the effects of the candidate drugs through an in vitro model and observe that two identified drugs inhibited the protein levels of their corresponding target genes and cell migration, also showing great binding affinity in protein docking analysis. Our study demonstrates the usefulness and efficiency of network-based drug repositioning approach to discover potential drugs for cancer treatment and precision medicine approach
A regulatory network comprising let-7 miRNA and SMUG1 is associated with good prognosis in ER+ breast tumours
Single-strand selective uracilâDNA glycosylase 1 (SMUG1) initiates base excision repair (BER) of uracil and oxidized pyrimidines. SMUG1 status has been associated with cancer risk and therapeutic response in breast carcinomas and other cancer types. However, SMUG1 is a multifunctional protein involved, not only, in BER but also in RNA quality control, and its function in cancer cells is unclear. Here we identify several novel SMUG1 interaction partners that functions in many biological processes relevant for cancer development and treatment response. Based on this, we hypothesized that the dominating function of SMUG1 in cancer might be ascribed to functions other than BER. We define a bad prognosis signature for SMUG1 by mapping out the SMUG1 interaction network and found that high expression of genes in the bad prognosis network correlated with lower survival probability in ER(+) breast cancer. Interestingly, we identified hsa-let-7b-5p microRNA as an upstream regulator of the SMUG1 interactome. Expression of SMUG1 and hsa-let-7b-5p were negatively correlated in breast cancer and we found an inhibitory auto-regulatory loop between SMUG1 and hsa-let-7b-5p in the MCF7 breast cancer cells. We conclude that SMUG1 functions in a gene regulatory network that influence the survival and treatment response in several cancers
AnĂĄlise integrativa dos mecanismos de patogĂȘnese em doenças lisossĂŽmicas
Doenças lisossĂŽmicas (DLs) causam acĂșmulo intracelular de substratos e deficiĂȘncia no trĂĄfego de macromolĂ©culas. O armazenamento do substrato pode impactar uma ou vĂĄrias vias que contribuem para o dano celular. Vias morfogĂȘnicas e de crescimento como Hedgehog (Hh), mTOR e insulina estĂŁo envolvidas na fisiopatologia das DLs. A via Hh Ă© afetada com expressĂŁo anormal e alteraçÔes nos nĂveis e distribuição de proteĂnas Hh. mTOR pode ter um atraso em sua reativação e desregular o tĂ©rmino da autofagia e manutenção dos lisossomos. A resistĂȘncia Ă insulina causada por mudanças nas jangadas lipĂdicas tambĂ©m foi descrita em diferentes DLs. Portanto, exploramos como estas vias podem estar relacionadas, mostrando que uma abordagem de medicina de redes pode ser uma ferramenta valiosa para o melhor entendimento da patogĂȘnese em DLs. Assim, utilizamos ferramentas de biologia de sistemas para investigar novos elementos associados com a dilatação da aorta em mucopolissacaridoses (MPS). Identificamos genes candidatos associados com processos biolĂłgicos, incluindo respostas inflamatĂłrias, deposição de colĂĄgeno e metabolismo de lipĂdeos que podem contribuir para a patogĂȘnese da dilatação da aorta em MPS I e MPS VII. Por Ășltimo, foram identificados novos genes candidatos e vias que convergem em mecanismos funcionais envolvidos nos defeitos de formação precoce do circuito neural, no qual podem indicar pistas sobre o comprometimento cognitivoem pacientes com MPSII. Tais mudanças moleculares durante o neurodesenvolvimento podem preceder as evidĂȘncias morfolĂłgicas e clĂnicas, destacando aimportĂąncia do diagnĂłstico precoce e do desenvolvimento de novas drogas.Lysosomal storage diseases (LSDs) cause intracellular accumulation of substrates and deficiency in trafficking of macromolecules. The substrate storage can impact one or several pathways which contribute to cell damage. Morphogenic and growth pathways such as hedgehog (Hh), mTOR and insulin are involved in the pathophysiology of LSDs. Hh pathway is affected with abnormal expression and changes in protein levels. mTOR may have a delay in reactivation and deregulate termination of autophagy and reformation of lysosomes. Insulin resistance caused by changes in lipids rafts also has been described in different LSDs. Therefore, we explored how specific signaling pathways can be related to specific LSDs, showing that a system medicine approach could be a valuable tool for the better understanding of LSD pathogenesis. Moreover, we used systems biology tools to investigate new elements that may be involved in aortic dilatation in Mucopolysaccharidoses (MPS) syndrome. We identified candidate genes associated with biological processes related to inflammatory responses, deposition of collagen, and lipid metabolism that may contribute to pathogenesis of aortic dilatation in the MPS I and MPS VII. Finally, we identified new candidate genes and pathways that converge into functional mechanisms involved in early neural circuit formation defects and could indicate clues about cognitive impairment in patients with MPSII. Such molecular changes during neurodevelopment may precede the morphological and clinical evidence, highlighting the importance of an early diagnosis and the development of new drugs
Investigation of HIV-TB co-infection through analysis of the potential impact of host genetic variation on host-pathogen protein interactions
HIV and Mycobacterium tuberculosis (Mtb) co-infection causes treatment and diagnostic difficulties, which places a major burden on health care systems in settings with high prevalence of both infectious diseases, such as South Africa. Human genetic variation adds further complexity, with variants affecting disease susceptibility and response to treatment. The identification of variants in African populations is affected by reference mapping bias, especially in complex regions like the Major Histocompatibility Complex (MHC), which plays an important role in the immune response to HIV and Mtb infection. We used a graph-based approach to identify novel variants in the MHC region within African samples without mapping to the canonical reference genome. We generated a host-pathogen functional interaction network made up of inter- and intraspecies protein interactions, gene expression during co-infection, drug-target interactions, and human genetic variation. Differential expression and network centrality properties were used to prioritise proteins that may be important in co-infection. Using the interaction network we identified 28 human proteins that interact with both pathogens (âbridgeâ proteins). Network analysis showed that while MHC proteins did not have significantly higher centrality measures than non-MHC proteins, bridge proteins had significantly shorter distance to MHC proteins. Proteins that were significantly differentially expressed during co-infection or contained variants clinically-associated with HIV or TB also had significantly stronger network properties. Finally, we identified common and consequential variants within prioritised proteins that may be clinically-associated with HIV and TB. The integrated network was extensively annotated and stored in a graph database that enables rapid and high throughput prioritisation of sets of genes or variants, facilitates detailed investigations and allows network-based visualisation
Text and Network Mining for Literature-Based Scientific Discovery in Biomedicine.
Most of the new and important findings in biomedicine are only available in the
text of the published scientific articles. The first goal of this thesis is to design
methods based on natural language processing and machine learning to extract information about genes, proteins, and their interactions from text. We introduce a
dependency tree kernel based relation extraction method to identify the interacting
protein pairs in a sentence. We propose two kernel functions based on cosine similarity and edit distance among the dependency tree paths connecting the protein names.
Using these kernel functions with supervised and semi-supervised machine learning
methods, we report significant improvement (59.96% F-Measure performance over
the AIMED data set) compared to the previous results in the literature. We also
address the problem of distinguishing factual information from speculative information. Unlike previous methods that formulate the problem as a sentence classification
task, we propose a two-step method to identify the speculative fragments of sentences.
First, we use supervised classification to identify the speculation keywords using a
diverse set of linguistic features that represent their contexts. Next, we use the syntactic structures of the sentences to resolve their linguistic scopes. Our results show
that the method is effective in identifying speculative portions of sentences. The
speculation keyword identification results are close to the upper bound of human
inter-annotator agreement.
The second goal of this thesis is to generate new scientific hypotheses using the
literature-mined protein/gene interactions. We propose a literature-based discovery
approach, where we start with a set of genes known to be related to a given concept
and integrate text mining with network centrality analysis to predict novel concept-related genes. We present the application of the proposed approach to two different
problems, namely predicting gene-disease associations and predicting genes that are
important for vaccine development. Our results provide new insights and hypotheses worth future investigations in these domains and show the effectiveness of the
proposed approach for literature-based discovery.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78956/1/ozgur_1.pd
- âŠ