135 research outputs found

    Aplikace teorie grafů v predikci funkce proteinů

    Get PDF
    Rapidní vývoj celogenomových sekvenačních metod a jejich snižující se cena za- příčinila existenci velkého množství osekvenovaných genomů. Vývoj spolehlivých in-silico metod pro anotaci rychle rostoucího počtu osekvenovaných genomů představuje výzvu pro moderní biologii. V práci představujeme způsob predikce funkce proteinů, založený na aplikaci teorie grafů v protein-protein interakčních sítích a identifikujeme jeho silné a slabé stránky. Tento přístup poté ilustrujeme na vybraných algoritmech založených na různých myšlenkách. Představené algoritmy porovnáváme a vyhodnocujeme jejich spolehlivost. 1The rapid development of the whole-genome sequencing methods and their reducing cost resulted in a huge number of sequenced genomes. Developing reliable methods for in- silico annotation of the expeditiously growing number of sequenced genomes is the next challenge of modern biology. We described a graph-theoretical approach for function prediction from the protein-protein interaction networks and outlined its strengths and weaknesses. We illustrate the principles of this approach on selected algorithms based on different ideas and provide their comparison and evaluation. 1Department of Cell BiologyKatedra buněčné biologiePřírodovědecká fakultaFaculty of Scienc

    Bioinformatics

    Get PDF
    This book is divided into different research areas relevant in Bioinformatics such as biological networks, next generation sequencing, high performance computing, molecular modeling, structural bioinformatics, molecular modeling and intelligent data analysis. Each book section introduces the basic concepts and then explains its application to problems of great relevance, so both novice and expert readers can benefit from the information and research works presented here

    Data based system design and network analysis tools for chemical and biological processes

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Protein-protein docking for interactomic studies and its aplication to personalized medicine

    Get PDF
    [eng] Proteins are the embodiment of the message encoded in the genes and they act as the building blocks and effector part of the cell. From gene regulation to cell signalling, as well as cell recognition and movement, protein-protein interactions (PPIs) drive many important cellular events by forming intricate interaction networks. The number of all non-redundant human binary interactions, forming the so-called interactome, ranges from 130,000 to 650,000 interactions as estimated by different studies. In some diseases, like cancer, these PPIs are altered by the presence of mutations in individual proteins, which can change the interaction networks of the cell resulting in a pathological state. In order to fully characterize the effect of a pathological mutation and have useful information for prediction purposes, it is important first to identify whether the mutation is located at a protein-binding interface, and second to understand the effect on the binding affinity of the affected interaction/s. To understand how these mutations can alter the PPIs, we need to look at the three-dimensional structure of the protein complexes at the atomic level. However, there are available structures for less than 10% of the estimated human interactome. Computational approaches such as protein-protein docking can help to extend the structural coverage of known PPIs. In the protein-protein docking field, rigid-body docking is a widely used docking approach, since es fast, computationally cheap and is often capable of generating a pool of models within which a near-native structure can be found. These models need to be scored in order to select the acceptable ones from the set of poses. In the present thesis, we have characterized the synergy between combination of protein-protein docking methods and several scoring functions. Our findings provide guides for the use of the most efficient scoring function for each docking method, as well as instruct future scoring functions development efforts Then we used docking calculations to predict interaction hotspots, i.e. residues that contribute the most to the binding energy, and interface patches by including neighbour residues to the predictions. We developed and validated a method, based in the Normalize Interface Propensity (NIP) score. The work of this thesis have extended the original NIP method to predict the location of disease-associated nsSNPs at protein-protein interfaces, when there is no available structure for the protein-protein complex. We have applied this approach to the pathological interaction networks of six diseases with low structural data on PPIs. This approach can almost double the number of nsSNPs that can be characterized and identify edgetic effects in many nsSNPs that were previously unknown. This methodology was also applied to predict the location of 14,551 nsSNPs in 4,254 proteins, for more than 12,000 interactions without 3D structure. We found that 34% of the disease-associated nsSNPs were located at a protein-protein interface. This opens future opportunities for the high-throughput characterization of pathological mutations at the atomic level resolution, and can help to design novel therapeutic strategies to re-stabilize the affected PPIs by disease-associated nsSNPs

    27th Fungal Genetics Conference

    Get PDF
    Program and abstracts from the 27th Fungal Genetics Conference Asilomar, March 12-17, 2013

    27th Fungal Genetics Conference

    Get PDF
    Program and abstracts from the 27th Fungal Genetics Conference Asilomar, March 12-17, 2013

    Regulatory modules discovery and mesenchymal stem cells characterization from high-throughput cancer genomics data

    Get PDF
    2013/2014Il tumore è una malattia caratterizzata da un’estrema complessità molecolare. Gli approcci di tipo “omic”, collezionando dati sull’intero genoma, sui trascritti e proteine in dataset pubblici, permettono di superare questa complessità e di trovare moduli funzionali che eseguono le funzioni coinvolte nei processi tumorali. Ad esempio, i profili di espressione genica da tessuti vengono usati per definire firme di geni e testarne la rilevanza clinica. Ho usato questo tipo di informazione per caratterizzare specifici geni di interesse in modelli di tumore al seno. Uno dei più recenti progetti di tipo “omic” è il FANTOM5. Questo progetto ha generato una risorsa unica: il primo atlante di espressione in mammifero basato su sequenziamento a singola molecola. Il sistema CAGE (Cap Analysis of Gene Expression) è stato usato per misurare i siti di inizio trascrizione (TSS) e l’utilizzo dei promotori in una collezione di campioni umani: in questo modo sono stati misurati i livelli di espressione di gran parte dei trascritti codificanti e non-codificanti nel genoma umano. Ho usato questo tipo di informazione per caratterizzare una linea staminale mesenchimale/stromale (MSC) derivante da tumori sierosi ovarici di alto grado (HG-SOC-MSCs) o da tessuti normali (N-MSCs) inclusi nel dataset FANTOM5. Ho messo in luce programmi funzionali condivisi tra le due linee cellulari e osservato che le differenze principali tra le funzioni attivate nelle due linee sono di tipo quantitativo più che qualitativo. I risultati suggeriscono inoltre che le HG-SOC-MSCs sono simili alle cellule mesoteliali e alle cellule del tessuto muscolare liscio. Inoltre, ho analizzato l’intero dataset usando ScanAll, un nuovo software utile a predire ab initio la presenza di elementi arricchiti nelle regioni geniche che circondano i promotori trovati del progetto FANTOM5. Ho individuato moduli di regolazione, ossia gruppi di motif che si trovano a distanze predefinite sul genoma uno rispetto all’altro. Questi moduli sono arricchiti in regioni del genoma co-espresse rispetto a sequenze generate casualmente. Infine ho creato un compendio di fattori di trascrizione espressi e che partecipano ad interazione proteina-proteina.Cancer is a disease characterized by an extreme molecular complexity. Omics approaches, collecting data in public databases for all the genome, transcripts and proteins, attempt to overcome this complexity and find the functional modules that perform the functions involved in tumour related processes. For instance, cancer tissues gene expression profiles are widely used to define genes signatures and test their clinical relevance. I used this kind information in order to characterise interesting genes in breast cancer models. On the other hand, cellular models datasets could provide data that permits to focus on specific molecular mechanisms and probe the effects of molecules in a specific cancer model. One of the most recent omics project is the FANTOM5 project, that has generated a unique resource, the first single molecule sequencing-based expression atlas in mammalian systems. Cap analysis of gene expression (CAGE) was used to measure transcription start sites (TSS) and promoter usage across a wide collection of human samples thereby identifying and measuring levels of the majority of coding and non-coding transcripts in the human genome. I used this information to characterize a mesenchymal/stromal stem cell line (MSC) derived from high-grade serous ovarian cancer (HG-SOC-MSCs) or derived from normal tissue (N-MSCs) included in the entire FANTOM5 human dataset. I highlighted shared functional programs between HG-SOC-MSCs and N-MSCs suggesting that the global differences between the two cell lines are based on quantitative levels of transcriptional output rather than on qualitative differences. The results suggested that HG-SOC-MSCs are close relatives of mesothelial cells and smooth muscle cells. Furthermore, we analysed the entire dataset using ScanAll, a newly developed software, to ab initio predict the presence of enriched elements in the genomic regions surrounding FANTOM5 promoters. I pinpointed regulatory modules, i.e. groups of enriched motifs co-occurring in co-expressed regions within a fixed distance. These modules are enriched in the co-expressed sequences in each sample respect to random generated sequences. Finally, I created a Compendium of putative expressed and directly interacting transcription factors.XXVII Ciclo198

    Implementation of machine learning for the evaluation of mastitis and antimicrobial resistance in dairy cows

    Get PDF
    Bovine mastitis is one of the biggest concerns in the dairy industry, where it affects sustainable milk production, farm economy and animal health. Most of the mastitis pathogens are bacterial in origin and accurate diagnosis of them enables understanding the epidemiology, outbreak prevention and rapid cure of the disease. This thesis aimed to provide a diagnostic solution that couples Matrix-Assisted Laser Desorption/Ionization-Time of Flight (MALDI-TOF) mass spectroscopy coupled with machine learning (ML), for detecting bovine mastitis pathogens at the subspecies level based on their phenotypic characters. In Chapter 3, MALDI-TOF coupled with ML was performed to discriminate bovine mastitis-causing Streptococcus uberis based on transmission routes; contagious and environmental. S. uberis isolates collected from dairy farms across England and Wales were compared within and between farms. The findings of this chapter suggested that the proposed methodology has the potential of successful classification at the farm level. In Chapter 4, MALDI-TOF coupled with ML was performed to show proteomic differences between bovine mastitis-causing Escherichia coli isolates with different clinical outcomes (clinical and subclinical) and disease phenotype (persistent and non-persistent). The findings of this chapter showed that phenotypic differences can be detected by the proposed methodology even for genotypically identical isolates. In Chapter 5, MALDI-TOF coupled with ML was performed to differentiate benzylpenicillin signatures of bovine mastitis-causing Staphylococcus aureus isolates. The findings of this chapter presented that the proposed methodology enables fast, affordable and effective diag-nostic solution for targeting resistant bacteria in dairy cows. Having shown this methodology successfully worked for differentiating benzylpenicillin resistant and susceptible S. aureus isolates in Chapter 5, the same technique was applied to other mastitis agents Enterococcus faecalis and Enterococcus faecium and for profiling other antimicrobials besides benzylpenicillin in Chapter 6. The findings of this chapter demonstrated that MALDI-TOF coupled with ML allows monitoring the disease epidemiology and provides suggestions for adjusting farm management strategies. Taken together, this thesis highlights that MALDI-TOF coupled with ML is capable of dis-criminating bovine mastitis pathogens at subspecies level based on transmission route, clinical outcome and antimicrobial resistance profile, which could be used as a diagnostic tool for bo-vine mastitis at dairy farms

    A proteomic and cytological characterisation of the buff-tailed bumblebee (Bombus terrestris) fat body and haemolymph -An immune perspective

    Get PDF
    Bees, including solitary and social, native and managed, are vital insect pollinators that provide essential ecosystem services. Bombus terrestris (Linnaeus) is a widespread and important bumblebee pollinator of wild and cultivated crops and although found commonly across Europe, is available commercially to supplement pollination requirements. Due to their activity, B. terrestris workers encounter a variety of diseases which in addition to habitat loss and agrichemical use, are key factors in global bumblebee declines. The profound economic and environmental consequences of this decline warrant a detailed investigation of the molecular and cellular aspects of bumblebee health. The principal components of the B. terrestris immune system, the fat body (FB) and haemolymph were characterised here using proteomic and cytological methodologies. The FB proteome is highly enriched in metabolic, detoxification and proteostasis processes whereas the haemolymph is enriched in cellular transport and immunity. At a cellular level the FB was shown to predominantly comprise adipocytes and oenocytes, while spherulocytes, oenocytoids and plasmatocytes were the most frequently found haemocytes in bumblebee haemolymph. The FB and haemolymph were also investigated under various stresses and contexts. In general, typical immune responses to microbial challenge were observed although immune signatures were lower than expected. Although specific responses to Gram-positive and Gram-negative bacteria and fungi were observed, a broad and conserved response to microbial challenge was found. The major responses in both the haemolymph and FB, however involved energy metabolism, protein processing and detoxification which provides insight into the mechanisms that support and regulate the immune response in bumblebees. Worryingly, pesticide exposure had a significant effect on the FB proteome and its ability to mount an immune response. Overall these results provide novel insights into molecular aspects of bee health and highlight the importance of nutrition and the risks posed by pesticides use on our important pollinator species
    corecore