148 research outputs found

    Integration of multi-scale protein interactions for biomedical data analysis

    Get PDF
    With the advancement of modern technologies, we observe an increasing accumulation of biomedical data about diseases. There is a need for computational methods to sift through and extract knowledge from the diverse data available in order to improve our mechanistic understanding of diseases and improve patient care. Biomedical data come in various forms as exemplified by the various omics data. Existing studies have shown that each form of omics data gives only partial information on cells state and motivated jointly mining multi-omics, multi-modal data to extract integrated system knowledge. The interactome is of particular importance as it enables the modelling of dependencies arising from molecular interactions. This Thesis takes a special interest in the multi-scale protein interactome and its integration with computational models to extract relevant information from biomedical data. We define multi-scale interactions at different omics scale that involve proteins: pairwise protein-protein interactions, multi-protein complexes, and biological pathways. Using hypergraph representations, we motivate considering higher-order protein interactions, highlighting the complementary biological information contained in the multi-scale interactome. Based on those results, we further investigate how those multi-scale protein interactions can be used as either prior knowledge, or auxiliary data to develop machine learning algorithms. First, we design a neural network using the multi-scale organization of proteins in a cell into biological pathways as prior knowledge and train it to predict a patient's diagnosis based on transcriptomics data. From the trained models, we develop a strategy to extract biomedical knowledge pertaining to the diseases investigated. Second, we propose a general framework based on Non-negative Matrix Factorization to integrate the multi-scale protein interactome with multi-omics data. We show that our approach outperforms the existing methods, provide biomedical insights and relevant hypotheses for specific cancer types

    Network-based identification of driver pathways in clonal systems

    Get PDF
    Highly ethanol-tolerant bacteria for the production of biofuels, bacterial pathogenes which are resistant to antibiotics and cancer cells are examples of phenotypes that are of importance to society and are currently being studied. In order to better understand these phenotypes and their underlying genotype-phenotype relationships it is now commonplace to investigate DNA and expression profiles using next generation sequencing (NGS) and microarray techniques. These techniques generate large amounts of omics data which result in lists of genes that have mutations or expression profiles which potentially contribute to the phenotype. These lists often include a multitude of genes and are troublesome to verify manually as performing literature studies and wet-lab experiments for a large number of genes is very time and resources consuming. Therefore, (computational) methods are required which can narrow these gene lists down by removing generally abundant false positives from these lists and can ideally provide additional information on the relationships between the selected genes. Other high-throughput techniques such as yeast two-hybrid (Y2H), ChIP-Seq and Chip-Chip but also a myriad of small-scale experiments and predictive computational methods have generated a treasure of interactomics data over the last decade, most of which is now publicly available. By combining this data into a biological interaction network, which contains all molecular pathways that an organisms can utilize and thus is the equivalent of the blueprint of an organisms, it is possible to integrate the omics data obtained from experiments with these biological interaction networks. Biological interaction networks are key to the computational methods presented in this thesis as they enables methods to account for important relations between genes (and gene products). Doing so it is possible to not only identify interesting genes but also to uncover molecular processes important to the phenotype. As the best way to analyze omics data from an interesting phenotype varies widely based on the experimental setup and the available data, multiple methods were developed and applied in the context of this thesis: In a first approach, an existing method (PheNetic) was applied to a consortium of three bacterial species that together are able to efficiently degrade a herbicide but none of the species are able to efficiently degrade the herbicide on their own. For each of the species expression data (RNA-seq) was generated for the consortium and the species in isolation. PheNetic identified molecular pathways which were differentially expressed and likely contribute to a cross-feeding mechanism between the species in the consortium. Having obtained proof-of-concept, PheNetic was adapted to cope with experimental evolution datasets in which, in addition to expression data, genomics data was also available. Two publicly available datasets were analyzed: Amikacin resistance in E. coli and coexisting ecotypes in E.coli. The results allowed to elicit well-known and newly found molecular pathways involved in these phenotypes. Experimental evolution sometimes generates datasets consisting of mutator phenotypes which have high mutation rates. These datasets are hard to analyze due to the large amount of noise (most mutations have no effect on the phenotype). To this end IAMBEE was developed. IAMBEE is able to analyze genomic datasets from evolution experiments even if they contain mutator phenotypes. IAMBEE was tested using an E. coli evolution experiment in which cells were exposed to increasing concentrations of ethanol. The results were validated in the wet-lab. In addition to methods for analysis of causal mutations and mechanisms in bacteria, a method for the identification of causal molecular pathways in cancer was developed. As bacteria and cancerous cells are both clonal, they can be treated similar in this context. The big differences are the amount of data available (many more samples are available in cancer) and the fact that cancer is a complex and heterogenic phenotype. Therefore we developed SSA-ME, which makes use of the concept that a causal molecular pathway has at most one mutation in a cancerous cell (mutual exclusivity). However, enforcing this criterion is computationally hard. SSA-ME is designed to cope with this problem and search for mutual exclusive patterns in relatively large datasets. SSA-ME was tested on cancer data from the TCGA PAN-cancer dataset. From the results we could, in addition to already known molecular pathways and mutated genes, predict the involvement of few rarely mutated genes.nrpages: 246status: publishe

    Proteomic approaches for quantitative cancer cell signaling

    Get PDF

    Of yeast and men: Dissecting the interaction between fungi and immune response

    Get PDF
    Il lavoro riportato in questa tesi si pone l’obiettivo di comprendere i meccanismi che governano l’interazione tra le cellule del sistema immunitario e i microrganismi, attraverso un approccio di Systems Biology. Questo lavoro combina infatti il tradizionale lavoro di laboratorio e l’analisi a livello trascrizionale, con metodi computazionali e bioinformatici, allo scopo di comprendere la risposta delle cellule dendritiche ai funghi, in particolare al lievito non patogeno Saccharomyces cerevisiae. Questo lavoro ha due obiettivi principali: in primo luogo, lo sviluppo di un approccio analitico in grado di facilitare l’interpretazione dei dati ottenuti da sistemi ad alte prestazioni (come i microarray o la proteomica) e aumentare la possibilità di confronto tra diversi esperimenti. Accanto ad esso è stato sviluppato un modello per rappresentare le reti di segnalazione indotte dalla risposta immunitaria, dal riconoscimento recettoriale all’attivazione cellulare. In secondo luogo, attraverso l'integrazione dei diversi risultati, è stata indagata la la capacità delle cellule dendritiche di discriminare tra microrganismi patogeni e non patogeni. Combinando l'analisi trascrizionale con esperimenti volti a comprendere il ruolo dei diversi recettori nel riconoscimento ai funghi, e la risposta immunitaria indotta dalle cellule dendritiche, abbiamo identificato meccanismi di risposta diversi a C. albicans e S. cerevisiae. Abbiamo inoltre osservato come l'interazione tra spore e lieviti sia cruciale per il commensalismo di S. cerevisiae. L'integrazione di un approccio di System Biology a dati funzionali offre nuovi strumenti per comprendere i meccanismi di virulenza associati ai funghi: non solo il dimorfismo ma anche l’utilizzo di diversi recettori sulla superficie delle cellule dendritiche possono mediare la presentazione dell'antigene, condizionare la tipologia della risposta adattative e, in ultima analisi, favorire il commensalismo o l’infezione.The research presented here aimed at using a Systems Biology approach to understand the mechanism governing a fruitful interaction between microbes and the human system. Systems Biology requires the acquisition of information on the different levels of regulation of a biological system and its integration in the development of models, that could predict the outcome of stimuli and changes in variables controlling the dynamic nature of the system. This work combined traditional wet-lab work and genome wide analyses of transcription and gene regulation, with computational and bioinformatic methods to dissect the response of dendritic cells to fungi, in particular to the harmless Saccharomyces cerevisiae. This work had two main goals: first, the implementation of an analytical approach that would facilitate the interpretation of the ‘-omics’ results and increase the comparability between different data sets, lessening the problems associated with the use of different types of data and array platforms; and the development of new pathway structure to allow temporal dissection of the immune response associated to the pattern recognition receptor sensing; secondly, by integration of different results, to investigate the immune response that discriminates between friends or foes. Combining transcriptional analysis with receptor-specific blocking and cytokine production assays, we determined that DCs respond differently to C. albicans and S. cerevisiae and in the latter case, the interplay between spores and yeasts is crucial for the commensalism of S. cerevisiae. The integration of a System Biology approach to functional data offers new interpretive clues to the mechanisms of fungal virulence: rather than dimorphism per se, the engagement of different recognition receptors on DCs might select the mode of fungal internalization and antigen presentation, condition the nature of the T-helper response and, ultimately, favor saprophytism or infection

    Caracterización de la microbiota de queso Serpa y selección de cepas nativas con aptitud probiótica

    Get PDF
    El queso Serpa, una Denominación de Origen Protegida (DOP), es un queso maturado de leche cruda de oveja producido en el sur de Portugal (Alentejo), en un área geográfica de producción establecida en el Decreto Reglamentario Nº 39/87. A pesar de ser un queso muy apreciado e importante en la economía local, no existen datos consistentes sobre la comunidad microbiana indígena involucrada. La presencia de esta microflora es esencial para la calidad final, la seguridad y la autenticidad del producto. Este trabajo tuvo como objetivo estudiar las poblaciones microbianas dominantes que actúan durante la elaboración del queso. Esto se realizó mediante una combinación de cultivo convencional y técnicas moleculares, para establecer las cepas más influyentes. Las cepas identificadas y representativas se caracterizaron en términos de su potencial bioactivo. La cantidad total de bacterias mesófilas al final de la maduración fue de 8,5 log ufc/g, siendo las bacterias del ácido láctico el grupo microbiano predominante, seguido de las enterobacterias y las levaduras. “Lactobacillus paracasei/casei” fue la principal especie entre las primeras y “Hafnia alvei” entre las enterobacterias, mientras que “Debaryomyces hansenii” y “Kluyveromyces marxianus” predominaron entre las levaduras. Los resultados obtenidos por secuenciación de alto rendimiento revelan el género “Lactococcus”, seguido de los géneros “Leuconostoc” y “Lactobacillus”. Teniendo en cuenta las características probióticas estudiadas se seleccionaron tres cepas, “Lb. brevis” C1Lb21, “Lb. plantarum” G1Lb5 y “Lb. pentosus” G4Lb7 porque su seguridad, mostraban buena tolerancia a las condiciones del tracto gastrointestinal (GIT) y la capacidad de colonizar el intestino.Serpa cheese, a Protected Designation of Origin (PDO), is a ripened raw ewes milk cheese produced in the south of Portugal (Alentejo), in a geographical area of production established in the Regulatory Decree Nº 39/87. Despite being a highly appreciated cheese and having great importance in the local economy, there is no consistent data, achieved by the use of molecular methods, about the indigenous microbial community involved. The presence of this microflora is essential to the final quality, security and authenticity of the product. Thus, this work aimed to study the dominant microbial populations that act during cheese making. This was done through a combination of conventional cultivation and molecular techniques, in order to establish the most influential strains. The identified and representative strains were characterized in terms of their bioactive potential. The total amount of mesophilic bacteria at the end of ripening was, on average, 8.5 log cfu/g, with lactic acid bacteria being the predominant microbial group, followed by Enterobacteria and yeasts. “Lactobacillus paracasei/casei” were the main species among the former and “Hafnia alvei” among the Enterobacteria, while “Debaryomyces hansenii” and “Kluyveromyces marxianus” predominated among yeasts. The results obtained by high-throughput sequencing reveal the “Lactococcus” genus, followed by the “Leuconostoc” and “Lactobacillus” genres. Considering the probiotic characteristics studied three potential probiotic strains (PPS) namely, “Lb. brevis”, “Lb. plantarum”, “Lb. pentosus” were selected as they were safe, showed good tolerance to stress conditions found in the gastrointestinal tract (GIT) and the ability to colonize the intestine.Ministerio de Agricultura y Desarrollo Rural de Portugal; Fondo Europeo Agrícola de Desarrollo Rural (FEDER), a través del acuerdo de asociación Portugal 2020-PDR: proyecto “SerpaFlora - Valorización de la flora nativa del queso Serpa” (PDR2020-101-031017

    A synthetic biology approach to monitoring transient interactions between cancer and immune cells

    Get PDF
    Immune cells play an important role in tumour growth and progression, as well as establishment at metastatic sites. Although inherently, immune system is designed to locate, target and eliminate malignant cells, evolutionary processes within a host allow tumourigenic cells to develop mechanisms and pathways to avoid immune recognition. There is a substantial amount of knowledge on how particular immune cell subtypes contribute to cancer growth and progression. Specifically, macrophages play an important role in mitigating immune response and induction of anti-inflammatory response. Due to this reason, macrophages can become potential new therapeutic targets. However, the knowledge of underlying mechanisms is limited due to the absence of robust tools for studying transient cell-cell interactions between cancer cells and macrophages at tumour microenvironment. Recent advances in synthetic biology have introduced a vast array of tools, particularly synthetic receptors, which have reported a broad range of applications in biosensing. One of such tools is synNotch receptor, which is derived from the core of the Notch receptor and is activated by cell-cell contact. Both extracellular and intracellular domains of synNotch can be substituted with custom sensing and signal transduction domains to carry out custom input/output circuits. In this thesis, the aim is to repurpose synNotch to detect interactions between cancer cells and macrophages in aims to develop a robust tool to aid in studying the mechanisms of metastasis development and bring insights into potential therapeutic targets

    Identification de nouvelles protéines effectrices dans la signalisation des récepteurs Eph

    Get PDF
    La réponse cellulaire aux stimuli extracellulaires est souvent médiée par des voies de signalisation qui agissent en aval des récepteurs transmembranaires, comme les récepteurs tyrosine kinases (RTK). Avec quatorze membres, la famille des récepteurs Eph représente la plus grande famille de RTK chez l'humain. Contrairement aux autres RTK, les ligands des récepteurs Eph, les éphrines, sont des protéines associées à la membrane cellulaire. La signalisation Eph-éprhines est donc principalement impliquée dans des événements de communication qui impliquent des contacts cellule-cellule comme la migration cellulaire, la répulsion et l'adhésion cellule-cellule. Ces événements sont cruciaux pour certains processus biologiques tels le guidage axonal et l'organisation tissulaire dans l'organisme en développement et chez l'adulte. Les récepteurs Eph sont fréquemment surexprimés ou dérégulés dans divers cancers, en particulier dans les plus agressifs et mortels. Récemment, la signalisation Eph-éphrines est devenue une nouvelle cible émergente pour le traitement du cancer. Bien que les fonctions biologiques des récepteurs Eph aient été largement étudiées, notre compréhension des mécanismes moléculaires grâce auxquels les récepteurs Eph régulent des phénotypes cellulaires précis demeure incomplète. Pour mieux comprendre le système de signalisation impliquant les Eph, mes travaux ont porté sur l'identification de nouvelles protéines effectrices en aval des récepteurs Eph et sur l'étude de leurs implications dans les fonctions régulées par les récepteurs Eph. Pour mieux comprendre les complexes de signalisation associés aux récepteurs Eph dans des conditions natives, j'ai appliqué une approche basée sur la spectrométrie de masse (MS), le marquage de proximité BioID. Cela m'a permis de surmonter les limites de l'utilisation d'approches conventionnelles de purification par affinité pour cartographier les interactions protéine-protéine liées aux récepteurs transmembranaires. J'ai obtenu un réseau de signalisation dépendant des récepteurs EphA4, - B2, -B3 et -B4, qui comprend 395 protéines, dont la plupart n'avaient jamais été liées à la signalisation Eph-dépendante. Pour tester la pertinence biologique des partenaires identifiés, j'ai examiné la contribution de 17 candidats en utilisant une approche de perte de fonction dans une expérience de tri cellulaire dépendante des récepteurs Eph. J'ai pu montrer que la déplétion de quelques candidats, incluant la protéine Par3, bloque le tri des cellules. En utilisant la purification par affinité combinée à la MS, j'ai aussi identifié un complexe de signalisation impliquant la kinase C-terminal SRC (CSK), dont le recrutement aux complexes Par3 dépend des signaux des récepteurs Eph. Pour mieux comprendre les interactions protéiques suivant la liaison Eph-éphrine, j'ai effectué des expériences de TurboID. Ces études m'ont permis d'identifier des complexes protéiques associés au récepteur EphA4 lorsqu'il est lié à l'éphrine-B2. J'ai également étudié les interactions protéine-protéine dépendantes de la liaison du récepteur EphB2 aux éphrines-B1 et -B2. Pour explorer si l'interaction d'EphB2 avec ces deux ligands peut mener à une réponse de signalisation inverse différente, j'ai identifié les partenaires de des ephrin-B1/-B2 lorsque stimulés par EphB2. Enfin, j'ai cartographié les réseaux de signalisation dépendants des récepteurs EphA4 et EphB2 sauvages ou kinase-inactifs, ce qui m'a permis de conclure que la perte de leur activité catalytique a conduit à des changements majeurs dans les interactomes dépendants de ces récepteurs. L'ensemble de mes résultats a permis de mieux définir les complexes protéiques dépendants des récepteurs Eph. Mes études ont mené à une meilleure compréhension des mécanismes moléculaires sous-jacents aux récepteurs Eph et de leur contribution dans le processus de délimitation des tissus, un processus souvent perturbé dans des maladies comme le cancer.The cellular response to extracellular stimuli is often mediated by signaling pathways that act downstream of transmembrane receptors, such as receptor tyrosine kinases (RTKs). With fourteen members, the Eph family of RTKs is the largest in humans. In contrast to other RTKs, Eph receptor cognate ligands, ephrins, are tethered to the cell surface. This results in Eph receptor-ephrin signaling being mainly involved in short-range cell-cell communication events that regulate cell migration, repulsion and cell-cell adhesion. These events are crucial in biological processes such as axon guidance and tissue boundary formation in the developing and adult organisms. Eph receptors are frequently overexpressed or deregulated in a variety of human tumors, especially in the more aggressive and lethal ones. In recent years, the Eph-ephrin signaling system became an emerging new target for cancer treatment. Although a plethora of Eph receptor biological functions have been extensively studied, we still have a vague idea on the molecular mechanisms of Eph receptor signal transduction, underlying how Eph receptors regulate precise cellular phenotypes. To better understand the Eph receptor signaling system, my studies focused on the identification of novel Eph receptor downstream effector proteins and the determination of their requirement for Eph receptor-regulated functions. To unravel Eph receptor-associated signaling complexes under native conditions, I applied a mass spectrometry (MS)-based approach, namely BioID proximity labeling. This allowed me to overcome the limitations of conventional affinity purification approaches for mapping protein-protein interactions of transmembrane receptors. I obtained a composite signaling network from EphA4, -B2, -B3 and -B4 receptors that comprises 395 proteins, most of which not previously linked to Eph signaling. To test the biological relevance of the identified Eph receptor proximity interactors, I examined the contribution of 17 candidates using a loss-of-function approach in an Eph receptor-dependent cell sorting assay. I showed that depletion of a few candidates, including the signaling scaffold Par3, blocks Eph receptordependent cell sorting. Using affinity purification combined with MS, I further delineated a signaling complex involving C-terminal SRC kinase (CSK), whose recruitment to Par3 complexes is dependent on Eph receptor signals. To further elucidate Eph receptor-centric signaling complexes that are formed upon ephrin binding and are affected by Eph receptor catalytic activity I performed TurboID experiments. I systematically mapped ligand stimulation-dependent signaling networks downstream of EphA4 and EphB2 receptors. I dissected the impact of ephrin-B2 stimulation on the formation of EphA4- nucleated proximal protein complexes. Moreover, I showed the differential recruitment of EphB2 partners upon receptor binding to the same subclass of ligands, ephrin-B1 and ephrin-B2. To explore whether the EphB2 interactions with these two ephrin-B ligands elicit different reverse signaling responses, I delineated ephrin-B1/-B2 proximity partners recruited upon EphB2 stimulation. I also determined that the kinase domain of EphA4/-B2 plays a major role in determining the composition of signaling networks around the receptors, as a loss of catalytic activity led to a drastic decrease in a number of interactors with the receptors. Collectively, my definition of Eph receptor signaling networks sheds light on physiologically relevant Eph receptor-centered protein complexes that occur in living cells. These studies will lead to a better understanding of the mechanisms by which Eph receptors transmit signals at the membrane and give insight into how Eph receptor-mediated signaling pathways contribute to boundary formation, a process often disrupted in diseases like cancer

    Pacific Symposium on Biocomputing 2023

    Get PDF
    The Pacific Symposium on Biocomputing (PSB) 2023 is an international, multidisciplinary conference for the presentation and discussion of current research in the theory and application of computational methods in problems of biological significance. Presentations are rigorously peer reviewed and are published in an archival proceedings volume. PSB 2023 will be held on January 3-7, 2023 in Kohala Coast, Hawaii. Tutorials and workshops will be offered prior to the start of the conference.PSB 2023 will bring together top researchers from the US, the Asian Pacific nations, and around the world to exchange research results and address open issues in all aspects of computational biology. It is a forum for the presentation of work in databases, algorithms, interfaces, visualization, modeling, and other computational methods, as applied to biological problems, with emphasis on applications in data-rich areas of molecular biology.The PSB has been designed to be responsive to the need for critical mass in sub-disciplines within biocomputing. For that reason, it is the only meeting whose sessions are defined dynamically each year in response to specific proposals. PSB sessions are organized by leaders of research in biocomputing's 'hot topics.' In this way, the meeting provides an early forum for serious examination of emerging methods and approaches in this rapidly changing field

    Comparative genomics of Dothideomycete fungi

    Get PDF
    Fungi are a diverse group of eukaryotic micro-organisms particularly suited for comparative genomics analyses. Fungi are important to industry, fundamental science and many of them are notorious pathogens of crops, thereby endangering global food supply. Dozens of fungi have been sequenced in the last decade and with the advances of the next generation sequencing, thousands of new genome sequences will become available in coming years. In this thesis I have used bioinformatics tools to study different biological and evolutionary processes in various genomes with a focus on the genomes of the Dothideomycetefungi Cladosporium fulvum, Dothistroma septosporumand Zymoseptoria tritici. Chapter 1introduces the scientific disciplines of mycology and bioinformatics from a historical perspective. It exemplifies a typical whole-genome sequence analysis of a fungal genome, and focusses in particular on structural gene annotation and detection of transposable elements. In addition it shortly reviews the microRNA pathway as known in animal and plants in the context of the putative existence of similar yet subtle different small RNA pathways in other branches of the eukaryotic tree of life. Chapter 2addresses the novel sequenced genomes of the closely related Dothideomyceteplant pathogenic fungi Cladosporium fulvumand Dothistroma septosporum. Remarkably, it revealed occurrence of a surprisingly high similarity at the protein level combined with striking differences at the DNA level, gene repertoire and gene expression. Most noticeably, the genome of C. fulvumappears to be at least twice as large, which is solely attributable to a much larger content in repetitive sequences. Chapter 3describes a novel alignment-based fungal gene prediction method (ABFGP) that is particularly suitable for plastic genomes like those of fungi. It shows excellent performance benchmarked on a dataset of 7,000 unigene-supported gene models from ten different fungi. Applicability of the method was shown by revisiting the annotations of C. fulvumand D. septosporumand of various other fungal genomes from the first-generation sequencing era. Thousands of gene models were revised in each of the gene catalogues, indeed revealing a correlation to the quality of the genome assembly, and to sequencing strategies used in the sequencing centres, highlighting different types of errors in different annotation pipelines. Chapter 4focusses on the unexpected high number of gene models that were identified by ABFGP that align nicely to informant genes, but only upon toleration of frame shifts and in-frame stop-codons. These discordances could represent sequence errors (SEs) and/or disruptive mutations (DMs) that caused these truncated and erroneous gene models. We revisited the same fungal gene catalogues as in chapter 3, confirmed SEs by resequencing and successively removed those, yielding a high-confidence and large dataset of nearly 1,000 pseudogenes caused by DMs. This dataset of fungal pseudogenes, containing genes listed as bona fide genes in current gene catalogues, does not correspond to various observations previously done on fungal pseudogenes. Moreover, the degree of pseudogenization showing up to a ten-fold variation for the lowest versus the highest affected species, is generally higher in species that reproduce asexually compared to those that in addition reproduce sexually. Chapter 5describes explorative genomics and comparative genomics analyses revealing the presence of introner-like elements (ILEs) in various Dothideomycetefungi including Zymoseptoria triticiin which they had not identified yet, although its genome sequence is already publicly available for several years. ILEs combine hallmark intron properties with the apparent capability of multiplying themselves as repetitive sequence. ILEs strongly associate with events of intron gain, thereby delivering in silico proof of their mobility. Phylogenetic analyses at the intra- and inter-species level showed that most ILEs are related and likely share common ancestry. Chapter 6provides additional evidence that ILE multiplication strongly dominates over other types of intron duplication in fungi. The observed high rate of ILE multiplication followed by rapid sequence degeneration led us to hypothesize that multiplication of ILEs has been the major cause and mechanism of intron gain in fungi, and we speculate that this could be generalized to all eukaryotes. Chapter 7describes a new strategy for miRNA hairpin prediction using statistical distributions of observed biological variation of properties (descriptors) of known miRNA hairpins. We show that the method outperforms miRNA prediction by previous, conventional methods that usually apply threshold filtering. Using this method, several novel candidate miRNAs were assigned in the genomes of Caenorhabditis elegansand two human viruses. Although this chapter is not applied on fungi, the study does provide a flexible method to find evidence for existence of a putative miRNA-like pathway in fungi. Chapter 8provides a general discussion on the advent of bioinformatics in mycological research and its implications. It highlights the necessity of a prioriplanning and integration of functional analysis and bioinformatics in order to achieve scientific excellence, and describes possible scenarios for the near future of fungal (comparative) genomics research. Moreover, it discusses the intrinsic error rate in large-scale, automatically inferred datasets and the implications of using and comparing those.</p
    corecore