2 research outputs found

    Searching for gene clusters related to virulence by coding sequence conservation

    Get PDF
    Motivation: Due to the increasing world population, the need to improve food production is growing. This can be helped byfighting the pathogens which affect the main crops as maize, wheat, barley and sugar cane. Among those, biotrophic parasitessuch as smut fungi can be found. To study how those microorganisms infect their host, the model system Ustilago maydis canbe used.U. maydis secretes protein effectors to infect its host, and at least 25% of them are known to be grouped in 13 different geneclusters. In addition to these characterized clusters, 7 new clusters have been described in the bibliography but notexperimentally tested. The aim of this work is to find out new clusters with features similar to the known ones (controls), mainlylow conservation, which can affect the infection process.Methods: To achieve this goal, candidate gene clusters were initially discovered based on coding sequence conservation viathe computational tool AnABlast [1], which highlitghted genomic coding region with conservation signal similar to the initialcontrols. Then, the candidates were functionally annotated using the tool Sma3s_v2 [2]. To select the best candidates, aprincipal component analysis (PCA) was done using the following factors, which were trained with the controls: sequenceconservation obtained by a similarity search by Blast against close organisms (Ensembl fungi phylogeny), expression dataduring infection, and signal peptide presence (SignalP and TargetP), usually present in effectors.Currently, a laboratory experiment has been began to elucidate if the chosen candidates affect the pathogenity, deleting themby homologous recombination.Results: We have been able to identify 49 new clusters by comparing their coding signal with those already known. After thesubsequent analysis three of them, and one from the bibliography have been chosen to be tested in laboratory to elucidatetheir virulence phenotype (swelling and tumors).In the PCA our best candidate is located among the clusters previously described as pathogenic, showing genes beingsecreted with high levels of expressionConclusions: In brief, we propose that putative cluster of virulence sequences could be found by the presented strategy. So,it could constitute a new silico approach to find out specific genes involved in different biological processes such as inffection

    Searching for novel genes and pseudogenes in the human Y chromosome based on ancestral coding signals

    Get PDF
    Motivation: Human Y chromosome has several features that contribute to an extreme variation due to the lack of a homologous partner for crossing over, high rate of sequence amplification and low evolutionary pressure [1]. For these reasons, we think that the Y chromosome could be a perfect candidate in order to discover new coding and fossil regions such as pseudogenes.Genome finding is one of the greatest hits in modern biology. However, in silico identification of small and complex coding sequences is still challenging. Jiménez et al [2] developed AnAblast, a computer tool which has been successful in uncovering new genes, as well as fossil-coding sequences. This program generates profiles of accumulated alignments of conserved coding signals using a low-stringency BLAST strategy [2]. Methods: We have used AnAblast to localizate new coding regions in the chromosome Y. After that, AnAblast-generated profiles were introduced into a genome browser, along with other informative data such as repeats and RNA expression data. The candidate's list obtained was complemented by careful BLAST, InterPro and peaks analysis. Moreover, we performed a search on the tool Genome Data Viewer (GDV) to check each result.Results: We have been able to identify some chromosome Y regions that fulfill different requirements: (1) regions without previous annotations as pseudogenes, genes or non-coding regions (Ensembl track); (2) regions without previous annotations as interspersed repeats and low complexity (RepeatMasker track); and (3) regions with expression profiles (RNA-seq of testis).The best candidate to be a new coding region was localized at Y:9912876-9919657 (-). Blast and InterPro analysis indicated similarity with serine-proteases which are found in rodents and another organism such as Rousettus aegyptiacus (Egyptian fruit bat). After the search on GDV, we observed that only the first bat´s exon was not found in our candidate. In spite of this, we found a methionine codon in our candidate (more specifically in the first exon). Furthermore, the Y chromosome has a 5´-truncated copy of this region.Conclusions: We have found some chromosome Y regions which could be new coding genes or pseudogenes. Thus, this in silico research provides a powerful protocol to search novel genes and fossil regions in the whole human genome. Although we added several RNA-seq tracks that showed the expression of these regions, clinical trials should be performed to verify our candidates
    corecore