69 research outputs found

    Analisi della struttura genomica di <i>Arabidopsis thaliana</i> L.

    Get PDF
    The present thesis provides an insight on the genomic structure of plant species, by taking in exam the model organism Arabidopsis thaliana. The research activity pointed towards three directions. Initially the correlation between the expression profile and some structural properties such as the sequence length and the GC (guanine + cytosine) content was studied. The results revealed that in plants highly expressed genes undergo a selection for miniaturization which is probably due to the need to minimize the cost of the transcription/translation process. In a successive phase the usage of the synonymous codons (i.e. nucleotide triplettes which code for the same amino acid) was investigated within 15 Arabidopsis tissues. The results showed that genes specifically expressed in certain tissues use a definite set of codons, whereas more widely expressed genes feature a codon composition which is, at a certain extent, a compromise between the codons used in the single tissues. Finally the nucleotide composition as a function of the position in the gene was studied in two monocots and two dicots. For all the analyzed bases compositional gradients were revealed. The observed trends, mostly describable with a linear model, underlined marked difference between monocots and dicots

    Seforta, an integrated tool for detecting the signature of selection in coding sequences

    Get PDF
    Background: The majority of amino acid residues are encoded by more than one codon, and a bias in the usage of such synonymous codons has been repeatedly demonstrated. One assumption is that this phenomenon has evolved to improve the efficiency of translation by reducing the time required for the recruitment of isoacceptors. The most abundant tRNA species are preferred at sites on the protein which are key for its functionality, a behavior which has been termed “translational accuracy”. Although observed in many species, as yet no public domain software has been made available for its quantification. Findings: We present here Seforta (Selection for Translational Accuracy), a program designed to quantify translational accuracy. It searches for synonymous codon usage bias in both conserved and non-conserved regions of coding sequences and computes a cumulative odds ratio and a Z-score. The specification of a set of preferred codons is desirable, but the program can also generate these. Finally, a randomization protocol calculates the probability that preferred codon combinations could have arisen by chance. Conclusions: Seforta is the first public domain program able to quantify translational accuracy. It comes with a simple graphical user interface and can be readily installed and adjusted to the user's requirements

    Mutational biases and selective forces shaping the structure of <i>Arabidopsis</i> genes

    Get PDF
    Recently features of gene expression profiles have been associated with structural parameters of gene sequences in organisms representing a diverse set of taxa. The emerging picture indicates that natural selection, mediated by gene expression profiles, has a significant role in determining genic structures. However the current situation is less clear in plants as the available data indicates that the effect of natural selection mediated by gene expression is very weak. Moreover, the direction of the patterns in plants appears to contradict those observed in animal genomes. In the present work we analized expression data for &gt;18000 Arabidopsis genes retrieved from public datasets obtained with different technologies (MPSS and high density chip arrays) and compared them with gene parameters. Our results show that the impact of natural selection mediated by expression on genes sequences is significant and distinguishable from the effects of regional mutational biases. In addition, we provide evidence that the level and the breadth of gene expression are related in opposite ways to many structural parameters of gene sequences. Higher levels of expression abundance are associated with smaller transcripts, consistent with the need to reduce costs of both transcription and translation. Expression breadth, however, shows a contrasting pattern, i.e. longer genes have higher breadth of expression, possibly to ensure those structural features associated with gene plasticity. Based on these results, we propose that the specific balance between these two selective forces play a significant role in shaping the structure of Arabidopsis genes

    Genome-wide characterisation and expression profile of the grapevine ATL ubiquitin ligase family reveal biotic and abiotic stress-responsive and development-related members

    Get PDF
    The Arabidopsis T\uf3xicos en Levadura (ATL) protein family is a class of E3 ubiquitin ligases with a characteristic RING-H2 Zn-finger structure that mediates diverse physiological processes and stress responses in plants. We carried out a genome-wide survey of grapevine (Vitis vinifera L.) ATL genes and retrieved 96 sequences containing the canonical ATL RING-H2 domain. We analysed their genomic organisation, gene structure and evolution, protein domains and phylogenetic relationships. Clustering revealed several clades, as already reported in Arabidopsis thaliana and rice (Oryza sativa), with an expanded subgroup of grapevine-specific genes. Most of the grapevine ATL genes lacked introns and were scattered among the 19 chromosomes, with a high level of duplication retention. Expression profiling revealed that some ATL genes are expressed specifically during early or late development and may participate in the juvenile to mature plant transition, whereas others may play a role in pathogen and/or abiotic stress responses, making them key candidates for further functional analysis. Our data offer the first genome-wide overview and annotation of the grapevine ATL family, and provide a basis for investigating the roles of specific family members in grapevine physiology and stress responses, as well as potential biotechnological applications

    GRACy: a tool for analysing human cytomegalovirus sequence data

    Get PDF
    Modern DNA sequencing has instituted a new era in human cytomegalovirus (HCMV) genomics. A key development has been the ability to determine the genome sequences of HCMV strains directly from clinical material. This involves the application of complex and often non-standardized bioinformatics approaches to analysing data of variable quality in a process that requires substantial manual intervention. To relieve this bottleneck, we have developed GRACy (Genome Reconstruction and Annotation of Cytomegalovirus), an easy-to-use tookit for analysing HCMV sequence data. GRACy automates and integrates modules for read filtering, genotyping, genome assembly, genome annotation, variant analysis and data submission. These modules were tested extensively on simulated and experimental data and outperformed generic approaches. GRACy is written in Python and is embedded in a graphical user interface with all required dependencies installed by a single command. It runs on the Linux operating system, and is designed to allow the future implementation of a cross-platform version. GRACy is distributed under a GPL 3.0 license and is freely available at https://bioinformatics.cvr.ac.uk/software/ with the manual and a test dataset

    Comprehensive workflow for the genome-wide identification and expression meta-analysis of the ATL E3 ubiquitin ligase gene family in grapevine

    Get PDF
    Classification and nomenclature of genes in a family can significantly contribute to the description of the diversity of encoded proteins and to the prediction of family functions based on several features, such as the presence of sequence motifs or of particular sites for post-translational modification and the expression profile of family members in different conditions. This work describes a detailed protocol for gene family characterization. Here, the procedure is applied to the characterization of the Arabidopsis Tóxicos in Levadura (ATL) E3 ubiquitin ligase family in grapevine. The methods include the genome-wide identification of family members, the characterization of gene localization, structure, and duplication, the analysis of conserved protein motifs, the prediction of protein localization and phosphorylation sites as well as gene expression profiling across the family in different datasets. Such procedure, which could be extended to further analyses depending on experimental purposes, could be applied to any gene family in any plant species for which genomic data are available, and it provides valuable information to identify interesting candidates for functional studies, giving insights into the molecular mechanisms of plant adaptation to their environment.The work was supported by the University of Verona within the frame of Joint Project 2014 (Characterization of the ATL gene family in grapevine and of its involvement in resistance to Plasmopara viticola)

    Whole-genome approach to assessing human cytomegalovirus dynamics in transplant patients undergoing antiviral therapy

    Get PDF
    Human cytomegalovirus (HCMV) is the most frequent cause of opportunistic viral infection following transplantation. Viral factors of potential clinical importance include the selection of mutants resistant to antiviral drugs and the occurrence of infections involving multiple HCMV strains. These factors are typically addressed by analyzing relevant HCMV genes by PCR and Sanger sequencing, which involves independent assays of limited sensitivity. To assess the dynamics of viral populations with high sensitivity, we applied high-throughput sequencing coupled with HCMV-adapted target enrichment to samples collected longitudinally from 11 transplant recipients (solid organ, n=9, and allogeneic hematopoietic stem cell, n=2). Only the latter presented multiple-strain infections. Four cases presented resistance mutations (n=6), two (A594V and L595S) at high (100%) and four (V715M, 32 V781I, A809V and T838A) at low (&lt;25%) frequency. One allogeneic hematopoietic stem cell transplant recipient presented up to four resistance mutations, each at low frequency. The use of high throughput sequencing to monitor mutations and strain composition in people at risk of HCMV disease is of potential value in helping clinicians implement the most appropriate therapy

    Identifying high-confidence variants in human cytomegalovirus genomes sequenced from clinical samples

    Get PDF
    Understanding the intrahost evolution of viral populations has implications in pathogenesis, diagnosis and treatment, and has recently made impressive advances from developments in high-throughput sequencing. However, the underlying analyses are very sensitive to sources of bias, error and artefact in the data, and it is important that these are addressed adequately if robust conclusions are to be drawn. The key factors include: (i) determining the number of viral strains present in the sample analysed; (ii) monitoring the extent to which the data represent these strains and assessing the quality of these data; (iii) dealing with the effects of cross-contamination; and (iv) ensuring that the results are reproducible. We investigated these factors by generating sequence datasets, including biological and technical replicates, directly from clinical samples obtained from a small cohort of patients who had been infected congenitally with the herpesvirus human cytomegalovirus, with the aim of developing a strategy for identifying high-confidence intrahost variants. We found that such variants were few in number and typically present in low proportions, and concluded that human cytomegalovirus exhibits a very low level of intrahost variability. In addition to clarifying the situation regarding human cytomegalovirus, our strategy has wider applicability to understanding the intrahost variability of other viruses
    corecore