256 research outputs found

    Monte Carlo Procedure for Protein Design

    Full text link
    A new method for sequence optimization in protein models is presented. The approach, which has inherited its basic philosophy from recent work by Deutsch and Kurosky [Phys. Rev. Lett. 76, 323 (1996)] by maximizing conditional probabilities rather than minimizing energy functions, is based upon a novel and very efficient multisequence Monte Carlo scheme. By construction, the method ensures that the designed sequences represent good folders thermodynamically. A bootstrap procedure for the sequence space search is devised making very large chains feasible. The algorithm is successfully explored on the two-dimensional HP model with chain lengths N=16, 18 and 32.Comment: 7 pages LaTeX, 4 Postscript figures; minor change

    Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data

    Get PDF
    Background: MicroRNAs (miRNAs) are short, non-coding RNA regulators of protein coding genes. miRNAs play a very important role in diverse biological processes and various diseases. Many algorithms are able to predict miRNA genes and their targets, but their transcription regulation is still under investigation. It is generally believed that intragenic miRNAs (located in introns or exons of protein coding genes) are co-transcribed with their host genes and most intergenic miRNAs transcribed from their own RNA polymerase II (Pol II) promoter. However, the length of the primary transcripts and promoter organization is currently unknown. Methodology: We performed Pol II chromatin immunoprecipitation (ChIP)-chip using a custom array surrounding regions of known miRNA genes. To identify the true core transcription start sites of the miRNA genes we developed a new tool (CPPP). We showed that miRNA genes can be transcribed from promoters located several kilobases away and that their promoters share the same general features as those of protein coding genes. Finally, we found evidence that as many as 26% of the intragenic miRNAs may be transcribed from their own unique promoters. Conclusion: miRNA promoters have similar features to those of protein coding genes, but miRNA transcript organization is more complex. © 2009 Corcoran et al

    Identification of TNF-alpha-Responsive Promoters and Enhancers in the Intestinal Epithelial Cell Model Caco-2

    Get PDF
    The Caco-2 cell line is one of the most important in vitro models for enterocytes, and is used to study drug absorption and disease, including inflammatory bowel disease and cancer. In order to use the model optimally, it is necessary to map its functional entities. In this study, we have generated genome-wide maps of active transcription start sites (TSSs), and active enhancers in Caco-2 cells with or without tumour necrosis factor (TNF)-α stimulation to mimic an inflammatory state. We found 520 promoters that significantly changed their usage level upon TNF-α stimulation; of these, 52% are not annotated. A subset of these has the potential to confer change in protein function due to protein domain exclusion. Moreover, we locate 890 transcribed enhancer candidates, where ∼50% are changing in usage after TNF-α stimulation. These enhancers share motif enrichments with similarly responding gene promoters. As a case example, we characterize an enhancer regulating the laminin-5 γ2-chain (LAMC2) gene by nuclear factor (NF)-κB binding. This report is the first to present comprehensive TSS and enhancer maps over Caco-2 cells, and highlights many novel inflammation-specific promoters and enhancers

    The PAZAR database of gene regulatory information coupled to the ORCA toolkit for the study of regulatory sequences

    Get PDF
    The PAZAR database unites independently created and maintained data collections of transcription factor and regulatory sequence annotation. The flexible PAZAR schema permits the representation of diverse information derived from experiments ranging from biochemical protein–DNA binding to cellular reporter gene assays. Data collections can be made available to the public, or restricted to specific system users. The data ‘boutiques’ within the shopping-mall-inspired system facilitate the analysis of genomics data and the creation of predictive models of gene regulation. Since its initial release, PAZAR has grown in terms of data, features and through the addition of an associated package of software tools called the ORCA toolkit (ORCAtk). ORCAtk allows users to rapidly develop analyses based on the information stored in the PAZAR system. PAZAR is available at http://www.pazar.info. ORCAtk can be accessed through convenient buttons located in the PAZAR pages or via our website at http://www.cisreg.ca/ORCAtk

    Genome Desertification in Eutherians: Can Gene Deserts Explain the Uneven Distribution of Genes in Placental Mammalian Genomes?

    Get PDF
    The evolution of genome size as well as structure and organization of genomes belongs among the key questions of genome biology. Here we show, based on a comparative analysis of 30 genomes, that there is generally a tight correlation between the number of genes per chromosome and the length of the respective chromosome in eukaryotic genomes. The surprising exceptions to this pattern are placental mammalian genomes. We identify the number and, more importantly, the uneven distribution of gene deserts among chromosomes, i.e., long (>500 kb) stretches of DNA that do not encode for genes, as the main contributing factor for the observed anomaly of eutherian genomes. Gene-rich placental mammalian chromosomes have smaller proportions of gene deserts and vice versa. We show that the uneven distribution of gene deserts is a derived character state of eutherians. The functional and evolutionary significance of this particular feature of eutherian genomes remains to be explained

    European breast surgical oncology certification theoretical and practical knowledge curriculum 2020

    Get PDF
    The Breast Surgery theoretical and practical knowledge curriculum comprehensively describes the knowledge and skills expected of a fully trained surgeon practicing in the European Union and European Economic Area (EEA). It forms part of a range of factors that contribute to the delivery of high quality cancer care. It has been developed by a panel of experts from across Europe and has been validated by professional breast surgery societies in Europe. The curriculum maps closely to the syllabus of the Union of European Medical Specialists (UEMS) Breast Surgery Exam, the UK FRCS (breast specialist interest) curriculum and other professional standards across Europe and globally (USA Society of Surgical Oncology, SSO). It is envisioned that this will serve as the basis for breast surgery training, examination and accreditation across Europe to harmonise and raise standards as breast surgery develops as a separate discipline from its parent specialties (general surgery, gynaecology, surgical oncology and plastic surgery). The curriculum is not static but will be revised and updated by the curriculum development group of the European Breast Surgical Oncology Certification group (BRESO) every 2 years

    Genomic and Transcriptional Co-Localization of Protein-Coding and Long Non-Coding RNA Pairs in the Developing Brain

    Get PDF
    Besides protein-coding mRNAs, eukaryotic transcriptomes include many long non-protein-coding RNAs (ncRNAs) of unknown function that are transcribed away from protein-coding loci. Here, we have identified 659 intergenic long ncRNAs whose genomic sequences individually exhibit evolutionary constraint, a hallmark of functionality. Of this set, those expressed in the brain are more frequently conserved and are significantly enriched with predicted RNA secondary structures. Furthermore, brain-expressed long ncRNAs are preferentially located adjacent to protein-coding genes that are (1) also expressed in the brain and (2) involved in transcriptional regulation or in nervous system development. This led us to the hypothesis that spatiotemporal co-expression of ncRNAs and nearby protein-coding genes represents a general phenomenon, a prediction that was confirmed subsequently by in situ hybridisation in developing and adult mouse brain. We provide the full set of constrained long ncRNAs as an important experimental resource and present, for the first time, substantive and predictive criteria for prioritising long ncRNA and mRNA transcript pairs when investigating their biological functions and contributions to development and disease

    Algorithms for learning parsimonious context trees

    Get PDF
    Parsimonious context trees, PCTs, provide a sparse parameterization of conditional probability distributions. They are particularly powerful for modeling context-specific independencies in sequential discrete data. Learning PCTs from data is computationally hard due to the combinatorial explosion of the space of model structures as the number of predictor variables grows. Under the score-and-search paradigm, the fastest algorithm for finding an optimal PCT, prior to the present work, is based on dynamic programming. While the algorithm can handle small instances fast, it becomes infeasible already when there are half a dozen four-state predictor variables. Here, we show that common scoring functions enable the use of new algorithmic ideas, which can significantly expedite the dynamic programming algorithm on typical data. Specifically, we introduce a memoization technique, which exploits regularities within the predictor variables by equating different contexts associated with the same data subset, and a bound-and-prune technique, which exploits regularities within the response variable by pruning parts of the search space based on score upper bounds. On real-world data from recent applications of PCTs within computational biology the ideas are shown to reduce the traversed search space and the computation time by several orders of magnitude in typical cases.Peer reviewe

    Reverse Engineering the Yeast RNR1 Transcriptional Control System

    Get PDF
    Transcription is controlled by multi-protein complexes binding to short non-coding regions of genomic DNA. These complexes interact combinatorially. A major goal of modern biology is to provide simple models that predict this complex behavior. The yeast gene RNR1 is transcribed periodically during the cell cycle. Here, we present a pilot study to demonstrate a new method of deciphering the logic behind transcriptional regulation. We took regular samples from cell cycle synchronized cultures of Saccharomyces cerevisiae and extracted nuclear protein. We tested these samples to measure the amount of protein that bound to seven different 16 base pair sequences of DNA that have been previously identified as protein binding locations in the promoter of the RNR1 gene. These tests were performed using surface plasmon resonance. We found that the surface plasmon resonance signals showed significant variation throughout the cell cycle. We correlated the protein binding data with previously published mRNA expression data and interpreted this to show that transcription requires protein bound to a particular site and either five different sites or one additional sites. We conclude that this demonstrates the feasibility of this approach to decipher the combinatorial logic of transcription
    corecore