26 research outputs found

    TurboFold: Iterative probabilistic estimation of secondary structures for multiple RNA sequences

    Get PDF
    Abstract Background The prediction of secondary structure, i.e. the set of canonical base pairs between nucleotides, is a first step in developing an understanding of the function of an RNA sequence. The most accurate computational methods predict conserved structures for a set of homologous RNA sequences. These methods usually suffer from high computational complexity. In this paper, TurboFold, a novel and efficient method for secondary structure prediction for multiple RNA sequences, is presented. Results TurboFold takes, as input, a set of homologous RNA sequences and outputs estimates of the base pairing probabilities for each sequence. The base pairing probabilities for a sequence are estimated by combining intrinsic information, derived from the sequence itself via the nearest neighbor thermodynamic model, with extrinsic information, derived from the other sequences in the input set. For a given sequence, the extrinsic information is computed by using pairwise-sequence-alignment-based probabilities for co-incidence with each of the other sequences, along with estimated base pairing probabilities, from the previous iteration, for the other sequences. The extrinsic information is introduced as free energy modifications for base pairing in a partition function computation based on the nearest neighbor thermodynamic model. This process yields updated estimates of base pairing probability. The updated base pairing probabilities in turn are used to recompute extrinsic information, resulting in the overall iterative estimation procedure that defines TurboFold. TurboFold is benchmarked on a number of ncRNA datasets and compared against alternative secondary structure prediction methods. The iterative procedure in TurboFold is shown to improve estimates of base pairing probability with each iteration, though only small gains are obtained beyond three iterations. Secondary structures composed of base pairs with estimated probabilities higher than a significance threshold are shown to be more accurate for TurboFold than for alternative methods that estimate base pairing probabilities. TurboFold-MEA, which uses base pairing probabilities from TurboFold in a maximum expected accuracy algorithm for secondary structure prediction, has accuracy comparable to the best performing secondary structure prediction methods. The computational and memory requirements for TurboFold are modest and, in terms of sequence length and number of sequences, scale much more favorably than joint alignment and folding algorithms. Conclusions TurboFold is an iterative probabilistic method for predicting secondary structures for multiple RNA sequences that efficiently and accurately combines the information from the comparative analysis between sequences with the thermodynamic folding model. Unlike most other multi-sequence structure prediction methods, TurboFold does not enforce strict commonality of structures and is therefore useful for predicting structures for homologous sequences that have diverged significantly. TurboFold can be downloaded as part of the RNAstructure package at http://rna.urmc.rochester.edu.</p

    Are we doing enough? Evaluation of the Polio Eradication Initiative in a district of Pakistan's Punjab province: a LQAS study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The success of the Global Polio Eradication Initiative was remarkable, but four countries - Afghanistan, Pakistan, India and Nigeria - never interrupted polio transmission. Pakistan reportedly achieved all milestones except interrupting virus transmission. The aim of the study was to establish valid and reliable estimate for: routine oral polio vaccine (OPV) coverage, logistics management and the quality of monitoring systems in health facilities, NIDs OPV coverage, the quality of NIDs service delivery in static centers and mobile teams, and to ultimately provide scientific evidence for tailoring future interventions.</p> <p>Methods</p> <p>A cross-sectional study using lot quality assessment sampling was conducted in the District Nankana Sahib of Pakistan's Punjab province. Twenty primary health centers and their catchment areas were selected randomly as <it>'lots'</it>. The study involved the evaluation of 1080 children aged 12-23 months for routine OPV coverage, 20 health centers for logistics management and quality of monitoring systems, 420 households for NIDs OPV coverage, 20 static centers and 20 mobile teams for quality of NIDs service delivery. Study instruments were designed according to WHO guidelines.</p> <p>Results</p> <p>Five out of twenty lots were rejected for unacceptably low routine immunization coverage. The validity of coverage was questionable to extent that all lots were rejected. Among the 54.1% who were able to present immunization cards, only 74.0% had valid immunization. Routine coverage was significantly associated with card availability and socioeconomic factors. The main reasons for routine immunization failure were absence of a vaccinator and unawareness of need for immunization. Health workers (96.9%) were a major source of information. All of the 20 lots were rejected for poor compliance in logistics management and quality of monitoring systems. Mean compliance score and compliance percentage for logistics management were 5.4 ± 2.0 (scale 0-9) and 59.4% while those for quality of monitoring systems were 3.3 ± 1.2 (scale 0-6) and 54.2%. The 15 out of 20 lots were rejected for unacceptably low NIDs coverage by finger-mark. All of the 20 lots were rejected for poor NIDs service delivery (mean compliance score = 11.7 ± 2.1 [scale 0-16]; compliance percentage = 72.8%).</p> <p>Conclusion</p> <p>Low coverage, both routine and during NIDs, and poor quality of logistics management, monitoring systems and NIDs service delivery were highlighted as major constraints in polio eradication and these should be considered in prioritizing future strategies.</p

    From their own perspective - constraints in the Polio Eradication Initiative: perceptions of health workers and managers in a district of Pakistan's Punjab province

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The success of the Global Polio Eradication Initiative was remarkable, but four countries - Afghanistan, Pakistan, India and Nigeria - never interrupted polio transmission. Pakistan reportedly achieved all milestones except interrupting virus transmission. This paper describes the perceptions of health workers and managers regarding constraints in the Polio Eradication Initiative (PEI) to ultimately provide evidence for designing future interventions.</p> <p>Methods</p> <p>A qualitative cross-sectional study using focus group discussions and in-depth interviews was conducted in the Nankana Sahib District of Pakistan's Punjab province. Study subjects included staff at all levels in the PEI at district headquarters, in all 4 tehsils (sub-districts) and at 20 randomly selected primary health centers. In total, 4 FGD and 7 interview sessions were conducted and individual session summary notes were prepared and later synthesized, consolidated and subjected to conceptual analysis.</p> <p>Results</p> <p>The main constraints identified in the study were the poor condition of the cold chain in all aspects, poor skills and a lack of authority in resource allocation and human resource management, limited advocacy and communication resources, a lack of skills and training among staff at all levels in the PEI/EPI in almost all aspects of the program, a deficiency of public health professionals, poor health services structure, administrative issues (including ineffective means of performance evaluation, bureaucratic and political influences, problems in vaccination areas and field programs, no birth records at health facilities, and poor linkage between different preventive programs), unreliable reporting and poor monitoring and supervision systems, limited use of local data for interventions, and unclear roles and responsibilities after decentralization.</p> <p>Conclusion</p> <p>The study highlights various shortcomings and bottlenecks in the PEI, and the barriers identified should be considered in prioritizing future strategies.</p

    Comparative analysis of the transcriptome across distant species

    Get PDF
    The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters

    Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis.

    Get PDF
    Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1, but not MALAT1. Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    No full text
    Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts

    Targeted gene expression profiling predicts meningioma outcomes and radiotherapy responses

    No full text
    Surgery is the mainstay of treatment for meningioma, the most common primary intracranial tumor, but improvements in meningioma risk stratification are needed and indications for postoperative radiotherapy are controversial. Here we develop a targeted gene expression biomarker that predicts meningioma outcomes and radiotherapy responses. Using a discovery cohort of 173 meningiomas, we developed a 34-gene expression risk score and performed clinical and analytical validation of this biomarker on independent meningiomas from 12 institutions across 3 continents (N = 1,856), including 103 meningiomas from a prospective clinical trial. The gene expression biomarker improved discrimination of outcomes compared with all other systems tested (N = 9) in the clinical validation cohort for local recurrence (5-year area under the curve (AUC) 0.81) and overall survival (5-year AUC 0.80). The increase in AUC compared with the standard of care, World Health Organization 2021 grade, was 0.11 for local recurrence (95% confidence interval 0.07 to 0.17, P &lt; 0.001). The gene expression biomarker identified meningiomas benefiting from postoperative radiotherapy (hazard ratio 0.54, 95% confidence interval 0.37 to 0.78, P = 0.0001) and suggested postoperative management could be refined for 29.8% of patients. In sum, our results identify a targeted gene expression biomarker that improves discrimination of meningioma outcomes, including prediction of postoperative radiotherapy responses.</p
    corecore