265 research outputs found

    PPCAS: Implementation of a Probabilistic Pairwise Model for Consistency-Based Multiple Alignment in Apache Spark

    Get PDF
    Large-scale data processing techniques, currently known as Big-Data, are used to manage the huge amount of data that are generated by sequencers. Although these techniques have significant advantages, few biological applications have adopted them. In the Bioinformatic scientific area, Multiple Sequence Alignment (MSA) tools are widely applied for evolution and phylogenetic analysis, homology and domain structure prediction. Highly-rated MSA tools, such as MAFFT, ProbCons and T-Coffee (TC), use the probabilistic consistency as a prior step to the progressive alignment stage in order to improve the final accuracy. In this paper, a novel approach named PPCAS (Probabilistic Pairwise model for Consistency-based multiple alignment in Apache Spark) is presented. PPCAS is based on the MapReduce processing paradigm in order to enable large datasets to be processed with the aim of improving the performance and scalability of the original algorithm.This work was supported by the MEyC-Spain [contract TIN2014-53234-C2-2-R]

    Measurement properties of asthma-specific quality-of-life measures: protocol for a systematic

    Get PDF
    Background: Asthma is a frequent chronic inflammatory disease of the airways, and the assessment of health-related quality of life (HrQoL) is important in both research and routine care. Various asthma-specific measures of HrQoL exist but there is uncertainty which measures are best suited for use in research and routine care. Therefore, the aim of the proposed research is a comprehensive systematic assessment of the measurement properties of the existing measures that were developed to measure asthma-specific quality of life. Methods/design: This study is a systematic review of the measurement properties of asthma-specific measures of health-related quality of life. PubMed and Embase will be searched using a selection of relevant search terms. Eligible studies will be primary empirical studies evaluating, describing or comparing measurement properties of asthma-specific HRQL tools. Eligibility assessment and data abstraction will be performed independently by two reviewers. Evidence tables will be generated for study characteristics, instrument characteristics, measurement properties and interpretability. The quality of the measurement properties will be assessed using predefined criteria. Methodological quality of studies will be assessed using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. A best evidence synthesis will be undertaken if more than one study have investigated a particular measurement property. Discussion: The proposed systematic review will produce a comprehensive assessment of measurement properties of existing measures of asthma-specific health-related quality of life. We also aim to derive recommendations in order to help researchers and practitioners alike in the choice of instrument

    Natural Genetic Diversity in Tomato Flavor Genes

    Get PDF
    Fruit flavor is defined as the perception of the food by the olfactory and gustatory systems, and is one of the main determinants of fruit quality. Tomato flavor is largely determined by the balance of sugars, acids and volatile compounds. Several genes controlling the levels of these metabolites in tomato fruit have been cloned, including LIN5, ALMT9, AAT1, CXE1, and LoxC. The aim of this study was to identify any association of these genes with trait variation and to describe the genetic diversity at these loci in the red-fruited tomato clade comprised of the wild ancestor Solanum pimpinellifolium, the semi-domesticated species Solanum lycopersicum cerasiforme and early domesticated Solanum lycopersicum. High genetic diversity was observed at these five loci, including novel haplotypes that could be incorporated into breeding programs to improve fruit quality of modern tomatoes. Using newly available high-quality genome assemblies, we assayed each gene for potential functional causative polymorphisms and resolved a duplication at the LoxC locus found in several wild and semi-domesticated accessions which caused lower accumulation of lipid derived volatiles. In addition, we explored gene expression of the five genes in nine phylogenetically diverse tomato accessions. In general, the expression patterns of these genes increased during fruit ripening but diverged between accessions without clear relationship between expression and metabolite levels

    Reaction rates and transport in neutron stars

    Full text link
    Understanding signals from neutron stars requires knowledge about the transport inside the star. We review the transport properties and the underlying reaction rates of dense hadronic and quark matter in the crust and the core of neutron stars and point out open problems and future directions.Comment: 74 pages; commissioned for the book "Physics and Astrophysics of Neutron Stars", NewCompStar COST Action MP1304; version 3: minor changes, references updated, overview graphic added in the introduction, improvements in Sec IV.A.

    Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

    Get PDF
    We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics

    The effects of phenoxodiol on the cell cycle of prostate cancer cell lines

    Get PDF
    Background: Prostate cancer is associated with a poor survival rate. The ability of cancer cells to evade apoptosis and exhibit limitless replication potential allows for progression of cancer from a benign to a metastatic phenotype. The aim of this study was to investigate in vitro the effect of the isoflavone phenoxodiol on the expression of cell cycle genes. Methods: Three prostate cancer cell lines-LNCaP, DU145, and PC3 were cultured in vitro, and then treated with phenoxodiol (10 μM and 30 μM) for 24 and 48 h. The expression of cell cycle genes p21WAF1, c-Myc, Cyclin-D1, and Ki-67 was investigated by Real Time PCR. Results: Here we report that phenoxodiol induces cell cycle arrest in the G1/S phase of the cell cycle, with the resultant arrest due to the upregulation of p21WAF1 in all the cell lines in response to treatment, indicating that activation of p21WAF1 and subsequent cell arrest was occurring via a p53 independent manner, with induction of cytotoxicity independent of caspase activation. We found that c-Myc and Cyclin-D1 expression was not consistently altered across all cell lines but Ki-67 signalling expression was decreased in line with the cell cycle arrest. Conclusions: Phenoxodiol demonstrates an ability in prostate cancer cells to induce significant cytotoxicity in cells by interacting with p21WAF1 and inducing cell cycle arrest irrespective of p53 status or caspase pathway interactions. These data indicate that phenoxodiol would be effective as a potential future treatment modality for both hormone sensitive and hormone refractory prostate cancer

    Assembly complexity of prokaryotic genomes using short reads

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>De Bruijn graphs are a theoretical framework underlying several modern genome assembly programs, especially those that deal with very short reads. We describe an application of de Bruijn graphs to analyze the global repeat structure of prokaryotic genomes.</p> <p>Results</p> <p>We provide the first survey of the repeat structure of a large number of genomes. The analysis gives an upper-bound on the performance of genome assemblers for <it>de novo </it>reconstruction of genomes across a wide range of read lengths. Further, we demonstrate that the majority of genes in prokaryotic genomes can be reconstructed uniquely using very short reads even if the genomes themselves cannot. The non-reconstructible genes are overwhelmingly related to mobile elements (transposons, IS elements, and prophages).</p> <p>Conclusions</p> <p>Our results improve upon previous studies on the feasibility of assembly with short reads and provide a comprehensive benchmark against which to compare the performance of the short-read assemblers currently being developed.</p

    Reconstructing cancer genomes from paired-end sequencing data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data.</p> <p>Results</p> <p>By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i) a partition of the reference genome into intervals; (ii) adjacencies between these intervals in the cancer genome; (iii) an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO), to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B) cycles.</p> <p>Conclusions</p> <p>We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is available at <url>http://compbio.cs.brown.edu/software/</url>.</p

    Applying diagnosis and pharmacy-based risk models to predict pharmacy use in Aragon, Spain: The impact of a local calibration

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the financing of a national health system, where pharmaceutical spending is one of the main cost containment targets, predicting pharmacy costs for individuals and populations is essential for budget planning and care management. Although most efforts have focused on risk adjustment applying diagnostic data, the reliability of this information source has been questioned in the primary care setting. We sought to assess the usefulness of incorporating pharmacy data into claims-based predictive models (PMs). Developed primarily for the U.S. health care setting, a secondary objective was to evaluate the benefit of a local calibration in order to adapt the PMs to the Spanish health care system.</p> <p>Methods</p> <p>The population was drawn from patients within the primary care setting of Aragon, Spain (n = 84,152). Diagnostic, medication and prior cost data were used to develop PMs based on the Johns Hopkins ACG methodology. Model performance was assessed through r-squared statistics and predictive ratios. The capacity to identify future high-cost patients was examined through c-statistic, sensitivity and specificity parameters.</p> <p>Results</p> <p>The PMs based on pharmacy data had a higher capacity to predict future pharmacy expenses and to identify potential high-cost patients than the models based on diagnostic data alone and a capacity almost as high as that of the combined diagnosis-pharmacy-based PM. PMs provided considerably better predictions when calibrated to Spanish data.</p> <p>Conclusion</p> <p>Understandably, pharmacy spending is more predictable using pharmacy-based risk markers compared with diagnosis-based risk markers. Pharmacy-based PMs can assist plan administrators and medical directors in planning the health budget and identifying high-cost-risk patients amenable to care management programs.</p

    AID-Targeting and Hypermutation of Non-Immunoglobulin Genes Does Not Correlate with Proximity to Immunoglobulin Genes in Germinal Center B Cells

    Get PDF
    Upon activation, B cells divide, form a germinal center, and express the activation induced deaminase (AID), an enzyme that triggers somatic hypermutation of the variable regions of immunoglobulin (Ig) loci. Recent evidence indicates that at least 25% of expressed genes in germinal center B cells are mutated or deaminated by AID. One of the most deaminated genes, c-Myc, frequently appears as a translocation partner with the Ig heavy chain gene (Igh) in mouse plasmacytomas and human Burkitt's lymphomas. This indicates that the two genes or their double-strand break ends come into close proximity at a biologically relevant frequency. However, the proximity of c-Myc and Igh has never been measured in germinal center B cells, where many such translocations are thought to occur. We hypothesized that in germinal center B cells, not only is c-Myc near Igh, but other mutating non-Ig genes are deaminated by AID because they are near Ig genes, the primary targets of AID. We tested this “collateral damage” model using 3D-fluorescence in situ hybridization (3D-FISH) to measure the distance from non-Ig genes to Ig genes in germinal center B cells. We also made mice transgenic for human MYC and measured expression and mutation of the transgenes. We found that there is no correlation between proximity to Ig genes and levels of AID targeting or gene mutation, and that c-Myc was not closer to Igh than were other non-Ig genes. In addition, the human MYC transgenes did not accumulate mutations and were not deaminated by AID. We conclude that proximity to Ig loci is unlikely to be a major determinant of AID targeting or mutation of non-Ig genes, and that the MYC transgenes are either missing important regulatory elements that allow mutation or are unable to mutate because their new nuclear position is not conducive to AID deamination
    corecore