Search CORE

119,550 research outputs found

Probabilistic techniques for obtaining accurate patient counts in Clinical Data Warehouses

Author: Gautier D.
Hersant F.
Tobie G.
Publication venue: Elsevier Inc.
Publication date: 01/01/2011
Field of study

AbstractProposal and execution of clinical trials, computation of quality measures and discovery of correlation between medical phenomena are all applications where an accurate count of patients is needed. However, existing sources of this type of patient information, including Clinical Data Warehouses (CDWs) may be incomplete or inaccurate. This research explores applying probabilistic techniques, supported by the MayBMS probabilistic database, to obtain accurate patient counts from a Clinical Data Warehouse containing synthetic patient data.We present a synthetic Clinical Data Warehouse, and populate it with simulated data using a custom patient data generation engine. We then implement, evaluate and compare different techniques for obtaining patients counts.We model billing as a test for the presence of a condition. We compute billing’s sensitivity and specificity both by conducting a “Simulated Expert Review” where a representative sample of records are reviewed and labeled by experts, and by obtaining the ground truth for every record.We compute the posterior probability of a patient having a condition through a “Bayesian Chain”, using Bayes’ Theorem to calculate the probability of a patient having a condition after each visit. The second method is a “one-shot” approach that computes the probability of a patient having a condition based on whether the patient is ever billed for the condition.Our results demonstrate the utility of probabilistic approaches, which improve on the accuracy of raw counts. In particular, the simulated review paired with a single application of Bayes’ Theorem produces the best results, with an average error rate of 2.1% compared to 43.7% for the straightforward billing counts.Overall, this research demonstrates that Bayesian probabilistic approaches improve patient counts on simulated patient populations. We believe that total patient counts based on billing data are one of the many possible applications of our Bayesian framework. Use of these probabilistic techniques will enable more accurate patient counts and better results for applications requiring this metric

Elsevier - Publisher Connector

Near-optimal RNA-Seq quantification

Author: Bray Nicolas
Melsted Páll
Pachter Lior
Pimentel Harold
Publication venue
Publication date: 11/05/2015
Field of study

We present a novel approach to RNA-Seq quantification that is near optimal in speed and accuracy. Software implementing the approach, called kallisto, can be used to analyze 30 million unaligned paired-end RNA-Seq reads in less than 5 minutes on a standard laptop computer while providing results as accurate as those of the best existing tools. This removes a major computational bottleneck in RNA-Seq analysis.Comment: - Added some results (paralog analysis, allele specific expression analysis, alignment comparison, accuracy analysis with TPMs) - Switched bootstrap analysis to human sample from SEQC-MAQCIII - Provided link to a snakefile that allows for reproducibility of all results and figures in the pape

arXiv.org e-Print Archive

Caltech Authors

Comparison of otolith readability and reproducibility of counts of translucent zones using diﬀerent otolith preparation methods for four endemic Labeobarbus species in Lake Tana, Ethiopia

Author: Anteneh Wassie
Bekaert Karen
Bruneel Stijn
Getahun Abebe
Goethals Peter
Kidane Shewit Gebremedhin
Torreele Els
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

The analysis of fish age data is vital for the successful conservation of fish. Attempts to develop optimal management strategies for effective conservation of the endemic Labeobarbus species are strongly affected by the lack of accurate age estimates. Although methodological studies are key to acquiring a good insight into the age of fishes, up to now, there have not been any studies comparing different methods for these species. Thus, this study aimed at determining the best method for the endemic Labeobarbus species. Samples were collected from May 2016 to April 2017. Asteriscus otoliths from 150 specimens each of L. intermedius, L. tsanensis, L. platydorsus, and L. megastoma were examined. Six methods were evaluated; however, only three methods resulted in readable images. The procedure in which whole otoliths were first submerged in water, and subsequently placed in glycerol to take the image (MO1), was generally best. Except for L. megastoma, this method produced the clearest image as both the coefficient of variation and average percentage error between readers were lowest. Furthermore, except for L. megastoma, MO1 had high otolith readability and no systematic bias. Therefore, we suggest that MO1 should be used as the standard otolith preparation technique for the first three species, while for L. megastoma, other preparation techniques should be evaluated. This study provides a reference for researchers from Africa, particularly Ethiopia, to develop a suitable otolith preparation method for the different tropical fish species

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Archivsystem Ask23

Do News and Sentiment play a role in Stock Price Prediction?

Author: Gepp Adrian
Harris Geoffrey
Vanstone Bruce J
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2019
Field of study

Bond University Research Portal

Archon Genomics X PRIZE Validation Protocol

Author: Edison Liu
Granger Sutton
Larry Kedes
Victor Jongeneel
Publication venue
Publication date: 24/02/2011
Field of study

This document is a collective assembly of techniques designed to test the quality and accuracy of 100 whole human genome sequences resulting from the $10 Million Archon Genomics X PRIZE (AGXP) competition. The purpose of this article is to enlist constructive criticism from the genomic and genetic community on the outlined approaches. The intent for the final version of this Validation Protocol is to become a useful standard by which to gauge the capabilities of whole genome sequencing technologies that emerge even after 2012

Nature Precedings

Improving the value of public RNA-seq expression data by phenotype prediction.

Author: Andrew Jaffe
Aryee
Beery
Bernstein
Collado-Torres
Collado-Torres
Consortium
Denk
Eswaran
Frazee
Goodspeed
Houseman
Iorio
Irizarry
Jeffrey T Leek
Kalari
Kim
Leek
Leinonen
Leonardo Collado-Torres
Lister
Liu
Lonsdale
Mazure
Mortazavi
Nagalakshmi
Nellore
Pohl
Ritchie
Robinson
Seqc/Maqc-Iii Consortium.
Shannon E Ellis
Smallridge
Toker
Publication venue: eScholarship, University of California
Publication date: 01/05/2018
Field of study

Publicly available genomic data are a valuable resource for studying normal human variation and disease, but these data are often not well labeled or annotated. The lack of phenotype information for public genomic data severely limits their utility for addressing targeted biological questions. We develop an in silico phenotyping approach for predicting critical missing annotation directly from genomic measurements using well-annotated genomic and phenotypic data produced by consortia like TCGA and GTEx as training data. We apply in silico phenotyping to a set of 70 000 RNA-seq samples we recently processed on a common pipeline as part of the recount2 project. We use gene expression data to build and evaluate predictors for both biological phenotypes (sex, tissue, sample source) and experimental conditions (sequencing strategy). We demonstrate how these predictions can be used to study cross-sample properties of public genomic data, select genomic projects with specific characteristics, and perform downstream analyses using predicted phenotypes. The methods to perform phenotype prediction are available in the phenopredict R package and the predictions for recount2 are available from the recount R package. With data and phenotype information available for 70,000 human samples, expression data is available for use on a scale that was not previously feasible

Crossref

eScholarship - University of California

Recommended from our members

Validation of a consumer-grade activity monitor for continuous daily activity monitoring in individuals with multiple sclerosis.

Author: Block Valerie J
Cree Bruce Ac
Gelfand Jeffrey M
Henry Roland
Hollenbach Jill A
Marcus Gregory M
Olgin Jeffrey E
Pletcher Mark J
Zhao Chao
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

Background:Technological advancements of remote-monitoring used in clinical-care and research require validation of model updates. Objectives:To compare the output of a newer consumer-grade accelerometer to a previous model in people with multiple sclerosis (MS) and to the ActiGraph, a waist-worn device widely used in MS research. Methods:Thirty-one individuals with MS participated in a 7-day validation by the Fitbit Flex (Flex), Fitbit Flex-2 (Flex2) and ActiGraph GT3X. Primary outcome was step count. Valid epochs of 5-min block increments, where there was overlap of ≥1 step/min for both devices were compared and summed to give a daily total for analysis. Results:Bland-Altman plots showed no systematic difference between the Flex and Flex2; mean step-count difference of 25 more steps-per-day more recorded by Flex2 (95% confidence intervals (CI) = 2, 48; p = 0.04),interclass correlation coefficient (ICC) = 1.00. Compared to the ActiGraph, Flex2 (and Flex) tended to record more steps (808 steps-per-day more than the ActiGraph (95% CI= -2380, 765; p < 0.01), although the ICC was high (0.98) indicating that the devices were likely measuring the same kind of activity. Conclusions:Steps from Flex and Flex2 can be used interchangeably. Differences in total step count between ActiGraph and Flex devices can make cross-device comparisons of numerical step-counts challenging particularly for faster walkers

eScholarship - University of California

Gut microbiota in HIV-pneumonia patients is related to peripheral CD4 counts, lung microbiota, and in vitro macrophage dysfunction.

Author: Byanyima Patrick
Chang Emily
Davis J Lucian
Fadrosh Douglas W
Fong Serena
Huang Laurence
Kaswabuli Sylvia
Lin Din L
Lynch Susan V
McCauley Kathryn
Musisi Emmanuel
Sanyu Ingvar
Shenoy Meera K
Worodria William
Zawedde Josephine
Publication venue: eScholarship, University of California
Publication date: 01/03/2019
Field of study

Pneumonia is common and frequently fatal in HIV-infected patients, due to rampant, systemic inflammation and failure to control microbial infection. While airway microbiota composition is related to local inflammatory response, gut microbiota has been shown to correlate with the degree of peripheral immune activation (IL6 and IP10 expression) in HIV-infected patients. We thus hypothesized that both airway and gut microbiota are perturbed in HIV-infected pneumonia patients, that the gut microbiota is related to peripheral CD4+ cell counts, and that its associated products differentially program immune cell populations necessary for controlling microbial infection in CD4-high and CD4-low patients. To assess these relationships, paired bronchoalveolar lavage and stool microbiota (bacterial and fungal) from a large cohort of Ugandan, HIV-infected patients with pneumonia were examined, and in vitro tests of the effect of gut microbiome products on macrophage effector phenotypes performed. While lower airway microbiota stratified into three compositionally distinct microbiota as previously described, these were not related to peripheral CD4 cell count. In contrast, variation in gut microbiota composition significantly related to CD4 cell count, lung microbiota composition, and patient mortality. Compared with patients with high CD4+ cell counts, those with low counts possessed more compositionally similar airway and gut microbiota, evidence of microbial translocation, and their associated gut microbiome products reduced macrophage activation and IL-10 expression and increased IL-1β expression in vitro. These findings suggest that the gut microbiome is related to CD4 status and plays a key role in modulating macrophage function, critical to microbial control in HIV-infected patients with pneumonia

Directory of Open Access Journals

eScholarship - University of California

InPhaDel: integrative shotgun and proximity-ligation sequencing to phase deletions with single nucleotide polymorphisms.

Author: Bafna Vineet
Bansal Vikas
Edge Peter
Patel Anand
Selvaraj Siddarth
Publication venue: eScholarship, University of California
Publication date: 21/04/2016
Field of study

Phasing of single nucleotide (SNV), and structural variations into chromosome-wide haplotypes in humans has been challenging, and required either trio sequencing or restricting phasing to population-based haplotypes. Selvaraj et al demonstrated single individual SNV phasing is possible with proximity ligated (HiC) sequencing. Here, we demonstrate HiC can phase structural variants into phased scaffolds of SNVs. Since HiC data is noisy, and SV calling is challenging, we applied a range of supervised classification techniques, including Support Vector Machines and Random Forest, to phase deletions. Our approach was demonstrated on deletion calls and phasings on the NA12878 human genome. We used three NA12878 chromosomes and simulated chromosomes to train model parameters. The remaining NA12878 chromosomes withheld from training were used to evaluate phasing accuracy. Random Forest had the highest accuracy and correctly phased 86% of the deletions with allele-specific read evidence. Allele-specific read evidence was found for 76% of the deletions. HiC provides significant read evidence for accurately phasing 33% of the deletions. Also, eight of eight top ranked deletions phased by only HiC were validated using long range polymerase chain reaction and Sanger. Thus, deletions from a single individual can be accurately phased using a combination of shotgun and proximity ligation sequencing. InPhaDel software is available at: http://l337x911.github.io/inphadel/

PubMed Central

eScholarship - University of California