Search CORE

1,819 research outputs found

Recommended from our members

Integrating biomedical research and electronic health records to create knowledge-based biologically meaningful machine-readable embeddings.

Author: Baranzini Sergio E
Butte Atul J
Nelson Charlotte A
Publication venue: eScholarship, University of California
Publication date: 01/07/2019
Field of study

In order to advance precision medicine, detailed clinical features ought to be described in a way that leverages current knowledge. Although data collected from biomedical research is expanding at an almost exponential rate, our ability to transform that information into patient care has not kept at pace. A major barrier preventing this transformation is that multi-dimensional data collection and analysis is usually carried out without much understanding of the underlying knowledge structure. Here, in an effort to bridge this gap, Electronic Health Records (EHRs) of individual patients are connected to a heterogeneous knowledge network called Scalable Precision Medicine Oriented Knowledge Engine (SPOKE). Then an unsupervised machine-learning algorithm creates Propagated SPOKE Entry Vectors (PSEVs) that encode the importance of each SPOKE node for any code in the EHRs. We argue that these results, alongside the natural integration of PSEVs into any EHR machine-learning platform, provide a key step toward precision medicine

eScholarship - University of California

Coanalysis of GWAS with eQTLs reveals disease-tissue associations.

Author: Butte Atul J
Chen Rong
Kang Hyunseok Peter
Morgan Alex A
Schadt Eric E
Publication venue: eScholarship, University of California
Publication date: 01/01/2012
Field of study

Expression quantitative trait loci (eQTL), or genetic variants associated with changes in gene expression, have the potential to assist in interpreting results of genome-wide association studies (GWAS). eQTLs also have varying degrees of tissue specificity. By correlating the statistical significance of eQTLs mapped in various tissue types to their odds ratios reported in a large GWAS by the Wellcome Trust Case Control Consortium (WTCCC), we discovered that there is a significant association between diseases studied genetically and their relevant tissues. This suggests that eQTL data sets can be used to determine tissues that play a role in the pathogenesis of a disease, thereby highlighting these tissue types for further post-GWAS functional studies

PubMed Central

eScholarship - University of California

Recommended from our members

Accuracy of medical billing data against the electronic health record in the measurement of colorectal cancer screening rates.

Author: Avila Patrick
Butte Atul J
Glicksberg Benjamin S
Harding-Theobald Emily
Rudrapatna Vivek A
Wang Connie
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

ObjectiveMedical billing data are an attractive source of secondary analysis because of their ease of use and potential to answer population-health questions with statistical power. Although these datasets have known susceptibilities to biases, the degree to which they can distort the assessment of quality measures such as colorectal cancer screening rates are not widely appreciated, nor are their causes and possible solutions.MethodsUsing a billing code database derived from our institution's electronic health records, we estimated the colorectal cancer screening rate of average-risk patients aged 50-74 years seen in primary care or gastroenterology clinic in 2016-2017. 200 records (150 unscreened, 50 screened) were sampled to quantify the accuracy against manual review.ResultsOut of 4611 patients, an analysis of billing data suggested a 61% screening rate, an estimate that matches the estimate by the Centers for Disease Control. Manual review revealed a positive predictive value of 96% (86%-100%), negative predictive value of 21% (15%-29%) and a corrected screening rate of 85% (81%-90%). Most false negatives occurred due to examinations performed outside the scope of the database-both within and outside of our institution-but 21% of false negatives fell within the database's scope. False positives occurred due to incomplete examinations and inadequate bowel preparation. Reasons for screening failure include ordered but incomplete examinations (48%), lack of or incorrect documentation by primary care (29%) including incorrect screening intervals (13%) and patients declining screening (13%).ConclusionsBilling databases are prone to substantial bias that may go undetected even in the presence of confirmatory external estimates. Caution is recommended when performing population-level inference from these data. We propose several solutions to improve the use of these data for the assessment of healthcare quality

eScholarship - University of California

Gene-network inference by message passing

Author: A Braunstein
A Pagnani
Alberts B
Braunstein A
Butte A J
Gasch A P
Kabashima Y
M Weigt
Murphy K
R Zecchina
Segal E
Publication venue: 'IOP Publishing'
Publication date: 01/01/2008
Field of study

The inference of gene-regulatory processes from gene-expression data belongs to the major challenges of computational systems biology. Here we address the problem from a statistical-physics perspective and develop a message-passing algorithm which is able to infer sparse, directed and combinatorial regulatory mechanisms. Using the replica technique, the algorithmic performance can be characterized analytically for artificially generated data. The algorithm is applied to genome-wide expression data of baker's yeast under various environmental conditions. We find clear cases of combinatorial control, and enrichment in common functional annotations of regulated genes and their regulators.Comment: Proc. of International Workshop on Statistical-Mechanical Informatics 2007, Kyot

arXiv.org e-Print Archive

CiteSeerX

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Gene-network inference by message passing

Author: A Braunstein
A Pagnani
Alberts B
Braunstein A
Butte A J
Gasch A P
Kabashima Y
M Weigt
Murphy K
R Zecchina
Segal E
Publication venue: 'IOP Publishing'
Publication date: 01/01/2008
Field of study

arXiv.org e-Print Archive

CiteSeerX

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Gene-network inference by message passing

Author: A Braunstein
A Pagnani
Alberts B
Braunstein A
Butte A J
Gasch A P
Kabashima Y
M Weigt
Murphy K
R Zecchina
Segal E
Publication venue: 'IOP Publishing'
Publication date: 01/01/2008
Field of study

arXiv.org e-Print Archive

CiteSeerX

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Likelihood ratios for genome medicine

Author: Butte Atul J
Chen Rong
Morgan Alexander A
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Patients are beginning to present to healthcare providers with the results of high-throughput individualized genotyping, and interpreting these results in the context of the explosive growth of literature linking individual variants with disease may seem daunting. However, we suggest that results of a personal genomic analysis may be viewed as a panel of many tests for multiple diseases. By using well-established methods of evidence based medicine, these very many parallel tests may be combined using likelihood ratios to report a post-test probability of disease for use in patient assessment

Crossref

PubMed Central

eScholarship - University of California

341: Allogeneic Antibodies Specifically Target AML Antigen NuSAP1 after Bone Marrow Transplantation

Author: Butte A.
Coram M.
Miklos D.B.
Wadia P.P.
Publication venue: American Society for Blood and Marrow Transplantation. Published by Elsevier Inc.
Publication date: 29/02/2008
Field of study

Elsevier - Publisher Connector

Current methodologies for translational bioinformatics

Author: Butte Atul J.
Hunter Lawrence
Lussier Yves A.
Publication venue: Elsevier Inc.
Publication date: 01/06/2010
Field of study

Elsevier - Publisher Connector

PubMed Central

Random matrix analysis of localization properties of Gene co-expression network

Author: A. J. Butte
Baowen Li
Béla Bollobás
Gábor Vattay
H. Göhlmann
K. Zyczkowski
M. L. Mehta
Norbert Solymosi
Sarika Jalan
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2010
Field of study

We analyze gene co-expression network under the random matrix theory framework. The nearest neighbor spacing distribution of the adjacency matrix of this network follows Gaussian orthogonal statistics of random matrix theory (RMT). Spectral rigidity test follows random matrix prediction for a certain range, and deviates after wards. Eigenvector analysis of the network using inverse participation ratio (IPR) suggests that the statistics of bulk of the eigenvalues of network is consistent with those of the real symmetric random matrix, whereas few eigenvalues are localized. Based on these IPR calculations, we can divide eigenvalues in three sets; (A) The non-degenerate part that follows RMT. (B) The non-degenerate part, at both ends and at intermediate eigenvalues, which deviate from RMT and expected to contain information about {\it important nodes} in the network. (C) The degenerate part with

zero

eigenvalue, which fluctuates around RMT predicted value. We identify nodes corresponding to the dominant modes of the corresponding eigenvectors and analyze their structural properties

arXiv.org e-Print Archive

Crossref

ELTE Digital Institutional Repository (EDIT)

ScholarBank@NUS