161 research outputs found
A user-friendly tool using systems biology models to infer cell functions from omics
Please click Additional Files below to see the full abstract
UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on proteināDNA interactions
The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants (āwordsā) of length k (āk-mersā), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.National Institutes of Health (U.S.) (grant number R01 HG003985
UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on proteināDNA interactions
The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants (āwordsā) of length k (āk-mersā), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.National Institutes of Health (U.S.) (grant number R01 HG003985
On the design of clone-based haplotyping
Background: Haplotypes are important for assessing genealogy and disease susceptibility of individual genomes, but are difficult to obtain with routine sequencing approaches. Experimental haplotype reconstruction based on assembling fragments of individual chromosomes is promising, but with variable yields due to incompletely understood parameter choices. Results: We parameterize the clone-based haplotyping problem in order to provide theoretical and empirical assessments of the impact of different parameters on haplotype assembly. We confirm the intuition that long clones help link together heterozygous variants and thus improve haplotype length. Furthermore, given the length of the clones, we address how to choose the other parameters, including number of pools, clone coverage and sequencing coverage, so as to maximize haplotype length. We model the problem theoretically and show empirically the benefits of using larger clones with moderate number of pools and sequencing coverage. In particular, using 140 kb BAC clones, we construct haplotypes for a personal genome and assemble haplotypes with N50 values greater than 2.6 Mb. These assembled haplotypes are longer and at least as accurate as haplotypes of existing clone-based strategies, whether in vivo or in vitro. Conclusions: Our results provide practical guidelines for the development and design of clone-based methods to achieve long range, high-resolution and accurate haplotypes
HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data
BACKGROUND: The human leukocyte antigen (HLA) system is a genomic region involved in regulating the human immune system by encoding cell membrane major histocompatibility complex (MHC) proteins that are responsible for self-recognition. Understanding the variation in this region provides important insights into autoimmune disorders, disease susceptibility, oncological immunotherapy, regenerative medicine, transplant rejection, and toxicogenomics. Traditional approaches to HLA typing are low throughput, target only a few genes, are labor intensive and costly, or require specialized protocols. RNA sequencing promises a relatively inexpensive, high-throughput solution for HLA calling across all genes, with the bonus of complete transcriptome information and widespread availability of historical data. Existing tools have been limited in their ability to accurately and comprehensively call HLA genes from RNA-seq data.
RESULTS: We created HLAProfiler ( https://github.com/ExpressionAnalysis/HLAProfiler ), a k-mer profile-based method for HLA calling in RNA-seq data which can identify rare and common HLA alleles with >ā99% accuracy at two-field precision in both biological and simulated data. For 68% of novel alleles not present in the reference database, HLAProfiler can correctly identify the two-field precision or exact coding sequence, a significant advance over existing algorithms.
CONCLUSIONS: HLAProfiler allows for accurate HLA calls in RNA-seq data, reliably expanding the utility of these data in HLA-related research and enabling advances across a broad range of disciplines. Additionally, by using the observed data to identify potential novel alleles and update partial alleles, HLAProfiler will facilitate further improvements to the existing database of reference HLA alleles. HLAProfiler is available at https://expressionanalysis.github.io/HLAProfiler/
Applicability of Precision Medicine Approaches to Managing Hypertension in Rural Populations
As part of the Heart Healthy Lenoir Project, we developed a practice level intervention to improve blood pressure control. The goal of this study was: (i) to determine if single nucleotide polymorphisms (SNPs) that associate with blood pressure variation, identified in large studies, are applicable to blood pressure control in subjects from a rural population; (ii) to measure the association of these SNPs with subjectsĆ¢ā¬ā¢ responsiveness to the hypertension intervention; and (iii) to identify other SNPs that may help understand patient-specific responses to an intervention. We used a combination of candidate SNPs and genome-wide analyses to test associations with either baseline systolic blood pressure (SBP) or change in systolic blood pressure one year after the intervention in two genetically defined ancestral groups: African Americans (AA) and Caucasian Americans (CAU). Of the 48 candidate SNPs, 13 SNPs associated with baseline SBP in our study; however, one candidate SNP, rs592582, also associated with a change in SBP after one year. Using our study data, we identified 4 and 15 additional loci that associated with a change in SBP in the AA and CAU groups, respectively. Our analysis of gene-age interactions identified genotypes associated with SBP improvement within different age groups of our populations. Moreover, our integrative analysis identified AQP4-AS1 and PADI2 as genes whose expression levels may contribute to the pleiotropy of complex traits involved in cardiovascular health and blood pressure regulation in response to an intervention targeting hypertension. In conclusion, the identification of SNPs associated with the success of a hypertension treatment intervention suggests that genetic factors in combination with age may contribute to an individualĆ¢ā¬ā¢s success in lowering SBP. If these findings prove to be applicable to other populations, the use of this genetic variation in making patient-specific interventions may help providers with making decisions to improve patient outcomes. Further investigation is required to determine the role of this genetic variance with respect to the management of hypertension such that more precise treatment recommendations may be made in the future as part of personalized medicine
Recommended from our members
Accurate Whole-Genome Sequencing and Haplotyping from 10 to 20 Human Cells
Recent advances in whole genome sequencing have brought the vision of personal genomics and genomic medicine closer to reality. However, current methods lack clinical accuracy and the ability to describe the context (haplotypes) in which genome variants co-occur in a cost-effective manner. Here we describe a low-cost DNA sequencing and haplotyping process, Long Fragment Read (LFR) technology, similar to sequencing long single DNA molecules without cloning or separation of metaphase chromosomes. In this study, ten LFR libraries were made using only ~100 pg of human DNA per sample. Up to 97% of the heterozygous single nucleotide variants (SNVs) were assembled into long haplotype contigs. Removal of false positive SNVs not phased by multiple LFR haplotypes resulted in a final genome error rate of 1 in 10 Mb. Cost-effective and accurate genome sequencing and haplotyping from 10-20 human cells, as demonstrated here, will enable comprehensive genetic studies and diverse clinical applications
Semantic integration of clinical laboratory tests from electronic health records for deep phenotyping and biomarker discovery.
Electronic Health Record (EHR) systems typically define laboratory test results using the Laboratory Observation Identifier Names and Codes (LOINC) and can transmit them using Fast Healthcare Interoperability Resource (FHIR) standards. LOINC has not yet been semantically integrated with computational resources for phenotype analysis. Here, we provide a method for mapping LOINC-encoded laboratory test results transmitted in FHIR standards to Human Phenotype Ontology (HPO) terms. We annotated the medical implications of 2923 commonly used laboratory tests with HPO terms. Using these annotations, our software assesses laboratory test results and converts each result into an HPO term. We validated our approach with EHR data from 15,681 patients with respiratory complaints and identified known biomarkers for asthma. Finally, we provide a freely available SMART on FHIR application that can be used within EHR systems. Our approach allows readily available laboratory tests in EHR to be reused for deep phenotyping and exploits the hierarchical structure of HPO to integrate distinct tests that have comparable medical interpretations for association studies
Applicability of precision medicine approaches to managing hypertension in rural populations
As part of the Heart Healthy Lenoir Project, we developed a practice level intervention to improve blood pressure control. The goal of this study was: (i) to determine if single nucleotide polymorphisms (SNPs) that associate with blood pressure variation, identified in large studies, are applicable to blood pressure control in subjects from a rural population; (ii) to measure the association of these SNPs with subjectsā responsiveness to the hypertension intervention; and (iii) to identify other SNPs that may help understand patient-specific responses to an intervention. We used a combination of candidate SNPs and genome-wide analyses to test associations with either baseline systolic blood pressure (SBP) or change in systolic blood pressure one year after the intervention in two genetically defined ancestral groups: African Americans (AA) and Caucasian Americans (CAU). Of the 48 candidate SNPs, 13 SNPs associated with baseline SBP in our study; however, one candidate SNP, rs592582, also associated with a change in SBP after one year. Using our study data, we identified 4 and 15 additional loci that associated with a change in SBP in the AA and CAU groups, respectively. Our analysis of gene-age interactions identified genotypes associated with SBP improvement within different age groups of our populations. Moreover, our integrative analysis identified AQP4-AS1 and PADI2 as genes whose expression levels may contribute to the pleiotropy of complex traits involved in cardiovascular health and blood pressure regulation in response to an intervention targeting hypertension. In conclusion, the identification of SNPs associated with the success of a hypertension treatment intervention suggests that genetic factors in combination with age may contribute to an individualās success in lowering SBP. If these findings prove to be applicable to other populations, the use of this genetic variation in making patient-specific interventions may help providers with making decisions to improve patient outcomes. Further investigation is required to determine the role of this genetic variance with respect to the management of hypertension such that more precise treatment recommendations may be made in the future as part of personalized medicine
Contribution of Distinct Homeodomain DNA Binding Specificities to Drosophila Embryonic Mesodermal Cell-Specific Gene Expression Programs
Homeodomain (HD) proteins are a large family of evolutionarily conserved transcription factors (TFs) having diverse developmental functions, often acting within the same cell types, yet many members of this family paradoxically recognize similar DNA sequences. Thus, with multiple family members having the potential to recognize the same DNA sequences in cis-regulatory elements, it is difficult to ascertain the role of an individual HD or a subclass of HDs in mediating a particular developmental function. To investigate this problem, we focused our studies on the Drosophila embryonic mesoderm where HD TFs are required to establish not only segmental identities (such as the Hox TFs), but also tissue and cell fate specification and differentiation (such as the NK-2 HDs, Six HDs and identity HDs (I-HDs)). Here we utilized the complete spectrum of DNA binding specificities determined by protein binding microarrays (PBMs) for a diverse collection of HDs to modify the nucleotide sequences of numerous mesodermal enhancers to be recognized by either no or a single subclass of HDs, and subsequently assayed the consequences of these changes on enhancer function in transgenic reporter assays. These studies show that individual mesodermal enhancers receive separate transcriptional input from both IāHD and Hox subclasses of HDs. In addition, we demonstrate that enhancers regulating upstream components of the mesodermal regulatory network are targeted by the Six class of HDs. Finally, we establish the necessity of NK-2 HD binding sequences to activate gene expression in multiple mesodermal tissues, supporting a potential role for the NK-2 HD TF Tinman (Tin) as a pioneer factor that cooperates with other factors to regulate cell-specific gene expression programs. Collectively, these results underscore the critical role played by HDs of multiple subclasses in inducing the unique genetic programs of individual mesodermal cells, and in coordinating the gene regulatory networks directing mesoderm development.National Institutes of Health (U.S.) (Grant R01 HG005287
- ā¦