Search CORE

1,451 research outputs found

Using Ontology Fingerprints to evaluate genome-wide association study results

Author: Lam C. Tsoi
Michael Boehnke
Richard L. Klein
W. Jim Zheng
Publication venue
Publication date: 14/08/2009
Field of study

We describe an approach to characterize genes or phenotypes via ontology fingerprints which are composed of Gene Ontology (GO) terms overrepresented among those PubMed abstracts linked to the genes or phenotypes. We then quantify the biological relevance between genes and phenotypes by comparing their ontology fingerprints to calculate a similarity score. We validated this approach by correctly identifying genes belong to their biological pathways with high accuracy, and applied this approach to evaluate GWA study by ranking genes associated with the lipid concentrations in plasma as well as to prioritize genes within linkage disequilibrium (LD) block. We found that the genes with highest scores were: ABCA1, LPL, and CETP for HDL; LDLR, APOE and APOB for LDL; and LPL, APOA1 and APOB for triglyceride. In addition, we identified some top ranked genes linking to lipid metabolism from the literature even in cases where such knowledge was not reflected in current annotation of these genes. These results demonstrate that ontology fingerprints can be used effectively to prioritize genes from GWA studies for experimental validation

Nature Precedings

Integration of the Gene Ontology into an object-oriented architecture

Author: Shegogue Daniel
Zheng W Jim
Publication venue: BioMed Central
Publication date: 01/05/2005
Field of study

BACKGROUND: To standardize gene product descriptions, a formal vocabulary defined as the Gene Ontology (GO) has been developed. GO terms have been categorized into biological processes, molecular functions, and cellular components. However, there is no single representation that integrates all the terms into one cohesive model. Furthermore, GO definitions have little information explaining the underlying architecture that forms these terms, such as the dynamic and static events occurring in a process. In contrast, object-oriented models have been developed to show dynamic and static events. A portion of the TGF-beta signaling pathway, which is involved in numerous cellular events including cancer, differentiation and development, was used to demonstrate the feasibility of integrating the Gene Ontology into an object-oriented model. RESULTS: Using object-oriented models we have captured the static and dynamic events that occur during a representative GO process, "transforming growth factor-beta (TGF-beta) receptor complex assembly" (GO:0007181). CONCLUSION: We demonstrate that the utility of GO terms can be enhanced by object-oriented technology, and that the GO terms can be integrated into an object-oriented model by serving as a basis for the generation of object functions and attributes

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Genome3D: A Viewer-Model Framework for Integrating and Visualizing Multi-Scale Epigenomic Information within a Three-Dimensional Genome

Author: Asbury Thomas M.
Mitman Matt
Tang Jijun
Zheng W. Jim
Publication venue: Scholar Commons
Publication date: 01/01/2010
Field of study

Background New technologies are enabling the measurement of many types of genomic and epigenomic information at scales ranging from the atomic to nuclear. Much of this new data is increasingly structural in nature, and is often difficult to coordinate with other data sets. There is a legitimate need for integrating and visualizing these disparate data sets to reveal structural relationships not apparent when looking at these data in isolation. Results We have applied object-oriented technology to develop a downloadable visualization tool, Genome3D, for integrating and displaying epigenomic data within a prescribed three-dimensional physical model of the human genome. In order to integrate and visualize large volume of data, novel statistical and mathematical approaches have been developed to reduce the size of the data. To our knowledge, this is the first such tool developed that can visualize human genome in three-dimension. We describe here the major features of Genome3D and discuss our multi-scale data framework using a representative basic physical model. We then demonstrate many of the issues and benefits of multi-resolution data integration. Conclusions Genome3D is a software visualization tool that explores a wide range of structural genomic and epigenetic data. Data from various sources of differing scales can be integrated within a hierarchical framework that is easily adapted to new developments concerning the structure of the physical genome. In addition, our tool has a simple annotation mechanism to incorporate non-structural information. Genome3D is unique is its ability to manipulate large amounts of multi-resolution data from diverse sources to uncover complex and new structural relationships within the genome

Springer - Publisher Connector

Scholar Commons - Institutional Repository of the University of South Carolina

PubMed Central

Consistent Differential Expression Pattern (CDEP) on microarray to identify genes related to metastatic behavior

Author: Qin Tingting
Slate Elizabeth H
Tsoi Lam C
Zheng W Jim
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background To utilize the large volume of gene expression information generated from different microarray experiments, several meta-analysis techniques have been developed. Despite these efforts, there remain significant challenges to effectively increasing the statistical power and decreasing the Type I error rate while pooling the heterogeneous datasets from public resources. The objective of this study is to develop a novel meta-analysis approach, Consistent Differential Expression Pattern (CDEP), to identify genes with common differential expression patterns across different datasets. Results We combined False Discovery Rate (FDR) estimation and the non-parametric RankProd approach to estimate the Type I error rate in each microarray dataset of the meta-analysis. These Type I error rates from all datasets were then used to identify genes with common differential expression patterns. Our simulation study showed that CDEP achieved higher statistical power and maintained low Type I error rate when compared with two recently proposed meta-analysis approaches. We applied CDEP to analyze microarray data from different laboratories that compared transcription profiles between metastatic and primary cancer of different types. Many genes identified as differentially expressed consistently across different cancer types are in pathways related to metastatic behavior, such as ECM-receptor interaction, focal adhesion, and blood vessel development. We also identified novel genes such as <it>AMIGO2</it>, <it>Gem</it>, and <it>CXCL11 </it>that have not been shown to associate with, but may play roles in, metastasis. Conclusions CDEP is a flexible approach that borrows information from each dataset in a meta-analysis in order to identify genes being differentially expressed consistently. We have shown that CDEP can gain higher statistical power than other existing approaches under a variety of settings considered in the simulation study, suggesting its robustness and insensitivity to data variation commonly associated with microarray experiments. Availability: CDEP is implemented in R and freely available at: <url>http://genomebioinfo.musc.edu/CDEP/</url> Contact: [email protected]</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Deep Blue Documents at the University of Michigan

Improving Transmission Efficiency of Large Sequence Alignment/Map (SAM) Files

Author: C Kozanitis
C Wang
Chin-Tser Huang
H Li
Jijun Tang
Leonardo Mariño-Ramírez
Muhammad Nazmus Sakib
S Deorowicz
W. Jim Zheng
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Research in bioinformatics primarily involves collection and analysis of a large volume of genomic data. Naturally, it demands efficient storage and transfer of this huge amount of data. In recent years, some research has been done to find efficient compression algorithms to reduce the size of various sequencing data. One way to improve the transmission time of large files is to apply a maximum lossless compression on them. In this paper, we present SAMZIP, a specialized encoding scheme, for sequence alignment data in SAM (Sequence Alignment/Map) format, which improves the compression ratio of existing compression tools available. In order to achieve this, we exploit the prior knowledge of the file format and specifications. Our experimental results show that our encoding scheme improves compression ratio, thereby reducing overall transmission time significantly

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Recommended from our members

Cancer Informatics for Cancer Centers (CI4CC): Building a Community Focused on Sharing Ideas and Best Practices to Improve Cancer Care and Patient Outcomes.

Author: Barnholtz-Sloan Jill S
Basu Amrita
Borowsky Alexander D
Bui Alex
DiGiovanna Jack
Garcia-Closas Montserrat
Genkinger Jeanine M
Gerke Travis
Induni Marta
Kibbe Warren A
Lacey James V, Jr
Mirel Lisa
Nadaf Sorena
Permuth Jennifer B
Rollison Dana E
Saltz Joel
Shenkman Elizabeth A
Ulrich Cornelia M
Zheng W Jim
Publication venue: eScholarship, University of California
Publication date: 01/02/2020
Field of study

Cancer Informatics for Cancer Centers (CI4CC) is a grassroots, nonprofit 501c3 organization intended to provide a focused national forum for engagement of senior cancer informatics leaders, primarily aimed at academic cancer centers anywhere in the world but with a special emphasis on the 70 National Cancer Institute-funded cancer centers. Although each of the participating cancer centers is structured differently, and leaders' titles vary, we know firsthand there are similarities in both the issues we face and the solutions we achieve. As a consortium, we have initiated a dedicated listserv, an open-initiatives program, and targeted biannual face-to-face meetings. These meetings are a place to review our priorities and initiatives, providing a forum for discussion of the strategic and pragmatic issues we, as informatics leaders, individually face at our respective institutions and cancer centers. Here we provide a brief history of the CI4CC organization and meeting highlights from the latest CI4CC meeting that took place in Napa, California from October 14-16, 2019. The focus of this meeting was "intersections between informatics, data science, and population science." We conclude with a discussion on "hot topics" on the horizon for cancer informatics

eScholarship - University of California

Distinct Profiles of Specialized Pro-resolving Lipid Mediators and Corresponding Receptor Gene Expression in Periodontal Inflammation

Author: Angelov Nikola
Ayilavarapu Srinivas
Bokka Nishantha R
Chen Wanqi
Ferguson Brittney
Lee Chun-Teh
Maddipati Krishna Rao
Van Dyke Thomas E
Weltman Robin
Zheng W Jim
Zhu Lisha
Publication venue: DigitalCommons@TMC
Publication date: 01/01/2020
Field of study

Polyunsaturated fatty acid-derived specialized pro-resolving lipid mediators (SPMs) play an important role in modulating inflammation. The aim of the study was to compare profiles of SPMs, SPM related lipid mediators and SPM receptor gene expression in gingiva of subjects with periodontitis to healthy controls. A total of 28 subjects were included; 13 periodontally healthy and 15 periodontitis before or after non-surgical periodontal therapy. Gingival tissues were collected from two representative posterior teeth prior to and 8 weeks after scaling and root planning; only once in the healthy group. Lipid mediator-SPM metabololipidomics was performed to identify metabolites in gingiva. qRT-PCR was performed to assess relative gene expression (

DigitalCommons@The Texas Medical Center

Combining comparative genomics with de novo motif discovery to identify human transcription factor DNA-binding motifs

Author: A Prakash
CT Harbison
G Pavesi
G Pavesi
J Gertz
J Hu
Linyong Mao
M Blanchette
M Kellis
M Tompa
S Aerts
S Aerts
S Sinha
SR Eddy
T Wang
W Jim Zheng
WW Wasserman
X Li
X Xie
Y Liu
Publication venue: BioMed Central
Publication date: 12/12/2006
Field of study

BACKGROUND: As more and more genomes are sequenced, comparative genomics approaches provide a methodology for identifying conserved regulatory elements that may be involved in gene regulations. RESULTS: We developed a novel method to combine comparative genomics with de novo motif discovery to identify human transcription factor binding motifs that are overrepresented and conserved in the upstream regions of a set of co-regulated genes. The method is validated by analyzing a well-characterized muscle specific gene set, and the results showed that our approach performed better than the existing programs in terms of sensitivity and prediction rate. CONCLUSION: The newly developed method can be used to extract regulatory signals in co-regulated genes, which can be derived from the microarray clustering analysis

Crossref

Springer - Publisher Connector

PubMed Central

A genetic variation map for chicken with 2.8 million single-nucleotide polymorphisms

Author: Aerts Andrea
Andersson Björn
Andersson Leif
Bartley Neil
Boardman Paul E
Bovenhuis Henk
Brandström Mikael
Bumstead Nat
Burt David W
Chen Chen
Chen Jie
Cheng Hans H
Consortium International Chicken Polymorphism Map
Crooijmans Richard P M A
Dai Mingtao
de Koning Dirk-Jan
Dong Le
Dong Wei
Ellegren Hans
Glavina Tijana
Gordon Laurie
Groenen Martien A M
Gunnarsson Ulrika
Hao Bailin
He Dandan
He Ximiao
Hillier Ladeana W
Hocking Paul M
Hu Songnian
Huang Xiangang
Huang Yanqing
Hubbard Simon J
Hunt Henry
Kaiser Pete
Kaufman Jim
Kindlund Ellen
Lamont Susan J
Lan Fengdi
Law Andy
Li Dawei
Li Guangyuan
Li Guoqing
Li Heng
Li Jun
Li Ning
Li Ruiqiang
Li Shengting
Li Songgang
Li Wenjie
Li Yuanzhe
Lin Wei
Liu Bin
Lucas Susan
Meng Qingshun
Morrice David
Ni Peixiang
Ovcharenko Ivan
Overton Ian M
Ponting Chris
Qi Qiuhui
Ran Longhua
Rogers Sally
Rothwell Lisa
Ruan Jue
Shi Jianping
Stubbs Lisa
Sun Yongqiao
Tammi Martti T
Tang Haizhou
Tong Wei
van der Poel Jan J
van Hateren Andy
Wahlberg Per
Walker Brian A
Wang Jian
Wang Jianjun
Wang Jing
Wang Jun
Wang Miaoheng
Wang Pei
Wang Xiaoling
Warren Wesley C
Webber Caleb
Wei Dong Qing
Wei Ning
Wilson Richard K
Wilson Stuart A
Wong Gane Ka-Shu
Xi Yan
Xie Fei
Yang Huanming
Yang Ning
Yang Shiaw-Pyng
Yang Xu
Yang Zheng
Ye Chen
Ye Jia
Young John R
Yu Jun
Yu Yingpu
Zeng Changqing
Zhang Jianguo
Zhang Jingjing
Zhang Xiaowei
Zhang Yunze
Zhang Zengjin
Zhang Zhenpeng
Zhang Zhi-Yong
Zhao Wenming
Zhao Yiqiang
Zheng Hongkun
Zheng Weimou
Zhou Huaijun
Zhou Jun
Zhou Yan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

We describe a genetic variation map for the chicken genome containing 2.8 million single-nucleotide polymorphisms ( SNPs). This map is based on a comparison of the sequences of three domestic chicken breeds ( a broiler, a layer and a Chinese silkie) with that of their wild ancestor, red jungle fowl. Subsequent experiments indicate that at least 90% of the variant sites are true SNPs, and at least 70% are common SNPs that segregate in many domestic breeds. Mean nucleotide diversity is about five SNPs per kilobase for almost every possible comparison between red jungle fowl and domestic lines, between two different domestic lines, and within domestic lines - in contrast to the notion that domestic animals are highly inbred relative to their wild ancestors. In fact, most of the SNPs originated before domestication, and there is little evidence of selective sweeps for adaptive alleles on length scales greater than 100 kilobases

Queen's University Belfast Research Portal

Edinburgh Research Explorer

Wageningen University & Research Publications

University of Gloucestershire Research Repository

The University of Manchester - Institutional Repository

University of Queensland eSpace

Digital Repository @ Iowa State University (ISU)

Online Research @ Cardiff

Oxford University Research Archive