Search CORE

7,476 research outputs found

Can Zipf's law be adapted to normalize microarrays?

Author: Costello Christine M
Croucher Peter JP
Deuschl Günther
Häsler Robert
Lu Tim
Schreiber Stefan
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an exponent close to -1 (i.e. obey Zipf's law). Based on the observation that our single channel and two channel microarray data sets also followed a power-law distribution, we were motivated to develop a normalization method based on this law, and examine how it compares with existing published techniques. A computationally simple and intuitively appealing technique based on this observation is presented. RESULTS: Using pairwise comparisons using MA plots (log ratio vs. log intensity), we compared this novel method to previously published normalization techniques, namely global normalization to the mean, the quantile method, and a variation on the loess normalization method designed specifically for boutique microarrays. Results indicated that, for single channel microarrays, the quantile method was superior with regard to eliminating intensity-dependent effects (banana curves), but Zipf's law normalization does minimize this effect by rotating the data distribution such that the maximal number of data points lie on the zero of the log ratio axis. For two channel boutique microarrays, the Zipf's law normalizations performed as well as, or better than existing techniques. CONCLUSION: Zipf's law normalization is a useful tool where the Quantile method cannot be applied, as is the case with microarrays containing functionally specific gene sets (boutique arrays)

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Broad Epigenetic Signature of Maternal Care in the Brain of Adult Rats

Author: A Korosi
A Wakshlak
AC Tang
Angela Sirigu
Aya Sasaki
B Sadikovic
B Tasic
BM Bolstad
BS McEwen
C Caldji
C Crane-Robinson
C Zou
CB Nemeroff
CC Nemeroff
D Francis
D Junghans
D Liu
DA Wilson
DJ Wong
DL Champagne
DT Smith
E Birney
E Segal
FA Champagne
G Poeggel
H Nishida
I Keshet
IC Weaver
IC Weaver
IC Weaver
J Coldwell
JA McCormick
JL Menard
JM Flanagan
KS Kendler
M Kawaguchi
M Weber
Matthew Suderman
Michael Hallett
Michael J. Meaney
MJ Meaney
MM Suzuki
Moshe Szyf
MW Coolen
MW Pfaffl
N Kaplan
P Novak
Patrick O. McGowan
PO McGowan
PO McGowan
RA Irizarry
S Katori
S Toki
SE Brown
SJ Clark
T Yagi
TA Down
TA Rauch
Tony C. T. Huang
TW Bredy
TW Bredy
TW Bredy
W Ovtscharoff Jr
WM Rideout 3rd
Publication venue: Public Library of Science
Publication date: 28/02/2011
Field of study

BACKGROUND: Maternal care is associated with long-term effects on behavior and epigenetic programming of the NR3C1 (GLUCOCORTICOID RECEPTOR) gene in the hippocampus of both rats and humans. In the rat, these effects are reversed by cross-fostering, demonstrating that they are defined by epigenetic rather than genetic processes. However, epigenetic changes at a single gene promoter are unlikely to account for the range of outcomes and the persistent change in expression of hundreds of additional genes in adult rats in response to differences in maternal care. METHODOLOGY/PRINCIPAL FINDINGS: We examine here using high-density oligonucleotide array the state of DNA methylation, histone acetylation and gene expression in a 7 million base pair region of chromosome 18 containing the NR3C1 gene in the hippocampus of adult rats. Natural variations in maternal care are associated with coordinate epigenetic changes spanning over a hundred kilobase pairs. The adult offspring of high compared to low maternal care mothers show epigenetic changes in promoters, exons, and gene ends associated with higher transcriptional activity across many genes within the locus examined. Other genes in this region remain unchanged, indicating a clustered yet specific and patterned response. Interestingly, the chromosomal region containing the protocadherin-α, -β, and -γ (Pcdh) gene families implicated in synaptogenesis show the highest differential response to maternal care. CONCLUSIONS/SIGNIFICANCE: The results suggest for the first time that the epigenetic response to maternal care is coordinated in clusters across broad genomic areas. The data indicate that the epigenetic response to maternal care involves not only single candidate gene promoters but includes transcriptional and intragenic sequences, as well as those residing distantly from transcription start sites. These epigenetic and transcriptional profiles constitute the first tiling microarray data set exploring the relationship between epigenetic modifications and RNA expression in both protein coding and non-coding regions across a chromosomal locus in the mammalian brain

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

What's that gene (or protein)? Online resources for exploring functions of genes, transcripts, and proteins

Author: Alberts B
Alexander J
Altschul SF
Amberger J
Asplund A
Barrett T
Becker KG
Benson DA
Bento AP
Bhagwat M
Blake JA
Boutros M
Bragin E
Bulusu KC
Capaldi AP
Chatr-Aryamontri A
Croft D
Davis AP
de Beer TA
Deribe YL
Dinkel H
Dinkel H
Dorée M
Doug Kellogg
Eisenhaber B
Fernandez JM
Finn RD
Fitch WM
Fleischmann A
Flicek P
Forbes SA
Forsburg SL
Franceschini A
Gaudet P
Geer LY
Geer LY
Gnad F
Good MC
Griss J
Gutmanas A
Hayles J
Hedegaard J
Herraez A
Hibbs MA
Hopkins AL
Hornbeck PV
Horowitz NH
Huang da W
Huang da W
Hung JH
Hunt T
Hunter S
Huntley RP
Hutchins JRA
James R. A. Hutchins
Kanehisa M
Karolchik D
Kersey PJ
Kersey PJ
Kim DU
Kirschner MW
Kornberg A
Kosuge T
Kouskoumvekaki I
Landrum MJ
Lane DP
Lappalainen I
Law V
Lee Y
Lees JG
Letunic I
Liebel U
Lipinski CA
Lotia S
Lu CT
Lu Z
Lütjohann DS
Madej T
Marchler-Bauer A
Mi H
Mi T
Müller HM
NCBI Resource Coordinators
Neumann B
Niedringhaus TP
Nishimura D
Obenauer JC
Ooi HS
Orchard S
Orchard S
Owens J
Ozsolak F
Pakseresht N
Petryszak R
Pruitt KD
Que S
Reardon S
Rose PW
Rosenbloom KR
Rustici G
Saito R
Schomburg I
Schreiber F
Sigrist CJ
Sillitoe I
Smith RN
Smoot ME
Stelzer G
Suzek BE
Sönnichsen B
UniProt Consortium
Villaveces JM
Walsh CT
Walther TC
Wang Y
Wang Y
Wolfsberg TG
Wood V
Yang W
Young NL
Publication venue: 'American Society for Cell Biology (ASCB)'
Publication date
Field of study

Crossref

Insights into distributed feature ranking

Author: Alonso-Betanzos Amparo
Bolon-Canedo Veronica
Brown Gavin
Sanchez-Marono Noelia
Sechidis Konstantinos
Publication venue
Publication date: 01/01/2018
Field of study

This version of the article: Bolón-Canedo, V., Sechidis, K., Sánchez-Maroño, N., Alonso-Betanzos, A., & Brown, G. (2019). ‘Insights into distributed feature ranking’ has been accepted for publication in: Information Sciences, 496, 378–398. The Version of Record is available online at https://doi.org/10.1016/j.ins.2018.09.045.[Abstract]: In an era in which the volume and complexity of datasets is continuously growing, feature selection techniques have become indispensable to extract useful information from huge amounts of data. However, existing algorithms may not scale well when dealing with huge datasets, and a possible solution is to distribute the data in several nodes. In this work we explore the different ways of distributing the data (by features and by samples) and we evaluate to what extent it is possible to obtain similar results as those obtained with the whole dataset. Trying to deal with the challenge of distributing the feature ranking process, we have performed experiments with different aggregation methods and feature rankers, and also evaluated the effect of distributing the feature ranking process in the subsequent classification performance.This research has been economically supported in part by the Spanish Ministerio de Economía y Competitividad and FEDER funds of the European Union through the research project TIN2015-65069-C2-1-R; and by the Consellería de Industria of the Xunta de Galicia through the research project GRC2014/035. Financial support from the Xunta de Galicia (Centro singular de investigación de Galicia accreditation 2016-2019) and the European Union (European Regional Development Fund - ERDF), is gratefully acknowledged (research project ED431G/01). V. Bolón-Canedo acknowledges support of the Xunta de Galicia under postdoctoral Grant code ED481B 2014/164-0.Xunta de Galicia; GRC2014/035Xunta de Galicia; ED431G/01Xunta de Galicia; ED481B 2014/164-

Repositorio da Universidade da Coruña

The University of Manchester - Institutional Repository

New Trends in Artificial Intelligence: Applications of Particle Swarm Optimization in Biomedical Problems

Author: Bharadwaj Shiv
Dhar Avinash
Kaushik Aman Chandra
Kumar Ajay
Wei Dongqing
Publication venue: 'IntechOpen'
Publication date: 13/02/2018
Field of study

Optimization is a process to discover the most effective element or solution from a set of all possible resources or solutions. Currently, there are various biological problems such as extending from biomolecule structure prediction to drug discovery that can be elevated by opting standard protocol for optimization. Particle swarm optimization (PSO) process, purposed by Dr. Eberhart and Dr. Kennedy in 1995, is solely based on population stochastic optimization technique. This method was designed by the researchers after inspired by social behavior of flocking bird or schooling fishes. This method shares numerous resemblances with the evolutionary computation procedures such as genetic algorithms (GA). Since, PSO algorithms is easy process to subject with minor adjustment of a few restrictions, it has gained more attention or advantages over other population based algorithms. Hence, PSO algorithms is widely used in various research fields like ranging from artificial neural network training to other areas where GA can be used in the system

IntechOpen

Crossref

Intelligent techniques using molecular data analysis in leukaemia: an opportunity for personalized medicine support system

Author: Adelson D.
Banjar H.
Brown A.
Chaudhri N.
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

The use of intelligent techniques in medicine has brought a ray of hope in terms of treating leukaemia patients. Personalized treatment uses patient’s genetic profile to select a mode of treatment. This process makes use of molecular technology and machine learning, to determine the most suitable approach to treating a leukaemia patient. Until now, no reviews have been published from a computational perspective concerning the development of personalized medicine intelligent techniques for leukaemia patients using molecular data analysis. This review studies the published empirical research on personalized medicine in leukaemia and synthesizes findings across studies related to intelligence techniques in leukaemia, with specific attention to particular categories of these studies to help identify opportunities for further research into personalized medicine support systems in chronic myeloid leukaemia. A systematic search was carried out to identify studies using intelligence techniques in leukaemia and to categorize these studies based on leukaemia type and also the task, data source, and purpose of the studies. Most studies used molecular data analysis for personalized medicine, but future advancement for leukaemia patients requires molecular models that use advanced machine-learning methods to automate decision-making in treatment management to deliver supportive medical information to the patient in clinical practice.Haneen Banjar, David Adelson, Fred Brown, and Naeem Chaudhr

Crossref

Adelaide Research & Scholarship

Directory of Open Access Journals

Recommended from our members

Meta-analysis of massively parallel reporter assays enables prediction of regulatory function across cell types.

Author: Ahituv Nadav
Kreimer Anat
Yan Zhongxia
Yosef Nir
Publication venue: eScholarship, University of California
Publication date: 01/09/2019
Field of study

Deciphering the potential of noncoding loci to influence gene regulation has been the subject of intense research, with important implications in understanding genetic underpinnings of human diseases. Massively parallel reporter assays (MPRAs) can measure regulatory activity of thousands of DNA sequences and their variants in a single experiment. With increasing number of publically available MPRA data sets, one can now develop data-driven models which, given a DNA sequence, predict its regulatory activity. Here, we performed a comprehensive meta-analysis of several MPRA data sets in a variety of cellular contexts. We first applied an ensemble of methods to predict MPRA output in each context and observed that the most predictive features are consistent across data sets. We then demonstrate that predictive models trained in one cellular context can be used to predict MPRA output in another, with loss of accuracy attributed to cell-type-specific features. Finally, we show that our approach achieves top performance in the Fifth Critical Assessment of Genome Interpretation "Regulation Saturation" Challenge for predicting effects of single-nucleotide variants. Overall, our analysis provides insights into how MPRA data can be leveraged to highlight functional regulatory regions throughout the genome and can guide effective design of future experiments by better prioritizing regions of interest

eScholarship - University of California

Making open data work for plant scientists

Author: Alsheikh-Ali
Ball
Barrett
Bastow
Baxter
Brazma
Breeze
Brenchley
Charis Cook
Fernie
Hickman
International Arabidopsis Informatics Consortium
International Arabidopsis Informatics Consortium
Jonathan Moore
Leonelli
Nicholas Smirnoff
Rogers
Royal Society
Rustici
Ruth Bastow
Sabina Leonelli
The International Barley Genome Sequencing Consortium
Windram
Winter
Zimmermann
Publication venue: 'Oxford University Press (OUP)'
Publication date: 16/09/2013
Field of study

Despite the clear demand for open data sharing, its implementation within plant science is still limited. This is, at least in part, because open data-sharing raises several unanswered questions and challenges to current research practices. In this commentary, some of the challenges encountered by plant researchers at the bench when generating, interpreting, and attempting to disseminate their data have been highlighted. The difficulties involved in sharing sequencing, transcriptomics, proteomics, and metabolomics data are reviewed. The benefits and drawbacks of three data-sharing venues currently available to plant scientists are identified and assessed: (i) journal publication; (ii) university repositories; and (iii) community and project-specific databases. It is concluded that community and project-specific databases are the most useful to researchers interested in effective data sharing, since these databases are explicitly created to meet the researchers’ needs, support extensive curation, and embody a heightened awareness of what it takes to make data reuseable by others. Such bottom-up and community-driven approaches need to be valued by the research community, supported by publishers, and provided with long-term sustainable support by funding bodies and government. At the same time, these databases need to be linked to generic databases where possible, in order to be discoverable to the majority of researchers and thus promote effective and efficient data sharing. As we look forward to a future that embraces open access to data and publications, it is essential that data policies, data curation, data integration, data infrastructure, and data funding are linked together so as to foster data access and research productivity

Crossref

PubMed Central

Warwick Research Archives Portal Repository