7,476 research outputs found

    Can Zipf's law be adapted to normalize microarrays?

    Get PDF
    BACKGROUND: Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an exponent close to -1 (i.e. obey Zipf's law). Based on the observation that our single channel and two channel microarray data sets also followed a power-law distribution, we were motivated to develop a normalization method based on this law, and examine how it compares with existing published techniques. A computationally simple and intuitively appealing technique based on this observation is presented. RESULTS: Using pairwise comparisons using MA plots (log ratio vs. log intensity), we compared this novel method to previously published normalization techniques, namely global normalization to the mean, the quantile method, and a variation on the loess normalization method designed specifically for boutique microarrays. Results indicated that, for single channel microarrays, the quantile method was superior with regard to eliminating intensity-dependent effects (banana curves), but Zipf's law normalization does minimize this effect by rotating the data distribution such that the maximal number of data points lie on the zero of the log ratio axis. For two channel boutique microarrays, the Zipf's law normalizations performed as well as, or better than existing techniques. CONCLUSION: Zipf's law normalization is a useful tool where the Quantile method cannot be applied, as is the case with microarrays containing functionally specific gene sets (boutique arrays)

    Broad Epigenetic Signature of Maternal Care in the Brain of Adult Rats

    Get PDF
    BACKGROUND: Maternal care is associated with long-term effects on behavior and epigenetic programming of the NR3C1 (GLUCOCORTICOID RECEPTOR) gene in the hippocampus of both rats and humans. In the rat, these effects are reversed by cross-fostering, demonstrating that they are defined by epigenetic rather than genetic processes. However, epigenetic changes at a single gene promoter are unlikely to account for the range of outcomes and the persistent change in expression of hundreds of additional genes in adult rats in response to differences in maternal care. METHODOLOGY/PRINCIPAL FINDINGS: We examine here using high-density oligonucleotide array the state of DNA methylation, histone acetylation and gene expression in a 7 million base pair region of chromosome 18 containing the NR3C1 gene in the hippocampus of adult rats. Natural variations in maternal care are associated with coordinate epigenetic changes spanning over a hundred kilobase pairs. The adult offspring of high compared to low maternal care mothers show epigenetic changes in promoters, exons, and gene ends associated with higher transcriptional activity across many genes within the locus examined. Other genes in this region remain unchanged, indicating a clustered yet specific and patterned response. Interestingly, the chromosomal region containing the protocadherin-α, -β, and -γ (Pcdh) gene families implicated in synaptogenesis show the highest differential response to maternal care. CONCLUSIONS/SIGNIFICANCE: The results suggest for the first time that the epigenetic response to maternal care is coordinated in clusters across broad genomic areas. The data indicate that the epigenetic response to maternal care involves not only single candidate gene promoters but includes transcriptional and intragenic sequences, as well as those residing distantly from transcription start sites. These epigenetic and transcriptional profiles constitute the first tiling microarray data set exploring the relationship between epigenetic modifications and RNA expression in both protein coding and non-coding regions across a chromosomal locus in the mammalian brain

    Insights into distributed feature ranking

    Get PDF
    This version of the article: Bolón-Canedo, V., Sechidis, K., Sánchez-Maroño, N., Alonso-Betanzos, A., & Brown, G. (2019). ‘Insights into distributed feature ranking’ has been accepted for publication in: Information Sciences, 496, 378–398. The Version of Record is available online at https://doi.org/10.1016/j.ins.2018.09.045.[Abstract]: In an era in which the volume and complexity of datasets is continuously growing, feature selection techniques have become indispensable to extract useful information from huge amounts of data. However, existing algorithms may not scale well when dealing with huge datasets, and a possible solution is to distribute the data in several nodes. In this work we explore the different ways of distributing the data (by features and by samples) and we evaluate to what extent it is possible to obtain similar results as those obtained with the whole dataset. Trying to deal with the challenge of distributing the feature ranking process, we have performed experiments with different aggregation methods and feature rankers, and also evaluated the effect of distributing the feature ranking process in the subsequent classification performance.This research has been economically supported in part by the Spanish Ministerio de Economía y Competitividad and FEDER funds of the European Union through the research project TIN2015-65069-C2-1-R; and by the Consellería de Industria of the Xunta de Galicia through the research project GRC2014/035. Financial support from the Xunta de Galicia (Centro singular de investigación de Galicia accreditation 2016-2019) and the European Union (European Regional Development Fund - ERDF), is gratefully acknowledged (research project ED431G/01). V. Bolón-Canedo acknowledges support of the Xunta de Galicia under postdoctoral Grant code ED481B 2014/164-0.Xunta de Galicia; GRC2014/035Xunta de Galicia; ED431G/01Xunta de Galicia; ED481B 2014/164-

    New Trends in Artificial Intelligence: Applications of Particle Swarm Optimization in Biomedical Problems

    Get PDF
    Optimization is a process to discover the most effective element or solution from a set of all possible resources or solutions. Currently, there are various biological problems such as extending from biomolecule structure prediction to drug discovery that can be elevated by opting standard protocol for optimization. Particle swarm optimization (PSO) process, purposed by Dr. Eberhart and Dr. Kennedy in 1995, is solely based on population stochastic optimization technique. This method was designed by the researchers after inspired by social behavior of flocking bird or schooling fishes. This method shares numerous resemblances with the evolutionary computation procedures such as genetic algorithms (GA). Since, PSO algorithms is easy process to subject with minor adjustment of a few restrictions, it has gained more attention or advantages over other population based algorithms. Hence, PSO algorithms is widely used in various research fields like ranging from artificial neural network training to other areas where GA can be used in the system

    Intelligent techniques using molecular data analysis in leukaemia: an opportunity for personalized medicine support system

    Get PDF
    The use of intelligent techniques in medicine has brought a ray of hope in terms of treating leukaemia patients. Personalized treatment uses patient’s genetic profile to select a mode of treatment. This process makes use of molecular technology and machine learning, to determine the most suitable approach to treating a leukaemia patient. Until now, no reviews have been published from a computational perspective concerning the development of personalized medicine intelligent techniques for leukaemia patients using molecular data analysis. This review studies the published empirical research on personalized medicine in leukaemia and synthesizes findings across studies related to intelligence techniques in leukaemia, with specific attention to particular categories of these studies to help identify opportunities for further research into personalized medicine support systems in chronic myeloid leukaemia. A systematic search was carried out to identify studies using intelligence techniques in leukaemia and to categorize these studies based on leukaemia type and also the task, data source, and purpose of the studies. Most studies used molecular data analysis for personalized medicine, but future advancement for leukaemia patients requires molecular models that use advanced machine-learning methods to automate decision-making in treatment management to deliver supportive medical information to the patient in clinical practice.Haneen Banjar, David Adelson, Fred Brown, and Naeem Chaudhr

    Making open data work for plant scientists

    Get PDF
    Despite the clear demand for open data sharing, its implementation within plant science is still limited. This is, at least in part, because open data-sharing raises several unanswered questions and challenges to current research practices. In this commentary, some of the challenges encountered by plant researchers at the bench when generating, interpreting, and attempting to disseminate their data have been highlighted. The difficulties involved in sharing sequencing, transcriptomics, proteomics, and metabolomics data are reviewed. The benefits and drawbacks of three data-sharing venues currently available to plant scientists are identified and assessed: (i) journal publication; (ii) university repositories; and (iii) community and project-specific databases. It is concluded that community and project-specific databases are the most useful to researchers interested in effective data sharing, since these databases are explicitly created to meet the researchers’ needs, support extensive curation, and embody a heightened awareness of what it takes to make data reuseable by others. Such bottom-up and community-driven approaches need to be valued by the research community, supported by publishers, and provided with long-term sustainable support by funding bodies and government. At the same time, these databases need to be linked to generic databases where possible, in order to be discoverable to the majority of researchers and thus promote effective and efficient data sharing. As we look forward to a future that embraces open access to data and publications, it is essential that data policies, data curation, data integration, data infrastructure, and data funding are linked together so as to foster data access and research productivity
    • …
    corecore