Search CORE

7,922 research outputs found

Establishment of a integrative multi-omics expression database CKDdb in the context of chronic kidney disease (CKD)

Author: Fernandes Marco
Husi Holger
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Complex human traits such as chronic kidney disease (CKD) are a major health and financial burden in modern societies. Currently, the description of the CKD onset and progression at the molecular level is still not fully understood. Meanwhile, the prolific use of high-throughput omic technologies in disease biomarker discovery studies yielded a vast amount of disjointed data that cannot be easily collated. Therefore, we aimed to develop a molecule-centric database featuring CKD-related experiments from available literature publications. We established the Chronic Kidney Disease database CKDdb, an integrated and clustered information resource that covers multi-omic studies (microRNAs, genomics, peptidomics, proteomics and metabolomics) of CKD and related disorders by performing literature data mining and manual curation. The CKDdb database contains differential expression data from 49395 molecule entries (redundant), of which 16885 are unique molecules (non-redundant) from 377 manually curated studies of 230 publications. This database was intentionally built to allow disease pathway analysis through a systems approach in order to yield biological meaning by integrating all existing information and therefore has the potential to unravel and gain an in-depth understanding of the key molecular events that modulate CKD pathogenesis

PubMed Central

Enlighten

Dimensionality Reduction Approach using Attributes Extraction and Attributes Selection in Gene Expression Databases

Author: Borges Helyane Bronoski
Matos Simone Nasser
Melo Rafael Felipe Tasaka de
Nievola Julio Cesar
Vieira Raimundo Osvaldo
Publication venue: American Academic Scientific Research Journal for Engineering, Technology, and Sciences
Publication date: 10/04/2021
Field of study

The gene expression databases are formed by a high number of attributes. To deal with this amount, data dimensionality reduction is used in order to minimize the volume of data to be treated regarding the number of attributes, and to increase the generalization capability of learning methods by eliminating irrelevant and/or redundant data. This paper proposes an approach to means of dimensionality reduction, which joins attribute extraction and attributes selection. For this, we used the Random Projection method and the filter and wrapper approaches for the attribute selection. The experiments are realized in five gene expression microarray databases. The results of the experiments showed that join of those approaches can provide promising results

American Scientific Research Journal for Engineering, Technology, and Sciences (ASRJETS)

Inferring gene regulatory networks using ensembles of feature selection techniques

Author: Demeester Piet
Dhaene Tom
Geurts Pierre
Huynh-thu Vân anh
Ruyssinck Joeri
Saeys Yvan
Publication venue
Publication date: 01/01/2012
Field of study

Ghent University Academic Bibliography

Challenges of Big Data Analysis

Author: Fan Jianqing
Han Fang
Liu Han
Publication venue: 'Oxford University Press (OUP)'
Publication date: 06/02/2014
Field of study

Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article give overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasis on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Recommended from our members

Simulating multiple faceted variability in single cell RNA sequencing.

Author: Xu Chenling
Yosef Nir
Zhang Xiuwei
Publication venue: eScholarship, University of California
Publication date: 01/06/2019
Field of study

The abundance of new computational methods for processing and interpreting transcriptomes at a single cell level raises the need for in silico platforms for evaluation and validation. Here, we present SymSim, a simulator that explicitly models the processes that give rise to data observed in single cell RNA-Seq experiments. The components of the SymSim pipeline pertain to the three primary sources of variation in single cell RNA-Seq data: noise intrinsic to the process of transcription, extrinsic variation indicative of different cell states (both discrete and continuous), and technical variation due to low sensitivity and measurement noise and bias. We demonstrate how SymSim can be used for benchmarking methods for clustering, differential expression and trajectory inference, and for examining the effects of various parameters on their performance. We also show how SymSim can be used to evaluate the number of cells required to detect a rare population under various scenarios

eScholarship - University of California