Search CORE

3,444 research outputs found

Previously Unidentified Changes in Renal Cell Carcinoma Gene Expression Identified by Parametric Analysis of Microarray Data

Author: Christman Michael F.
Cohen Herbert T.
Frampton Garrett M.
Gerry Norman P.
Lenburg Marc E.
Liou Louis S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/11/2003
Field of study

BACKGROUND. Renal cell carcinoma is a common malignancy that often presents as a metastatic-disease for which there are no effective treatments. To gain insights into the mechanism of renal cell carcinogenesis, a number of genome-wide expression profiling studies have been performed. Surprisingly, there is very poor agreement among these studies as to which genes are differentially regulated. To better understand this lack of agreement we profiled renal cell tumor gene expression using genome-wide microarrays (45,000 probe sets) and compare our analysis to previous microarray studies. METHODS. We hybridized total RNA isolated from renal cell tumors and adjacent normal tissue to Affymetrix U133A and U133B arrays. We removed samples with technical defects and removed probesets that failed to exhibit sequence-specific hybridization in any of the samples. We detected differential gene expression in the resulting dataset with parametric methods and identified keywords that are overrepresented in the differentially expressed genes with the Fisher-exact test. RESULTS. We identify 1,234 genes that are more than three-fold changed in renal tumors by t-test, 800 of which have not been previously reported to be altered in renal cell tumors. Of the only 37 genes that have been identified as being differentially expressed in three or more of five previous microarray studies of renal tumor gene expression, our analysis finds 33 of these genes (89%). A key to the sensitivity and power of our analysis is filtering out defective samples and genes that are not reliably detected. CONCLUSIONS. The widespread use of sample-wise voting schemes for detecting differential expression that do not control for false positives likely account for the poor overlap among previous studies. Among the many genes we identified using parametric methods that were not previously reported as being differentially expressed in renal cell tumors are several oncogenes and tumor suppressor genes that likely play important roles in renal cell carcinogenesis. This highlights the need for rigorous statistical approaches in microarray studies.National Institutes of Healt

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

PubMed Central

A systematic review of data quality issues in knowledge discovery tasks

Author: Corrales David Camilo
Corrales Juan Carlos
Ledezma Agapito Ismael
Publication venue: 'Universidad de Medellin'
Publication date: 07/11/2015
Field of study

Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Universidad de Medellín: Revistas Científicas

Repositorio Institucional Universidad de Medellín

DIALNET

An integrated approach for identifying wrongly labeled samples when performing classification in microarray data

Author: Chang CQ
Hung YS
LEUNG YY
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

published_or_final_versio

Directory of Open Access Journals

PubMed Central

HKU Scholars Hub

A kernel-based approach for detecting outliers of high-dimensional biological data

Author: A Malossini
B Schölkopf
C Aggarwal
D Koller
E Knorr
E Knorr
F Angiulli
H Ressom
J Oh
Jean Gao
JS Wang
Jung Hun Oh
K Kadota
L Manevitz
M Tumminello
R Lilien
S Bandyopadhyay
S Zhou
T Fawcett
T Golub
U Alon
W Lee
Publication venue: BioMed Central
Publication date: 29/04/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

An integrated approach to feature selection and classification for microarray data with outlier detection

Author: Hung YS
Leung YY
Publication venue
Publication date: 01/01/2009
Field of study

postprintThe 8th Annual International Conference on Computational Systems Bioinformatics (CSB 2009), Stanford University, 10-13 August 2009

HKU Scholars Hub

Exon expression arrays as a tool to identify new cancer genes

Author: A Hollestelle
A Schroeder
Antoinette Hollestelle
BJ Blencowe
C Greenman
Christopher Arendt
Elza Duijm
F Elstrodt
Fons Elstrodt
GH Su
J Li
Jord H. A. Nagel
Justine K. Peeters
KF Becker
L Frederick
L Frederick
Linda B. C. Bralten
M van de Wetering
M Wasielewski
Maartje J. Vuerhard
Marijke Wasielewski
Mieke Schutte
NA Faustino
P Huusko
PA Futreal
Peter A. Sillevis Smitt
Peter van der Spek
Pim J. French
PJ French
PJ French
PS Mischel
R Nishikawa
SA Hahn
SA Tomlins
T Sjoblom
TA Clark
Y Samuels
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2008
Field of study

Background: Identification of genes that are causally implicated in oncogenesis is a major goal in cancer research. An estimated 10-20% of cancer-related gene mutations result in skipping of one or more exons in the encoded transcripts. Here we report on a strategy to screen in a global fashion for such exon-skipping events using PAttern based Correlation (PAC). The PAC algorithm has been used previously to identify differentially expressed splice variants between two predefined subgroups. As genetic changes in cancer are sample specific, we tested the ability of PAC to identify aberrantly expressed exons in single samples. Principal Findings: As a proof-of-principle, we tested the PAC strategy on human cancer samples of which the complete coding sequence of eight cancer genes had been screened for mutations. PAC detected all seven exon-skipping mutants among 12 cancer cell lines. PAC also identified exon-skipping mutants in clinical cancer specimens although detection was compromised due to heterogeneous (wild-type) transcript expression. PAC reduced the number candidate genes/exons for subsequent mutational analysis by two to three orders of magnitude and had a substantial true positive rate. Importantly, of 112 randomly selected outlier exons, sequence analysis identified two novel exon skipping events, two novel base changes and 21 previously reported base changes (SNPs). Conclusions: The ability of PAC to enrich for mutated transcripts and to identify known and novel genetic changes confirms its suitability as a strategy to identify candidate cancer genes

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

EUR Research Repository

Erasmus University Digital Repository

Detecting Outlier Microarray Arrays by Correlation and Percentage of Outliers Spots

Author: Guo Xiang
Heckman Caroline
Hooke Jeffrey
Hu Hai
Liebman Michael N.
Papcunik Denise
Shriver Craig D.
Yang Song
Yang Yaw-Ching
Publication venue: Libertas Academica
Publication date: 01/01/2006
Field of study

We developed a quality assurance (QA) tool, namely microarray outlier filter (MOF), and have applied it to our microarray datasets for the identification of problematic arrays. Our approach is based on the comparison of the arrays using the correlation coefficient and the number of outlier spots generated on each array to reveal outlier arrays. For a human universal reference (HUR) dataset, which is used as a technical control in our standard hybridization procedure, 3 outlier arrays were identified out of 35 experiments. For a human blood dataset, 12 outlier arrays were identified from 185 experiments. In general, arrays from human blood samples displayed greater variation in their gene expression profiles than arrays from HUR samples. As a result, MOF identified two distinct patterns in the occurrence of outlier arrays. These results demonstrate that this methodology is a valuable QA practice to identify questionable microarray data prior to downstream analysis

Directory of Open Access Journals

PubMed Central

Multivariate classification of gene expression microarray data

Author: Botella Pérez Cristina
Publication venue: 'Universitat Rovira I Virgili'
Publication date: 01/01/2010
Field of study

L'expressiódels gens obtinguts de l'anàliside microarrays s'utilitza en molts casos, per classificar les cèllules. En aquestatesi, unaversióprobabilística del mètodeDiscriminant Partial Least Squares (p-DPLS)s'utilitza per classificar les mostres de les expressions delsseus gens. p-DPLS esbasa en la regla de Bayes de la probabilitat a posteriori. Aquestsclassificadorssónforaçats a classficarsempre.Per superaraquestalimitaciós'haimplementatl'opció de rebuig.Aquestaopciópermetrebutjarlesmostresamb alt riscd'errors de classificació (és a dir, mostresambigüesi outliers).Aquestaopció de rebuigcombinacriterisbasats en els residuals x, el leverage ielsvalorspredits. A més,esdesenvolupa un mètode de selecció de variables per triarels gens mésrellevants, jaque la majoriadels gens analitzatsamb un microarraysónirrellevants per al propòsit particular de classificacióI podenconfondre el classificador. Finalment, el DPLSs'estenen a la classificació multi-classemitjançant la combinació de PLS ambl'anàlisidiscriminant lineal

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tesis Doctorals en Xarxa

Repositori Institucional URV

Improved quality control processing of peptide-centric LC-MS proteomics data

Author: Amy C. Sims
Anderson
Barnett
Bobbie-Jo M. Webb-Robertson
Bukhman
Caroni
Cho
Croux
Daly
Dixon
Filzmoser
Grubbs
Hawkins
Hoaglin
Jain
Jaitly
Joel G. Pounds
Jon M. Jacobs
Karpievitch
Katrina M. Waters
Kauffmann
Kemmeren
Lee
Li
MacCoss
Mahalanobis
Melissa M. Matzke
Metz
Monroe
Oberg
Oberg
Piening
Ralph S. Baric
Rocke
Rocke
Rudnick
Schulz-Trieglaff
Smith
Stead
Thomas O. Metz
Webb-Robertson
Wilson
Xia
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Motivation: In the analysis of differential peptide peak intensities (i.e. abundance measures), LC-MS analyses with poor quality peptide abundance data can bias downstream statistical analyses and hence the biological interpretation for an otherwise high-quality dataset. Although considerable effort has been placed on assuring the quality of the peptide identification with respect to spectral processing, to date quality assessment of the subsequent peptide abundance data matrix has been limited to a subjective visual inspection of run-by-run correlation or individual peptide components. Identifying statistical outliers is a critical step in the processing of proteomics data as many of the downstream statistical analyses [e.g. analysis of variance (ANOVA)] rely upon accurate estimates of sample variance, and their results are influenced by extreme values

Crossref

PubMed Central

Carolina Digital Repository