Search CORE

3,881 research outputs found

Systematic bioinformatic analysis of expression levels of 17,330 human genes across 9,783 samples from 175 types of healthy and pathological tissues

Author: Elmar Bucher
Henri Sara
Henrik Edgren
Jaakko Astola
John-Patrick Mpindi
Kalle Ojala
Kristiina Iljin
Maija Wolf
Mari Bjorkman
Matthias Nees
Matti Saarela
Olli Kallioniemi
Paula Vainio
Reija Autio
Rolf I Skotheim
Saija Haapa-Paananen
Sami Kilpinen
Sampsa Hautaniemi
Tommi Pisto
Publication venue: Springer Nature
Publication date: 01/01/2008
Field of study

Our knowledge on tissue- and disease-specific functions of human genes is rather limited and highly context-specific. Here, we have developed a method for the comparison of mRNA expression levels of most human genes across 9,783 Affymetrix gene expression array experiments representing 43 normal human tissue types, 68 cancer types, and 64 other diseases. This database of gene expression patterns in normal human tissues and pathological conditions covers 113 million datapoints and is available from the GeneSapiens website

Crossref

Springer - Publisher Connector

PubMed Central

VTT Research System

Decorin protein core affects the global gene expression profile of the tumor microenvironment in a triple-negative orthotopic breast carcinoma xenograft model

Author: Buraschi Simone
Evans Barry
Iniguez Leonardo A.
Iozzo Renato V.
Neill Thomas
Owens Rick T.
Peiper Stephen C.
Purkins George
Schäfer Liliana
Vadigepalli Rajanikanth
Wang Zi-Xuan
Publication venue
Publication date: 01/01/2012
Field of study

Decorin, a member of the small leucine-rich proteoglycan gene family, exists and functions wholly within the tumor microenvironment to suppress tumorigenesis by directly targeting and antagonizing multiple receptor tyrosine kinases, such as the EGFR and Met. This leads to potent and sustained signal attenuation, growth arrest, and angiostasis. We thus sought to evaluate the tumoricidal benefits of systemic decorin on a triple-negative orthotopic breast carcinoma xenograft model. To this end, we employed a novel high-density mixed expression array capable of differentiating and simultaneously measuring gene signatures of both Mus musculus (stromal) and Homo sapiens (epithelial) tissue origins. We found that decorin protein core modulated the differential expression of 374 genes within the stromal compartment of the tumor xenograft. Further, our top gene ontology classes strongly suggests an unexpected and preferential role for decorin protein core to inhibit genes necessary for immunomodulatory responses while simultaneously inducing expression of those possessing cellular adhesion and tumor suppressive gene properties. Rigorous verification of the top scoring candidates led to the discovery of three genes heretofore unlinked to malignant breast cancer that were reproducibly found to be induced in several models of tumor stroma. Collectively, our data provide highly novel and unexpected stromal gene signatures as a direct function of systemic administration of decorin protein core and reveals a fundamental basis of action for decorin to modulate the tumor stroma as a biological mechanism for the ascribed anti-tumorigenic properties

Directory of Open Access Journals

PubMed Central

Jefferson Digital Commons

Hochschulschriftenserver - Universität Frankfurt am Main

FigShare

Mitmemõõtmeliste andmete statistiline analüüs bioinformaatikas

Author: Metsalu Tauno
Publication venue
Publication date: 29/01/2016
Field of study

Väitekirja elektrooniline versioon ei sisalda publikatsioone.Valgud on organismide ühed tähtsaimad ehituskivid. Nende kogust ja omavahelisi seoseid uurides on võimalik saada infot organismi seisundi kohta. Tänapäevased seadmed võimaldavad koguda lühikese ajaga palju valkudega seotud andmeid. Nende analüüs on aga suhteliselt keerukas ja on loonud uue teadusharu nimega bioinformaatika. Käesoleva doktoritöö eesmärgiks on kirjeldada mitmemõõtmeliste andmete statistilise analüüsiga seotud probleeme ja nende lahendusi. Näidatakse, kuidas sellised andmed saab esitada maatriksi kujul. Antakse ülevaade andmeallikatest ja analüüsimeetoditest ning näidatakse, kuidas neid saab praktikas kasutada. Kirjeldatakse üleeuroopalist vähiuuringute projekti PREDECT, kus paljud organisatsioonid osalevad vähimudelite täiustamises. Antakse ülevaade metaandmete kogumisest paljudelt partneritelt, samuti veebitööriistadest, mis loodi esmaseks andmeanalüüsiks. Kirjeldatakse uudse rinnavähi mudeliga seotud analüüsi ja koelõikude võrdlust erinevates laboritingimustes. Tutvustatakse vabalt kasutatavat veebitööriista, millega saab teha kirjeldavat andmeanalüüsi. Järgmistes peatükkides kirjeldatakse andmeanalüüsi erinevates uuringutes. Inimese platsentas leiti mitmeid uusi alleelispetsiifilise ekspressiooniga geene. Uuriti atoopilise dermatiidi molekulaarseid mehhanisme, täpsemalt valgu gamma-interferoon mõju sellele haigusele. Leiti mikroRNAsid, mida saab kasutada endometrioosi markeritena, ja loodi klassifitseerija endometrioosihaigete eristamiseks tervetest.Proteins are one of the most important building blocks of an organism. By investigating the abundance and relations between different proteins, it is possible to get information about the current state of the organism. Modern technologies allow to collect a large amount of data related to proteins in a short period of time. This type of analysis is quite complicated and has created a new field of science called bioinformatics. The aim of the dissertation is to describe problems and solutions related to statistical analysis of multivariate data. It is shown how this type of data can be presented as a matrix. An overview of data sources and analysis methods is given and it is shown how they can be used in practice. A pan-European project PREDECT is described where many organizations are contributing to develop better cancer models. An overview is given about collecting metadata from multiple partners, and about web tools created for initial data analysis. An analysis concerning a novel breast cancer model is described, and a comparison of tissue slices in different cultivation conditions is made. A freely available web tool is introduced which allows to perform exploratory data analysis. Next chapters describe data analysis in various projects. Multiple novel genes were found in the human placenta that have an allele-specific expression. Molecular mechanisms of a disease called atopic dermatitis were examined, more specifically the influence of the protein interferon-gamma. MicroRNAs were found that can be used as markers for a disease called endometriosis, and a classifier was built to differentiate people with endometriosis from healthy people

DSpace at Tartu University Library

Previously Unidentified Changes in Renal Cell Carcinoma Gene Expression Identified by Parametric Analysis of Microarray Data

Author: Christman Michael F.
Cohen Herbert T.
Frampton Garrett M.
Gerry Norman P.
Lenburg Marc E.
Liou Louis S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/11/2003
Field of study

BACKGROUND. Renal cell carcinoma is a common malignancy that often presents as a metastatic-disease for which there are no effective treatments. To gain insights into the mechanism of renal cell carcinogenesis, a number of genome-wide expression profiling studies have been performed. Surprisingly, there is very poor agreement among these studies as to which genes are differentially regulated. To better understand this lack of agreement we profiled renal cell tumor gene expression using genome-wide microarrays (45,000 probe sets) and compare our analysis to previous microarray studies. METHODS. We hybridized total RNA isolated from renal cell tumors and adjacent normal tissue to Affymetrix U133A and U133B arrays. We removed samples with technical defects and removed probesets that failed to exhibit sequence-specific hybridization in any of the samples. We detected differential gene expression in the resulting dataset with parametric methods and identified keywords that are overrepresented in the differentially expressed genes with the Fisher-exact test. RESULTS. We identify 1,234 genes that are more than three-fold changed in renal tumors by t-test, 800 of which have not been previously reported to be altered in renal cell tumors. Of the only 37 genes that have been identified as being differentially expressed in three or more of five previous microarray studies of renal tumor gene expression, our analysis finds 33 of these genes (89%). A key to the sensitivity and power of our analysis is filtering out defective samples and genes that are not reliably detected. CONCLUSIONS. The widespread use of sample-wise voting schemes for detecting differential expression that do not control for false positives likely account for the poor overlap among previous studies. Among the many genes we identified using parametric methods that were not previously reported as being differentially expressed in renal cell tumors are several oncogenes and tumor suppressor genes that likely play important roles in renal cell carcinogenesis. This highlights the need for rigorous statistical approaches in microarray studies.National Institutes of Healt

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

PubMed Central

CAncer bioMarker Prediction Pipeline (CAMPP) - A standardized framework for the analysis of quantitative biological data

Author: Krogh Anders
Papaleo Elena
Terkelsen Thilde
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

With the improvement of -omics and next-generation sequencing (NGS) methodologies, along with the lowered cost of generating these types of data, the analysis of high-throughput biological data has become standard both for forming and testing biomedical hypotheses. Our knowledge of how to normalize datasets to remove latent undesirable variances has grown extensively, making for standardized data that are easily compared between studies. Here we present the CAncer bioMarker Prediction Pipeline (CAMPP), an open-source R-based wrapper (https://github.com/ELELAB/CAncer-bioMarker-Prediction-Pipeline -CAMPP) intended to aid bioinformatic software-users with data analyses. CAMPP is called from a terminal command line and is supported by a user-friendly manual. The pipeline may be run on a local computer and requires little or no knowledge of programming. To avoid issues relating to R-package updates, a renv .lock file is provided to ensure R-package stability. Data-management includes missing value imputation, data normalization, and distributional checks. CAMPP performs (I) k-means clustering, (II) differential expression/abundance analysis, (III) elastic-net regression, (IV) correlation and co-expression network analyses, (V) survival analysis, and (VI) protein-protein/miRNA-gene interaction networks. The pipeline returns tabular files and graphical representations of the results. We hope that CAMPP will assist in streamlining bioinformatic analysis of quantitative biological data, whilst ensuring an appropriate bio-statistical framework

Directory of Open Access Journals

Copenhagen University Research Information System

Mammary molecular portraits reveal lineage-specific features and progenitor cell vulnerabilities.

Author: Abe
Akalin
Alison E. Casey
Ankit Sinha
Asselin-Labat
Buenrostro
Cardiff
Cerami
Cheryl Arrowsmith
Chlebowski
Cox
Cox
Dalia Barsyte-Lovejoy
Daniel De Carvalho
Deugnier
Dos Santos
Edgar
Eirew
Eirew
Eisen
Erik Drysdale
Gary Bader
Gascard
Genevieve Deblois
Gu
Hal Berman
Heinz
Hennighausen
Herschkowitz
Hu
Hui Fang
Huston
Hyeyeon Kim
Ignatchenko
Jackson
Jennifer Cruickshank
Joshi
Joshi
Joshi
Julie Livingstone
Kaltenborn
Kauff
Kelsey
Kendrick
Kiechl
Kislinger
Koboldt
Kotsopoulos
Krueger
Kucera
Labarge
Li
Lim
Lim
Lin
Loenen
Lucas
Lydon
Marotti
Maruyama
Mathieu Lupien
McLean
Meissner
Merico
Michailidou
Michalak
Mohammed
Molyneux
Mona Shehata
Nguyen
Pal
Pathania
Paul C. Boutros
Paul Waterhouse
Pei
Pellacani
Pirashaanthy Tharmapalan
Rajat Singhania
Rama Khokha
Reimand
Rios
Rios
Rugg-Gunn
Ruth Isserlin
Schimanski
Shackleton
Shannon
Shehata
Shiah
Shu
Sigl
Smith
Smyth
Stefan Hofer
Stefan Knapp
Stingl
Storey
Stunnenberg
Subramanian
Swneke Bailey
Thomas Kislinger
Tiago Medina
Tomasetti
van Amerongen
Van Keymeulen
Van Keymeulen
Visvader
Wang
Wojtowicz
Wuidart
Yu-Jia Shiah
Zhang
Publication venue: eScholarship, University of California
Publication date: 01/08/2018
Field of study

The mammary epithelium depends on specific lineages and their stem and progenitor function to accommodate hormone-triggered physiological demands in the adult female. Perturbations of these lineages underpin breast cancer risk, yet our understanding of normal mammary cell composition is incomplete. Here, we build a multimodal resource for the adult gland through comprehensive profiling of primary cell epigenomes, transcriptomes, and proteomes. We define systems-level relationships between chromatin-DNA-RNA-protein states, identify lineage-specific DNA methylation of transcription factor binding sites, and pinpoint proteins underlying progesterone responsiveness. Comparative proteomics of estrogen and progesterone receptor-positive and -negative cell populations, extensive target validation, and drug testing lead to discovery of stem and progenitor cell vulnerabilities. Top epigenetic drugs exert cytostatic effects; prevent adult mammary cell expansion, clonogenicity, and mammopoiesis; and deplete stem cell frequency. Select drugs also abrogate human breast progenitor cell activity in normal and high-risk patient samples. This integrative computational and functional study provides fundamental insight into mammary lineage and stem cell biology

Crossref

eScholarship - University of California

iGPSe: A Visual Analytic System for Integrative Genomic Based Cancer Patient Stratification

Author: Ding Hao
Huang Kun
Machiraju Raghu
Wang Chao
Publication venue
Publication date: 01/01/2014
Field of study

Background: Cancers are highly heterogeneous with different subtypes. These subtypes often possess different genetic variants, present different pathological phenotypes, and most importantly, show various clinical outcomes such as varied prognosis and response to treatment and likelihood for recurrence and metastasis. Recently, integrative genomics (or panomics) approaches are often adopted with the goal of combining multiple types of omics data to identify integrative biomarkers for stratification of patients into groups with different clinical outcomes. Results: In this paper we present a visual analytic system called Interactive Genomics Patient Stratification explorer (iGPSe) which significantly reduces the computing burden for biomedical researchers in the process of exploring complicated integrative genomics data. Our system integrates unsupervised clustering with graph and parallel sets visualization and allows direct comparison of clinical outcomes via survival analysis. Using a breast cancer dataset obtained from the The Cancer Genome Atlas (TCGA) project, we are able to quickly explore different combinations of gene expression (mRNA) and microRNA features and identify potential combined markers for survival prediction. Conclusions: Visualization plays an important role in the process of stratifying given population patients. Visual tools allowed for the selection of possibly features across various datasets for the given patient population. We essentially made a case for visualization for a very important problem in translational informatics.Comment: BioVis 2014 conferenc

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

Recommended from our members

Interaction-Based Learning for High-Dimensional Data with Continuous Predictors

Author: Huang Chien-Hsun
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2014
Field of study

High-dimensional data, such as that relating to gene expression in microarray experiments, may contain substantial amount of useful information to be explored. However, the information, relevant variables and their joint interactions are usually diluted by noise due to a large number of non-informative variables. Consequently, variable selection plays a pivotal role for learning in high dimensional problems. Most of the traditional feature selection methods, such as Pearson's correlation between response and predictors, stepwise linear regressions and LASSO are among the popular linear methods. These methods are effective in identifying linear marginal effect but are limited in detecting non-linear or higher order interaction effects. It is well known that epistasis (gene - gene interactions) may play an important role in gene expression where unknown functional forms are difficult to identify. In this thesis, we propose a novel nonparametric measure to first screen and do feature selection based on information from nearest neighborhoods. The method is inspired by Lo and Zheng's earlier work (2002) on detecting interactions for discrete predictors. We apply a backward elimination algorithm based on this measure which leads to the identification of many in influential clusters of variables. Those identified groups of variables can capture both marginal and interactive effects. Second, each identified cluster has the potential to perform predictions and classifications more accurately. We also study procedures how to combine these groups of individual classifiers to form a final predictor. Through simulation and real data analysis, the proposed measure is capable of identifying important variable sets and patterns including higher-order interaction sets. The proposed procedure outperforms existing methods in three different microarray datasets. Moreover, the nonparametric measure is quite flexible and can be easily extended and applied to other areas of high-dimensional data and studies

Columbia University Academic Commons