Search CORE

806 research outputs found

Applying weighted network measures to microarray distance matrices

Author: D Garlaschelli
G Caldarelli
S E Ahnert
Spellman P T
T M A Fink
Zhang B
Publication venue: 'IOP Publishing'
Publication date: 01/01/2008
Field of study

In recent work we presented a new approach to the analysis of weighted networks, by providing a straightforward generalization of any network measure defined on unweighted networks. This approach is based on the translation of a weighted network into an ensemble of edges, and is particularly suited to the analysis of fully connected weighted networks. Here we apply our method to several such networks including distance matrices, and show that the clustering coefficient, constructed by using the ensemble approach, provides meaningful insights into the systems studied. In the particular case of two data sets from microarray experiments the clustering coefficient identifies a number of biologically significant genes, outperforming existing identification approaches.Comment: Accepted for publication in J. Phys.

arXiv.org e-Print Archive

Crossref

Archivio della ricerca della Scuola IMT Alti Studi Lucca

IMT Institutional Repository

Non-equilibrium dynamics of gene expression and the Jarzynski equality

Author: E. Davidson
Johannes Berg
L. Michaelis
N. van Kampen
P. T. Spellman
U. Alon
Publication venue: 'American Physical Society (APS)'
Publication date: 03/12/2007
Field of study

In order to express specific genes at the right time, the transcription of genes is regulated by the presence and absence of transcription factor molecules. With transcription factor concentrations undergoing constant changes, gene transcription takes place out of equilibrium. In this paper we discuss a simple mapping between dynamic models of gene expression and stochastic systems driven out of equilibrium. Using this mapping, results of nonequilibrium statistical mechanics such as the Jarzynski equality and the fluctuation theorem are demonstrated for gene expression dynamics. Applications of this approach include the determination of regulatory interactions between genes from experimental gene expression data

arXiv.org e-Print Archive

Crossref

Dynamics of gene expression and the regulatory inference problem

Author: Alon U.
Chen W. England J. Shakhnovich E.
Hertz J.
Honerkamp J.
J. Berg
Lèbre S.
Michaelis L.
Spellman P. T.
Stokic D. Hanel R. Thurner S.
van Kampen N.
Publication venue: 'IOP Publishing'
Publication date: 05/03/2008
Field of study

From the response to external stimuli to cell division and death, the dynamics of living cells is based on the expression of specific genes at specific times. The decision when to express a gene is implemented by the binding and unbinding of transcription factor molecules to regulatory DNA. Here, we construct stochastic models of gene expression dynamics and test them on experimental time-series data of messenger-RNA concentrations. The models are used to infer biophysical parameters of gene transcription, including the statistics of transcription factor-DNA binding and the target genes controlled by a given transcription factor.Comment: revised version to appear in Europhys. Lett., new titl

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

A single-sample method for normalizing and combining full-resolution copy numbers from multiple platforms, labs and analysis methods

Author: A. Ray
Barrick
Bengtsson
Bengtsson
Bolstad
H. Bengtsson
Korn
McLendon
P. Spellman
PAGE
Redon
T. P. Speed
Ylstra
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: The rapid expansion of whole-genome copy number (CN) studies brings a demand for increased precision and resolution of CN estimates. Recent studies have obtained CN estimates from more than one platform for the same set of samples, and it is natural to want to combine the different estimates in order to meet this demand. Estimates from different platforms show different degrees of attenuation of the true CN changes. Similar differences can be observed in CNs from the same platform run in different labs, or in the same lab, with different analytical methods. This is the reason why it is not straightforward to combine CN estimates from different sources (platforms, labs and analysis methods)

Crossref

PubMed Central

Universal Role for HLA-C and KIR2Dl Ligand Mismatch in Severe Acute Graft-Versus-Host Disease After Unrelated Donor Hematopoietic Stem Cell Transplantation (U-HSCT) in Japanese and Caucasian Transplant Recipients: An Analysis on Behalf of International Histocompatibility Working Group in Hematopoietic Cell Transplantation

Author: Bardy P.
Bignon J.-D.
Dupont B.
Gooley T.A.
Hsu K.C.
Kawase T.
Madigral A.
Malkki M.
Morishima Y.
Petersdorf E.W.
Spellman S.
Velardi A.
Publication venue
Publication date: 01/02/2011
Field of study

Elsevier - Publisher Connector

Open Access Repository

The Iterative Signature Algorithm for the analysis of large scale gene expression data

Author: A. Brazma
A. Schulze
C.M. Perou
D.D. Lee
E. Lander
G. Getz
G. Sherlock
J. Ihmels
J.E. Staunton
J.L. DeRisi
Jan Ihmels
L. Lazzeroni
M. Bittner
M. Bittner
M. Schena
M.B. Eisen
N.S. Holter
Naama Barkai
O. Alter
P. Tamayo
P.T. Spellman
R.B. Altman
S. Tavazoie
Sven Bergmann
T. Hastie
T.G. Kolda
U. Alon
U. Scherf
Y. Cheng
Publication venue: 'American Physical Society (APS)'
Publication date: 08/10/2002
Field of study

We present a new approach for the analysis of genome-wide expression data. Our method is designed to overcome the limitations of traditional techniques, when applied to large-scale data. Rather than alloting each gene to a single cluster, we assign both genes and conditions to context-dependent and potentially overlapping transcription modules. We provide a rigorous definition of a transcription module as the object to be retrieved from the expression data. An efficient algorithm, that searches for the modules encoded in the data by iteratively refining sets of genes and conditions until they match this definition, is established. Each iteration involves a linear map, induced by the normalized expression matrix, followed by the application of a threshold function. We argue that our method is in fact a generalization of Singular Value Decomposition, which corresponds to the special case where no threshold is applied. We show analytically that for noisy expression data our approach leads to better classification due to the implementation of the threshold. This result is confirmed by numerical analyses based on in-silico expression data. We discuss briefly results obtained by applying our algorithm to expression data from the yeast S. cerevisiae.Comment: Latex, 36 pages, 8 figure

arXiv.org e-Print Archive

Crossref

Improved functional prediction of proteins by learning kernel combinations in multilabel settings

Author: A Dempster
A Dubrulle
A Gasch
Bernd Fischer
D MacKay
F Bach
G Lanckriet
G Yvert
H Yoshimoto
K Crammer
M Brauer
N Kumar
P Spellman
S Sonnenburg
T Hastie
T Hastie
T Hastie
T Hughes
V Roth
Volker Roth
Y Grandvalet
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Background We develop a probabilistic model for combining kernel matrices to predict the function of proteins. It extends previous approaches in that it can handle multiple labels which naturally appear in the context of protein function. Results Explicit modeling of multilabels significantly improves the capability of learning protein function from multiple kernels. The performance and the interpretability of the inference model are further improved by simultaneously predicting the subcellular localization of proteins and by combining pairwise classifiers to consistent class membership estimates. Conclusion For the purpose of functional prediction of proteins, multilabels provide valuable information that should be included adequately in the training process of classifiers. Learning of functional categories gains from co-prediction of subcellular localization. Pairwise separation rules allow very detailed insights into the relevance of different measurements like sequence, structure, interaction data, or expression data. A preliminary version of the software can be downloaded from http://www.inf.ethz.ch/personal/vroth/KernelHMM/.ISSN:1471-210

Repository for Publications and Research Data

Crossref

Springer - Publisher Connector

PubMed Central

SMART: Unique splitting-while-merging framework for gene clustering

Author: A Thalamuthu
AD Lanterman
AE Teschendorff
AK Jain
Asoke K. Nandi
B Abu-Jamous
B Fritzke
B Fritzke
CR Lin
CS Wallace
D Dembele
D Jiang
David J. Roberts
G Celeux
H Akaike
J Qin
J Rissanen
KY Yeung
L Hubert
L Mavridis
L Zhao
MAT Figueiredo
P Tamayo
PT Spellman
R Xu
R Xu
RJ Cho
Rui Fa
S Bandyopadhyay
S Monti
S Wu
Sergio Gómez
T Kohonen
T Pramila
TR Golub
WM Rand
YJ Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 08/04/2014
Field of study

Copyright @ 2014 Fa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named “splitting merging awareness tactics” (SMART), which does not require any a priori knowledge of either the number of clusters or even the possible range of this number. Unlike existing self-splitting algorithms, which over-cluster the dataset to a large number of clusters and then merge some similar clusters, our framework has the ability to split and merge clusters automatically during the process and produces the the most reliable clustering results, by intrinsically integrating many clustering techniques and tasks. The SMART framework is implemented with two distinct clustering paradigms in two algorithms: competitive learning and finite mixture model. Nevertheless, within the proposed SMART framework, many other algorithms can be derived for different clustering paradigms. The minimum message length algorithm is integrated into the framework as the clustering selection criterion. The usefulness of the SMART framework and its algorithms is tested in demonstration datasets and simulated gene expression datasets. Moreover, two real microarray gene expression datasets are studied using this approach. Based on the performance of many metrics, all numerical results show that SMART is superior to compared existing self-splitting algorithms and traditional algorithms. Three main properties of the proposed SMART framework are summarized as: (1) needing no parameters dependent on the respective dataset or a priori knowledge about the datasets, (2) extendible to many different applications, (3) offering superior performance compared with counterpart algorithms.National Institute for Health Researc

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Brunel University Research Archive

UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets

Author: A Huber
A Prelić
AA Shabalin
AP Gasch
Asoke K. Nandi
B Abu-Jamous
B Abu-Jamous
Basel Abu-Jamous
C Koch
CH Wade
CT Harbison
D Dikicioglu
D Liu
DA Orlando
David J. Roberts
IS Dhillon
J Bahler
J Yang
JK Choi
JK Limb
JM Pena
JM Stuart
KC Li
KC Li
KY Yeung
KY Yeung
L Lazzeroni
LP Zhao
MB Eisen
P Cahan
P Grandi
PC Roberts
PT Spellman
R Fa
R Lletı́a
R Nilsson
RJ Cho
RM Piro
Rui Fa
S Chu
S Fujii
S Sharma
S Vega-Pons
T Hayata
T Murali
T Pramila
TC Fleischer
VA Gennarino
X Liu
Y Cheng
Y Kluger
Z Tao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/06/2015
Field of study

Background: Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Results: Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. Conclusions: The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.The National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-0310-1004)

Jyväskylä University Digital Archive

Crossref

Springer - Publisher Connector

PubMed Central

Brunel University Research Archive

An assessment of the Jenkinson and Collison synoptic classification to a continental mid-latitude location

Author: AF Jenkinson
AK El-Kadi
B Alijani
B Yarnal
BB Fitzharris
C Spence
D Chen
DS Wilks
E Kalnay
G Spellman
Greg Spellman
J Duan
JP Boulanger
JW Kidson
M Grimalt
M Mourad
O Jorba
P Post
PD Jones
PD Jones
PS Espinoza
R Romero
RM Petrone
RM Trigo
T Shoji
TA Buishand
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

A weather-type catalogue based on the Jenkinson and Collison method was developed for an area in south-west Russia for the period 1961--2010. Gridded sea level pressure data was obtained from the National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) reanalysis. The resulting catalogue was analysed for frequency of individual types and groups of weather types to characterise long-term atmospheric circulation in this region. Overall, the most frequent type is anticyclonic (A) (23.3 {%}) followed by cyclonic (C) (11.9 {%}); however, there are some key seasonal patterns with westerly circulation being significantly more common in winter than summer. The utility of this synoptic classification is evaluated by modelling daily rainfall amounts. A low level of error is found using a simple model based on the prevailing weather type. Finally, characteristics of the circulation classification are compared to those for the original JC British Isles catalogue and a much more equal distribution of flow types is seen in the former classification

Crossref

Springer - Publisher Connector

University of Northampton's Research Explorer

NECTAR