Search CORE

141 research outputs found

Penalized Clustering of Large Scale Functional Data with Multiple Covariates

Author: Ma Ping
Zhong Wenxuan
Publication venue
Publication date: 01/01/2008
Field of study

In this article, we propose a penalized clustering method for large scale data with multiple covariates through a functional data approach. In the proposed method, responses and covariates are linked together through nonparametric multivariate functions (fixed effects), which have great flexibility in modeling a variety of function features, such as jump points, branching, and periodicity. Functional ANOVA is employed to further decompose multivariate functions in a reproducing kernel Hilbert space and provide associated notions of main effect and interaction. Parsimonious random effects are used to capture various correlation structures. The mixed-effect models are nested under a general mixture model, in which the heterogeneity of functional data is characterized. We propose a penalized Henderson's likelihood approach for model-fitting and design a rejection-controlled EM algorithm for the estimation. Our method selects smoothing parameters through generalized cross-validation. Furthermore, the Bayesian confidence intervals are used to measure the clustering uncertainty. Simulation studies and real-data examples are presented to investigate the empirical performance of the proposed method. Open-source code is available in the R package MFDA

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Statistical Assessment of the Global Regulatory Role of Histone Acetylation in Saccharomyces cerevisiae

Author: Liu Jun
Ma Ping
Yuan Guo-Cheng
Zhong Wenxuan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/09/2010
Field of study

BACKGROUND: Histone acetylation plays important but incompletely understood roles in gene regulation. A comprehensive understanding of the regulatory role of histone acetylation is difficult because many different histone acetylation patterns exist and their effects are confounded by other factors, such as the transcription factor binding sequence motif information and nucleosome occupancy. RESULTS: We analyzed recent genomewide histone acetylation data using a few complementary statistical models and tested the validity of a cumulative model in approximating the global regulatory effect of histone acetylation. Confounding effects due to transcription factor binding sequence information were estimated by using two independent motif-based algorithms followed by a variable selection method. We found that the sequence information has a significant role in regulating transcription, and we also found a clear additional histone acetylation effect. Our model fits well with observed genome-wide data. Strikingly, including more complicated combinatorial effects does not improve the model's performance. Through a statistical analysis of conditional independence, we found that H4 acetylation may not have significant direct impact on global gene expression. CONCLUSION: Decoding the combinatorial complexity of histone modification requires not only new data but also new methods to analyze the data. Our statistical analysis confirms that histone acetylation has a significant effect on gene transcription rates in addition to that attributable to upstream sequence motifs. Our analysis also suggests that a cumulative effect model for global histone acetylation is justified, although a more complex histone code may be important at specific gene loci. We also found that the regulatory roles among different histone acetylation sites have important differences.Statistic

Harvard University - DASH

Recommended from our members

Bayesian Functional Data Clustering for Temporal Microarray Data

Author: Feng Yang
Liu Jun S.
Ma Ping
Zhong Wenxuan
Publication venue: Hindawi Publishing Corporation
Publication date: 05/10/2010
Field of study

We propose a Bayesian procedure to cluster temporal gene expression microarray profiles, based on a mixed-effect smoothing-spline model, and design a Gibbs sampler to sample from the desired posterior distribution. Our method can determine the cluster number automatically based on the Bayesian information criterion, and handle missing data easily. When applied to a microarray dataset on the budding yeast, our clustering algorithm provides biologically meaningful gene clusters according to a functional enrichment analysis

Harvard University - DASH

PubMed Central

A Spatio-Temporal Graph Convolutional Network for Gesture Recognition from High-Density Electromyography

Author: Fu Peiwen
Xiong Wenxuan
Zhang Mingming
Zhang Yuyang
Zhong Wenjuan
Publication venue
Publication date: 01/12/2023
Field of study

Accurate hand gesture prediction is crucial for effective upper-limb prosthetic limbs control. As the high flexibility and multiple degrees of freedom exhibited by human hands, there has been a growing interest in integrating deep networks with high-density surface electromyography (HD-sEMG) grids to enhance gesture recognition capabilities. However, many existing methods fall short in fully exploit the specific spatial topology and temporal dependencies present in HD-sEMG data. Additionally, these studies are often limited number of gestures and lack generality. Hence, this study introduces a novel gesture recognition method, named STGCN-GR, which leverages spatio-temporal graph convolution networks for HD-sEMG-based human-machine interfaces. Firstly, we construct muscle networks based on functional connectivity between channels, creating a graph representation of HD-sEMG recordings. Subsequently, a temporal convolution module is applied to capture the temporal dependences in the HD-sEMG series and a spatial graph convolution module is employed to effectively learn the intrinsic spatial topology information among distinct HD-sEMG channels. We evaluate our proposed model on a public HD-sEMG dataset comprising a substantial number of gestures (i.e., 65). Our results demonstrate the remarkable capability of the STGCN-GR method, achieving an impressive accuracy of 91.07% in predicting gestures, which surpasses state-of-the-art deep learning methods applied to the same dataset

arXiv.org e-Print Archive

The Expression of irx7 in the Inner Nuclear Layer of Zebrafish Retina Is Essential for a Proper Retinal Development and Lamination.

Author: Leung Yuk Fai
Trujillo Caleb
Yang Yifan
Zhang Yuqing
Zhong Wenxuan
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2012
Field of study

Irx7, a member in the zebrafish iroquois transcription factor (TF) family, has been shown to control brain patterning. During retinal development, irx7\u27s expression was found to appear exclusively in the inner nuclear layer (INL) as soon as the prospective INL cells withdraw from the cell cycle and during retinal lamination. In Irx7-deficient retinas, the formation of a proper retinal lamination was disrupted and the differentiation of INL cell types, including amacrine, horizontal, bipolar and Muller cells, was compromised. Despite irx7\u27s exclusive expression in the INL, photoreceptors differentiation was also compromised in Irx7-deficient retinas. Compared with other retinal cell types, ganglion cells differentiated relatively well in these retinas, except for their dendritic projections into the inner plexiform layer (IPL). In fact, the neuronal projections of amacrine and bipolar cells into the IPL were also diminished. These indicate that the retinal lamination issue in the Irx7-deficient retinas is likely caused by the attenuation of the neurite outgrowth. Since the expression of known TFs that can specify specific retinal cell type was also altered in Irx7-deficient retinas, thus the irx7 gene network is possibly a novel regulatory circuit for retinal development and lamination

CiteSeerX

Directory of Open Access Journals

PubMed Central

Purdue E-Pubs

A data-driven clustering method for time course gene expression data

Author: Castillo-Davis Cristian I.
Liu Jun S.
Ma Ping
Zhong Wenxuan
Publication venue: Oxford University Press
Publication date: 01/03/2006
Field of study

Gene expression over time is, biologically, a continuous process and can thus be represented by a continuous function, i.e. a curve. Individual genes often share similar expression patterns (functional forms). However, the shape of each function, the number of such functions, and the genes that share similar functional forms are typically unknown. Here we introduce an approach that allows direct discovery of related patterns of gene expression and their underlying functions (curves) from data without a priori specification of either cluster number or functional form. Smoothing spline clustering (SSC) models natural properties of gene expression over time, taking into account natural differences in gene expression within a cluster of similarly expressed genes, the effects of experimental measurement error, and missing data. Furthermore, SSC provides a visual summary of each cluster's gene expression function and goodness-of-fit by way of a ‘mean curve’ construct and its associated confidence bands. We apply this method to gene expression data over the life-cycle of Drosophila melanogaster and Caenorhabditis elegans to discover 17 and 16 unique patterns of gene expression in each species, respectively. New and previously described expression patterns in both species are discovered, the majority of which are biologically meaningful and exhibit statistically significant gene function enrichment. Software and source code implementing the algorithm, SSClust, is freely available ()

Crossref

Harvard University - DASH

PubMed Central