176,535 research outputs found
Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery
We propose a calibrated multivariate regression method named CMR for fitting
high dimensional multivariate regression models. Compared with existing
methods, CMR calibrates regularization for each regression task with respect to
its noise level so that it simultaneously attains improved finite-sample
performance and tuning insensitiveness. Theoretically, we provide sufficient
conditions under which CMR achieves the optimal rate of convergence in
parameter estimation. Computationally, we propose an efficient smoothed
proximal gradient algorithm with a worst-case numerical rate of convergence
\cO(1/\epsilon), where is a pre-specified accuracy of the
objective function value. We conduct thorough numerical simulations to
illustrate that CMR consistently outperforms other high dimensional
multivariate regression methods. We also apply CMR to solve a brain activity
prediction problem and find that it is as competitive as a handcrafted model
created by human experts. The R package \texttt{camel} implementing the
proposed method is available on the Comprehensive R Archive Network
\url{http://cran.r-project.org/web/packages/camel/}.Comment: Journal of Machine Learning Research, 201
The evaluation of protein folding rate constant is improved by predicting the folding kinetic order with a SVM-based method
Protein folding is a problem of large interest since it concerns the
mechanism by which the genetic information is translated into proteins with
well defined three-dimensional (3D) structures and functions. Recently
theoretical models have been developed to predict the protein folding rate
considering the relationships of the process with tolopological parameters
derived from the native (atomic-solved) protein structures. Previous works
classified proteins in two different groups exhibiting either a
single-exponential or a multi-exponential folding kinetics. It is well known
that these two classes of proteins are related to different protein structural
features. The increasing number of available experimental kinetic data allows
the application to the problem of a machine learning approach, in order to
predict the kinetic order of the folding process starting from the experimental
data so far collected. This information can be used to improve the prediction
of the folding rate. In this work first we describe a support vector
machine-based method (SVM-KO) to predict for a given protein the kinetic order
of the folding process. Using this method we can classify correctly 78% of the
folding mechanisms over a set of 63 experimental data. Secondly we focus on the
prediction of the logarithm of the folding rate. This value can be obtained as
a linear regression task with a SVM-based method. In this paper we show that
linear correlation of the predicted with experimental data can improve when the
regression task is computed over two different sets, instead of one, each of
them composed by the proteins with a correctly predicted two state or
multistate kinetic order.Comment: The paper will be published on WSEAS Transaction on Biology and
Biomedicin
- …