12,535 research outputs found
Toward a multilevel representation of protein molecules: comparative approaches to the aggregation/folding propensity problem
This paper builds upon the fundamental work of Niwa et al. [34], which
provides the unique possibility to analyze the relative aggregation/folding
propensity of the elements of the entire Escherichia coli (E. coli) proteome in
a cell-free standardized microenvironment. The hardness of the problem comes
from the superposition between the driving forces of intra- and inter-molecule
interactions and it is mirrored by the evidences of shift from folding to
aggregation phenotypes by single-point mutations [10]. Here we apply several
state-of-the-art classification methods coming from the field of structural
pattern recognition, with the aim to compare different representations of the
same proteins gathered from the Niwa et al. data base; such representations
include sequences and labeled (contact) graphs enriched with chemico-physical
attributes. By this comparison, we are able to identify also some interesting
general properties of proteins. Notably, (i) we suggest a threshold around 250
residues discriminating "easily foldable" from "hardly foldable" molecules
consistent with other independent experiments, and (ii) we highlight the
relevance of contact graph spectra for folding behavior discrimination and
characterization of the E. coli solubility data. The soundness of the
experimental results presented in this paper is proved by the statistically
relevant relationships discovered among the chemico-physical description of
proteins and the developed cost matrix of substitution used in the various
discrimination systems.Comment: 17 pages, 3 figures, 46 reference
Forecasting and Forecast Combination in Airline Revenue Management Applications
Predicting a variable for a future point in time helps planning for unknown
future situations and is common practice in many areas such as economics, finance,
manufacturing, weather and natural sciences. This paper investigates and compares
approaches to forecasting and forecast combination that can be applied to service
industry in general and to airline industry in particular. Furthermore, possibilities to
include additionally available data like passenger-based information are discussed
Review of Nature-Inspired Forecast Combination Techniques
Effective and efficient planning in various areas can be significantly supported by forecasting a variable
like an economy growth rate or product demand numbers for a future point in time. More than one forecast for the same
variable is often available, leading to the question whether one should choose one of the single models or combine
several of them to obtain a forecast with improved accuracy. In the almost 40 years of research in the area of forecast
combination, an impressive amount of work has been done. This paper reviews forecast combination techniques that are
nonlinear and have in some way been inspired by nature
Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values
This work is motivated by the needs of predictive analytics on healthcare
data as represented by Electronic Medical Records. Such data is invariably
problematic: noisy, with missing entries, with imbalance in classes of
interests, leading to serious bias in predictive modeling. Since standard data
mining methods often produce poor performance measures, we argue for
development of specialized techniques of data-preprocessing and classification.
In this paper, we propose a new method to simultaneously classify large
datasets and reduce the effects of missing values. It is based on a multilevel
framework of the cost-sensitive SVM and the expected maximization imputation
method for missing values, which relies on iterated regression analyses. We
compare classification results of multilevel SVM-based algorithms on public
benchmark datasets with imbalanced classes and missing values as well as real
data in health applications, and show that our multilevel SVM-based method
produces fast, and more accurate and robust classification results.Comment: arXiv admin note: substantial text overlap with arXiv:1503.0625
- …