8,984 research outputs found
Optimal model-free prediction from multivariate time series
© 2015 American Physical Society.Forecasting a time series from multivariate predictors constitutes a challenging problem, especially using model-free approaches. Most techniques, such as nearest-neighbor prediction, quickly suffer from the curse of dimensionality and overfitting for more than a few predictors which has limited their application mostly to the univariate case. Therefore, selection strategies are needed that harness the available information as efficiently as possible. Since often the right combination of predictors matters, ideally all subsets of possible predictors should be tested for their predictive power, but the exponentially growing number of combinations makes such an approach computationally prohibitive. Here a prediction scheme that overcomes this strong limitation is introduced utilizing a causal preselection step which drastically reduces the number of possible predictors to the most predictive set of causal drivers making a globally optimal search scheme tractable. The information-theoretic optimality is derived and practical selection criteria are discussed. As demonstrated for multivariate nonlinear stochastic delay processes, the optimal scheme can even be less computationally expensive than commonly used suboptimal schemes like forward selection. The method suggests a general framework to apply the optimal model-free approach to select variables and subsequently fit a model to further improve a prediction or learn statistical dependencies. The performance of this framework is illustrated on a climatological index of El Niño Southern Oscillation
Optimal model-free prediction from multivariate time series
Forecasting a time series from multivariate predictors constitutes a
challenging problem, especially using model-free approaches. Most techniques,
such as nearest-neighbor prediction, quickly suffer from the curse of
dimensionality and overfitting for more than a few predictors which has limited
their application mostly to the univariate case. Therefore, selection
strategies are needed that harness the available information as efficiently as
possible. Since often the right combination of predictors matters, ideally all
subsets of possible predictors should be tested for their predictive power, but
the exponentially growing number of combinations makes such an approach
computationally prohibitive. Here a prediction scheme that overcomes this
strong limitation is introduced utilizing a causal pre-selection step which
drastically reduces the number of possible predictors to the most predictive
set of causal drivers making a globally optimal search scheme tractable. The
information-theoretic optimality is derived and practical selection criteria
are discussed. As demonstrated for multivariate nonlinear stochastic delay
processes, the optimal scheme can even be less computationally expensive than
commonly used sub-optimal schemes like forward selection. The method suggests a
general framework to apply the optimal model-free approach to select variables
and subsequently fit a model to further improve a prediction or learn
statistical dependencies. The performance of this framework is illustrated on a
climatological index of El Ni\~no Southern Oscillation.Comment: 14 pages, 9 figure
Precise influence evaluation in complex networks
Evaluating node influence is fundamental for identifying key nodes in complex
networks. Existing methods typically rely on generic indicators to rank node
influence across diverse networks, thereby ignoring the individualized features
of each network itself. Actually, node influence stems not only from general
features but the multi-scale individualized information encompassing specific
network structure and task. Here we design an active learning architecture to
predict node influence quantitively and precisely, which samples representative
nodes based on graph entropy correlation matrix integrating multi-scale
individualized information. This brings two intuitive advantages: (1)
discovering potential high-influence but weak-connected nodes that are usually
ignored in existing methods, (2) improving the influence maximization strategy
by deducing influence interference. Significantly, our architecture
demonstrates exceptional transfer learning capabilities across multiple types
of networks, which can identify those key nodes with large disputation across
different existing methods. Additionally, our approach, combined with a simple
greedy algorithm, exhibits dominant performance in solving the influence
maximization problem. This architecture holds great potential for applications
in graph mining and prediction tasks
How complex climate networks complement eigen techniques for the statistical analysis of climatological data
Eigen techniques such as empirical orthogonal function (EOF) or coupled
pattern (CP) / maximum covariance analysis have been frequently used for
detecting patterns in multivariate climatological data sets. Recently,
statistical methods originating from the theory of complex networks have been
employed for the very same purpose of spatio-temporal analysis. This climate
network (CN) analysis is usually based on the same set of similarity matrices
as is used in classical EOF or CP analysis, e.g., the correlation matrix of a
single climatological field or the cross-correlation matrix between two
distinct climatological fields. In this study, formal relationships as well as
conceptual differences between both eigen and network approaches are derived
and illustrated using exemplary global precipitation, evaporation and surface
air temperature data sets. These results allow to pinpoint that CN analysis can
complement classical eigen techniques and provides additional information on
the higher-order structure of statistical interrelationships in climatological
data. Hence, CNs are a valuable supplement to the statistical toolbox of the
climatologist, particularly for making sense out of very large data sets such
as those generated by satellite observations and climate model intercomparison
exercises.Comment: 18 pages, 11 figure
- …