141,458 research outputs found
Recommended from our members
Chapter 2 - Data-Driven Energy Efficient Driving Control in Connected Vehicle Environment
Query-driven learning for predictive analytics of data subspace cardinality
Fundamental to many predictive analytics tasks is the ability to estimate the cardinality (number of data items) of multi-dimensional data subspaces, defined by query selections over datasets. This is crucial for data analysts dealing with, e.g., interactive data subspace explorations, data subspace visualizations, and in query processing optimization. However, in many modern data systems, predictive analytics may be (i) too costly money-wise, e.g., in clouds, (ii) unreliable, e.g., in modern Big Data query engines, where accurate statistics are difficult to obtain/maintain, or (iii) infeasible, e.g., for privacy issues. We contribute a novel, query-driven, function estimation model of analyst-defined data subspace cardinality. The proposed estimation model is highly accurate in terms of prediction and accommodating the well-known selection queries: multi-dimensional range and distance-nearest neighbors (radius) queries. Our function estimation model: (i) quantizes the vectorial query space, by learning the analysts’ access patterns over a data space, (ii) associates query vectors with their corresponding cardinalities of the analyst-defined data subspaces, (iii) abstracts and employs query vectorial similarity to predict the cardinality of an unseen/unexplored data subspace, and (iv) identifies and adapts to possible changes of the query subspaces based on the theory of optimal stopping. The proposed model is decentralized, facilitating the scaling-out of such predictive analytics queries. The research significance of the model lies in that (i) it is an attractive solution when data-driven statistical techniques are undesirable or infeasible, (ii) it offers a scale-out, decentralized training solution, (iii) it is applicable to different selection query types, and (iv) it offers a performance that is superior to that of data-driven approaches
Data-driven Soft Sensors in the Process Industry
In the last two decades Soft Sensors established themselves as a valuable alternative to the traditional means for the acquisition of critical process variables, process monitoring and other tasks which are related to process control. This paper discusses characteristics of the process industry data which are critical for the development of data-driven Soft Sensors. These characteristics are common to a large number of process industry fields, like the chemical industry, bioprocess industry, steel industry, etc. The focus of this work is put on the data-driven Soft Sensors because of their growing popularity, already demonstrated usefulness and huge, though yet not completely realised, potential. A comprehensive selection of case studies covering the three most important Soft Sensor application fields, a general introduction to the most popular Soft Sensor modelling techniques as well as a discussion of some open issues in the Soft Sensor development and maintenance and their possible solutions are the main contributions of this work
Scalable aggregation predictive analytics: a query-driven machine learning approach
We introduce a predictive modeling solution that provides high quality predictive analytics over aggregation queries in Big Data environments. Our predictive methodology is generally applicable in environments in which large-scale data owners may or may not restrict access to their data and allow only aggregation operators like COUNT to be executed over their data. In this context, our methodology is based on historical queries and their answers to accurately predict ad-hoc queries’ answers. We focus on the widely used set-cardinality, i.e., COUNT, aggregation query, as COUNT is a fundamental operator for both internal data system optimizations and for aggregation-oriented data exploration and predictive analytics. We contribute a novel, query-driven Machine Learning (ML) model whose goals are to: (i) learn the query-answer space from past issued queries, (ii) associate the query space with local linear regression & associative function estimators, (iii) define query similarity, and (iv) predict the cardinality of the answer set of unseen incoming queries, referred to the Set Cardinality Prediction (SCP) problem. Our ML model incorporates incremental ML algorithms for ensuring high quality prediction results. The significance of contribution lies in that it (i) is the only query-driven solution applicable over general Big Data environments, which include restricted-access data, (ii) offers incremental learning adjusted for arriving ad-hoc queries, which is well suited for query-driven data exploration, and (iii) offers a performance (in terms of scalability, SCP accuracy, processing time, and memory requirements) that is superior to data-centric approaches. We provide a comprehensive performance evaluation of our model evaluating its sensitivity, scalability and efficiency for quality predictive analytics. In addition, we report on the development and incorporation of our ML model in Spark showing its superior performance compared to the Spark’s COUNT method
Particle filtering in high-dimensional chaotic systems
We present an efficient particle filtering algorithm for multiscale systems,
that is adapted for simple atmospheric dynamics models which are inherently
chaotic. Particle filters represent the posterior conditional distribution of
the state variables by a collection of particles, which evolves and adapts
recursively as new information becomes available. The difference between the
estimated state and the true state of the system constitutes the error in
specifying or forecasting the state, which is amplified in chaotic systems that
have a number of positive Lyapunov exponents. The purpose of the present paper
is to show that the homogenization method developed in Imkeller et al. (2011),
which is applicable to high dimensional multi-scale filtering problems, along
with important sampling and control methods can be used as a basic and flexible
tool for the construction of the proposal density inherent in particle
filtering. Finally, we apply the general homogenized particle filtering
algorithm developed here to the Lorenz'96 atmospheric model that mimics
mid-latitude atmospheric dynamics with microscopic convective processes.Comment: 28 pages, 12 figure
Gait learning for soft microrobots controlled by light fields
Soft microrobots based on photoresponsive materials and controlled by light
fields can generate a variety of different gaits. This inherent flexibility can
be exploited to maximize their locomotion performance in a given environment
and used to adapt them to changing conditions. Albeit, because of the lack of
accurate locomotion models, and given the intrinsic variability among
microrobots, analytical control design is not possible. Common data-driven
approaches, on the other hand, require running prohibitive numbers of
experiments and lead to very sample-specific results. Here we propose a
probabilistic learning approach for light-controlled soft microrobots based on
Bayesian Optimization (BO) and Gaussian Processes (GPs). The proposed approach
results in a learning scheme that is data-efficient, enabling gait optimization
with a limited experimental budget, and robust against differences among
microrobot samples. These features are obtained by designing the learning
scheme through the comparison of different GP priors and BO settings on a
semi-synthetic data set. The developed learning scheme is validated in
microrobot experiments, resulting in a 115% improvement in a microrobot's
locomotion performance with an experimental budget of only 20 tests. These
encouraging results lead the way toward self-adaptive microrobotic systems
based on light-controlled soft microrobots and probabilistic learning control.Comment: 8 pages, 7 figures, to appear in the proceedings of the IEEE/RSJ
International Conference on Intelligent Robots and Systems 201
- …