42,249 research outputs found
Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources
Apache Calcite is a foundational software framework that provides query
processing, optimization, and query language support to many popular
open-source data processing systems such as Apache Hive, Apache Storm, Apache
Flink, Druid, and MapD. Calcite's architecture consists of a modular and
extensible query optimizer with hundreds of built-in optimization rules, a
query processor capable of processing a variety of query languages, an adapter
architecture designed for extensibility, and support for heterogeneous data
models and stores (relational, semi-structured, streaming, and geospatial).
This flexible, embeddable, and extensible architecture is what makes Calcite an
attractive choice for adoption in big-data frameworks. It is an active project
that continues to introduce support for the new types of data sources, query
languages, and approaches to query processing and optimization.Comment: SIGMOD'1
Customer churn prediction in telecom using machine learning and social network analysis in big data platform
Customer churn is a major problem and one of the most important concerns for
large companies. Due to the direct effect on the revenues of the companies,
especially in the telecom field, companies are seeking to develop means to
predict potential customer to churn. Therefore, finding factors that increase
customer churn is important to take necessary actions to reduce this churn. The
main contribution of our work is to develop a churn prediction model which
assists telecom operators to predict customers who are most likely subject to
churn. The model developed in this work uses machine learning techniques on big
data platform and builds a new way of features' engineering and selection. In
order to measure the performance of the model, the Area Under Curve (AUC)
standard measure is adopted, and the AUC value obtained is 93.3%. Another main
contribution is to use customer social network in the prediction model by
extracting Social Network Analysis (SNA) features. The use of SNA enhanced the
performance of the model from 84 to 93.3% against AUC standard. The model was
prepared and tested through Spark environment by working on a large dataset
created by transforming big raw data provided by SyriaTel telecom company. The
dataset contained all customers' information over 9 months, and was used to
train, test, and evaluate the system at SyriaTel. The model experimented four
algorithms: Decision Tree, Random Forest, Gradient Boosted Machine Tree "GBM"
and Extreme Gradient Boosting "XGBOOST". However, the best results were
obtained by applying XGBOOST algorithm. This algorithm was used for
classification in this churn predictive model.Comment: 24 pages, 14 figures. PDF https://rdcu.be/budK
Generalized Bregman Divergence and Gradient of Mutual Information for Vector Poisson Channels
We investigate connections between information-theoretic and
estimation-theoretic quantities in vector Poisson channel models. In
particular, we generalize the gradient of mutual information with respect to
key system parameters from the scalar to the vector Poisson channel model. We
also propose, as another contribution, a generalization of the classical
Bregman divergence that offers a means to encapsulate under a unifying
framework the gradient of mutual information results for scalar and vector
Poisson and Gaussian channel models. The so-called generalized Bregman
divergence is also shown to exhibit various properties akin to the properties
of the classical version. The vector Poisson channel model is drawing
considerable attention in view of its application in various domains: as an
example, the availability of the gradient of mutual information can be used in
conjunction with gradient descent methods to effect compressive-sensing
projection designs in emerging X-ray and document classification applications
Towards the Safety of Human-in-the-Loop Robotics: Challenges and Opportunities for Safety Assurance of Robotic Co-Workers
The success of the human-robot co-worker team in a flexible manufacturing
environment where robots learn from demonstration heavily relies on the correct
and safe operation of the robot. How this can be achieved is a challenge that
requires addressing both technical as well as human-centric research questions.
In this paper we discuss the state of the art in safety assurance, existing as
well as emerging standards in this area, and the need for new approaches to
safety assurance in the context of learning machines. We then focus on robotic
learning from demonstration, the challenges these techniques pose to safety
assurance and indicate opportunities to integrate safety considerations into
algorithms "by design". Finally, from a human-centric perspective, we stipulate
that, to achieve high levels of safety and ultimately trust, the robotic
co-worker must meet the innate expectations of the humans it works with. It is
our aim to stimulate a discussion focused on the safety aspects of
human-in-the-loop robotics, and to foster multidisciplinary collaboration to
address the research challenges identified
Collaborative Verification-Driven Engineering of Hybrid Systems
Hybrid systems with both discrete and continuous dynamics are an important
model for real-world cyber-physical systems. The key challenge is to ensure
their correct functioning w.r.t. safety requirements. Promising techniques to
ensure safety seem to be model-driven engineering to develop hybrid systems in
a well-defined and traceable manner, and formal verification to prove their
correctness. Their combination forms the vision of verification-driven
engineering. Often, hybrid systems are rather complex in that they require
expertise from many domains (e.g., robotics, control systems, computer science,
software engineering, and mechanical engineering). Moreover, despite the
remarkable progress in automating formal verification of hybrid systems, the
construction of proofs of complex systems often requires nontrivial human
guidance, since hybrid systems verification tools solve undecidable problems.
It is, thus, not uncommon for development and verification teams to consist of
many players with diverse expertise. This paper introduces a
verification-driven engineering toolset that extends our previous work on
hybrid and arithmetic verification with tools for (i) graphical (UML) and
textual modeling of hybrid systems, (ii) exchanging and comparing models and
proofs, and (iii) managing verification tasks. This toolset makes it easier to
tackle large-scale verification tasks
- …