2,245 research outputs found
Clustering-Based Predictive Process Monitoring
Business process enactment is generally supported by information systems that
record data about process executions, which can be extracted as event logs.
Predictive process monitoring is concerned with exploiting such event logs to
predict how running (uncompleted) cases will unfold up to their completion. In
this paper, we propose a predictive process monitoring framework for estimating
the probability that a given predicate will be fulfilled upon completion of a
running case. The predicate can be, for example, a temporal logic constraint or
a time constraint, or any predicate that can be evaluated over a completed
trace. The framework takes into account both the sequence of events observed in
the current trace, as well as data attributes associated to these events. The
prediction problem is approached in two phases. First, prefixes of previous
traces are clustered according to control flow information. Secondly, a
classifier is built for each cluster using event data to discriminate between
fulfillments and violations. At runtime, a prediction is made on a running case
by mapping it to a cluster and applying the corresponding classifier. The
framework has been implemented in the ProM toolset and validated on a log
pertaining to the treatment of cancer patients in a large hospital
Estimating Cardinalities with Deep Sketches
We introduce Deep Sketches, which are compact models of databases that allow
us to estimate the result sizes of SQL queries. Deep Sketches are powered by a
new deep learning approach to cardinality estimation that can capture
correlations between columns, even across tables. Our demonstration allows
users to define such sketches on the TPC-H and IMDb datasets, monitor the
training process, and run ad-hoc queries against trained sketches. We also
estimate query cardinalities with HyPer and PostgreSQL to visualize the gains
over traditional cardinality estimators.Comment: To appear in SIGMOD'1
Vectorwise: Beyond Column Stores
textabstractThis paper tells the story of Vectorwise, a high-performance analytical database system, from multiple perspectives: its history from academic project to commercial product, the evolution of its technical
architecture, customer reactions to the product and its future research and development roadmap. One take-away from this story is that the novelty in Vectorwise is much more than just column-storage:
it boasts many query processing innovations in its vectorized execution model, and an adaptive mixed
row/column data storage model with indexing support tailored to analytical workloads. Another one is that there is a long road from research prototype to commercial product, though database research continues to achieve a strong innovative influence on product development
- …