3,595 research outputs found
Mining Event Logs to Support Workflow Resource Allocation
Workflow technology is widely used to facilitate the business process in
enterprise information systems (EIS), and it has the potential to reduce design
time, enhance product quality and decrease product cost. However, significant
limitations still exist: as an important task in the context of workflow, many
present resource allocation operations are still performed manually, which are
time-consuming. This paper presents a data mining approach to address the
resource allocation problem (RAP) and improve the productivity of workflow
resource management. Specifically, an Apriori-like algorithm is used to find
the frequent patterns from the event log, and association rules are generated
according to predefined resource allocation constraints. Subsequently, a
correlation measure named lift is utilized to annotate the negatively
correlated resource allocation rules for resource reservation. Finally, the
rules are ranked using the confidence measures as resource allocation rules.
Comparative experiments are performed using C4.5, SVM, ID3, Na\"ive Bayes and
the presented approach, and the results show that the presented approach is
effective in both accuracy and candidate resource recommendations.Comment: T. Liu et al., Mining event logs to support workflow resource
allocation, Knowl. Based Syst. (2012), http://dx.doi.org/
10.1016/j.knosys.2012.05.01
Graph BI & analytics: current state and future challenges
In an increasingly competitive market, making well-informed decisions requires the analysis of a wide range of heterogeneous, large and complex data. This paper focuses on the emerging field of graph warehousing. Graphs are widespread structures that yield a great expressive power. They are used for modeling highly complex and interconnected domains, and efficiently solving emerging big data application. This paper presents the current status and open challenges of graph BI and analytics, and motivates the need for new warehousing frameworks aware of the topological nature of graphs. We survey the topics of graph modeling, management, processing and analysis in graph warehouses. Then we conclude by discussing future research directions and positioning them within a unified architecture of a graph BI and analytics framework.Peer ReviewedPostprint (author's final draft
Diamond Dicing
In OLAP, analysts often select an interesting sample of the data. For
example, an analyst might focus on products bringing revenues of at least 100
000 dollars, or on shops having sales greater than 400 000 dollars. However,
current systems do not allow the application of both of these thresholds
simultaneously, selecting products and shops satisfying both thresholds. For
such purposes, we introduce the diamond cube operator, filling a gap among
existing data warehouse operations.
Because of the interaction between dimensions the computation of diamond
cubes is challenging. We compare and test various algorithms on large data sets
of more than 100 million facts. We find that while it is possible to implement
diamonds in SQL, it is inefficient. Indeed, our custom implementation can be a
hundred times faster than popular database engines (including a row-store and a
column-store).Comment: 29 page
Data geo-Science Approach for Modelling Unconventional Petroleum Ecosystems and their Visual Analytics
Storage, integration and interoperability are critical
challenges in the unconventional exploration data
management. With a quest to explore unconventional
hydrocarbons, in particular, shale gas from fractured shales,
we aim at investigating new petroleum data geoscience
approaches. The data geo-science describes the
integration of geoscience-domain expertise, collaborating
mathematical concepts, computing algorithms, machine learning
tools, including data and business analytics.
Further, to strengthen data-science services among
producing companies, we propose an integrated
multidimensional repository system, for which factual
instances are acquired on gas shales, to store, process and
deliver fractured-data views in new knowledge domains.
Data dimensions are categorized to examine their
suitability in the integrated prototype articulations that use
fracture-networks and attribute dimension model
descriptions. The factual instances are typically from
seismic attributes, seismically interpreted geological
structures and reservoirs, well log, including production
data entities. For designing and developing
multidimensional repository systems, we create various
artefacts, describing conceptual, logical and physical
models. For exploring the connectivity between seismic
and geology entities, multidimensional ontology models
are construed using fracture network attribute dimensions
and their instances. Different data warehousing and mining
are added support to the management of ontologies that can
bring the data instances of fractured shales, to unify and
explore the associativity between high-dense fractured
shales and their orientations.
The models depicting collaboration of geology,
geophysics, reservoir engineering and geo-mechanics
entities and their dimensions can substantially reduce the
risk and uncertainty involved in modelling and interpreting
shale- and tight-gas reservoirs, including traps associated
with Coal Bed Methane (CBM). Anisotropy, Poisson's
ratio and Young's modulus properties corroborate the
interpretation of stress images from the 3D acoustic
characterization of shale reservoirs. The statistical analysis
of data-views, their correlations and patterns further
facilitate us to visualize and interpret geoscientific
metadata meticulously. Data geo-science guided integrated
methodology can be applied in any basin, including frontier
basins
- …