11,599 research outputs found
Building Combined Classifiers
This chapter covers different approaches that may be taken when building an
ensemble method, through studying specific examples of each approach from research
conducted by the authors. A method called Negative Correlation Learning illustrates a
decision level combination approach with individual classifiers trained co-operatively. The
Model level combination paradigm is illustrated via a tree combination method. Finally,
another variant of the decision level paradigm, with individuals trained independently
instead of co-operatively, is discussed as applied to churn prediction in the
telecommunications industry
An Integrated Semantic Web Service Discovery and Composition Framework
In this paper we present a theoretical analysis of graph-based service
composition in terms of its dependency with service discovery. Driven by this
analysis we define a composition framework by means of integration with
fine-grained I/O service discovery that enables the generation of a graph-based
composition which contains the set of services that are semantically relevant
for an input-output request. The proposed framework also includes an optimal
composition search algorithm to extract the best composition from the graph
minimising the length and the number of services, and different graph
optimisations to improve the scalability of the system. A practical
implementation used for the empirical analysis is also provided. This analysis
proves the scalability and flexibility of our proposal and provides insights on
how integrated composition systems can be designed in order to achieve good
performance in real scenarios for the Web.Comment: Accepted to appear in IEEE Transactions on Services Computing 201
Scalable discovery of hybrid process models in a cloud computing environment
Process descriptions are used to create products and deliver services. To lead better processes and services, the first step
is to learn a process model. Process discovery is such a technique which can automatically extract process models from event logs.
Although various discovery techniques have been proposed, they focus on either constructing formal models which are very powerful
but complex, or creating informal models which are intuitive but lack semantics. In this work, we introduce a novel method that returns
hybrid process models to bridge this gap. Moreover, to cope with today’s big event logs, we propose an efficient method, called f-HMD,
aims at scalable hybrid model discovery in a cloud computing environment. We present the detailed implementation of our approach
over the Spark framework, and our experimental results demonstrate that the proposed method is efficient and scalabl
Fine-grained Search Space Classification for Hard Enumeration Variants of Subset Problems
We propose a simple, powerful, and flexible machine learning framework for
(i) reducing the search space of computationally difficult enumeration variants
of subset problems and (ii) augmenting existing state-of-the-art solvers with
informative cues arising from the input distribution. We instantiate our
framework for the problem of listing all maximum cliques in a graph, a central
problem in network analysis, data mining, and computational biology. We
demonstrate the practicality of our approach on real-world networks with
millions of vertices and edges by not only retaining all optimal solutions, but
also aggressively pruning the input instance size resulting in several fold
speedups of state-of-the-art algorithms. Finally, we explore the limits of
scalability and robustness of our proposed framework, suggesting that
supervised learning is viable for tackling NP-hard problems in practice.Comment: AAAI 201
A Heuristic Approach for Discovering Reference Models by Mining Process Model Variants
Recently, a new generation of adaptive Process-Aware Information Systems (PAISs) has emerged, which enables structural process changes during runtime while preserving PAIS robustness and consistency. Such flexibility, in turn, leads to a large number of process variants derived from the same model, but differing in structure. Generally, such variants are expensive to configure and maintain. This paper provides a heuristic search algorithm which fosters learning from past process changes by mining process variants. The algorithm discovers a reference model based on which the need for future process configuration and adaptation can be reduced. It additionally provides the flexibility to control the process evolution procedure, i.e., we can control to what degree the discovered reference model differs from the original one. As benefit, we can not only control the effort for updating the reference model, but also gain the flexibility to perform only the most important adaptations of the current reference model. Our mining algorithm is implemented and evaluated by a simulation using more than 7000 process models. Simulation results indicate strong performance and scalability of our algorithm even when facing large-sized process models
Towards Profit Maximization for Online Social Network Providers
Online Social Networks (OSNs) attract billions of users to share information
and communicate where viral marketing has emerged as a new way to promote the
sales of products. An OSN provider is often hired by an advertiser to conduct
viral marketing campaigns. The OSN provider generates revenue from the
commission paid by the advertiser which is determined by the spread of its
product information. Meanwhile, to propagate influence, the activities
performed by users such as viewing video ads normally induce diffusion cost to
the OSN provider. In this paper, we aim to find a seed set to optimize a new
profit metric that combines the benefit of influence spread with the cost of
influence propagation for the OSN provider. Under many diffusion models, our
profit metric is the difference between two submodular functions which is
challenging to optimize as it is neither submodular nor monotone. We design a
general two-phase framework to select seeds for profit maximization and develop
several bounds to measure the quality of the seed set constructed. Experimental
results with real OSN datasets show that our approach can achieve high
approximation guarantees and significantly outperform the baseline algorithms,
including state-of-the-art influence maximization algorithms.Comment: INFOCOM 2018 (Full version), 12 page
- …