582 research outputs found
Tree-Mining: Understanding Applications and Challenges
Tree-mining is an essential system of techniques and software technologies for multi-level and multi-angled operations in databases. Pertaining to the purview of this manuscript, several applications of various sub-techniques of tree mining have been explored. The current write-up is aimed at investigating the major applications and challenges of different types and techniques of tree mining, as there have been patchy and scanty investigations so far in this context. To accomplish these tasks, the author has reviewed some of the latest and most pertinent research articles of the last two decades to investigate the titled aspects of this technique
EFFICIENT APPROACH FOR VIEW SELECTION FOR DATA WAREHOUSE USING TREE MINING AND EVOLUTIONARY COMPUTATION
Selection of a proper set of views to materialize plays an important role indatabase performance. There are many methods of view selection which uses different techniques and frameworks to select an efficient set of views for materialization. In this paper, we present a new efficient, scalable method for view selection under the given storage constraints using a tree mining approach and evolutionary optimization. Tree mining algorithm is designed to determine the exact frequency of (sub)queries in the historical SQL dataset. Query Cost model achieves the objective of maximizing the performance benefits from the final view set which is derived from the frequent view set given by tree mining algorithm. Performance benefit of a query is defined as a function of queryfrequency, query creation cost, and query maintenance cost. The experimental results shows that the proposed method is successful in recommending a solution which is fairly close to optimal solution
Asynchronous Collective Tree Exploration by Tree-Mining
We investigate the problem of collaborative tree exploration with complete
communication introduced by [FGKP06], in which a group of agents is
assigned to collectively go through all edges of an unknown tree in an
efficient manner and then return to the origin. The agents have unrestricted
communication and computation capabilities. The algorithm's runtime is
typically compared to the cost of offline traversal, which is at least
where is the number of nodes and is the tree depth.
Since its introduction, two types of guarantee have emerged on the topic: the
first is of the form , where is called the competitive
ratio, and the other is of the form , where is called the
competitive overhead. In this paper, we present the first algorithm with
linear-in- competitive overhead, thereby reconciling both approaches.
Specifically, our bound is in and thus leads to a
competitive ratio in . This is the first
improvement over the -competitive algorithm known since the
introduction of the problem in 2004. Our algorithm is obtained for an
asynchronous generalization of collective tree exploration (ACTE). It is an
instance of a general class of locally-greedy exploration algorithms that we
define. We show that the additive overhead analysis of locally-greedy
algorithms can be seen through the lens of a 2-player game that we call the
tree-mining game and that could be of independent interest
Tree mining application to matching of hetereogeneous knowledge
Matching of heterogeneous knowledge sources is of increasing importance in areas such as scientific knowledge management, e-commerce, enterprise application integration, and many emerging Semantic Web applications. With the desire of knowledge sharing and reuse in these fields, it is common that the knowledge coming from different organizations from the same domain is to be matched. We propose a knowledge matching method based on our previously developed tree mining algorithms for extracting frequently occurring subtrees from a tree structured database such as XML. Using the method the common structure among the different representations can be automatically extracted. Our focus is on knowledge matching at the structural level and we use a set of example XML schema documents from the same domain to evaluate the method. We discuss some important issues that arise when applying tree mining algorithms for detection of common document structures. The experiments demonstrate the usefulness of the approach
Graph-based task libraries for robots: generalization and autocompletion
In this paper, we consider an autonomous robot that persists
over time performing tasks and the problem of providing one additional
task to the robot's task library. We present an approach to generalize
tasks, represented as parameterized graphs with sequences, conditionals,
and looping constructs of sensing and actuation primitives. Our approach
performs graph-structure task generalization, while maintaining task ex-
ecutability and parameter value distributions. We present an algorithm
that, given the initial steps of a new task, proposes an autocompletion
based on a recognized past similar task. Our generalization and auto-
completion contributions are eective on dierent real robots. We show
concrete examples of the robot primitives and task graphs, as well as
results, with Baxter. In experiments with multiple tasks, we show a sig-
nicant reduction in the number of new task steps to be provided
Mining Shared Decision Trees between Datasets
This thesis studies the problem of mining models, patterns andstructures (MPS) shared by two datasets (applications), a well understood dataset, denoted as WD, and a poorly understood one, denoted as PD. Combined with users\u27 familiarity with WD, the shared MPS can help users better understand PD, since they capture similarities between WD and PD. Moreover, the knowledge on such similarities can enable the users to focus attention on analyzing the unique behavior of PD. Technically, this thesis focuses on the shared decision tree mining problem. In order to provide a view on the similarities between WD and PD, this thesis proposes to mine a high quality shared decision tree satisfying the properties: the tree has (1) highly similar data distribution and (2) high classification accuracy in the datasets. This thesis proposes an algorithm, namely SDT-Miner, for mining such shared decision tree. This algorithm is significantly different from traditional decision tree mining, since it addresses the challenges caused by the presence of two datasets, by the data distribution similarity requirement and by the tree accuracy requirement. The effectiveness of the algorithm is verified by experiments
- …