14,897 research outputs found
Efficient incremental constrained clustering
Clustering with constraints is an emerging area of data min-ing research. However, most work assumes that the con-straints are given as one large batch. In this paper we ex-plore the situation where the constraints are incrementally given. In this way the user after seeing a clustering can provide positive and negative feedback via constraints to critique a clustering solution. We consider the problem of efficiently updating a clustering to satisfy the new and old constraints rather than re-clustering the entire data set. We show that the problem of incremental clustering under con-straints is NP-hard in general, but identify several sufficient conditions which lead to efficiently solvable versions. These translate into a set of rules on the types of constraints that can be added and constraint set properties that must be maintained. We demonstrate that this approach is more ef-ficient than re-clustering the entire data set and has several other advantages
General Bounds for Incremental Maximization
We propose a theoretical framework to capture incremental solutions to
cardinality constrained maximization problems. The defining characteristic of
our framework is that the cardinality/support of the solution is bounded by a
value that grows over time, and we allow the solution to be
extended one element at a time. We investigate the best-possible competitive
ratio of such an incremental solution, i.e., the worst ratio over all
between the incremental solution after steps and an optimum solution of
cardinality . We define a large class of problems that contains many
important cardinality constrained maximization problems like maximum matching,
knapsack, and packing/covering problems. We provide a general
-competitive incremental algorithm for this class of problems, and show
that no algorithm can have competitive ratio below in general.
In the second part of the paper, we focus on the inherently incremental
greedy algorithm that increases the objective value as much as possible in each
step. This algorithm is known to be -competitive for submodular objective
functions, but it has unbounded competitive ratio for the class of incremental
problems mentioned above. We define a relaxed submodularity condition for the
objective function, capturing problems like maximum (weighted) (-)matching
and a variant of the maximum flow problem. We show that the greedy algorithm
has competitive ratio (exactly) for the class of problems that satisfy
this relaxed submodularity condition.
Note that our upper bounds on the competitive ratios translate to
approximation ratios for the underlying cardinality constrained problems.Comment: fixed typo
XML documents clustering using a tensor space model
The traditional Vector Space Model (VSM) is not able to represent both the structure and the content of XML documents. This paper introduces a novel method of representing XML documents in a Tensor Space Model (TSM) and then utilizing it for clustering. Empirical analysis shows that the proposed method is scalable for large-sized datasets; as well, the factorized matrices produced from the proposed method help to improve the quality of clusters through the enriched document representation of both structure and content information
Global Trajectory Optimisation : Can We Prune the Solution Space When Considering Deep Space Manoeuvres? [Final Report]
This document contains a report on the work done under the ESA/Ariadna study 06/4101 on the global optimization of space trajectories with multiple gravity assist (GA) and deep space manoeuvres (DSM). The study was performed by a joint team of scientists from the University of Reading and the University of Glasgow
- âŠ