14,897 research outputs found

    Efficient incremental constrained clustering

    Full text link
    Clustering with constraints is an emerging area of data min-ing research. However, most work assumes that the con-straints are given as one large batch. In this paper we ex-plore the situation where the constraints are incrementally given. In this way the user after seeing a clustering can provide positive and negative feedback via constraints to critique a clustering solution. We consider the problem of efficiently updating a clustering to satisfy the new and old constraints rather than re-clustering the entire data set. We show that the problem of incremental clustering under con-straints is NP-hard in general, but identify several sufficient conditions which lead to efficiently solvable versions. These translate into a set of rules on the types of constraints that can be added and constraint set properties that must be maintained. We demonstrate that this approach is more ef-ficient than re-clustering the entire data set and has several other advantages

    General Bounds for Incremental Maximization

    Full text link
    We propose a theoretical framework to capture incremental solutions to cardinality constrained maximization problems. The defining characteristic of our framework is that the cardinality/support of the solution is bounded by a value k∈Nk\in\mathbb{N} that grows over time, and we allow the solution to be extended one element at a time. We investigate the best-possible competitive ratio of such an incremental solution, i.e., the worst ratio over all kk between the incremental solution after kk steps and an optimum solution of cardinality kk. We define a large class of problems that contains many important cardinality constrained maximization problems like maximum matching, knapsack, and packing/covering problems. We provide a general 2.6182.618-competitive incremental algorithm for this class of problems, and show that no algorithm can have competitive ratio below 2.182.18 in general. In the second part of the paper, we focus on the inherently incremental greedy algorithm that increases the objective value as much as possible in each step. This algorithm is known to be 1.581.58-competitive for submodular objective functions, but it has unbounded competitive ratio for the class of incremental problems mentioned above. We define a relaxed submodularity condition for the objective function, capturing problems like maximum (weighted) (bb-)matching and a variant of the maximum flow problem. We show that the greedy algorithm has competitive ratio (exactly) 2.3132.313 for the class of problems that satisfy this relaxed submodularity condition. Note that our upper bounds on the competitive ratios translate to approximation ratios for the underlying cardinality constrained problems.Comment: fixed typo

    XML documents clustering using a tensor space model

    Get PDF
    The traditional Vector Space Model (VSM) is not able to represent both the structure and the content of XML documents. This paper introduces a novel method of representing XML documents in a Tensor Space Model (TSM) and then utilizing it for clustering. Empirical analysis shows that the proposed method is scalable for large-sized datasets; as well, the factorized matrices produced from the proposed method help to improve the quality of clusters through the enriched document representation of both structure and content information

    Global Trajectory Optimisation : Can We Prune the Solution Space When Considering Deep Space Manoeuvres? [Final Report]

    Get PDF
    This document contains a report on the work done under the ESA/Ariadna study 06/4101 on the global optimization of space trajectories with multiple gravity assist (GA) and deep space manoeuvres (DSM). The study was performed by a joint team of scientists from the University of Reading and the University of Glasgow
    • 

    corecore