46,659 research outputs found
Optimal clustering of frequency-constrained maintenance jobs with shared set-ups
Since maintenance jobs often require one or more set-up activities, joint execution or clustering of maintenance jobs is a powerful instrument to reduce shut-down costs. We consider a clustering problem for frequency-constrained maintenance jobs, i.e. maintenance jobs that must be carried out with a prescribed (or higher) frequency. For the clustering of maintenance jobs with identical, so-called common set-ups, several strong dominance rules are provided. These dominance rules are used in an efficient dynamic programming algorithm which solves the problem in polynomial time. For the clustering of maintenance jobs with partially identical, so-called shared set-ups, similar but less strong dominance rules are available. Nevertheless, a surprisingly well-performing greedy heuristic and a branch and bound procedure have been developed to solve this problem. For randomly generated test problems with 10 set-ups and 30 maintenance jobs, the heuristic was optimal in 47 out of 100 test problems, with an average deviation of 0.24% from the optimal solution. In addition, the branch and bound method found an optimal solution in only a few seconds computation time on average
Exact algorithms for minimum sum-of-squares clustering
NP-Hardness of Euclidean sum-of-squares clustering -- Computational complexity -- An incorrect reduction from the K-section problem -- A new proof by reduction from the densest cut problem -- Evaluating a branch-and-bound RLT-based algorithm for minimum sum-of-squares clustering -- Reformulation-Linearization technique for the MSSC -- Branch-and-bound for the MSSC -- An attempt at reproducting computational results -- Breaking symmetry and convex hull inequalities -- A branch-and-cut SDP-based algorithm for minimum sum-of-squares clustering -- Equivalence of MSSC to 0-1 SDP -- A branch-and cut algorithm for the 0-1 SDP formulation -- Computational experiments -- An improved column generation algorithm for minimum sum-of-squares clustering -- Column generation algorithm revisited -- A geometric approach -- Generalization to the Euclidean space -- Computational results
An Exact Algorithm for Semi-supervised Minimum Sum-of-Squares Clustering
The minimum sum-of-squares clustering (MSSC), or k-means type clustering, is
traditionally considered an unsupervised learning task. In recent years, the
use of background knowledge to improve the cluster quality and promote
interpretability of the clustering process has become a hot research topic at
the intersection of mathematical optimization and machine learning research.
The problem of taking advantage of background information in data clustering is
called semi-supervised or constrained clustering. In this paper, we present a
branch-and-cut algorithm for semi-supervised MSSC, where background knowledge
is incorporated as pairwise must-link and cannot-link constraints. For the
lower bound procedure, we solve the semidefinite programming relaxation of the
MSSC discrete optimization model, and we use a cutting-plane procedure for
strengthening the bound. For the upper bound, instead, by using integer
programming tools, we use an adaptation of the k-means algorithm to the
constrained case. For the first time, the proposed global optimization
algorithm efficiently manages to solve real-world instances up to 800 data
points with different combinations of must-link and cannot-link constraints and
with a generic number of features. This problem size is about four times larger
than the one of the instances solved by state-of-the-art exact algorithms
A Parallel Branch-and-Bound Method for Cluster Analysis
Cluster analysis is a generic term coined for procedures that are used objectively to group entities based on their similarities and differences. The primary objective of these procedures is to group n items into K mutually exclusive clusters so that items within each cluster are relatively homogeneous in nature while the clusters themselves are distinct. In this research, we have developed, implemented and tested an asynchronous, dynamic parallel branchand-bound algorithm to solve the clustering problem. In the developmental environment, several processes (tasks) work independently on various subproblems generated by the branch-and-bound procedure. This parallel algorithm can solve very large-scale, optimal clustering problems in a reasonable amount of wall-clock time. Linear and superlinear speedups are obtained. Thus, solutions to real-world, complex clustering problems, which could not be solved due to the lack of efficient parallel algorithms, can now be attempted
An investigation into the effects of partitioning the facilities assignment problem by hierarchical clustering methods
This thesis considers the possibility of partitioning larger problems by clustering the facilities according to the hierarchy of their mutual flows. Two different methods of accomplishing this clustering are developed and evaluated. A model is developed to partition the problem by these methods and to use a branch and bound algorithm at two levels. One level arranges the clusters in an optional manner and the second level arranges the facilities within the clusters
XClusters: Explainability-first Clustering
We study the problem of explainability-first clustering where explainability
becomes a first-class citizen for clustering. Previous clustering approaches
use decision trees for explanation, but only after the clustering is completed.
In contrast, our approach is to perform clustering and decision tree training
holistically where the decision tree's performance and size also influence the
clustering results. We assume the attributes for clustering and explaining are
distinct, although this is not necessary. We observe that our problem is a
monotonic optimization where the objective function is a difference of
monotonic functions. We then propose an efficient branch-and-bound algorithm
for finding the best parameters that lead to a balance of cluster distortion
and decision tree explainability. Our experiments show that our method can
improve the explainability of any clustering that fits in our framework.Comment: 11 page
Global Optimization for Cardinality-constrained Minimum Sum-of-Squares Clustering via Semidefinite Programming
The minimum sum-of-squares clustering (MSSC), or k-means type clustering, has
been recently extended to exploit prior knowledge on the cardinality of each
cluster. Such knowledge is used to increase performance as well as solution
quality. In this paper, we propose a global optimization approach based on the
branch-and-cut technique to solve the cardinality-constrained MSSC. For the
lower bound routine, we use the semidefinite programming (SDP) relaxation
recently proposed by Rujeerapaiboon et al. [SIAM J. Optim. 29(2), 1211-1239,
(2019)]. However, this relaxation can be used in a branch-and-cut method only
for small-size instances. Therefore, we derive a new SDP relaxation that scales
better with the instance size and the number of clusters. In both cases, we
strengthen the bound by adding polyhedral cuts. Benefiting from a tailored
branching strategy which enforces pairwise constraints, we reduce the
complexity of the problems arising in the children nodes. For the upper bound,
instead, we present a local search procedure that exploits the solution of the
SDP relaxation solved at each node. Computational results show that the
proposed algorithm globally solves, for the first time, real-world instances of
size 10 times larger than those solved by state-of-the-art exact methods
Automated Search for Block Cipher Differentials: A GPU-Accelerated Branch-and-Bound Algorithm
Differential cryptanalysis of block ciphers requires the identification of differential characteristics with high probability. For block ciphers with large block sizes and number of rounds, identifying these characteristics is computationally intensive. The branch-and-bound algorithm was proposed by Matsui to automate this task. Since then, numerous improvements were made to the branch-and-bound algorithm by bounding the number of active s-boxes, incorporating a meet-in-the-middle approach, and adapting it to various block cipher architectures. Although mixed-integer linear programming (MILP) has been widely used to evaluate the differential resistance of block ciphers, MILP is still inefficient for clustering singular differential characteristics to obtain differentials (also known as the differential effect). The branch-and-bound method is still better suited for the task of trail clustering. However, it requires enhancements before being feasible for block ciphers with large block sizes, especially for a large number of rounds. Motivated by the need for a more efficient branch-and-bound algorithm to search for block cipher differentials, we propose a GPU-accelerated branch-and-bound algorithm. The proposed approach substantially increases the performance of the differential cluster search. We were able to derive a branch enumeration and evaluation kernel that is 5.95 times faster than its CPU counterpart. To showcase its practicality, the proposed algorithm is applied on TRIFLE-BC, a 128-bit block cipher. By incorporating a meet-in-the-middle approach with the proposed GPU kernel, we were able to improve the search efficiency (on 20 rounds of TRIFLE-BC) by approximately 58 times as compared to the CPU-based approach. Differentials consisting of up to 50 million individual characteristics can be constructed for 20 rounds of TRIFLE, leading to slight improvements to the overall differential probabilities. Even for larger rounds (43 rounds), the proposed algorithm is still able to construct large clusters of over 500 thousand characteristics. This result depicts the practicality of the proposed algorithm in constructing large differentials even for a 128-bit block cipher, which could be used to improve cryptanalytic findings against other block ciphers in the future
- …