71,509 research outputs found

    You Are What You Eat: A Preference-Aware Inverse Optimization Approach

    Full text link
    A key challenge in the emerging field of precision nutrition entails providing diet recommendations that reflect both the (often unknown) dietary preferences of different patient groups and known dietary constraints specified by human experts. Motivated by this challenge, we develop a preference-aware constrained-inference approach in which the objective function of an optimization problem is not pre-specified and can differ across various segments. Among existing methods, clustering models from machine learning are not naturally suited for recovering the constrained optimization problems, whereas constrained inference models such as inverse optimization do not explicitly address non-homogeneity in given datasets. By harnessing the strengths of both clustering and inverse optimization techniques, we develop a novel approach that recovers the utility functions of a constrained optimization process across clusters while providing optimal diet recommendations as cluster representatives. Using a dataset of patients' daily food intakes, we show how our approach generalizes stand-alone clustering and inverse optimization approaches in terms of adherence to dietary guidelines and partitioning observations, respectively. The approach makes diet recommendations by incorporating both patient preferences and expert recommendations for healthier diets, leading to structural improvements in both patient partitioning and nutritional recommendations for each cluster. An appealing feature of our method is its ability to consider infeasible but informative observations for a given set of dietary constraints. The resulting recommendations correspond to a broader range of dietary options, even when they limit unhealthy choices

    Deep Constrained Dominant Sets for Person Re-Identification

    Get PDF
    In this work, we propose an end-to-end constrained clustering scheme to tackle the person re-identification (re-id) problem. Deep neural networks (DNN) have recently proven to be effective on person re-identification task. In particular, rather than leveraging solely a probe-gallery similarity, diffusing the similarities among the gallery images in an end-to-end manner has proven to be effective in yielding a robust probe-gallery affinity. However, existing methods do not apply probe image as a constraint, and are prone to noise propagation during the similarity diffusion process. To overcome this, we propose an intriguing scheme which treats person-image retrieval problem as a constrained clustering optimization problem, called deep constrained dominant sets (DCDS). Given a probe and gallery images, we re-formulate person re-id problem as finding a constrained cluster, where the probe image is taken as a constraint (seed) and each cluster corresponds to a set of images corresponding to the same person. By optimizing the constrained clustering in an end-to-end manner, we naturally leverage the contextual knowledge of a set of images corresponding to the given person-images. We further enhance the performance by integrating an auxiliary net alongside DCDS, which employs a multi-scale ResNet. To validate the effectiveness of our method we present experiments on several benchmark datasets and show that the proposed method can outperform state-of-the-art methods

    Clustering Multiple Sclerosis Medication Sequence Data with Mixture Markov Chain Analysis with covariates using Multiple Simplex Constrained Optimization Routine (MSiCOR)

    Full text link
    Multiple sclerosis (MS) is an autoimmune disease of the central nervous system that causes neurodegeneration. While disease-modifying therapies (DMTs) reduce inflammatory disease activity and delay worsening disability in MS, there are significantly varying treatment responses across people with MS (pwMS). pwMS often receive serial monotherapies of DMTs. Here, we propose a novel method to cluster pwMS according to the sequence of DMT prescriptions and associated clinical features (covariates). This is achieved via a mixture Markov chain analysis with covariates, where the sequence of prescribed DMTs for each patient is modeled as a Markov chain. Given the computational challenges to maximize the mixture likelihood on the constrained parameter space, we develop a pattern search-based global optimization technique which can optimize any objective function on a collection of simplexes and shown to outperform other related global optimization techniques. In simulation experiments, the proposed method is shown to outperform the Expectation-Maximization (EM) algorithm based method for clustering sequence data without covariates. Based on the analysis, we divided MS patients into 3 clusters: inferon-beta dominated, multi-DMTs, and natalizumab dominated. Further cluster-specific summaries of relevant covariates indicate patient differences among the clusters. This method may guide the DMT prescription sequence based on clinical features

    Constrained K-means with General Pairwise and Cardinality Constraints

    Full text link
    In this work, we study constrained clustering, where constraints are utilized to guide the clustering process. In existing works, two categories of constraints have been widely explored, namely pairwise and cardinality constraints. Pairwise constraints enforce the cluster labels of two instances to be the same (must-link constraints) or different (cannot-link constraints). Cardinality constraints encourage cluster sizes to satisfy a user-specified distribution. However, most existing constrained clustering models can only utilize one category of constraints at a time. In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means. As these two categories provide useful information at different levels, utilizing both of them is expected to allow for better clustering performance. However, the optimization is difficult due to the binary and quadratic constraints in the proposed unified formulation. To alleviate this difficulty, we utilize two techniques: equivalently replacing the binary constraints by the intersection of two continuous constraints; the other is transforming the quadratic constraints into bi-linear constraints by introducing extra variables. Then we derive an equivalent continuous reformulation with simple constraints, which can be efficiently solved by Alternating Direction Method of Multipliers (ADMM) algorithm. Extensive experiments on both synthetic and real data demonstrate: (1) when utilizing a single category of constraint, the proposed model is superior to or competitive with state-of-the-art constrained clustering models, and (2) when utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category

    Low rank methods for optimizing clustering

    Get PDF
    Complex optimization models and problems in machine learning often have the majority of information in a low rank subspace. By careful exploitation of these low rank structures in clustering problems, we find new optimization approaches that reduce the memory and computational cost. We discuss two cases where this arises. First, we consider the NEO-K-Means (Non-Exhaustive, Overlapping K-Means) objective as a way to address overlapping and outliers in an integrated fashion. Optimizing this discrete objective is NP-hard, and even though there is a convex relaxation of the objective, straightforward convex optimization approaches are too expensive for large datasets. We utilize low rank structures in the solution matrix of the convex formulation and use a low-rank factorization of the solution matrix directly as a practical alternative. The resulting optimization problem is non-convex, but has a smaller number of solution variables, and can be locally optimized using an augmented Lagrangian method. In addition, we consider two fast multiplier methods to accelerate the convergence of the augmented Lagrangian scheme: a proximal method of multipliers and an alternating direction method of multipliers. For the proximal augmented Lagrangian, we show a convergence result for the non-convex case with bound-constrained subproblems. When the clustering performance is evaluated on real-world datasets, we show this technique is effective in finding the ground-truth clusters and cohesive overlapping communities in real-world networks. The second case is where the low-rank structure appears in the objective function. Inspired by low rank matrix completion techniques, we propose a low rank symmetric matrix completion scheme to approximate a kernel matrix. For the kernel k-means problem, we show empirically that the clustering performance with the approximation is comparable to the full kernel k-means
    • …
    corecore