4,649 research outputs found

    Prioritized Metric Structures and Embedding

    Full text link
    Metric data structures (distance oracles, distance labeling schemes, routing schemes) and low-distortion embeddings provide a powerful algorithmic methodology, which has been successfully applied for approximation algorithms \cite{llr}, online algorithms \cite{BBMN11}, distributed algorithms \cite{KKMPT12} and for computing sparsifiers \cite{ST04}. However, this methodology appears to have a limitation: the worst-case performance inherently depends on the cardinality of the metric, and one could not specify in advance which vertices/points should enjoy a better service (i.e., stretch/distortion, label size/dimension) than that given by the worst-case guarantee. In this paper we alleviate this limitation by devising a suit of {\em prioritized} metric data structures and embeddings. We show that given a priority ranking (x1,x2,…,xn)(x_1,x_2,\ldots,x_n) of the graph vertices (respectively, metric points) one can devise a metric data structure (respectively, embedding) in which the stretch (resp., distortion) incurred by any pair containing a vertex xjx_j will depend on the rank jj of the vertex. We also show that other important parameters, such as the label size and (in some sense) the dimension, may depend only on jj. In some of our metric data structures (resp., embeddings) we achieve both prioritized stretch (resp., distortion) and label size (resp., dimension) {\em simultaneously}. The worst-case performance of our metric data structures and embeddings is typically asymptotically no worse than of their non-prioritized counterparts.Comment: To appear at STOC 201

    Labelings vs. Embeddings: On Distributed Representations of Distances

    Full text link
    We investigate for which metric spaces the performance of distance labeling and of β„“βˆž\ell_\infty-embeddings differ, and how significant can this difference be. Recall that a distance labeling is a distributed representation of distances in a metric space (X,d)(X,d), where each point x∈Xx\in X is assigned a succinct label, such that the distance between any two points x,y∈Xx,y \in X can be approximated given only their labels. A highly structured special case is an embedding into β„“βˆž\ell_\infty, where each point x∈Xx\in X is assigned a vector f(x)f(x) such that βˆ₯f(x)βˆ’f(y)βˆ₯∞\|f(x)-f(y)\|_\infty is approximately d(x,y)d(x,y). The performance of a distance labeling or an β„“βˆž\ell_\infty-embedding is measured via its distortion and its label-size/dimension. We also study the analogous question for the prioritized versions of these two measures. Here, a priority order Ο€=(x1,…,xn)\pi=(x_1,\dots,x_n) of the point set XX is given, and higher-priority points should have shorter labels. Formally, a distance labeling has prioritized label-size Ξ±(.)\alpha(.) if every xjx_j has label size at most Ξ±(j)\alpha(j). Similarly, an embedding f:Xβ†’β„“βˆžf: X \to \ell_\infty has prioritized dimension Ξ±(.)\alpha(.) if f(xj)f(x_j) is non-zero only in the first Ξ±(j)\alpha(j) coordinates. In addition, we compare these their prioritized measures to their classical (worst-case) versions. We answer these questions in several scenarios, uncovering a surprisingly diverse range of behaviors. First, in some cases labelings and embeddings have very similar worst-case performance, but in other cases there is a huge disparity. However in the prioritized setting, we most often find a strict separation between the performance of labelings and embeddings. And finally, when comparing the classical and prioritized settings, we find that the worst-case bound for label size often ``translates'' to a prioritized one, but also a surprising exception to this rule

    Applications of Nonlinear Optimization

    Get PDF
    We apply an interior point algorithm to two nonlinear optimization problems and achieve improved results. We also devise an approximate convex functional alternative for use in one of the problems and estimate its accuracy. The first problem is maximum variance unfolding in machine learning. The traditional method to solve this problem is to convert it to a semi-definite optimization problem by defining a kernel matrix. We obtain better unfolding and higher speeds with the interior point algorithm on the original non-convex problem for data with less than 10,000 points. The second problem is a multi-objective dose optimization for intensity modulated radiotherapy, whose goals are to achieve high radiation dose on tumors while sparing normal tissues. Due to tumor motions and patient set-up errors, a robust optimization against motion uncertainties is required to deliver a clinically acceptable treatment plan. The traditional method, to irradiate an enlargement of the tumor region, is very conservative and leads to possibly high radiation dose on sensitive structures. We use a new robust optimization model within the framework of goal programming that consists of multiple optimization steps based on prescription priorities. One metric is defined for each structure of interest. A final robustness optimization step then minimizes the variance of all the goal metrics with respect to the motion probability space, and pushes the mean values of these metrics toward a desired value as well. We show similar high dose coverage on example tumors with reduced dose on sensitive structures. One clinically important metric for a radiation dose distribution, that describes tumor control probability or normal tissue complication probability, is Dx, the minimum dose value on the hottest x% of a structure. It is not mathematically well-behaved, which impedes its use in optimization. We approximate Dx with a linear function of two generalized equivalent uniform dose metrics, also known as lp norms, requiring that the approximation is concave so that its maximization becomes a convex problem. Results with cross validation on a sampling of radiation therapy plans show that the error of this approximation is less than 1 Gy for the most used range 80 to 95 of x values

    Enhancing drug and cell line representations via contrastive learning for improved anti-cancer drug prioritization

    Full text link
    Due to cancer's complex nature and variable response to therapy, precision oncology informed by omics sequence analysis has become the current standard of care. However, the amount of data produced for each patients makes it difficult to quickly identify the best treatment regimen. Moreover, limited data availability has hindered computational methods' abilities to learn patterns associated with effective drug-cell line pairs. In this work, we propose the use of contrastive learning to improve learned drug and cell line representations by preserving relationship structures associated with drug mechanism of action and cell line cancer types. In addition to achieving enhanced performance relative to a state-of-the-art method, we find that classifiers using our learned representations exhibit a more balances reliance on drug- and cell line-derived features when making predictions. This facilitates more personalized drug prioritizations that are informed by signals related to drug resistance.Comment: 60 pages, 4 figures, 4 tables, 11 supplementary tables, 1 supplementary note, submitted to Nature Communication

    Stochastic Attraction-Repulsion Embedding for Large Scale Image Localization

    Full text link
    This paper tackles the problem of large-scale image-based localization (IBL) where the spatial location of a query image is determined by finding out the most similar reference images in a large database. For solving this problem, a critical task is to learn discriminative image representation that captures informative information relevant for localization. We propose a novel representation learning method having higher location-discriminating power. It provides the following contributions: 1) we represent a place (location) as a set of exemplar images depicting the same landmarks and aim to maximize similarities among intra-place images while minimizing similarities among inter-place images; 2) we model a similarity measure as a probability distribution on L_2-metric distances between intra-place and inter-place image representations; 3) we propose a new Stochastic Attraction and Repulsion Embedding (SARE) loss function minimizing the KL divergence between the learned and the actual probability distributions; 4) we give theoretical comparisons between SARE, triplet ranking and contrastive losses. It provides insights into why SARE is better by analyzing gradients. Our SARE loss is easy to implement and pluggable to any CNN. Experiments show that our proposed method improves the localization performance on standard benchmarks by a large margin. Demonstrating the broad applicability of our method, we obtained the third place out of 209 teams in the 2018 Google Landmark Retrieval Challenge. Our code and model are available at https://github.com/Liumouliu/deepIBL.Comment: ICC
    • …
    corecore