7,954 research outputs found
Tropical Principal Component Analysis and its Application to Phylogenetics
Principal component analysis is a widely-used method for the dimensionality
reduction of a given data set in a high-dimensional Euclidean space. Here we
define and analyze two analogues of principal component analysis in the setting
of tropical geometry. In one approach, we study the Stiefel tropical linear
space of fixed dimension closest to the data points in the tropical projective
torus; in the other approach, we consider the tropical polytope with a fixed
number of vertices closest to the data points. We then give approximative
algorithms for both approaches and apply them to phylogenetics, testing the
methods on simulated phylogenetic data and on an empirical dataset of
Apicomplexa genomes.Comment: 28 page
Joint Coding and Scheduling Optimization in Wireless Systems with Varying Delay Sensitivities
Throughput and per-packet delay can present strong trade-offs that are
important in the cases of delay sensitive applications.We investigate such
trade-offs using a random linear network coding scheme for one or more
receivers in single hop wireless packet erasure broadcast channels. We capture
the delay sensitivities across different types of network applications using a
class of delay metrics based on the norms of packet arrival times. With these
delay metrics, we establish a unified framework to characterize the rate and
delay requirements of applications and optimize system parameters. In the
single receiver case, we demonstrate the trade-off between average packet
delay, which we view as the inverse of throughput, and maximum ordered
inter-arrival delay for various system parameters. For a single broadcast
channel with multiple receivers having different delay constraints and feedback
delays, we jointly optimize the coding parameters and time-division scheduling
parameters at the transmitters. We formulate the optimization problem as a
Generalized Geometric Program (GGP). This approach allows the transmitters to
adjust adaptively the coding and scheduling parameters for efficient allocation
of network resources under varying delay constraints. In the case where the
receivers are served by multiple non-interfering wireless broadcast channels,
the same optimization problem is formulated as a Signomial Program, which is
NP-hard in general. We provide approximation methods using successive
formulation of geometric programs and show the convergence of approximations.Comment: 9 pages, 10 figure
Computing medians and means in Hadamard spaces
The geometric median as well as the Frechet mean of points in an Hadamard
space are important in both theory and applications. Surprisingly, no
algorithms for their computation are hitherto known. To address this issue, we
use a split version of the proximal point algorithm for minimizing a sum of
convex functions and prove that this algorithm produces a sequence converging
to a minimizer of the objective function, which extends a recent result of D.
Bertsekas (2001) into Hadamard spaces. The method is quite robust and not only
does it yield algorithms for the median and the mean, but it also applies to
various other optimization problems. We moreover show that another algorithm
for computing the Frechet mean can be derived from the law of large numbers due
to K.-T. Sturm (2002). In applications, computing medians and means is probably
most needed in tree space, which is an instance of an Hadamard space, invented
by Billera, Holmes, and Vogtmann (2001) as a tool for averaging phylogenetic
trees. It turns out, however, that it can be also used to model numerous other
tree-like structures. Since there now exists a polynomial-time algorithm for
computing geodesics in tree space due to M. Owen and S. Provan (2011), we
obtain efficient algorithms for computing medians and means, which can be
directly used in practice.Comment: Corrected version. Accepted in SIAM Journal on Optimizatio
Wireless Scheduling with Power Control
We consider the scheduling of arbitrary wireless links in the physical model
of interference to minimize the time for satisfying all requests. We study here
the combined problem of scheduling and power control, where we seek both an
assignment of power settings and a partition of the links so that each set
satisfies the signal-to-interference-plus-noise (SINR) constraints.
We give an algorithm that attains an approximation ratio of , where is the number of links and is the ratio
between the longest and the shortest link length. Under the natural assumption
that lengths are represented in binary, this gives the first approximation
ratio that is polylogarithmic in the size of the input. The algorithm has the
desirable property of using an oblivious power assignment, where the power
assigned to a sender depends only on the length of the link. We give evidence
that this dependence on is unavoidable, showing that any
reasonably-behaving oblivious power assignment results in a -approximation.
These results hold also for the (weighted) capacity problem of finding a
maximum (weighted) subset of links that can be scheduled in a single time slot.
In addition, we obtain improved approximation for a bidirectional variant of
the scheduling problem, give partial answers to questions about the utility of
graphs for modeling physical interference, and generalize the setting from the
standard 2-dimensional Euclidean plane to doubling metrics. Finally, we explore
the utility of graph models in capturing wireless interference.Comment: Revised full versio
Convergence analysis of generalized iteratively reweighted least squares algorithms on convex function spaces
The computation of robust regression estimates often relies on minimization of a convex functional on a convex set. In this paper we discuss a general technique for a large class of convex functionals to compute the minimizers iteratively which is closely related to majorization-minimization algorithms. Our approach is based on a quadratic approximation of the functional to be minimized and includes the iteratively reweighted least squares algorithm as a special case. We prove convergence on convex function spaces for general coercive and convex functionals F and derive geometric convergence in certain unconstrained settings. The algorithm is applied to TV penalized quantile regression and is compared with a step size corrected Newton-Raphson algorithm. It is found that typically in the first steps the iteratively reweighted least squares algorithm performs significantly better, whereas the Newton type method outpaces the former only after many iterations. Finally, in the setting of bivariate regression with unimodality constraints we illustrate how this algorithm allows to utilize highly efficient algorithms for special quadratic programs in more complex settings. --regression analysis,monotone regression,quantile regression,shape constraints,L1 regression,nonparametric regression,total variation semi-norm,reweighted least squares,Fermat's problem,convex approximation,quadratic approximation,pool adjacent violators algorithm
A review of multi-instance learning assumptions
Multi-instance (MI) learning is a variant of inductive machine learning, where each learning example contains a bag of instances instead of a single feature vector. The term commonly refers to the supervised setting, where each bag is associated with a label. This type of representation is a natural fit for a number of real-world learning scenarios, including drug activity prediction and image classification, hence many MI learning algorithms have been proposed. Any MI learning method must relate instances to bag-level class labels, but many types of relationships between instances and class labels are possible. Although all early work in MI learning assumes a specific MI concept class known to be appropriate for a drug activity prediction domain; this ‘standard MI assumption’ is not guaranteed to hold in other domains. Much of the recent work in MI learning has concentrated on a relaxed view of the MI problem, where the standard MI assumption is dropped, and alternative assumptions are considered instead. However, often it is not clearly stated what particular assumption is used and how it relates to other assumptions that have been proposed. In this paper, we aim to clarify the use of alternative MI assumptions by reviewing the work done in this area
- …