178,507 research outputs found
Streaming complexity of CSPs with randomly ordered constraints
We initiate a study of the streaming complexity of constraint satisfaction
problems (CSPs) when the constraints arrive in a random order. We show that
there exists a CSP, namely , for which random ordering
makes a provable difference. Whereas a approximation of
requires space with adversarial ordering,
we show that with random ordering of constraints there exists a
-approximation algorithm that only needs space. We also give
new algorithms for in variants of the adversarial ordering
setting. Specifically, we give a two-pass space
-approximation algorithm for general graphs and a single-pass
space -approximation algorithm for bounded degree
graphs.
On the negative side, we prove that CSPs where the satisfying assignments of
the constraints support a one-wise independent distribution require
-space for any non-trivial approximation, even when the
constraints are randomly ordered. This was previously known only for
adversarially ordered constraints. Extending the results to randomly ordered
constraints requires switching the hard instances from a union of random
matchings to simple Erd\"os-Renyi random (hyper)graphs and extending tools that
can perform Fourier analysis on such instances.
The only CSP to have been considered previously with random ordering is
where the ordering is not known to change the
approximability. Specifically it is known to be as hard to approximate with
random ordering as with adversarial ordering, for space
algorithms. Our results show a richer variety of possibilities and motivate
further study of CSPs with randomly ordered constraints
Gamma-based clustering via ordered means with application to gene-expression analysis
Discrete mixture models provide a well-known basis for effective clustering
algorithms, although technical challenges have limited their scope. In the
context of gene-expression data analysis, a model is presented that mixes over
a finite catalog of structures, each one representing equality and inequality
constraints among latent expected values. Computations depend on the
probability that independent gamma-distributed variables attain each of their
possible orderings. Each ordering event is equivalent to an event in
independent negative-binomial random variables, and this finding guides a
dynamic-programming calculation. The structuring of mixture-model components
according to constraints among latent means leads to strict concavity of the
mixture log likelihood. In addition to its beneficial numerical properties, the
clustering method shows promising results in an empirical study.Comment: Published in at http://dx.doi.org/10.1214/10-AOS805 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
On the Approximability of Digraph Ordering
Given an n-vertex digraph D = (V, A) the Max-k-Ordering problem is to compute
a labeling maximizing the number of forward edges, i.e.
edges (u,v) such that (u) < (v). For different values of k, this
reduces to Maximum Acyclic Subgraph (k=n), and Max-Dicut (k=2). This work
studies the approximability of Max-k-Ordering and its generalizations,
motivated by their applications to job scheduling with soft precedence
constraints. We give an LP rounding based 2-approximation algorithm for
Max-k-Ordering for any k={2,..., n}, improving on the known
2k/(k-1)-approximation obtained via random assignment. The tightness of this
rounding is shown by proving that for any k={2,..., n} and constant
, Max-k-Ordering has an LP integrality gap of 2 -
for rounds of the
Sherali-Adams hierarchy.
A further generalization of Max-k-Ordering is the restricted maximum acyclic
subgraph problem or RMAS, where each vertex v has a finite set of allowable
labels . We prove an LP rounding based
approximation for it, improving on the
approximation recently given by Grandoni et al.
(Information Processing Letters, Vol. 115(2), Pages 182-185, 2015). In fact,
our approximation algorithm also works for a general version where the
objective counts the edges which go forward by at least a positive offset
specific to each edge.
The minimization formulation of digraph ordering is DAG edge deletion or
DED(k), which requires deleting the minimum number of edges from an n-vertex
directed acyclic graph (DAG) to remove all paths of length k. We show that
both, the LP relaxation and a local ratio approach for DED(k) yield
k-approximation for any .Comment: 21 pages, Conference version to appear in ESA 201
It's Good to Be First: Order Bias in Reading and Citing NBER Working Papers
When choices are made from ordered lists, individuals can exhibit biases toward selecting certain options as a result of the ordering. We examine this phenomenon in the context of consumer response to the ordering of economics papers in an e-mail announcement issued by the NBER. We show that despite the effectively random list placement, papers listed first each week are about 30% more likely to be viewed, downloaded, and subsequently cited. We suggest that a model of “skimming” behavior, where individuals focus on the first few papers in the list due to time constraints, would be most consistent with our findings
An Approximately Optimal Algorithm for Scheduling Phasor Data Transmissions in Smart Grid Networks
In this paper, we devise a scheduling algorithm for ordering transmission of
synchrophasor data from the substation to the control center in as short a time
frame as possible, within the realtime hierarchical communications
infrastructure in the electric grid. The problem is cast in the framework of
the classic job scheduling with precedence constraints. The optimization setup
comprises the number of phasor measurement units (PMUs) to be installed on the
grid, a weight associated with each PMU, processing time at the control center
for the PMUs, and precedence constraints between the PMUs. The solution to the
PMU placement problem yields the optimum number of PMUs to be installed on the
grid, while the processing times are picked uniformly at random from a
predefined set. The weight associated with each PMU and the precedence
constraints are both assumed known. The scheduling problem is provably NP-hard,
so we resort to approximation algorithms which provide solutions that are
suboptimal yet possessing polynomial time complexity. A lower bound on the
optimal schedule is derived using branch and bound techniques, and its
performance evaluated using standard IEEE test bus systems. The scheduling
policy is power grid-centric, since it takes into account the electrical
properties of the network under consideration.Comment: 8 pages, published in IEEE Transactions on Smart Grid, October 201
Exploiting Spatial Code Proximity and Order for Improved Source Code Retrieval for Bug Localization
Abstract—Practically all Information Retrieval (IR) based approaches developed to date for automatic bug localization are based on the bag-of-words assumption that ignores any positional and ordering relationships between the terms in a query. In this paper we argue that bug reports are ill-served by this assumption since such reports frequently contain various types of structural information whose terms must obey certain positional and ordering constraints. It therefore stands to reason that the quality of retrieval for bug localization would improve if these constraints could be taken into account when searching for the most relevant files. In this paper, we demonstrate that such is indeed the case. We show how the well-known Markov Random Field (MRF) based retrieval framework can be used for taking into account the term-term proximity and ordering relationships in a query vis-a-vis the same relationships in the files of a source-code library to greatly improve the quality of retrieval of the most relevant source files. We have carried out our experimental evaluations on popular large software projects using over 4 thousand bug reports. The results we present demonstrate unequivocally that the new proposed approach is far superior to the widely used bag-of-words based approaches
Improved Parameterized Algorithms for Constraint Satisfaction
For many constraint satisfaction problems, the algorithm which chooses a
random assignment achieves the best possible approximation ratio. For instance,
a simple random assignment for {\sc Max-E3-Sat} allows 7/8-approximation and
for every \eps >0 there is no polynomial-time (7/8+\eps)-approximation
unless P=NP. Another example is the {\sc Permutation CSP} of bounded arity.
Given the expected fraction of the constraints satisfied by a random
assignment (i.e. permutation), there is no (\rho+\eps)-approximation
algorithm for every \eps >0, assuming the Unique Games Conjecture (UGC).
In this work, we consider the following parameterization of constraint
satisfaction problems. Given a set of constraints of constant arity, can we
satisfy at least constraint, where is the expected fraction
of constraints satisfied by a random assignment? {\sc Constraint Satisfaction
Problems above Average} have been posed in different forms in the literature
\cite{Niedermeier2006,MahajanRamanSikdar09}. We present a faster parameterized
algorithm for deciding whether equations can be simultaneously
satisfied over . As a consequence, we obtain -variable
bikernels for {\sc boolean CSPs} of arity for every fixed , and for {\sc
permutation CSPs} of arity 3. This implies linear bikernels for many problems
under the "above average" parameterization, such as {\sc Max--Sat}, {\sc
Set-Splitting}, {\sc Betweenness} and {\sc Max Acyclic Subgraph}. As a result,
all the parameterized problems we consider in this paper admit -time
algorithms.
We also obtain non-trivial hybrid algorithms for every Max -CSP: for every
instance , we can either approximate beyond the random assignment
threshold in polynomial time, or we can find an optimal solution to in
subexponential time.Comment: A preliminary version of this paper has been accepted for IPEC 201
Stochastic Ordering under Conditional Modelling of Extreme Values: Drug-Induced Liver Injury
Drug-induced liver injury (DILI) is a major public health issue and of
serious concern for the pharmaceutical industry. Early detection of signs of a
drug's potential for DILI is vital for pharmaceutical companies' evaluation of
new drugs. A combination of extreme values of liver specific variables indicate
potential DILI (Hy's Law). We estimate the probability of severe DILI using the
Heffernan and Tawn (2004) conditional dependence model which arises naturally
in applications where a multidimensional random variable is extreme in at least
one component. We extend the current model by including the assumption of
stochastically ordered survival curves for different doses in a Phase 3 study.Comment: 24 pages, 5 figure
- …