171 research outputs found
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
-Regression in the Arbitrary Partition Model of Communication
We consider the randomized communication complexity of the distributed
-regression problem in the coordinator model, for . In this
problem, there is a coordinator and servers. The -th server receives
and and the coordinator would like to find a -approximate
solution to . Here
for convenience. This model, where the data is
additively shared across servers, is commonly referred to as the arbitrary
partition model.
We obtain significantly improved bounds for this problem. For , i.e.,
least squares regression, we give the first optimal bound of
bits.
For ,we obtain an upper bound. Notably, for sufficiently large,
our leading order term only depends linearly on rather than
quadratically. We also show communication lower bounds of for and for . Our bounds considerably improve previous bounds due to (Woodruff et al.
COLT, 2013) and (Vempala et al., SODA, 2020)
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
Can You Solve Closest String Faster than Exhaustive Search?
We study the fundamental problem of finding the best string to represent a
given set, in the form of the Closest String problem: Given a set of strings, find the string minimizing the radius of the
smallest Hamming ball around that encloses all the strings in . In
this paper, we investigate whether the Closest String problem admits algorithms
that are faster than the trivial exhaustive search algorithm. We obtain the
following results for the two natural versions of the problem:
In the continuous Closest String problem, the goal is to find the
solution string anywhere in . For binary strings, the
exhaustive search algorithm runs in time and we prove that it
cannot be improved to time , for any , unless the Strong Exponential Time Hypothesis fails.
In the discrete Closest String problem, is required to be in
the input set . While this problem is clearly in polynomial time, its
fine-grained complexity has been pinpointed to be quadratic time whenever the dimension is . We complement
this known hardness result with new algorithms, proving essentially that
whenever falls out of this hard range, the discrete Closest String problem
can be solved faster than exhaustive search. In the small- regime, our
algorithm is based on a novel application of the inclusion-exclusion principle.
Interestingly, all of our results apply (and some are even stronger) to the
natural dual of the Closest String problem, called the Remotest String problem,
where the task is to find a string maximizing the Hamming distance to all the
strings in
Minimizing Hitting Time between Disparate Groups with Shortcut Edges
Structural bias or segregation of networks refers to situations where two or
more disparate groups are present in the network, so that the groups are highly
connected internally, but loosely connected to each other. In many cases it is
of interest to increase the connectivity of disparate groups so as to, e.g.,
minimize social friction, or expose individuals to diverse viewpoints. A
commonly-used mechanism for increasing the network connectivity is to add edge
shortcuts between pairs of nodes. In many applications of interest, edge
shortcuts typically translate to recommendations, e.g., what video to watch, or
what news article to read next. The problem of reducing structural bias or
segregation via edge shortcuts has recently been studied in the literature, and
random walks have been an essential tool for modeling navigation and
connectivity in the underlying networks. Existing methods, however, either do
not offer approximation guarantees, or engineer the objective so that it
satisfies certain desirable properties that simplify the optimization~task. In
this paper we address the problem of adding a given number of shortcut edges in
the network so as to directly minimize the average hitting time and the maximum
hitting time between two disparate groups. Our algorithm for minimizing average
hitting time is a greedy bicriteria that relies on supermodularity. In
contrast, maximum hitting time is not supermodular. Despite, we develop an
approximation algorithm for that objective as well, by leveraging connections
with average hitting time and the asymmetric k-center problem.Comment: To appear in KDD 202
LIPIcs, Volume 274, ESA 2023, Complete Volume
LIPIcs, Volume 274, ESA 2023, Complete Volum
LIPIcs, Volume 258, SoCG 2023, Complete Volume
LIPIcs, Volume 258, SoCG 2023, Complete Volum
Energy Data Analytics for Smart Meter Data
The principal advantage of smart electricity meters is their ability to transfer digitized electricity consumption data to remote processing systems. The data collected by these devices make the realization of many novel use cases possible, providing benefits to electricity providers and customers alike. This book includes 14 research articles that explore and exploit the information content of smart meter data, and provides insights into the realization of new digital solutions and services that support the transition towards a sustainable energy system. This volume has been edited by Andreas Reinhardt, head of the Energy Informatics research group at Technische Universität Clausthal, Germany, and Lucas Pereira, research fellow at Técnico Lisboa, Portugal
- …