3,605 research outputs found
Densest Diverse Subgraphs: How to Plan a Successful Cocktail Party with Diversity
Dense subgraph discovery methods are routinely used in a variety of
applications including the identification of a team of skilled individuals for
collaboration from a social network. However, when the network's node set is
associated with a sensitive attribute such as race, gender, religion, or
political opinion, the lack of diversity can lead to lawsuits.
In this work, we focus on the problem of finding a densest diverse subgraph
in a graph whose nodes have different attribute values/types that we refer to
as colors. We propose two novel formulations motivated by different realistic
scenarios. Our first formulation, called the densest diverse subgraph problem
(DDSP), guarantees that no color represents more than some fraction of the
nodes in the output subgraph, which generalizes the state-of-the-art due to
Anagnostopoulos et al. (CIKM 2020). By varying the fraction we can range the
diversity constraint and interpolate from a diverse dense subgraph where all
colors have to be equally represented to an unconstrained dense subgraph. We
design a scalable -approximation algorithm, where is
the number of nodes. Our second formulation is motivated by the setting where
any specified color should not be overlooked. We propose the densest
at-least--subgraph problem (DalS), a novel generalization of
the classic DalS, where instead of a single value , we have a vector
of cardinality demands with one coordinate per color class. We
design a -approximation algorithm using linear programming together with
an acceleration technique. Computational experiments using synthetic and
real-world datasets demonstrate that our proposed algorithms are effective in
extracting dense diverse clusters.Comment: Accepted to KDD 202
Practical recommendations for gradient-based training of deep architectures
Learning algorithms related to artificial neural networks and in particular
for Deep Learning may seem to involve many bells and whistles, called
hyper-parameters. This chapter is meant as a practical guide with
recommendations for some of the most commonly used hyper-parameters, in
particular in the context of learning algorithms based on back-propagated
gradient and gradient-based optimization. It also discusses how to deal with
the fact that more interesting results can be obtained when allowing one to
adjust many hyper-parameters. Overall, it describes elements of the practice
used to successfully and efficiently train and debug large-scale and often deep
multi-layer neural networks. It closes with open questions about the training
difficulties observed with deeper architectures
The path inference filter: model-based low-latency map matching of probe vehicle data
We consider the problem of reconstructing vehicle trajectories from sparse
sequences of GPS points, for which the sampling interval is between 10 seconds
and 2 minutes. We introduce a new class of algorithms, called altogether path
inference filter (PIF), that maps GPS data in real time, for a variety of
trade-offs and scenarios, and with a high throughput. Numerous prior approaches
in map-matching can be shown to be special cases of the path inference filter
presented in this article. We present an efficient procedure for automatically
training the filter on new data, with or without ground truth observations. The
framework is evaluated on a large San Francisco taxi dataset and is shown to
improve upon the current state of the art. This filter also provides insights
about driving patterns of drivers. The path inference filter has been deployed
at an industrial scale inside the Mobile Millennium traffic information system,
and is used to map fleets of data in San Francisco, Sacramento, Stockholm and
Porto.Comment: Preprint, 23 pages and 23 figure
- …