29,738 research outputs found
Probabilistic Fair Clustering
In clustering problems, a central decision-maker is given a complete metric
graph over vertices and must provide a clustering of vertices that minimizes
some objective function. In fair clustering problems, vertices are endowed with
a color (e.g., membership in a group), and the features of a valid clustering
might also include the representation of colors in that clustering. Prior work
in fair clustering assumes complete knowledge of group membership. In this
paper, we generalize prior work by assuming imperfect knowledge of group
membership through probabilistic assignments. We present clustering algorithms
in this more general setting with approximation ratio guarantees. We also
address the problem of "metric membership", where different groups have a
notion of order and distance. Experiments are conducted using our proposed
algorithms as well as baselines to validate our approach and also surface
nuanced concerns when group membership is not known deterministically
Visualizing probabilistic models: Intensive Principal Component Analysis
Unsupervised learning makes manifest the underlying structure of data without
curated training and specific problem definitions. However, the inference of
relationships between data points is frustrated by the `curse of
dimensionality' in high-dimensions. Inspired by replica theory from statistical
mechanics, we consider replicas of the system to tune the dimensionality and
take the limit as the number of replicas goes to zero. The result is the
intensive embedding, which is not only isometric (preserving local distances)
but allows global structure to be more transparently visualized. We develop the
Intensive Principal Component Analysis (InPCA) and demonstrate clear
improvements in visualizations of the Ising model of magnetic spins, a neural
network, and the dark energy cold dark matter ({\Lambda}CDM) model as applied
to the Cosmic Microwave Background.Comment: 6 pages, 5 figure
iFair: Learning Individually Fair Data Representations for Algorithmic Decision Making
People are rated and ranked, towards algorithmic decision making in an
increasing number of applications, typically based on machine learning.
Research on how to incorporate fairness into such tasks has prevalently pursued
the paradigm of group fairness: giving adequate success rates to specifically
protected groups. In contrast, the alternative paradigm of individual fairness
has received relatively little attention, and this paper advances this less
explored direction. The paper introduces a method for probabilistically mapping
user records into a low-rank representation that reconciles individual fairness
and the utility of classifiers and rankings in downstream applications. Our
notion of individual fairness requires that users who are similar in all
task-relevant attributes such as job qualification, and disregarding all
potentially discriminating attributes such as gender, should have similar
outcomes. We demonstrate the versatility of our method by applying it to
classification and learning-to-rank tasks on a variety of real-world datasets.
Our experiments show substantial improvements over the best prior work for this
setting.Comment: Accepted at ICDE 2019. Please cite the ICDE 2019 proceedings versio
Analyzing Energy-efficiency and Route-selection of Multi-level Hierarchal Routing Protocols in WSNs
The advent and development in the field of Wireless Sensor Networks (WSNs) in
recent years has seen the growth of extremely small and low-cost sensors that
possess sensing, signal processing and wireless communication capabilities.
These sensors can be expended at a much lower cost and are capable of detecting
conditions such as temperature, sound, security or any other system. A good
protocol design should be able to scale well both in energy heterogeneous and
homogeneous environment, meet the demands of different application scenarios
and guarantee reliability. On this basis, we have compared six different
protocols of different scenarios which are presenting their own schemes of
energy minimizing, clustering and route selection in order to have more
effective communication. This research is motivated to have an insight that
which of the under consideration protocols suit well in which application and
can be a guide-line for the design of a more robust and efficient protocol.
MATLAB simulations are performed to analyze and compare the performance of
LEACH, multi-level hierarchal LEACH and multihop LEACH.Comment: NGWMN with 7th IEEE Inter- national Conference on Broadband and
Wireless Computing, Communication and Applications (BWCCA 2012), Victoria,
Canada, 201
Numeric Input Relations for Relational Learning with Applications to Community Structure Analysis
Most work in the area of statistical relational learning (SRL) is focussed on
discrete data, even though a few approaches for hybrid SRL models have been
proposed that combine numerical and discrete variables. In this paper we
distinguish numerical random variables for which a probability distribution is
defined by the model from numerical input variables that are only used for
conditioning the distribution of discrete response variables. We show how
numerical input relations can very easily be used in the Relational Bayesian
Network framework, and that existing inference and learning methods need only
minor adjustments to be applied in this generalized setting. The resulting
framework provides natural relational extensions of classical probabilistic
models for categorical data. We demonstrate the usefulness of RBN models with
numeric input relations by several examples.
In particular, we use the augmented RBN framework to define probabilistic
models for multi-relational (social) networks in which the probability of a
link between two nodes depends on numeric latent feature vectors associated
with the nodes. A generic learning procedure can be used to obtain a
maximum-likelihood fit of model parameters and latent feature values for a
variety of models that can be expressed in the high-level RBN representation.
Specifically, we propose a model that allows us to interpret learned latent
feature values as community centrality degrees by which we can identify nodes
that are central for one community, that are hubs between communities, or that
are isolated nodes. In a multi-relational setting, the model also provides a
characterization of how different relations are associated with each community
MUSE: Modularizing Unsupervised Sense Embeddings
This paper proposes to address the word sense ambiguity issue in an
unsupervised manner, where word sense representations are learned along a word
sense selection mechanism given contexts. Prior work focused on designing a
single model to deliver both mechanisms, and thus suffered from either
coarse-grained representation learning or inefficient sense selection. The
proposed modular approach, MUSE, implements flexible modules to optimize
distinct mechanisms, achieving the first purely sense-level representation
learning system with linear-time sense selection. We leverage reinforcement
learning to enable joint training on the proposed modules, and introduce
various exploration techniques on sense selection for better robustness. The
experiments on benchmark data show that the proposed approach achieves the
state-of-the-art performance on synonym selection as well as on contextual word
similarities in terms of MaxSimC
- …