4,350 research outputs found
Multiuser detection in a dynamic environment Part I: User identification and data detection
In random-access communication systems, the number of active users varies
with time, and has considerable bearing on receiver's performance. Thus,
techniques aimed at identifying not only the information transmitted, but also
that number, play a central role in those systems. An example of application of
these techniques can be found in multiuser detection (MUD). In typical MUD
analyses, receivers are based on the assumption that the number of active users
is constant and known at the receiver, and coincides with the maximum number of
users entitled to access the system. This assumption is often overly
pessimistic, since many users might be inactive at any given time, and
detection under the assumption of a number of users larger than the real one
may impair performance.
The main goal of this paper is to introduce a general approach to the problem
of identifying active users and estimating their parameters and data in a
random-access system where users are continuously entering and leaving the
system. The tool whose use we advocate is Random-Set Theory: applying this, we
derive optimum receivers in an environment where the set of transmitters
comprises an unknown number of elements. In addition, we can derive
Bayesian-filter equations which describe the evolution with time of the a
posteriori probability density of the unknown user parameters, and use this
density to derive optimum detectors. In this paper we restrict ourselves to
interferer identification and data detection, while in a companion paper we
shall examine the more complex problem of estimating users' parameters.Comment: To be published on IEEE Transactions on Information Theor
Music Recommendations in Hyperbolic Space: An Application of Empirical Bayes and Hierarchical Poincar\'e Embeddings
Matrix Factorization (MF) is a common method for generating recommendations,
where the proximity of entities like users or items in the embedded space
indicates their similarity to one another. Though almost all applications
implicitly use a Euclidean embedding space to represent two entity types,
recent work has suggested that a hyperbolic Poincar\'e ball may be more well
suited to representing multiple entity types, and in particular, hierarchies.
We describe a novel method to embed a hierarchy of related music entities in
hyperbolic space. We also describe how a parametric empirical Bayes approach
can be used to estimate link reliability between entities in the hierarchy.
Applying these methods together to build personalized playlists for users in a
digital music service yielded a large and statistically significant increase in
performance during an A/B test, as compared to the Euclidean model
Inferring Anomalies from Data using Bayesian Networks
Existing studies on data mining has largely focused on the design of measures and algorithms to identify outliers in large and high dimensional categorical and numeric databases. However, not much stress has been given on the interestingness of the reported outlier. One way to ascertain interestingness and usefulness of the reported outlier is by making use of domain knowledge. In this thesis, we present measures to discover outliers based on background knowledge, represented by a Bayesian network. Using causal relationships between attributes encoded in the Bayesian framework, we demonstrate that meaningful outliers, i.e., outliers which encode important or new information are those which violate causal relationships encoded in the model. Depending upon nature of data, several approaches are proposed to identify and explain anomalies using Bayesian knowledge. Outliers are often identified as data points which are ``rare'', ''isolated'', or ''far away from their nearest neighbors''. We show that these characteristics may not be an accurate way of describing interesting outliers. Through a critical analysis on several existing outlier detection techniques, we show why there is a mismatch between outliers as entities described by these characteristics and ``real'' outliers as identified using Bayesian approach. We show that the Bayesian approaches presented in this thesis has better accuracy in mining genuine outliers while, keeping a low false positive rate as compared to traditional outlier detection techniques
Bayesian stochastic blockmodeling
This chapter provides a self-contained introduction to the use of Bayesian
inference to extract large-scale modular structures from network data, based on
the stochastic blockmodel (SBM), as well as its degree-corrected and
overlapping generalizations. We focus on nonparametric formulations that allow
their inference in a manner that prevents overfitting, and enables model
selection. We discuss aspects of the choice of priors, in particular how to
avoid underfitting via increased Bayesian hierarchies, and we contrast the task
of sampling network partitions from the posterior distribution with finding the
single point estimate that maximizes it, while describing efficient algorithms
to perform either one. We also show how inferring the SBM can be used to
predict missing and spurious links, and shed light on the fundamental
limitations of the detectability of modular structures in networks.Comment: 44 pages, 16 figures. Code is freely available as part of graph-tool
at https://graph-tool.skewed.de . See also the HOWTO at
https://graph-tool.skewed.de/static/doc/demos/inference/inference.htm
- …