2,044 research outputs found
Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a
challenging problem in statistics and AI. Classical approaches for {exploratory
data analysis} are usually not flexible enough to deal with the uncertainty
inherent to real-world data: they are often restricted to fixed latent
interaction models and homogeneous likelihoods; they are sensitive to missing,
corrupt and anomalous data; moreover, their expressiveness generally comes at
the price of intractable inference. As a result, supervision from statisticians
is usually needed to find the right model for the data. However, since domain
experts are not necessarily also experts in statistics, we propose Automatic
Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible
at large. Specifically, ABDA allows for automatic and efficient missing value
estimation, statistical data type and likelihood discovery, anomaly detection
and dependency structure mining, on top of providing accurate density
estimation. Extensive empirical evidence shows that ABDA is a suitable tool for
automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
Personalized Expert Recommendation: Models and Algorithms
Many large-scale information sharing systems including social media systems, questionanswering
sites and rating and reviewing applications have been growing rapidly, allowing
millions of human participants to generate and consume information on an unprecedented
scale. To manage the sheer growth of information generation, there comes the need to enable
personalization of information resources for users — to surface high-quality content
and feeds, to provide personally relevant suggestions, and so on. A fundamental task in
creating and supporting user-centered personalization systems is to build rich user profile
to aid recommendation for better user experience.
Therefore, in this dissertation research, we propose models and algorithms to facilitate
the creation of new crowd-powered personalized information sharing systems. Specifically,
we first give a principled framework to enable personalization of resources so that
information seekers can be matched with customized knowledgeable users based on their
previous historical actions and contextual information; We then focus on creating rich
user models that allows accurate and comprehensive modeling of user profiles for long
tail users, including discovering user’s known-for profile, user’s opinion bias and user’s
geo-topic profile. In particular, this dissertation research makes two unique contributions:
First, we introduce the problem of personalized expert recommendation and propose
the first principled framework for addressing this problem. To overcome the sparsity issue,
we investigate the use of user’s contextual information that can be exploited to build robust
models of personal expertise, study how spatial preference for personally-valuable expertise
varies across regions, across topics and based on different underlying social communities,
and integrate these different forms of preferences into a matrix factorization-based
personalized expert recommender.
Second, to support the personalized recommendation on experts, we focus on modeling
and inferring user profiles in online information sharing systems. In order to tap
the knowledge of most majority of users, we provide frameworks and algorithms to accurately
and comprehensively create user models by discovering user’s known-for profile,
user’s opinion bias and user’s geo-topic profile, with each described shortly as follows:
—We develop a probabilistic model called Bayesian Contextual Poisson Factorization
to discover what users are known for by others. Our model considers as input a small fraction
of users whose known-for profiles are already known and the vast majority of users for
whom we have little (or no) information, learns the implicit relationships between user?s
known-for profiles and their contextual signals, and finally predict known-for profiles for
those majority of users.
—We explore user’s topic-sensitive opinion bias, propose a lightweight semi-supervised
system called “BiasWatch” to semi-automatically infer the opinion bias of long-tail users,
and demonstrate how user’s opinion bias can be exploited to recommend other users with
similar opinion in social networks.
— We study how a user’s topical profile varies geo-spatially and how we can model
a user’s geo-spatial known-for profile as the last step in our dissertation for creation of
rich user profile. We propose a multi-layered Bayesian hierarchical user factorization to
overcome user heterogeneity and an enhanced model to alleviate the sparsity issue by integrating
user contexts into the two-layered hierarchical user model for better representation
of user’s geo-topic preference by others
Fine-grained performance analysis of massive MTC networks with scheduling and data aggregation
Abstract. The Internet of Things (IoT) represents a substantial shift within wireless communication and constitutes a relevant topic of social, economic, and overall technical impact. It refers to resource-constrained devices communicating without or with low human intervention. However, communication among machines imposes several challenges compared to traditional human type communication (HTC). Moreover, as the number of devices increases exponentially, different network management techniques and technologies are needed. Data aggregation is an efficient approach to handle the congestion introduced by a massive number of machine type devices (MTDs). The aggregators not only collect data but also implement scheduling mechanisms to cope with scarce network resources.
This thesis provides an overview of the most common IoT applications and the network technologies to support them. We describe the most important challenges in machine type communication (MTC). We use a stochastic geometry (SG) tool known as the meta distribution (MD) of the signal-to-interference ratio (SIR), which is the distribution of the conditional SIR distribution given the wireless nodes’ locations, to provide a fine-grained description of the per-link reliability. Specifically, we analyze the performance of two scheduling methods for data aggregation of MTC: random resource scheduling (RRS) and channel-aware resource scheduling (CRS). The results show the fraction of users in the network that achieves a target reliability, which is an important aspect to consider when designing wireless systems with stringent service requirements. Finally, the impact on the fraction of MTDs that communicate with a target reliability when increasing the aggregators density is investigated
Graph Theory and Networks in Biology
In this paper, we present a survey of the use of graph theoretical techniques
in Biology. In particular, we discuss recent work on identifying and modelling
the structure of bio-molecular networks, as well as the application of
centrality measures to interaction networks and research on the hierarchical
structure of such networks and network motifs. Work on the link between
structural network properties and dynamics is also described, with emphasis on
synchronization and disease propagation.Comment: 52 pages, 5 figures, Survey Pape
Economic Complexity Unfolded: Interpretable Model for the Productive Structure of Economies
Economic complexity reflects the amount of knowledge that is embedded in the
productive structure of an economy. It resides on the premise of hidden
capabilities - fundamental endowments underlying the productive structure. In
general, measuring the capabilities behind economic complexity directly is
difficult, and indirect measures have been suggested which exploit the fact
that the presence of the capabilities is expressed in a country's mix of
products. We complement these studies by introducing a probabilistic framework
which leverages Bayesian non-parametric techniques to extract the dominant
features behind the comparative advantage in exported products. Based on
economic evidence and trade data, we place a restricted Indian Buffet Process
on the distribution of countries' capability endowment, appealing to a culinary
metaphor to model the process of capability acquisition. The approach comes
with a unique level of interpretability, as it produces a concise and
economically plausible description of the instantiated capabilities
- …