11,044 research outputs found
Optimal Grouping for Group Minimax Hypothesis Testing
Bayesian hypothesis testing and minimax hypothesis testing represent extreme
instances of detection in which the prior probabilities of the hypotheses are
either completely and precisely known, or are completely unknown. Group
minimax, also known as Gamma-minimax, is a robust intermediary between Bayesian
and minimax hypothesis testing that allows for coarse or partial advance
knowledge of the hypothesis priors by using information on sets in which the
prior lies. Existing work on group minimax, however, does not consider the
question of how to define the sets or groups of priors; it is assumed that the
groups are given. In this work, we propose a novel intermediate detection
scheme formulated through the quantization of the space of prior probabilities
that optimally determines groups and also representative priors within the
groups. We show that when viewed from a quantization perspective, group minimax
amounts to determining centroids with a minimax Bayes risk error divergence
distortion criterion: the appropriate Bregman divergence for this task.
Moreover, the optimal partitioning of the space of prior probabilities is a
Bregman Voronoi diagram. Together, the optimal grouping and representation
points are an epsilon-net with respect to Bayes risk error divergence, and
permit a rate-distortion type asymptotic analysis of detection performance with
the number of groups. Examples of detecting signals corrupted by additive white
Gaussian noise and of distinguishing exponentially-distributed signals are
presented.Comment: 12 figure
Reliable Crowdsourcing for Multi-Class Labeling using Coding Theory
Crowdsourcing systems often have crowd workers that perform unreliable work
on the task they are assigned. In this paper, we propose the use of
error-control codes and decoding algorithms to design crowdsourcing systems for
reliable classification despite unreliable crowd workers. Coding-theory based
techniques also allow us to pose easy-to-answer binary questions to the crowd
workers. We consider three different crowdsourcing models: systems with
independent crowd workers, systems with peer-dependent reward schemes, and
systems where workers have common sources of information. For each of these
models, we analyze classification performance with the proposed coding-based
scheme. We develop an ordering principle for the quality of crowds and describe
how system performance changes with the quality of the crowd. We also show that
pairing among workers and diversification of the questions help in improving
system performance. We demonstrate the effectiveness of the proposed
coding-based scheme using both simulated data and real datasets from Amazon
Mechanical Turk, a crowdsourcing microtask platform. Results suggest that use
of good codes may improve the performance of the crowdsourcing task over
typical majority-voting approaches.Comment: 20 pages, 11 figures, under revision, IEEE Journal of Selected Topics
in Signal Processin
A Coupon-Collector Model of Machine-Aided Discovery
Empirical studies of scientific discovery---so-called Eurekometrics---have
indicated that the output of exploration proceeds as a logistic growth curve.
Although logistic functions are prevalent in explaining population growth that
is resource-limited to a given carrying capacity, their derivation do not apply
to discovery processes. This paper develops a generative model for logistic
\emph{knowledge discovery} using a novel extension of coupon collection, where
an explorer interested in discovering all unknown elements of a set is
supported by technology that can respond to queries. This discovery process is
parameterized by the novelty and quality of the set of discovered elements at
every time step, and randomness is demonstrated to improve performance.
Simulation results provide further intuition on the discovery process.Comment: 5 pages, 9 figures, 2017 KDD Workshop on Data-Driven Discover
Diffusive Molecular Communication with Nanomachine Mobility
This work presents a performance analysis for diffusive molecular
communication with mobile transmit and receive nanomachines. To begin with, the
optimal test is obtained for symbol detection at the receiver nanomachine.
Subsequently, closed-form expressions are derived for the probabilities of
detection and false alarm, probability of error, and capacity considering also
aberrations such as multi-source interference, inter-symbol interference, and
counting errors. Simulation results are presented to corroborate the
theoretical results derived and also, to yield various insights into the
performance of the system. Interestingly, it is shown that the performance of
the mobile diffusive molecular communication can be significantly enhanced by
allocating large fraction of total available molecules for transmission as the
slot interval increases.Comment: To be submitted in 52th Annual Conference on Information Sciences and
Systems (CISS
Cognitive MIMO-RF/FSO Cooperative Relay Communication with Mobile Nodes and Imperfect Channel State Information
This work analyzes the performance of an underlay cognitive radio based
decode-and-forward mixed multiple-input multiple-output (MIMO) radio
frequency/free space optical (RF/FSO) cooperative relay system with multiple
mobile secondary and primary user nodes. The effect of imperfect channel state
information (CSI) arising due to channel estimation error is also considered at
the secondary user transmitters (SU-TXs) and relay on the power control and
symbol detection processes respectively. A unique aspect of this work is that
both fixed and proportional interference power constraints are employed to
limit the interference at the primary user receivers (PU-RXs). Analytical
results are derived to characterize the exact and asymptotic outage and bit
error probabilities of the above system under practical conditions of node
mobility and imperfect CSI, together with impairments of the optical channel,
such as path loss, atmospheric turbulence, and pointing errors, for orthogonal
space-time block coded transmission between each SU-TX and relay. Finally,
simulation results are presented to yield various interesting insights into the
system performance such as the benefits of a midamble versus preamble for
channel estimation.Comment: revision submitted to IEEE Transactions on Cognitive Communications
and Networkin
Multi-object Classification via Crowdsourcing with a Reject Option
Consider designing an effective crowdsourcing system for an -ary
classification task. Crowd workers complete simple binary microtasks whose
results are aggregated to give the final result. We consider the novel scenario
where workers have a reject option so they may skip microtasks when they are
unable or choose not to respond. For example, in mismatched speech
transcription, workers who do not know the language may not be able to respond
to microtasks focused on phonological dimensions outside their categorical
perception. We present an aggregation approach using a weighted majority voting
rule, where each worker's response is assigned an optimized weight to maximize
the crowd's classification performance. We evaluate system performance in both
exact and asymptotic forms. Further, we consider the setting where there may be
a set of greedy workers that complete microtasks even when they are unable to
perform it reliably. We consider an oblivious and an expurgation strategy to
deal with greedy workers, developing an algorithm to adaptively switch between
the two based on the estimated fraction of greedy workers in the anonymous
crowd. Simulation results show improved performance compared with conventional
majority voting.Comment: two column, 15 pages, 8 figures, submitted to IEEE Trans. Signal
Proces
Design and Performance Analysis of Dual and Multi-hop Diffusive Molecular Communication Systems
This work presents a comprehensive performance analysis of diffusion based
direct, dual-hop, and multi-hop molecular communication systems with Brownian
motion and drift in the presence of various distortions such as inter-symbol
interference (ISI), multi-source interference (MSI), and counting errors.
Optimal decision rules are derived employing the likelihood ratio tests (LRTs)
for symbol detection at each of the cooperative as well as the destination
nanomachines. Further, closed-form expressions are also derived for the
probabilities of detection, false alarm at the individual cooperative,
destination nanomachines, as well as the overall end-to-end probability of
error for source-destination communication. The results also characterize the
impact of detection performance of the intermediate cooperative nanomachine(s)
on the end-to-end performance of dual/multi hop diffusive molecular
communication systems. In addition, capacity expressions are also derived for
direct, dual-hop, and multi-hop molecular communication scenarios. Simulation
results are presented to corroborate the theoretical results derived and also,
to yield insights into system performance.Comment: in preparatio
Flavor Pairing in Medieval European Cuisine: A Study in Cooking with Dirty Data
An important part of cooking with computers is using statistical methods to
create new, flavorful ingredient combinations. The flavor pairing hypothesis
states that culinary ingredients with common chemical flavor components combine
well to produce pleasant dishes. It has been recently shown that this design
principle is a basis for modern Western cuisine and is reversed for Asian
cuisine.
Such data-driven analysis compares the chemistry of ingredients to ingredient
sets found in recipes. However, analytics-based generation of novel flavor
profiles can only be as good as the underlying chemical and recipe data.
Incomplete, inaccurate, and irrelevant data may degrade flavor pairing
inferences. Chemical data on flavor compounds is incomplete due to the nature
of the experiments that must be conducted to obtain it. Recipe data may have
issues due to text parsing errors, imprecision in textual descriptions of
ingredients, and the fact that the same ingredient may be known by different
names in different recipes. Moreover, the process of matching ingredients in
chemical data and recipe data may be fraught with mistakes. Much of the
`dirtiness' of the data cannot be cleansed even with manual curation.
In this work, we collect a new data set of recipes from Medieval Europe
before the Columbian Exchange and investigate the flavor pairing hypothesis
historically. To investigate the role of data incompleteness and error as part
of this hypothesis testing, we use two separate chemical compound data sets
with different levels of cleanliness. Notably, the different data sets give
conflicting conclusions about the flavor pairing hypothesis in Medieval Europe.
As a contribution towards social science, we obtain inferences about the
evolution of culinary arts when many new ingredients are suddenly made
available.Comment: IJCA
How an Electrical Engineer Became an Artificial Intelligence Researcher, a Multiphase Active Contours Analysis
This essay examines how what is considered to be artificial intelligence (AI)
has changed over time and come to intersect with the expertise of the author.
Initially, AI developed on a separate trajectory, both topically and
institutionally, from pattern recognition, neural information processing,
decision and control systems, and allied topics by focusing on symbolic systems
within computer science departments rather than on continuous systems in
electrical engineering departments. The separate evolutions continued
throughout the author's lifetime, with some crossover in reinforcement learning
and graphical models, but were shocked into converging by the virality of deep
learning, thus making an electrical engineer into an AI researcher. Now that
this convergence has happened, opportunity exists to pursue an agenda that
combines learning and reasoning bridged by interpretable machine learning
models
Toward a Comparative Cognitive History: Archimedes and D. H. J. Polymath
Is collective intelligence just individual intelligence writ large, or are
there fundamental differences? This position paper argues that a cognitive
history methodology can shed light into the nature of collective intelligence
and its differences from individual intelligence. To advance this proposed area
of research, a small case study on the structure of argument and proof is
presented. Quantitative metrics from network science are used to compare the
artifacts of deduction from two sources. The first is the work of Archimedes of
Syracuse, putatively an individual, and of other ancient Greek mathematicians.
The second is work of the Polymath Project, a massively collaborative
mathematics project that used blog posts and comments to prove new results in
combinatorics.Comment: Presented at Collective Intelligence conference, 2012
(arXiv:1204.2991
- …