9,145 research outputs found
Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence
Data in the form of pairwise comparisons arises in many domains, including
preference elicitation, sporting competitions, and peer grading among others.
We consider parametric ordinal models for such pairwise comparison data
involving a latent vector that represents the
"qualities" of the items being compared; this class of models includes the
two most widely used parametric models--the Bradley-Terry-Luce (BTL) and the
Thurstone models. Working within a standard minimax framework, we provide tight
upper and lower bounds on the optimal error in estimating the quality score
vector under this class of models. The bounds depend on the topology of
the comparison graph induced by the subset of pairs being compared via its
Laplacian spectrum. Thus, in settings where the subset of pairs may be chosen,
our results provide principled guidelines for making this choice. Finally, we
compare these error rates to those under cardinal measurement models and show
that the error rates in the ordinal and cardinal settings have identical
scalings apart from constant pre-factors.Comment: 39 pages, 5 figures. Significant extension of arXiv:1406.661
Ad Hoc Microphone Array Calibration: Euclidean Distance Matrix Completion Algorithm and Theoretical Guarantees
This paper addresses the problem of ad hoc microphone array calibration where
only partial information about the distances between microphones is available.
We construct a matrix consisting of the pairwise distances and propose to
estimate the missing entries based on a novel Euclidean distance matrix
completion algorithm by alternative low-rank matrix completion and projection
onto the Euclidean distance space. This approach confines the recovered matrix
to the EDM cone at each iteration of the matrix completion algorithm. The
theoretical guarantees of the calibration performance are obtained considering
the random and locally structured missing entries as well as the measurement
noise on the known distances. This study elucidates the links between the
calibration error and the number of microphones along with the noise level and
the ratio of missing distances. Thorough experiments on real data recordings
and simulated setups are conducted to demonstrate these theoretical insights. A
significant improvement is achieved by the proposed Euclidean distance matrix
completion algorithm over the state-of-the-art techniques for ad hoc microphone
array calibration.Comment: In Press, available online, August 1, 2014.
http://www.sciencedirect.com/science/article/pii/S0165168414003508, Signal
Processing, 201
When is it Better to Compare than to Score?
When eliciting judgements from humans for an unknown quantity, one often has
the choice of making direct-scoring (cardinal) or comparative (ordinal)
measurements. In this paper we study the relative merits of either choice,
providing empirical and theoretical guidelines for the selection of a
measurement scheme. We provide empirical evidence based on experiments on
Amazon Mechanical Turk that in a variety of tasks, (pairwise-comparative)
ordinal measurements have lower per sample noise and are typically faster to
elicit than cardinal ones. Ordinal measurements however typically provide less
information. We then consider the popular Thurstone and Bradley-Terry-Luce
(BTL) models for ordinal measurements and characterize the minimax error rates
for estimating the unknown quantity. We compare these minimax error rates to
those under cardinal measurement models and quantify for what noise levels
ordinal measurements are better. Finally, we revisit the data collected from
our experiments and show that fitting these models confirms this prediction:
for tasks where the noise in ordinal measurements is sufficiently low, the
ordinal approach results in smaller errors in the estimation
Computational exploration of molecular receptive fields in the olfactory bulb reveals a glomerulus-centric chemical map
© The Author(s) 2020. This article is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.Progress in olfactory research is currently hampered by incomplete knowledge about chemical receptive ranges of primary receptors. Moreover, the chemical logic underlying the arrangement of computational units in the olfactory bulb has still not been resolved. We undertook a large-scale approach at characterising molecular receptive ranges (MRRs) of glomeruli in the dorsal olfactory bulb (dOB) innervated by the MOR18-2 olfactory receptor, also known as Olfr78, with human ortholog OR51E2. Guided by an iterative approach that combined biological screening and machine learning, we selected 214 odorants to characterise the response of MOR18-2 and its neighbouring glomeruli. We found that a combination of conventional physico-chemical and vibrational molecular descriptors performed best in predicting glomerular responses using nonlinear Support-Vector Regression. We also discovered several previously unknown odorants activating MOR18-2 glomeruli, and obtained detailed MRRs of MOR18-2 glomeruli and their neighbours. Our results confirm earlier findings that demonstrated tunotopy, that is, glomeruli with similar tuning curves tend to be located in spatial proximity in the dOB. In addition, our results indicate chemotopy, that is, a preference for glomeruli with similar physico-chemical MRR descriptions being located in spatial proximity. Together, these findings suggest the existence of a partial chemical map underlying glomerular arrangement in the dOB. Our methodology that combines machine learning and physiological measurements lights the way towards future high-throughput studies to deorphanise and characterise structure-activity relationships in olfaction.Peer reviewe
Rank Minimization over Finite Fields: Fundamental Limits and Coding-Theoretic Interpretations
This paper establishes information-theoretic limits in estimating a finite
field low-rank matrix given random linear measurements of it. These linear
measurements are obtained by taking inner products of the low-rank matrix with
random sensing matrices. Necessary and sufficient conditions on the number of
measurements required are provided. It is shown that these conditions are sharp
and the minimum-rank decoder is asymptotically optimal. The reliability
function of this decoder is also derived by appealing to de Caen's lower bound
on the probability of a union. The sufficient condition also holds when the
sensing matrices are sparse - a scenario that may be amenable to efficient
decoding. More precisely, it is shown that if the n\times n-sensing matrices
contain, on average, \Omega(nlog n) entries, the number of measurements
required is the same as that when the sensing matrices are dense and contain
entries drawn uniformly at random from the field. Analogies are drawn between
the above results and rank-metric codes in the coding theory literature. In
fact, we are also strongly motivated by understanding when minimum rank
distance decoding of random rank-metric codes succeeds. To this end, we derive
distance properties of equiprobable and sparse rank-metric codes. These
distance properties provide a precise geometric interpretation of the fact that
the sparse ensemble requires as few measurements as the dense one. Finally, we
provide a non-exhaustive procedure to search for the unknown low-rank matrix.Comment: Accepted to the IEEE Transactions on Information Theory; Presented at
IEEE International Symposium on Information Theory (ISIT) 201
Coarse-graining the Dynamics of a Driven Interface in the Presence of Mobile Impurities: Effective Description via Diffusion Maps
Developing effective descriptions of the microscopic dynamics of many
physical phenomena can both dramatically enhance their computational
exploration and lead to a more fundamental understanding of the underlying
physics. Previously, an effective description of a driven interface in the
presence of mobile impurities, based on an Ising variant model and a single
empirical coarse variable, was partially successful; yet it underlined the
necessity of selecting additional coarse variables in certain parameter
regimes. In this paper we use a data mining approach to help identify the
coarse variables required. We discuss the implementation of this diffusion map
approach, the selection of a similarity measure between system snapshots
required in the approach, and the correspondence between empirically selected
and automatically detected coarse variables. We conclude by illustrating the
use of the diffusion map variables in assisting the atomistic simulations, and
we discuss the translation of information between fine and coarse descriptions
using lifting and restriction operators.Comment: 28 pages, 10 figure
Probabilistic embeddings of the Fr\'echet distance
The Fr\'echet distance is a popular distance measure for curves which
naturally lends itself to fundamental computational tasks, such as clustering,
nearest-neighbor searching, and spherical range searching in the corresponding
metric space. However, its inherent complexity poses considerable computational
challenges in practice. To address this problem we study distortion of the
probabilistic embedding that results from projecting the curves to a randomly
chosen line. Such an embedding could be used in combination with, e.g.
locality-sensitive hashing. We show that in the worst case and under reasonable
assumptions, the discrete Fr\'echet distance between two polygonal curves of
complexity in , where , degrades
by a factor linear in with constant probability. We show upper and lower
bounds on the distortion. We also evaluate our findings empirically on a
benchmark data set. The preliminary experimental results stand in stark
contrast with our lower bounds. They indicate that highly distorted projections
happen very rarely in practice, and only for strongly conditioned input curves.
Keywords: Fr\'echet distance, metric embeddings, random projectionsComment: 27 pages, 11 figure
- …