230 research outputs found
Entropy-Based Strategies for Multi-Bracket Pools
Much work in the March Madness literature has discussed how to estimate the
probability that any one team beats any other team. There has been strikingly
little work, however, on what to do with these win probabilities. Hence we pose
the multi-brackets problem: given these probabilities, what is the best way to
submit a set of brackets to a March Madness bracket challenge? This is an
extremely difficult question, so we begin with a simpler situation. In
particular, we compare various sets of randomly sampled brackets, subject
to different entropy ranges or levels of chalkiness (rougly, chalkier brackets
feature fewer upsets). We learn three lessons. First, the observed NCAA
tournament is a "typical" bracket with a certain "right" amount of entropy
(roughly, a "right" amount of upsets), not a chalky bracket. Second, to
maximize the expected score of a set of randomly sampled brackets, we
should be successively less chalky as the number of submitted brackets
increases. Third, to maximize the probability of winning a bracket challenge
against a field of opposing brackets, we should tailor the chalkiness of our
brackets to the chalkiness of our opponents' brackets
A Bayesian analysis of the time through the order penalty in baseball
As a baseball game progresses, batters appear to perform better the more
times they face a particular pitcher. The apparent drop-off in pitcher
performance from one time through the order to the next, known as the Time
Through the Order Penalty (TTOP), is often attributed to within-game batter
learning. Although the TTOP has largely been accepted within baseball and
influences many managers' in-game decision making, we argue that existing
approaches of estimating the size of the TTOP cannot disentangle continuous
evolution in pitcher performance over the course of the game from
discontinuities between successive times through the order. Using a Bayesian
multinomial regression model, we find that, after adjusting for confounders
like batter and pitcher quality, handedness, and home field advantage, there is
little evidence of strong discontinuity in pitcher performance between times
through the order. Our analysis suggests that the start of the third time
through the order should not be viewed as a special cutoff point in deciding
whether to pull a starting pitcher.Comment: Accepted to JQA
Achievement Trap: How America Is Failing Millions of High-Achieving Students From Lower-Income Families
Assesses the elementary school, high school, college, and graduate school experiences of students who score in the top 25 percent on national standardized tests and whose family incomes are below the national median
Locking of accessible information and implications for the security of quantum cryptography
The unconditional security of a quantum key distribution protocol is often
defined in terms of the accessible information, that is, the maximum mutual
information between the distributed key S and the outcome of an optimal
measurement on the adversary's (quantum) system. We show that, even if this
quantity is small, certain parts of the key S might still be completely
insecure when S is used in applications, such as for one-time pad encryption.
This flaw is due to a locking property of the accessible information: one
additional (physical) bit of information might increase the accessible
information by more than one bit.Comment: 5 pages; minor change
Hard Discs on the Hyperbolic Plane
We examine a simple hard disc fluid with no long range interactions on the
two dimensional space of constant negative Gaussian curvature, the hyperbolic
plane. This geometry provides a natural mechanism by which global crystalline
order is frustrated, allowing us to construct a tractable model of disordered
monodisperse hard discs. We extend free area theory and the virial expansion to
this regime, deriving the equation of state for the system, and compare its
predictions with simulation near an isostatic packing in the curved space.Comment: 4 pages, 3 figures, included, final versio
Security against eavesdropping in quantum cryptography
In this article we deal with the security of the BB84 quantum cryptography
protocol over noisy channels using generalized privacy amplification. For this
we estimate the fraction of bits needed to be discarded during the privacy
amplification step. This estimate is given for two scenarios, both of which
assume the eavesdropper to access each of the signals independently and take
error correction into account. One scenario does not allow a delay of the
eavesdropper's measurement of a measurement probe until he receives additional
classical information. In this scenario we achieve a sharp bound. The other
scenario allows a measurement delay, so that the general attack of an
eavesdropper on individual signals is covered. This bound is not sharp but
allows a practical implementation of the protocol.Comment: 11 pages including 3 figures, contains new results not contained in
my Phys. Rev. A pape
Linguistic Analysis of Requirements of a Space Project and Their Conformity with the Recommendations Proposed by a Controlled Natural Language
International audienceWe propose a linguistic analysis of requirements written in French for a project carried out by the French National Space Agency (CNES). The aim is to determine to what extent they conform to some of the rules laid down in INCOSE, a recent guide for writing requirements, with a focus on the notion of sentence " comprehensibility ". Although CNES engineers are not obliged to follow any Controlled Natural Language, we believe that language regularities are likely to emerge from this task, mainly due to the writers' experience. As a first step, we use natural language processing tools to identify sentences that do not comply with INCOSE rules. We further review these sentences to understand why the recommendations cannot (or should not) always be applied when specifying large-scale projects, and how they could be improved. This paper presents a corpus linguistics approach applied to the melioration of requirements writing. We propose a linguistic diagnosis of the way requirements are written in a space project by comparing these requirements with a guide for writing specifications (a controlled natural language). Initial results obtained from this analysis suggest that guides for writing specifications are not fully adapted to the real writing process: they are sometimes too constraining, and sometimes insufficiently so. In the medium term, the aim is to propose another guide based on the spontaneous regularities observed in requirements. The paper comprises two parts. In the first one (see section 2), we present the context of our study and the tool-assisted method used for making the diagnosis. In the second one (see section 3), we describe and discuss our preliminary results
Artificial Sequences and Complexity Measures
In this paper we exploit concepts of information theory to address the
fundamental problem of identifying and defining the most suitable tools to
extract, in a automatic and agnostic way, information from a generic string of
characters. We introduce in particular a class of methods which use in a
crucial way data compression techniques in order to define a measure of
remoteness and distance between pairs of sequences of characters (e.g. texts)
based on their relative information content. We also discuss in detail how
specific features of data compression techniques could be used to introduce the
notion of dictionary of a given sequence and of Artificial Text and we show how
these new tools can be used for information extraction purposes. We point out
the versatility and generality of our method that applies to any kind of
corpora of character strings independently of the type of coding behind them.
We consider as a case study linguistic motivated problems and we present
results for automatic language recognition, authorship attribution and self
consistent-classification.Comment: Revised version, with major changes, of previous "Data Compression
approach to Information Extraction and Classification" by A. Baronchelli and
V. Loreto. 15 pages; 5 figure
End-to-End Joint Antenna Selection Strategy and Distributed Compress and Forward Strategy for Relay Channels
Multi-hop relay channels use multiple relay stages, each with multiple relay
nodes, to facilitate communication between a source and destination.
Previously, distributed space-time codes were proposed to maximize the
achievable diversity-multiplexing tradeoff, however, they fail to achieve all
the points of the optimal diversity-multiplexing tradeoff. In the presence of a
low-rate feedback link from the destination to each relay stage and the source,
this paper proposes an end-to-end antenna selection (EEAS) strategy as an
alternative to distributed space-time codes. The EEAS strategy uses a subset of
antennas of each relay stage for transmission of the source signal to the
destination with amplify and forwarding at each relay stage. The subsets are
chosen such that they maximize the end-to-end mutual information at the
destination. The EEAS strategy achieves the corner points of the optimal
diversity-multiplexing tradeoff (corresponding to maximum diversity gain and
maximum multiplexing gain) and achieves better diversity gain at intermediate
values of multiplexing gain, versus the best known distributed space-time
coding strategies. A distributed compress and forward (CF) strategy is also
proposed to achieve all points of the optimal diversity-multiplexing tradeoff
for a two-hop relay channel with multiple relay nodes.Comment: Accepted for publication in the special issue on cooperative
communication in the Eurasip Journal on Wireless Communication and Networkin
On the Communication Complexity of Secure Computation
Information theoretically secure multi-party computation (MPC) is a central
primitive of modern cryptography. However, relatively little is known about the
communication complexity of this primitive.
In this work, we develop powerful information theoretic tools to prove lower
bounds on the communication complexity of MPC. We restrict ourselves to a
3-party setting in order to bring out the power of these tools without
introducing too many complications. Our techniques include the use of a data
processing inequality for residual information - i.e., the gap between mutual
information and G\'acs-K\"orner common information, a new information
inequality for 3-party protocols, and the idea of distribution switching by
which lower bounds computed under certain worst-case scenarios can be shown to
apply for the general case.
Using these techniques we obtain tight bounds on communication complexity by
MPC protocols for various interesting functions. In particular, we show
concrete functions that have "communication-ideal" protocols, which achieve the
minimum communication simultaneously on all links in the network. Also, we
obtain the first explicit example of a function that incurs a higher
communication cost than the input length in the secure computation model of
Feige, Kilian and Naor (1994), who had shown that such functions exist. We also
show that our communication bounds imply tight lower bounds on the amount of
randomness required by MPC protocols for many interesting functions.Comment: 37 page
- …