443 research outputs found
Generalizing the Network Scale-Up Method: A New Estimator for the Size of Hidden Populations
The network scale-up method enables researchers to estimate the size of
hidden populations, such as drug injectors and sex workers, using sampled
social network data. The basic scale-up estimator offers advantages over other
size estimation techniques, but it depends on problematic modeling assumptions.
We propose a new generalized scale-up estimator that can be used in settings
with non-random social mixing and imperfect awareness about membership in the
hidden population. Further, the new estimator can be used when data are
collected via complex sample designs and from incomplete sampling frames.
However, the generalized scale-up estimator also requires data from two
samples: one from the frame population and one from the hidden population. In
some situations these data from the hidden population can be collected by
adding a small number of questions to already planned studies. For other
situations, we develop interpretable adjustment factors that can be applied to
the basic scale-up estimator. We conclude with practical recommendations for
the design and analysis of future studies
The Network Survival Method for Estimating Adult Mortality: Evidence From a Survey Experiment in Rwanda.
Adult death rates are a critical indicator of population health and well-being. Wealthy countries have high-quality vital registration systems, but poor countries lack this infrastructure and must rely on estimates that are often problematic. In this article, we introduce the network survival method, a new approach for estimating adult death rates. We derive the precise conditions under which it produces consistent and unbiased estimates. Further, we develop an analytical framework for sensitivity analysis. To assess the performance of the network survival method in a realistic setting, we conducted a nationally representative survey experiment in Rwanda (n = 4,669). Network survival estimates were similar to estimates from other methods, even though the network survival estimates were made with substantially smaller samples and are based entirely on data from Rwanda, with no need for model life tables or pooling of data from other countries. Our analytic results demonstrate that the network survival method has attractive properties, and our empirical results show that this method can be used in countries where reliable estimates of adult death rates are sorely needed
Breaking a one-dimensional chain: fracture in 1 + 1 dimensions
The breaking rate of an atomic chain stretched at zero temperature by a
constant force can be calculated in a quasiclassical approximation by finding
the localized solutions ("bounces") of the equations of classical dynamics in
imaginary time. We show that this theory is related to the critical cracks of
stressed solids, because the world lines of the atoms in the chain form a
two-dimensional crystal, and the bounce is a crack configuration in (unstable)
mechanical equilibrium. Thus the tunneling time, Action, and breaking rate in
the limit of small forces are determined by the classical results of Griffith.
For the limit of large forces we give an exact bounce solution that describes
the quantum fracture and classical crack close to the limit of mechanical
stability. This limit can be viewed as a critical phenomenon for which we
establish a Levanyuk-Ginzburg criterion of weakness of fluctuations, and
propose a scaling argument for the critical regime. The post-tunneling dynamics
is understood by the analytic continuation of the bounce solutions to real
time.Comment: 15 pages, 5 figure
Controlling Fairness and Bias in Dynamic Learning-to-Rank
Rankings are the primary interface through which many online platforms match
users to items (e.g. news, products, music, video). In these two-sided markets,
not only the users draw utility from the rankings, but the rankings also
determine the utility (e.g. exposure, revenue) for the item providers (e.g.
publishers, sellers, artists, studios). It has already been noted that
myopically optimizing utility to the users, as done by virtually all
learning-to-rank algorithms, can be unfair to the item providers. We,
therefore, present a learning-to-rank approach for explicitly enforcing
merit-based fairness guarantees to groups of items (e.g. articles by the same
publisher, tracks by the same artist). In particular, we propose a learning
algorithm that ensures notions of amortized group fairness, while
simultaneously learning the ranking function from implicit feedback data. The
algorithm takes the form of a controller that integrates unbiased estimators
for both fairness and utility, dynamically adapting both as more data becomes
available. In addition to its rigorous theoretical foundation and convergence
guarantees, we find empirically that the algorithm is highly practical and
robust.Comment: First two authors contributed equally. In Proceedings of the 43rd
International ACM SIGIR Conference on Research and Development in Information
Retrieval 202
Why We Read Wikipedia
Wikipedia is one of the most popular sites on the Web, with millions of users
relying on it to satisfy a broad range of information needs every day. Although
it is crucial to understand what exactly these needs are in order to be able to
meet them, little is currently known about why users visit Wikipedia. The goal
of this paper is to fill this gap by combining a survey of Wikipedia readers
with a log-based analysis of user activity. Based on an initial series of user
surveys, we build a taxonomy of Wikipedia use cases along several dimensions,
capturing users' motivations to visit Wikipedia, the depth of knowledge they
are seeking, and their knowledge of the topic of interest prior to visiting
Wikipedia. Then, we quantify the prevalence of these use cases via a
large-scale user survey conducted on live Wikipedia with almost 30,000
responses. Our analyses highlight the variety of factors driving users to
Wikipedia, such as current events, media coverage of a topic, personal
curiosity, work or school assignments, or boredom. Finally, we match survey
responses to the respondents' digital traces in Wikipedia's server logs,
enabling the discovery of behavioral patterns associated with specific use
cases. For instance, we observe long and fast-paced page sequences across
topics for users who are bored or exploring randomly, whereas those using
Wikipedia for work or school spend more time on individual articles focused on
topics such as science. Our findings advance our understanding of reader
motivations and behavior on Wikipedia and can have implications for developers
aiming to improve Wikipedia's user experience, editors striving to cater to
their readers' needs, third-party services (such as search engines) providing
access to Wikipedia content, and researchers aiming to build tools such as
recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table
Bias reduction in traceroute sampling: towards a more accurate map of the Internet
Traceroute sampling is an important technique in exploring the internet
router graph and the autonomous system graph. Although it is one of the primary
techniques used in calculating statistics about the internet, it can introduce
bias that corrupts these estimates. This paper reports on a theoretical and
experimental investigation of a new technique to reduce the bias of traceroute
sampling when estimating the degree distribution. We develop a new estimator
for the degree of a node in a traceroute-sampled graph; validate the estimator
theoretically in Erdos-Renyi graphs and, through computer experiments, for a
wider range of graphs; and apply it to produce a new picture of the degree
distribution of the autonomous system graph.Comment: 12 pages, 3 figure
Quantum Breaking of Elastic String
Breaking of an atomic chain under stress is a collective many-particle
tunneling phenomenon. We study classical dynamics in imaginary time by using
conformal mapping technique, and derive an analytic formula for the probability
of breaking. The result covers a broad temperature interval and interpolates
between two regimes: tunneling and thermal activation. Also, we consider the
breaking induced by an ultrasonic wave propagating in the chain, and propose to
observe it in an STM experiment.Comment: 8 pages, RevTeX 3.0, Landau Institute preprint 261/643
Universality in movie rating distributions
In this paper histograms of user ratings for movies (1,...,10) are analysed.
The evolving stabilised shapes of histograms follow the rule that all are
either double- or triple-peaked. Moreover, at most one peak can be on the
central bins 2,...,9 and the distribution in these bins looks smooth
`Gaussian-like' while changes at the extremes (1 and 10) often look abrupt. It
is shown that this is well approximated under the assumption that histograms
are confined and discretised probability density functions of L\'evy skew
alpha-stable distributions. These distributions are the only stable
distributions which could emerge due to a generalized central limit theorem
from averaging of various independent random avriables as which one can see the
initial opinions of users. Averaging is also an appropriate assumption about
the social process which underlies the process of continuous opinion formation.
Surprisingly, not the normal distribution achieves the best fit over histograms
obseved on the web, but distributions with fat tails which decay as power-laws
with exponent -(1+alpha) (alpha=4/3). The scale and skewness parameters of the
Levy skew alpha-stable distributions seem to depend on the deviation from an
average movie (with mean about 7.6). The histogram of such an average movie has
no skewness and is the most narrow one. If a movie deviates from average the
distribution gets broader and skew. The skewness pronounces the deviation. This
is used to construct a one parameter fit which gives some evidence of
universality in processes of continuous opinion dynamics about taste.Comment: 8 pages, 5 figures, accepted for publicatio
Recommended from our members
Evolution of entrepreneurs state support system in the national aspect
The network scale-up method is a promising technique that uses sampled social network data to estimate the
sizes of epidemiologically important hidden populations, such as sex workers and people who inject illicit drugs.
Although previous scale-up research has focused exclusively on networks of acquaintances, we show that the
type of personal network about which survey respondents are asked to report is a potentially crucial parameter that
researchers are free to vary. This generalization leads to a method that is more flexible and potentially more accurate.
In 2011, we conducted a large, nationally representative survey experiment in Rwanda that randomized respondents
to report about one of 2 different personal networks. Our results showed that asking respondents for
less information can, somewhat surprisingly, produce more accurate size estimates. We also estimated the sizes
of 4 key populations at risk for human immunodeficiency virus infection in Rwanda. Our estimates were higher than
earlier estimates from Rwanda but lower than international benchmarks. Finally, in this article we develop a new
sensitivity analysis framework and use it to assess the possible biases in our estimates. Our design can be customized
and extended for other settings, enabling researchers to continue to improve the network scale-up method
An Experimental Study of Cryptocurrency Market Dynamics
As cryptocurrencies gain popularity and credibility, marketplaces for
cryptocurrencies are growing in importance. Understanding the dynamics of these
markets can help to assess how viable the cryptocurrnency ecosystem is and how
design choices affect market behavior. One existential threat to
cryptocurrencies is dramatic fluctuations in traders' willingness to buy or
sell. Using a novel experimental methodology, we conducted an online experiment
to study how susceptible traders in these markets are to peer influence from
trading behavior. We created bots that executed over one hundred thousand
trades costing less than a penny each in 217 cryptocurrencies over the course
of six months. We find that individual "buy" actions led to short-term
increases in subsequent buy-side activity hundreds of times the size of our
interventions. From a design perspective, we note that the design choices of
the exchange we study may have promoted this and other peer influence effects,
which highlights the potential social and economic impact of HCI in the design
of digital institutions.Comment: CHI 201
- …