9,974 research outputs found
SAS/IML Macros for a Multivariate Analysis of Variance Based on Spatial Signs
Recently, new nonparametric multivariate extensions of the univariate sign methods have been proposed. Randles (2000) introduced an affine invariant multivariate sign test for the multivariate location problem. Later on, Hettmansperger and Randles (2002) considered an affine equivariant multivariate median corresponding to this test. The new methods have promising efficiency and robustness properties. In this paper, we review these developments and compare them with the classical multivariate analysis of variance model. A new SAS/IML tool for performing a spatial sign based multivariate analysis of variance is introduced.
Bootstrap based uncertainty bands for prediction in functional kriging
The increasing interest in spatially correlated functional data has led to
the development of appropriate geostatistical techniques that allow to predict
a curve at an unmonitored location using a functional kriging with external
drift model that takes into account the effect of exogenous variables (either
scalar or functional). Nevertheless uncertainty evaluation for functional
spatial prediction remains an open issue. We propose a semi-parametric
bootstrap for spatially correlated functional data that allows to evaluate the
uncertainty of a predicted curve, ensuring that the spatial dependence
structure is maintained in the bootstrap samples. The performance of the
proposed methodology is assessed via a simulation study. Moreover, the approach
is illustrated on a well known data set of Canadian temperature and on a real
data set of PM concentration in the Piemonte region, Italy. Based on the
results it can be concluded that the method is computationally feasible and
suitable for quantifying the uncertainty around a predicted curve.
Supplementary material including R code is available upon request
Analyzing and Modeling Special Offer Campaigns in Location-based Social Networks
The proliferation of mobile handheld devices in combination with the
technological advancements in mobile computing has led to a number of
innovative services that make use of the location information available on such
devices. Traditional yellow pages websites have now moved to mobile platforms,
giving the opportunity to local businesses and potential, near-by, customers to
connect. These platforms can offer an affordable advertisement channel to local
businesses. One of the mechanisms offered by location-based social networks
(LBSNs) allows businesses to provide special offers to their customers that
connect through the platform. We collect a large time-series dataset from
approximately 14 million venues on Foursquare and analyze the performance of
such campaigns using randomization techniques and (non-parametric) hypothesis
testing with statistical bootstrapping. Our main finding indicates that this
type of promotions are not as effective as anecdote success stories might
suggest. Finally, we design classifiers by extracting three different types of
features that are able to provide an educated decision on whether a special
offer campaign for a local business will succeed or not both in short and long
term.Comment: in The 9th International AAAI Conference on Web and Social Media
(ICWSM 2015
TRULLO - local trust bootstrapping for ubiquitous devices
Handheld devices have become sufficiently powerful
that it is easy to create, disseminate, and access digital content
(e.g., photos, videos) using them. The volume of such content is
growing rapidly and, from the perspective of each user, selecting
relevant content is key. To this end, each user may run a trust
model - a software agent that keeps track of who disseminates
content that its user finds relevant. This agent does so by
assigning an initial trust value to each producer for a specific
category (context); then, whenever it receives new content, the
agent rates the content and accordingly updates its trust value for
the producer in the content category. However, a problem with
such an approach is that, as the number of content categories
increases, so does the number of trust values to be initially set.
This paper focuses on how to effectively set initial trust values.
The most sophisticated of the current solutions employ predefined
context ontologies, using which initial trust in a given
context is set based on that already held in similar contexts.
However, universally accepted (and time invariant) ontologies
are rarely found in practice. For this reason, we propose a
mechanism called TRULLO (TRUst bootstrapping by Latently
Lifting cOntext) that assigns initial trust values based only on
local information (on the ratings of its userâs past experiences)
and that, as such, does not rely on third-party recommendations.
We evaluate the effectiveness of TRULLO by simulating its use
in an informal antique market setting. We also evaluate the
computational cost of a J2ME implementation of TRULLO on
a mobile phone
Ambulance Emergency Response Optimization in Developing Countries
The lack of emergency medical transportation is viewed as the main barrier to
the access of emergency medical care in low and middle-income countries
(LMICs). In this paper, we present a robust optimization approach to optimize
both the location and routing of emergency response vehicles, accounting for
uncertainty in travel times and spatial demand characteristic of LMICs. We
traveled to Dhaka, Bangladesh, the sixth largest and third most densely
populated city in the world, to conduct field research resulting in the
collection of two unique datasets that inform our approach. This data is
leveraged to develop machine learning methodologies to estimate demand for
emergency medical services in a LMIC setting and to predict the travel time
between any two locations in the road network for different times of day and
days of the week. We combine our robust optimization and machine learning
frameworks with real data to provide an in-depth investigation into three
policy-related questions. First, we demonstrate that outpost locations
optimized for weekday rush hour lead to good performance for all times of day
and days of the week. Second, we find that significant improvements in
emergency response times can be achieved by re-locating a small number of
outposts and that the performance of the current system could be replicated
using only 30% of the resources. Lastly, we show that a fleet of small
motorcycle-based ambulances has the potential to significantly outperform
traditional ambulance vans. In particular, they are able to capture three times
more demand while reducing the median response time by 42% due to increased
routing flexibility offered by nimble vehicles on a larger road network. Our
results provide practical insights for emergency response optimization that can
be leveraged by hospital-based and private ambulance providers in Dhaka and
other urban centers in LMICs
A Lightweight Distributed Solution to Content Replication in Mobile Networks
Performance and reliability of content access in mobile networks is
conditioned by the number and location of content replicas deployed at the
network nodes. Facility location theory has been the traditional, centralized
approach to study content replication: computing the number and placement of
replicas in a network can be cast as an uncapacitated facility location
problem. The endeavour of this work is to design a distributed, lightweight
solution to the above joint optimization problem, while taking into account the
network dynamics. In particular, we devise a mechanism that lets nodes share
the burden of storing and providing content, so as to achieve load balancing,
and decide whether to replicate or drop the information so as to adapt to a
dynamic content demand and time-varying topology. We evaluate our mechanism
through simulation, by exploring a wide range of settings and studying
realistic content access mechanisms that go beyond the traditional
assumptionmatching demand points to their closest content replica. Results show
that our mechanism, which uses local measurements only, is: (i) extremely
precise in approximating an optimal solution to content placement and
replication; (ii) robust against network mobility; (iii) flexible in
accommodating various content access patterns, including variation in time and
space of the content demand.Comment: 12 page
Fast calibrated additive quantile regression
We propose a novel framework for fitting additive quantile regression models,
which provides well calibrated inference about the conditional quantiles and
fast automatic estimation of the smoothing parameters, for model structures as
diverse as those usable with distributional GAMs, while maintaining equivalent
numerical efficiency and stability. The proposed methods are at once
statistically rigorous and computationally efficient, because they are based on
the general belief updating framework of Bissiri et al. (2016) to loss based
inference, but compute by adapting the stable fitting methods of Wood et al.
(2016). We show how the pinball loss is statistically suboptimal relative to a
novel smooth generalisation, which also gives access to fast estimation
methods. Further, we provide a novel calibration method for efficiently
selecting the 'learning rate' balancing the loss with the smoothing priors
during inference, thereby obtaining reliable quantile uncertainty estimates.
Our work was motivated by a probabilistic electricity load forecasting
application, used here to demonstrate the proposed approach. The methods
described here are implemented by the qgam R package, available on the
Comprehensive R Archive Network (CRAN)
Wage distribution and the spatial sorting of workers and firms
Spatial sorting plays an important role in accounting for disparities in average wages among locations. This paper shows that sorting also matters when addressing the relation between spatial externalities and wage distribution, i.e. across workers located at dierent percentiles of the wage distribution. Using Italian employer-employee panel data we can control for individual and firm heterogeneity as well as for unobserved individual heterogeneity by means of quantile fixed eects estimates. After controlling for the sorting of workers the spatial externality impacts dampen along the whole wage distribution and generally remain positive only in the upper tail. As for firm sorting, it becomes uniform along the wage distribution once individual fixed effects are considered. We also point out that the impact of worker sorting is not homogeneous across sectors: along the density dimension it occurs mainly in skill-intensive sectors, while along the specialization dimension it is concentrated in the unskill-intensive sectors.Spatial Externalities, Spatial Sorting, Wage Distribution, Quantile Fixed Effects
Bias-free Measurement of Giant Molecular Cloud Properties
(abridged) We review methods for measuring the sizes, line widths, and
luminosities of giant molecular clouds (GMCs) in molecular-line data cubes with
low resolution and sensitivity. We find that moment methods are robust and
sensitive -- making full use of both position and intensity information -- and
we recommend a standard method to measure the position angle, major and minor
axis sizes, line width, and luminosity using moment methods. Without
corrections for the effects of beam convolution and sensitivity to GMC
properties, the resulting properties may be severely biased. This is
particularly true for extragalactic observations, where resolution and
sensitivity effects often bias measured values by 40% or more. We correct for
finite spatial and spectral resolutions with a simple deconvolution and we
correct for sensitivity biases by extrapolating properties of a GMC to those we
would expect to measure with perfect sensitivity. The resulting method recovers
the properties of a GMC to within 10% over a large range of resolutions and
sensitivities, provided the clouds are marginally resolved with a peak
signal-to-noise ratio greater than 10. We note that interferometers
systematically underestimate cloud properties, particularly the flux from a
cloud. The degree of bias depends on the sensitivity of the observations and
the (u,v) coverage of the observations. In the Appendix to the paper we present
a conservative, new decomposition algorithm for identifying GMCs in
molecular-line observations. This algorithm treats the data in physical rather
than observational units, does not produce spurious clouds in the presence of
noise, and is sensitive to a range of morphologies. As a result, the output of
this decomposition should be directly comparable among disparate data sets.Comment: Accepted to PASP (19 pgs., 12 figures). The submission describes an
IDL software package available from
http://cfa-www.harvard.edu/~erosolow/cprops
- âŠ