130,116 research outputs found
Online Active Linear Regression via Thresholding
We consider the problem of online active learning to collect data for
regression modeling. Specifically, we consider a decision maker with a limited
experimentation budget who must efficiently learn an underlying linear
population model. Our main contribution is a novel threshold-based algorithm
for selection of most informative observations; we characterize its performance
and fundamental lower bounds. We extend the algorithm and its guarantees to
sparse linear regression in high-dimensional settings. Simulations suggest the
algorithm is remarkably robust: it provides significant benefits over passive
random sampling in real-world datasets that exhibit high nonlinearity and high
dimensionality --- significantly reducing both the mean and variance of the
squared error.Comment: Published in AAAI 201
An Analysis of Active Learning With Uniform Feature Noise
In active learning, the user sequentially chooses values for feature and
an oracle returns the corresponding label . In this paper, we consider the
effect of feature noise in active learning, which could arise either because
itself is being measured, or it is corrupted in transmission to the oracle,
or the oracle returns the label of a noisy version of the query point. In
statistics, feature noise is known as "errors in variables" and has been
studied extensively in non-active settings. However, the effect of feature
noise in active learning has not been studied before. We consider the
well-known Berkson errors-in-variables model with additive uniform noise of
width .
Our simple but revealing setting is that of one-dimensional binary
classification setting where the goal is to learn a threshold (point where the
probability of a label crosses half). We deal with regression functions
that are antisymmetric in a region of size around the threshold and
also satisfy Tsybakov's margin condition around the threshold. We prove minimax
lower and upper bounds which demonstrate that when is smaller than the
minimiax active/passive noiseless error derived in \cite{CN07}, then noise has
no effect on the rates and one achieves the same noiseless rates. For larger
, the \textit{unflattening} of the regression function on convolution
with uniform noise, along with its local antisymmetry around the threshold,
together yield a behaviour where noise \textit{appears} to be beneficial. Our
key result is that active learning can buy significant improvement over a
passive strategy even in the presence of feature noise.Comment: 24 pages, 2 figures, published in the proceedings of the 17th
International Conference on Artificial Intelligence and Statistics (AISTATS),
201
A meta analysis of real estate fund performance
This paper provides evidence regarding the risk-adjusted performance of 19 UK real estate funds in the UK, over the period 1991-2001. Using Jensen’s alpha the results are generally favourable towards the hypothesis that real estate fund managers showed superior risk-adjusted performance over this period. However, using three widely known parametric statistical procedures to jointly test for timing and selection ability the results are less conclusive. The paper then utilises the meta-analysis technique to further examine the regression results in an attempt to estimate the proportion of variation in results attributable to sampling error. The meta-analysis results reveal strong evidence, across all models, that the variation in findings is real and may not be attributed to sampling error. Thus, the meta-analysis results provide strong evidence that on average the sample of real estate funds analysed in this study delivered significant risk-adjusted performance over this period. The meta-analysis for the three timing and selection models strongly indicating that this out performance of the benchmark resulted from superior selection ability, while the evidence for the ability of real estate fund managers to time the market is at best weak. Thus, we can say that although real estate fund managers are unable to outperform a passive buy and hold strategy through timing, they are able to improve their risk-adjusted performance through selection ability
Measurement and Modeling of Ground-Level Ozone Concentration in Catania, Italy using Biophysical Remote Sensing and GIS
This experimental study examined spatial variation of ground level ozone (O3) in the city of Catania, Italy using thirty passive samplers deployed in a 500-m grid pattern. Significant spatial variation in ground level O3 concentrations (ranging from 12.8 to 41.7 g/m3) was detected across Catania’s urban core and periphery. Biophysical measures derived from satellite imagery and built environment characteristics from GIS were evaluated as correlates of O3 concentrations. A land use regression model based on four variables (land surface temperature, building area, residential street length, and distance to the coast) explained 74% of the variance (adjusted R2) in measured O3. The results of the study suggest that biophysical remote sensing variables are worth further investigation as predictors of ground level O3 (and potentially other air pollutants) because they provide objective measurements that can be tested across multiple locations and over time
A passive sampling method for radiocarbon analysis of atmospheric CO<sub>2</sub> using molecular sieve
Radiocarbon (14C) analysis of atmospheric CO2 can provide information on CO2 sources and is potentially valuable for validating inventories of fossil fuel-derived CO2 emissions to the atmosphere. We tested zeolite molecular sieve cartridges, in both field and laboratory experiments, for passively collecting atmospheric CO2. Cartridges were exposed to the free atmosphere in two configurations which controlled CO2 trapping rate, allowing collection of sufficient CO2 in between 1.5 and 10 months at current levels. 14C results for passive samples were within measurement uncertainty of samples collected using a pump-based system, showing that the method collected samples with 14C contents representative of the atmosphere. δ13C analysis confirmed that the cartridges collected representative CO2 samples, however, fractionation during passive trapping means that δ13C values need to be adjusted by an amount which we have quantified. Trapping rate was proportional to atmospheric CO2 concentration, and was not affected by exposure time unless this exceeded a threshold. Passive sampling using molecular sieve cartridges provides an easy and reliable method to collect atmospheric CO2 for 14C analysis
Reconciling surveillance systems with limited resources: an evaluation of passive surveillance for rabies in an endemic setting
Surveillance systems for rabies in endemic regions are often subject to severe constraints in terms of resources. The World Organisation for Animal Health (OIE) and the World Health Organisation (WHO) propose the use of an active surveillance system to substantiate claims of disease freedom, including rabies. However, many countries do not have the resources to establish active surveillance systems for rabies and the testing of dead dogs poses logistical challenges. This paper explores the potential of using a scenario tree model parameterised with data collected via questionnaires and interviews to estimate the sensitivity of passive surveillance, assessing its potential as a viable low-cost alternative to active surveillance systems. The results of this explorative study illustrated that given a large enough sample size, in this case the entire population of Colombo City, the sensitivity of passive surveillance can be 100% even at a low disease prevalence (0.1%), despite the low sensitivity of individual surveillance components (mean values in the range 4.077×10(-5)-1.834×10(-3) at 1% prevalence). In addition, logistic regression was used to identify factors associated with increased recognition of rabies in dogs and reporting of rabies suspect dogs. Increased recognition was observed amongst dog owners (OR 3.8 (CI, 1.3-10.8)), people previously bitten by dogs (OR 5.9 (CI, 2.2-15.9)) and people who believed they had seen suspect dogs in the past (OR 4.7 (CI, 1.8-12.9)). Increased likelihood of reporting suspect dogs was observed amongst dog owners (OR 5.3 (CI, 1.1-25)). Further work is required to validate the data collection tool and the assumptions made in the model with respect to sample size in order to develop a robust methodology for evaluating passive rabies surveillance
Recommended from our members
Workplace secondhand smoke exposure in the U.S. trucking industry.
BackgroundAlthough the smoking rate in the United States is declining because of an increase of smoke-free laws, among blue-collar workers it remains higher than that among many other occupational groups.ObjectivesWe evaluated the factors influencing workplace secondhand smoke (SHS) exposures in the U.S. unionized trucking industry.MethodsFrom 2003 through 2005, we measured workplace SHS exposure among 203 nonsmoking and 61 smoking workers in 25 trucking terminals. Workers in several job groups wore personal vapor-phase nicotine samplers on their lapels for two consecutive work shifts and completed a workplace SHS exposure questionnaire at the end of the personal sampling.ResultsMedian nicotine level was 0.87 microg/m3 for nonsmokers and 5.96 microg/m3 for smokers. As expected, smokers experienced higher SHS exposure duration and intensity than did nonsmokers. For nonsmokers, multiple regression analyses indicated that self-reported exposure duration combined with intensity, lack of a smoking policy as reported by workers, having a nondriver job, and lower educational level were independently associated with elevated personal nicotine levels (model R2 = 0.52). Nondriver job and amount of active smoking were associated with elevated personal nicotine level in smokers, but self-reported exposure, lack of a smoking policy, and lower educational level were not.ConclusionsDespite movements toward smoke-free laws, this population of blue-collar workers was still exposed to workplace SHS as recently as 2005. The perceived (reported by the workers), rather than the official (reported by the terminal managers), smoking policy was associated with measured SHS exposure levels among the nonsmokers. Job duties and educational level might also be important predictors of workplace SHS exposure
- …