130,116 research outputs found

    Online Active Linear Regression via Thresholding

    Full text link
    We consider the problem of online active learning to collect data for regression modeling. Specifically, we consider a decision maker with a limited experimentation budget who must efficiently learn an underlying linear population model. Our main contribution is a novel threshold-based algorithm for selection of most informative observations; we characterize its performance and fundamental lower bounds. We extend the algorithm and its guarantees to sparse linear regression in high-dimensional settings. Simulations suggest the algorithm is remarkably robust: it provides significant benefits over passive random sampling in real-world datasets that exhibit high nonlinearity and high dimensionality --- significantly reducing both the mean and variance of the squared error.Comment: Published in AAAI 201

    An Analysis of Active Learning With Uniform Feature Noise

    Full text link
    In active learning, the user sequentially chooses values for feature XX and an oracle returns the corresponding label YY. In this paper, we consider the effect of feature noise in active learning, which could arise either because XX itself is being measured, or it is corrupted in transmission to the oracle, or the oracle returns the label of a noisy version of the query point. In statistics, feature noise is known as "errors in variables" and has been studied extensively in non-active settings. However, the effect of feature noise in active learning has not been studied before. We consider the well-known Berkson errors-in-variables model with additive uniform noise of width σ\sigma. Our simple but revealing setting is that of one-dimensional binary classification setting where the goal is to learn a threshold (point where the probability of a ++ label crosses half). We deal with regression functions that are antisymmetric in a region of size σ\sigma around the threshold and also satisfy Tsybakov's margin condition around the threshold. We prove minimax lower and upper bounds which demonstrate that when σ\sigma is smaller than the minimiax active/passive noiseless error derived in \cite{CN07}, then noise has no effect on the rates and one achieves the same noiseless rates. For larger σ\sigma, the \textit{unflattening} of the regression function on convolution with uniform noise, along with its local antisymmetry around the threshold, together yield a behaviour where noise \textit{appears} to be beneficial. Our key result is that active learning can buy significant improvement over a passive strategy even in the presence of feature noise.Comment: 24 pages, 2 figures, published in the proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS), 201

    A meta analysis of real estate fund performance

    Get PDF
    This paper provides evidence regarding the risk-adjusted performance of 19 UK real estate funds in the UK, over the period 1991-2001. Using Jensen’s alpha the results are generally favourable towards the hypothesis that real estate fund managers showed superior risk-adjusted performance over this period. However, using three widely known parametric statistical procedures to jointly test for timing and selection ability the results are less conclusive. The paper then utilises the meta-analysis technique to further examine the regression results in an attempt to estimate the proportion of variation in results attributable to sampling error. The meta-analysis results reveal strong evidence, across all models, that the variation in findings is real and may not be attributed to sampling error. Thus, the meta-analysis results provide strong evidence that on average the sample of real estate funds analysed in this study delivered significant risk-adjusted performance over this period. The meta-analysis for the three timing and selection models strongly indicating that this out performance of the benchmark resulted from superior selection ability, while the evidence for the ability of real estate fund managers to time the market is at best weak. Thus, we can say that although real estate fund managers are unable to outperform a passive buy and hold strategy through timing, they are able to improve their risk-adjusted performance through selection ability

    Measurement and Modeling of Ground-Level Ozone Concentration in Catania, Italy using Biophysical Remote Sensing and GIS

    Get PDF
    This experimental study examined spatial variation of ground level ozone (O3) in the city of Catania, Italy using thirty passive samplers deployed in a 500-m grid pattern. Significant spatial variation in ground level O3 concentrations (ranging from 12.8 to 41.7 g/m3) was detected across Catania’s urban core and periphery. Biophysical measures derived from satellite imagery and built environment characteristics from GIS were evaluated as correlates of O3 concentrations. A land use regression model based on four variables (land surface temperature, building area, residential street length, and distance to the coast) explained 74% of the variance (adjusted R2) in measured O3. The results of the study suggest that biophysical remote sensing variables are worth further investigation as predictors of ground level O3 (and potentially other air pollutants) because they provide objective measurements that can be tested across multiple locations and over time

    A passive sampling method for radiocarbon analysis of atmospheric CO<sub>2</sub> using molecular sieve

    Get PDF
    Radiocarbon (14C) analysis of atmospheric CO2 can provide information on CO2 sources and is potentially valuable for validating inventories of fossil fuel-derived CO2 emissions to the atmosphere. We tested zeolite molecular sieve cartridges, in both field and laboratory experiments, for passively collecting atmospheric CO2. Cartridges were exposed to the free atmosphere in two configurations which controlled CO2 trapping rate, allowing collection of sufficient CO2 in between 1.5 and 10 months at current levels. 14C results for passive samples were within measurement uncertainty of samples collected using a pump-based system, showing that the method collected samples with 14C contents representative of the atmosphere. δ13C analysis confirmed that the cartridges collected representative CO2 samples, however, fractionation during passive trapping means that δ13C values need to be adjusted by an amount which we have quantified. Trapping rate was proportional to atmospheric CO2 concentration, and was not affected by exposure time unless this exceeded a threshold. Passive sampling using molecular sieve cartridges provides an easy and reliable method to collect atmospheric CO2 for 14C analysis

    Reconciling surveillance systems with limited resources: an evaluation of passive surveillance for rabies in an endemic setting

    Get PDF
    Surveillance systems for rabies in endemic regions are often subject to severe constraints in terms of resources. The World Organisation for Animal Health (OIE) and the World Health Organisation (WHO) propose the use of an active surveillance system to substantiate claims of disease freedom, including rabies. However, many countries do not have the resources to establish active surveillance systems for rabies and the testing of dead dogs poses logistical challenges. This paper explores the potential of using a scenario tree model parameterised with data collected via questionnaires and interviews to estimate the sensitivity of passive surveillance, assessing its potential as a viable low-cost alternative to active surveillance systems. The results of this explorative study illustrated that given a large enough sample size, in this case the entire population of Colombo City, the sensitivity of passive surveillance can be 100% even at a low disease prevalence (0.1%), despite the low sensitivity of individual surveillance components (mean values in the range 4.077×10(-5)-1.834×10(-3) at 1% prevalence). In addition, logistic regression was used to identify factors associated with increased recognition of rabies in dogs and reporting of rabies suspect dogs. Increased recognition was observed amongst dog owners (OR 3.8 (CI, 1.3-10.8)), people previously bitten by dogs (OR 5.9 (CI, 2.2-15.9)) and people who believed they had seen suspect dogs in the past (OR 4.7 (CI, 1.8-12.9)). Increased likelihood of reporting suspect dogs was observed amongst dog owners (OR 5.3 (CI, 1.1-25)). Further work is required to validate the data collection tool and the assumptions made in the model with respect to sample size in order to develop a robust methodology for evaluating passive rabies surveillance
    • …
    corecore