450,961 research outputs found

    Linear Regression with Limited Observation

    Full text link
    We consider the most common variants of linear regression, including Ridge, Lasso and Support-vector regression, in a setting where the learner is allowed to observe only a fixed number of attributes of each example at training time. We present simple and efficient algorithms for these problems: for Lasso and Ridge regression they need the same total number of attributes (up to constants) as do full-information algorithms, for reaching a certain accuracy. For Support-vector regression, we require exponentially less attributes compared to the state of the art. By that, we resolve an open problem recently posed by Cesa-Bianchi et al. (2010). Experiments show the theoretical bounds to be justified by superior performance compared to the state of the art.Comment: ICML201

    Seismic Wavefield Reconstruction based on Compressed Sensing using Data-Driven Reduced-Order Model

    Full text link
    A seismic wavefield reconstruction framework based on compressed sensing using the data-driven reduced-order model (ROM) is proposed and its characteristics are investigated through numerical experiments. The data-driven ROM is generated from the dataset of the wavefield using the singular value decomposition. The spatially continuous seismic wavefield is reconstructed from the sparse and discrete observation and the data-driven ROM. The observation sites used for reconstruction are effectively selected by the sensor optimization method for linear inverse problems based on a greedy algorithm. The proposed framework was applied to simulation data of theoretical waveform with the subsurface structure of the horizontally-stratified three layers. The validity of the proposed method was confirmed by the reconstruction based on the noise-free observation. Since the ROM of the wavefield is used as prior information, the reconstruction error is reduced to an approximately lower error bound of the present framework, even though the number of sensors used for reconstruction is limited and randomly selected. In addition, the reconstruction error obtained by the proposed framework is much smaller than that obtained by the Gaussian process regression. For the numerical experiment with noise-contaminated observation, the reconstructed wavefield is degraded due to the observation noise, but the reconstruction error obtained by the present framework with all available observation sites is close to a lower error bound, even though the reconstructed wavefield using the Gaussian process regression is fully collapsed. Although the reconstruction error is larger than that obtained using all observation sites, the number of observation sites used for reconstruction can be reduced while minimizing the deterioration and scatter of the reconstructed data by combining it with the sensor optimization method

    Event History Regression with Pseudo-Observations: Computational Approaches and an Implementation in R

    Get PDF
    Due to tradition and ease of estimation, the vast majority of clinical and epidemiological papers with time-to-event data report hazard ratios from Cox proportional hazards regression models. Although hazard ratios are well known, they can be difficult to interpret, particularly as causal contrasts, in many settings. Nonparametric or fully parametric estimators allow for the direct estimation of more easily causally interpretable estimands such as the cumulative incidence and restricted mean survival. However, modeling these quantities as functions of covariates is limited to a few categorical covariates with nonparametric estimators, and often requires simulation or numeric integration with parametric estimators. Combining pseudo-observations based on non-parametric estimands with parametric regression on the pseudo-observations allows for the best of these two approaches and has many nice properties. In this paper, we develop a user friendly, easy to understand way of doing event history regression for the cumulative incidence and the restricted mean survival, using the pseudo-observation framework for estimation. The interface uses the well known formulation of a generalized linear model and allows for features including plotting of residuals, the use of sampling weights, and correct variance estimation

    Pengaruh Fasilitas Perpustakaan terhadap Minat Baca Siswa di Perpustakaan MAN Curup Rejang Lebong

    Get PDF
    The purpose of this study was to determine library facilities, the level of interest in reading and the effect of limited library facilities on students' reading interest at MAN Curup Rejang Lebong. This study used a quantitative research approach which showed the effect of variable X on variable Y. The population in this study were 541 students of MAN Curup Rejang Lebong. The sample in this study was 54 people or 10%. Data collection techniques used are observation, questionnaires and documentation. While technical data analysis using descriptive statistical analysis and inferential statistical analysis techniques using Simple Linear Regression. With the results of the study showing that the library facilities in Curup Rejang Lebong MAN were in the good category with a percentage rate of 79.80 percent. The reading interest level of students at MAN Curup Rejang Lebong is in the high category with a percentage rate of 79.76 percent. This can be proven by the results of simple linear regression, namely the results obtained f count> f table, namely 71.300> 4.08, then Ho is rejected and Ha is accepted. So it can be concluded that there is an influence of library facilities on students' reading interest at MAN Curup Rejang Lebong. &nbsp

    Spatial Quantile Regression

    Get PDF
    In a number of applications, a crucial problem consists in describing and analyzing the influence of a vector Xi of covariates on some real-valued response variable Yi. In the present context, where the observations are made over a collection of sites, this study is more difficult, due to the complexity of the possible spatial dependence among the various sites. In this paper, instead of spatial mean regression, we thus consider the spatial quantile regression functions. Quantile regression has been considered in a spatial context. The main aim of this paper is to incorporate quantile regression and spatial econometric modeling. Substantial variation exists across quantiles, suggesting that ordinary regression is insufficient on its own. Quantile estimates of a spatial-lag model show considerable spatial dependence in the different parts of the distribution.W wielu zastosowaniach, podstawowym problemem jest opis i analiza wpływu wektora skorelowanych zmiennych objaśniających X na zmienna objaśnianą Y. W przypadku, gdy obserwacje badanych zmiennych są dodatkowo rozmieszczone przestrzennie, zadanie jest jeszcze trudniejsze, ponieważ mamy dodatkowe zależności, wynikające ze zmienności przestrzennej. W tej pracy, w miejsce przestrzennej regresji wykorzystującej średnią, rozpatrzymy przestrzenna regresję kwantylową. Regresja kwantylowa zostanie omówiona w przestrzennym kontekście. Głównym celem pracy jest wskazanie na możliwości powiązania metodologii regresji kwantylowej i ekonometrycznego modelowania przestrzennego. Dodatkowe zasoby informacji o zmienności otrzymujemy badając kwantyle, wychodząc poza tradycyjny opis klasycznej regresji. Estymacja kwantylowa w modelu przestrzennym uwydatnia zależności przestrzenne dla różnych fragmentów rozważanych rozkładów

    Distributed Kernel Regression: An Algorithm for Training Collaboratively

    Full text link
    This paper addresses the problem of distributed learning under communication constraints, motivated by distributed signal processing in wireless sensor networks and data mining with distributed databases. After formalizing a general model for distributed learning, an algorithm for collaboratively training regularized kernel least-squares regression estimators is derived. Noting that the algorithm can be viewed as an application of successive orthogonal projection algorithms, its convergence properties are investigated and the statistical behavior of the estimator is discussed in a simplified theoretical setting.Comment: To be presented at the 2006 IEEE Information Theory Workshop, Punta del Este, Uruguay, March 13-17, 200