120 research outputs found
Mixed Matrix Completion in Complex Survey Sampling under Heterogeneous Missingness
Modern surveys with large sample sizes and growing mixed-type questionnaires
require robust and scalable analysis methods. In this work, we consider
recovering a mixed dataframe matrix, obtained by complex survey sampling, with
entries following different canonical exponential distributions and subject to
heterogeneous missingness. To tackle this challenging task, we propose a
two-stage procedure: in the first stage, we model the entry-wise missing
mechanism by logistic regression, and in the second stage, we complete the
target parameter matrix by maximizing a weighted log-likelihood with a low-rank
constraint. We propose a fast and scalable estimation algorithm that achieves
sublinear convergence, and the upper bound for the estimation error of the
proposed method is rigorously derived. Experimental results support our
theoretical claims, and the proposed estimator shows its merits compared to
other existing methods. The proposed method is applied to analyze the National
Health and Nutrition Examination Survey data.Comment: Journal of Computational and Graphical Statistics, 202
Topics in bootstrap methods for survey sampling and spatially balanced design
This dissertation consists of three parts. In the first part, we propose new bootstrap methods for three commonly used sampling designs, including the Poisson sampling, simple random sampling, and probability-proportional-to-size sampling. We show that the proposed bootstrap methods are second-order accurate and easy to be implemented in practice. Two simulation studies are conducted to compare the proposed bootstrap methods with the Wald method, and the proposed bootstrap methods outperform the Wald method in terms of coverage rate. It is well-known that a spatially balanced sample, which spread over the study domain well, can improve the estimation efficiency under dependent settings. In the second part, we propose to use a block bootstrap method to estimate the variance and make inference based on a sample generated by a one-per-stratum sampling design. We show the validity of the block bootstrap method and compare it with another commonly used sampling design theoretically. Simulation study shows that the block bootstrap
method can provide valid variance estimator and inference for the one-per-stratum sampling design. Although there are many researches about spatially balanced sampling design, there are few discussing the spatio-temporal balanced sampling design. In the third part, we propose a spatio-temporal balanced sampling design to generate annual samples, such that the sample for each year
is spatially balanced, and the one combining from consecutive years is also spatially balanced. We also propose design-based variance estimator for the estimates of annual status and annual change. The proposed sampling design is used in the National Resources Inventory rangeland on-site survey, and it shows that the proposed design performs better than the current design and estimators
Oral GS-441524 derivatives: Next-generation inhibitors of SARS‐CoV‐2 RNA‐dependent RNA polymerase
GS-441524, an RNA‐dependent RNA polymerase (RdRp) inhibitor, is a 1′-CN-substituted adenine C-nucleoside analog with broad-spectrum antiviral activity. However, the low oral bioavailability of GS‐441524 poses a challenge to its anti-SARS-CoV-2 efficacy. Remdesivir, the intravenously administered version (version 1.0) of GS-441524, is the first FDA-approved agent for SARS-CoV-2 treatment. However, clinical trials have presented conflicting evidence on the value of remdesivir in COVID-19. Therefore, oral GS-441524 derivatives (VV116, ATV006, and GS-621763; version 2.0, targeting highly conserved viral RdRp) could be considered as game-changers in treating COVID-19 because oral administration has the potential to maximize clinical benefits, including decreased duration of COVID-19 and reduced post-acute sequelae of SARS-CoV-2 infection, as well as limited side effects such as hepatic accumulation. This review summarizes the current research related to the oral derivatives of GS-441524, and provides important insights into the potential factors underlying the controversial observations regarding the clinical efficacy of remdesivir; overall, it offers an effective launching pad for developing an oral version of GS-441524
An adaptive weighting algorithm for accurate radio tomographic image in the environment with multipath and WiFi interference
Radio frequency device-free localization based on wireless sensor network has proved its feasibility in buildings. With this technique, a target can be located relying on the changes of received signal strengths caused by the moving object. However, the accuracy of many such systems deteriorates seriously in the environment with WiFi and the multipath interference. State-of-the-art methods do not efficiently solve the WiFi and multipath interference problems at the same time. In this article, we propose and evaluate an adaptive weighting radio tomography image algorithm to improve the accuracy of radio frequency device-free localization in the environment with multipath and different intensity of WiFi interference. Field experiments prove that our approach outperforms the state-of-the-art radio frequency device-free localization systems in the environment with multipath and WiFi interference
Probability Weighted Clustered Coefficients Regression Models in Complex Survey Sampling
Regression analysis is commonly conducted in survey sampling. However,
existing methods fail when the relationships vary across different areas or
domains. In this paper, we propose a unified framework to study the group-wise
covariate effect under complex survey sampling based on pairwise penalties, and
the associated objective function is solved by the alternating direction method
of multipliers. Theoretical properties of the proposed method are investigated
under some generality conditions. Numerical experiments demonstrate the
superiority of the proposed method in terms of identifying groups and
estimation efficiency for both linear regression models and logistic regression
models.Comment: 35 pages,2 figure
Multiple bias-calibration for adjusting selection bias of non-probability samples using data integration
Valid statistical inference is challenging when the sample is subject to
unknown selection bias. Data integration can be used to correct for selection
bias when we have a parallel probability sample from the same population with
some common measurements. How to model and estimate the selection probability
or the propensity score (PS) of a non-probability sample using an independent
probability sample is the challenging part of the data integration. We approach
this difficult problem by employing multiple candidate models for PS combined
with empirical likelihood. By incorporating multiple propensity score models
into the internal bias calibration constraint in the empirical likelihood
setup, the selection bias can be eliminated so long as the multiple candidate
models contain a true PS model. The bias calibration constraint under the
multiple PS models is called multiple bias calibration. Multiple PS models can
include both missing-at-random and missing-not-at-random models. Asymptotic
properties are discussed, and some limited simulation studies are presented to
compare the proposed method with some existing competitors. Plasmode simulation
studies using the Culture \& Community in a Time of Crisis dataset demonstrate
the practical usage and advantages of the proposed method
Sampling techniques for big data analysis in finite population inference
In analyzing big data for finite population inference, it is critical to
adjust for the selection bias in the big data. In this paper, we propose two
methods of reducing the selection bias associated with the big data sample. The
first method uses a version of inverse sampling by incorporating auxiliary
information from external sources, and the second one borrows the idea of data
integration by combining the big data sample with an independent probability
sample. Two simulation studies show that the proposed methods are unbiased and
have better coverage rates than their alternatives. In addition, the proposed
methods are easy to implement in practice.Comment: 24 pages, 3 table
- …