13 research outputs found
GRID for model structure discovering in high dimensional regression
Given a nonparametric regression model, we assume that the number of
covariates d → ∞ but only some of these covariates are relevant for the model. Our goal
is to identify the relevant covariates and to obtain some information about the structure of
the model. We propose a new nonparametric procedure, called GRID, having the following
features: (a) it automatically identifies the relevant covariates of the regression model, also
distinguishing the nonlinear from the linear ones (a covariate is defined linear/nonlinear
depending on the marginal relation between the response variable and such a covariate);
(b) the interactions between the covariates (mixed effect terms) are automatically identified,
without the necessity of considering some kind of stepwise selection method. In
particular, our procedure can identify the mixed terms of any order (two way, three way,
...) without increasing the computational complexity of the algorithm; (c) it is completely
data-driven, so being easily implementable for the analysis of real datasets. In particular,
it does not depend on the selection of crucial regularization parameters, nor it requires the
estimation of the nuisance parameter 2 (self scaling). The acronym GRID has a twofold
meaning: first, it derives from Gradient Relevant Identification Derivatives, meaning that
the procedure is based on testing the significance of a partial derivative estimator; second,
it refers to a graphical tool which can help in representing the identified structure of the
regression model. The properties of the GRID procedure are investigated theoretically
On Inconsistency of the Jackknife-after-Bootstrap Bias Estimator for Dependent Data,
B. Efron introducedjackknife-after-bootstrapas a computationally efficient method for estimating standard errors of bootstrap estimators. In a recent paper consistency of the jackknife-after-bootstrap variance estimators has been established for different bootstrap quantities for independent and dependent data. In this paper, it is shown that in the dependent case, the standard jackknife-after-bootstrap estimator for the bias of block bootstrap quantities is inconsistent for almost any sensible choice of the blocking parameters. Some alternative bias estimators are proposed and shown to be consistent.jackknife block bootstrap consistency weak dependence
On Edgeworth Expansion and Moving Block Bootstrap for StudentizedM-Estimators in Multiple Linear Regression Models
This paper considers the multiple linear regression modelYi=xi'[beta]+[var epsilon]i,i=i, ..., n, wherexi's are knownp-1 vectors,[beta]is ap-1 vector of parameters, and[var epsilon]1,[var epsilon]2, ... are stationary, strongly mixing random variables. Let[beta]ndenote anM-estimator of[beta]corresponding to some score function[psi]. Under some conditions on[psi],xi's and[var epsilon]i's, a two-term Edgeworth expansion for Studentized multivariateM-estimator is proved. Furthermore, it is shown that the moving block bootstrap is second-order correct for some suitable bootstrap analog of Studentized[beta]n.Edgeworth expansion moving block bootstrap M-estimators multiple linear regression stationarity strong mixing Studentization (null)
Second order optimality of stationary bootstrap
This paper proves the second order correctness of the stationary bootstrap procedure for normalized, multivariate sample mean of weakly dependent observations. Similar results are shown to hold also for more general vector valued statistics based on sample means.Bootstrap Edgeworth expansion stationarity weak dependence normalized sample mean
Bootstrapping weighted empirical processes that do not converge weakly
We show that the bootstrap method provides valid approximations to the sampling distribution of a weighted empirical process on D[0,1] even in the cases where it fails to converge weakly. Furthermore, the result is applied to construct valid bootstrap confidence sets in such pathological cases.Weighted empirical process Bootstrap Weak convergence Confidence sets
In silico approach of receptor-ligand binding and interaction: Established phytoligands from Tagetes errecta Linn. against bacterial β-glucosidase receptor
The medicinal plant, Tagetes errecta Linn. is a common ornamental plant and leaves of this plant are containing phytochemicals (volatile oil) that inhibit the growth of bacteria, fungi and known natural antimicrobial agents. The objective of the present study was to detect receptor-ligand binding energy and interaction through molecular docking for phytoligands established in the leaves of T. errecta against β-glucosidase receptor (PDB ID: 3AHZ). Molecular docking was performed by using PyRx (Version 0.8) for the structure-based virtual screening and visualized the interaction in the molecular graphic laboratory (MGL) tool (Version 1.5.6). Among 25 phytochemicals and 2 synthetic compounds (Carbendazim and 2-Amino-2-hydroxymethyl-propane-1,3-diol), binding energy value was obtained highest in Bicyclogermacrene (-6.4 Kcal/mol) and lowest in Octanol (-4.4 Kcal/mol) and Carbendazim and 2-Amino-2-hydroxymethyl-propane-1,3-diol showed -6.7 Kcal/mol and -3.5 Kcal/mol all of these showed no hydrogen bonding. The binding interaction of target protein with this phytocompound found binding at the mouth of the active site may be treated as competitive inhibitor. In conclusion, phytocompound Bicyclogermacrene can be alternative of synthetic fungicide as per binding energy value and interaction. It is suggesting further pharmacological and toxicological assay with this phytocompound after isolation from ornamental plant (T. errecta)
GRID: A Variable selection and structure discovery method for high dimensional nonparametric regression
We consider nonparametric regression in high dimensions where
only a relatively small subset of a large number of variables are relevant
and may have nonlinear effects on the response. We develop
methods for variable selection, structure discovery and estimation of
the true low-dimensional regression function, allowing any degree of
interactions among the relevant variables that need not be specified
a-priori. The proposed method, called the GRID, combines empirical
likelihood based marginal testing with the local linear estimation
machinery in a novel way to select the relevant variables. Further, it
provides a simple graphical tool for identifying the low dimensional
nonlinear structure of the regression function. Theoretical results establish
consistency of variable selection and structure discovery, and
also Oracle risk property of the GRID estimator of the regression
function, allowing the dimension d of the covariates to grow with the
sample size n at the rate for any and the
number of relevant covariates r to grow at a rate for some under some regularity conditions that, in particular, require finiteness of certain absolute moments of the error variables depending on . Finite sample properties of the GRID are investigated in a
moderately large simulation study