7,667 research outputs found
Prediction Weighted Maximum Frequency Selection
Shrinkage estimators that possess the ability to produce sparse solutions
have become increasingly important to the analysis of today's complex datasets.
Examples include the LASSO, the Elastic-Net and their adaptive counterparts.
Estimation of penalty parameters still presents difficulties however. While
variable selection consistent procedures have been developed, their finite
sample performance can often be less than satisfactory. We develop a new
strategy for variable selection using the adaptive LASSO and adaptive
Elastic-Net estimators with diverging. The basic idea first involves
using the trace paths of their LARS solutions to bootstrap estimates of maximum
frequency (MF) models conditioned on dimension. Conditioning on dimension
effectively mitigates overfitting, however to deal with underfitting, these MFs
are then prediction-weighted, and it is shown that not only can consistent
model selection be achieved, but that attractive convergence rates can as well,
leading to excellent finite sample performance. Detailed numerical studies are
carried out on both simulated and real datasets. Extensions to the class of
generalized linear models are also detailed.Comment: This manuscript contains 41 pages and 14 figure
Spike and slab variable selection: Frequentist and Bayesian strategies
Variable selection in the linear regression model takes many apparent faces
from both frequentist and Bayesian standpoints. In this paper we introduce a
variable selection method referred to as a rescaled spike and slab model. We
study the importance of prior hierarchical specifications and draw connections
to frequentist generalized ridge regression estimation. Specifically, we study
the usefulness of continuous bimodal priors to model hypervariance parameters,
and the effect scaling has on the posterior mean through its relationship to
penalization. Several model selection strategies, some frequentist and some
Bayesian in nature, are developed and studied theoretically. We demonstrate the
importance of selective shrinkage for effective variable selection in terms of
risk misclassification, and show this is achieved using the posterior from a
rescaled spike and slab model. We also show how to verify a procedure's ability
to reduce model uncertainty in finite samples using a specialized forward
selection strategy. Using this tool, we illustrate the effectiveness of
rescaled spike and slab models in reducing model uncertainty.Comment: Published at http://dx.doi.org/10.1214/009053604000001147 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A Strategy for the Design of Flame Retardants: Cross-linking Processes
Cross-linking is identified as an effective means for flame retardation of polymers and schemes for the cross-linking of poly(ethylene terephthalate) and poly(methyl methacrylate) are presented. For poly(ethylene terephthalate) the scheme involves polymerization of the initially produced vinyl ester. This is followed by chain-stripping, producing a polyene, and cyclization of this polyene. For poly(methyl methacrylate) the scheme entails the formation of anhydride linkages between adjacent polymer strands. Evidence is presented to show the efficacy of these processes and information is produced to aid in the identification of new flame retardants
Groups which do not admit ghosts
A ghost in the stable module category of a group G is a map between
representations of G that is invisible to Tate cohomology. We show that the
only non-trivial finite p-groups whose stable module categories have no
non-trivial ghosts are the cyclic groups of order 2 and 3. We compare this to
the situation in the derived category of a commutative ring. We also determine
for which groups G the second power of the Jacobson radical of kG is stably
isomorphic to a suspension of k.Comment: 9 pages, improved exposition and fixed several typos, to appear in
the Proceedings of the AM
Boxicity of Series Parallel Graphs
The three well-known graph classes, planar graphs (P), series-parallel
graphs(SP) and outer planar graphs(OP) satisfy the following proper inclusion
relation: OP C SP C P. It is known that box(G) <= 3 if G belongs to P and
box(G) <= 2 if G belongs to OP. Thus it is interesting to decide whether the
maximum possible value of the boxicity of series-parallel graphs is 2 or 3. In
this paper we construct a series-parallel graph with boxicity 3, thus resolving
this question. Recently Chandran and Sivadasan showed that for any G, box(G) <=
treewidth(G)+2. They conjecture that for any k, there exists a k-tree with
boxicity k+1. (This would show that their upper bound is tight but for an
additive factor of 1, since the treewidth of any k-tree equals k.) The
series-parallel graph we construct in this paper is a 2-tree with boxicity 3
and is thus a first step towards proving their conjecture.Comment: 10 pages, 0 figure
Hadwiger Number and the Cartesian Product Of Graphs
The Hadwiger number mr(G) of a graph G is the largest integer n for which the
complete graph K_n on n vertices is a minor of G. Hadwiger conjectured that for
every graph G, mr(G) >= chi(G), where chi(G) is the chromatic number of G. In
this paper, we study the Hadwiger number of the Cartesian product G [] H of
graphs.
As the main result of this paper, we prove that mr(G_1 [] G_2) >= h\sqrt{l}(1
- o(1)) for any two graphs G_1 and G_2 with mr(G_1) = h and mr(G_2) = l. We
show that the above lower bound is asymptotically best possible. This
asymptotically settles a question of Z. Miller (1978).
As consequences of our main result, we show the following:
1. Let G be a connected graph. Let the (unique) prime factorization of G be
given by G_1 [] G_2 [] ... [] G_k. Then G satisfies Hadwiger's conjecture if k
>= 2.log(log(chi(G))) + c', where c' is a constant. This improves the
2.log(chi(G))+3 bound of Chandran and Sivadasan.
2. Let G_1 and G_2 be two graphs such that chi(G_1) >= chi(G_2) >=
c.log^{1.5}(chi(G_1)), where c is a constant. Then G_1 [] G_2 satisfies
Hadwiger's conjecture.
3. Hadwiger's conjecture is true for G^d (Cartesian product of G taken d
times) for every graph G and every d >= 2. This settles a question by Chandran
and Sivadasan (They had shown that the Hadiwger's conjecture is true for G^d if
d >= 3.)Comment: 10 pages, 2 figures, major update: lower and upper bound proofs have
been revised. The bounds are now asymptotically tigh
The generating hypothesis for the stable module category of a -group
Freyd's generating hypothesis, interpreted in the stable module category of a
finite p-group G, is the statement that a map between finite-dimensional
kG-modules factors through a projective if the induced map on Tate cohomology
is trivial. We show that Freyd's generating hypothesis holds for a non-trivial
finite p-group G if and only if G is either C_2 or C_3. We also give various
conditions which are equivalent to the generating hypothesis.Comment: 6 pages, fixed minor typos, to appear in J. Algebr
Fence methods for mixed model selection
Many model search strategies involve trading off model fit with model
complexity in a penalized goodness of fit measure. Asymptotic properties for
these types of procedures in settings like linear regression and ARMA time
series have been studied, but these do not naturally extend to nonstandard
situations such as mixed effects models, where simple definition of the sample
size is not meaningful. This paper introduces a new class of strategies, known
as fence methods, for mixed model selection, which includes linear and
generalized linear mixed models. The idea involves a procedure to isolate a
subgroup of what are known as correct models (of which the optimal model is a
member). This is accomplished by constructing a statistical fence, or barrier,
to carefully eliminate incorrect models. Once the fence is constructed, the
optimal model is selected from among those within the fence according to a
criterion which can be made flexible. In addition, we propose two variations of
the fence. The first is a stepwise procedure to handle situations of many
predictors; the second is an adaptive approach for choosing a tuning constant.
We give sufficient conditions for consistency of fence and its variations, a
desirable property for a good model selection procedure. The methods are
illustrated through simulation studies and real data analysis.Comment: Published in at http://dx.doi.org/10.1214/07-AOS517 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …