40,620 research outputs found
Partial Identifiability of Restricted Latent Class Models
Latent class models have wide applications in social and biological sciences.
In many applications, pre-specified restrictions are imposed on the parameter
space of latent class models, through a design matrix, to reflect
practitioners' assumptions about how the observed responses depend on subjects'
latent traits. Though widely used in various fields, such restricted latent
class models suffer from non-identifiability due to their discreteness nature
and complex structure of restrictions. This work addresses the fundamental
identifiability issue of restricted latent class models by developing a general
framework for strict and partial identifiability of the model parameters. Under
correct model specification, the developed identifiability conditions only
depend on the design matrix and are easily checkable, which provide useful
practical guidelines for designing statistically valid diagnostic tests.
Furthermore, the new theoretical framework is applied to establish, for the
first time, identifiability of several designs from cognitive diagnosis
applications
Superconductivity in a two-dimensional superconductor with Rashba and Dresselhaus spin-orbit couplings
We present a general model with both Rashba and Dresselhaus spin-orbit
couplings to describe a two-dimensional noncentrosymmetric superconductor. The
combined effects of the two spin-orbit couplings on superconductivity are
investigated in the framework of mean-field theory. We find that the Rashba and
Dresselhaus spin-orbit couplings result in similar effects on superconductivity
if they are present solely in the system. Mixing of spin-singlet and triplet
pairings in electron band is induced under the assumption that each
quasiparticle band is p-wave paired. If the two types of spin-orbit couplings
appear jointly, both the singlet and triplet pairings are weakened and
decreased down to their minimum values in the equal-Rashba-Dresselhaus case.Comment: 5 pages, 4 figure
Learning Attribute Patterns in High-Dimensional Structured Latent Attribute Models
Structured latent attribute models (SLAMs) are a special family of discrete
latent variable models widely used in social and biological sciences. This
paper considers the problem of learning significant attribute patterns from a
SLAM with potentially high-dimensional configurations of the latent attributes.
We address the theoretical identifiability issue, propose a penalized
likelihood method for the selection of the attribute patterns, and further
establish the selection consistency in such an overfitted SLAM with diverging
number of latent patterns. The good performance of the proposed methodology is
illustrated by simulation studies and two real datasets in educational
assessment
The Sufficient and Necessary Condition for the Identifiability and Estimability of the DINA Model
Cognitive Diagnosis Models (CDMs) are useful statistical tools in cognitive
diagnosis assessment. However, as many other latent variable models, the CDMs
often suffer from the non-identifiability issue. This work gives the sufficient
and necessary condition for the identifiability of the basic DINA model, which
not only addresses the open problem in Xu and Zhang (2016, Psychomatrika,
81:625-649) on the minimal requirement for the identifiability, but also sheds
light on the study of more general CDMs, which often cover the DINA as a
submodel. Moreover, we show the identifiability condition ensures the
consistent estimation of the model parameters. From a practical perspective,
the identifiability condition only depends on the Q-matrix structure and is
easy to verify, which would provide a guideline for designing statistically
valid and estimable cognitive diagnosis tests
Nonseparable Gaussian Stochastic Process: A Unified View and Computational Strategy
Gaussian stochastic process (GaSP) has been widely used as a prior over
functions due to its flexibility and tractability in modeling. However, the
computational cost in evaluating the likelihood is , where is the
number of observed points in the process, as it requires to invert the
covariance matrix. This bottleneck prevents GaSP being widely used in
large-scale data. We propose a general class of nonseparable GaSP models for
multiple functional observations with a fast and exact algorithm, in which the
computation is linear () and exact, requiring no approximation to compute
the likelihood. We show that the commonly used linear regression and separable
models are special cases of the proposed nonseparable GaSP model. Through the
study of an epigenetic application, the proposed nonseparable GaSP model can
accurately predict the genome-wide DNA methylation levels and compares
favorably to alternative methods, such as linear regression, random forests and
localized Kriging method
Constant Query Time -Approximate Distance Oracle for Planar Graphs
We give a -approximate distance oracle with query time
for an undirected planar graph with vertices and non-negative edge
lengths. For and any two vertices and in , our oracle
gives a distance with stretch in time.
The oracle has size and
pre-processing time , where
. This is the first -approximate
distance oracle with query time independent of and the size
and pre-processing time nearly linear in , and improves the query time
of previous -approximate distance oracle with
size nearly linear in
Near-Linear Time Constant-Factor Approximation Algorithm for Branch-Decomposition of Planar Graphs
We give an algorithm which for an input planar graph of vertices and
integer , in time either constructs a
branch-decomposition of with width at most , is a
constant, or a cylinder minor of
implying , is the branchwidth of . This is the first
time constant-factor approximation for branchwidth/treewidth and
largest grid/cylinder minors of planar graphs and improves the previous
( is a constant) time
constant-factor approximations. For a planar graph and , a
branch-decomposition of width at most and a
cylinder/grid minor with , is constant, can be
computed by our algorithm in time.Comment: The mainly revision is the algorithm part (Section 4):
added proofs for graphs with edge weights 1/2 and 1, and modified the proofs
for finding the minimum separating cycle
On nonexistence and existence of positive global solutions to heat equation with a potential term on Riemannian manifolds
We reinvestigate nonexistence and existence of global positive solutions to
heat equation with a potential term on Riemannian manifolds. Especially, we
give a very natural sharp condition only in terms of the volume of geodesic
ball to obtain nonexistence results.Comment: 25 page
Stochastic Nested Variance Reduction for Nonconvex Optimization
We study finite-sum nonconvex optimization problems, where the objective
function is an average of nonconvex functions. We propose a new stochastic
gradient descent algorithm based on nested variance reduction. Compared with
conventional stochastic variance reduced gradient (SVRG) algorithm that uses
two reference points to construct a semi-stochastic gradient with diminishing
variance in each iteration, our algorithm uses nested reference points to
build a semi-stochastic gradient to further reduce its variance in each
iteration. For smooth nonconvex functions, the proposed algorithm converges to
an -approximate first-order stationary point (i.e., ) within number of stochastic
gradient evaluations. This improves the best known gradient complexity of SVRG
and that of SCSG . For gradient
dominated functions, our algorithm also achieves a better gradient complexity
than the state-of-the-art algorithms.Comment: 28 pages, 2 figures, 1 tabl
Control of the False Discovery Rate Under Arbitrary Covariance Dependence
Multiple hypothesis testing is a fundamental problem in high dimensional
inference, with wide applications in many scientific fields. In genome-wide
association studies, tens of thousands of tests are performed simultaneously to
find if any genes are associated with some traits and those tests are
correlated. When test statistics are correlated, false discovery control
becomes very challenging under arbitrary dependence. In the current paper, we
propose a new methodology based on principal factor approximation, which
successfully substracts the common dependence and weakens significantly the
correlation structure, to deal with an arbitrary dependence structure. We
derive the theoretical distribution for false discovery proportion (FDP) in
large scale multiple testing when a common threshold is used and provide a
consistent FDP. This result has important applications in controlling FDR and
FDP. Our estimate of FDP compares favorably with Efron (2007)'s approach, as
demonstrated by in the simulated examples. Our approach is further illustrated
by some real data applications.Comment: 44 pages, 7 figure
- …