5 research outputs found
Simultaneous Parameter Learning and Bi-Clustering for Multi-Response Models
We consider multi-response and multitask regression models, where the
parameter matrix to be estimated is expected to have an unknown grouping
structure. The groupings can be along tasks, or features, or both, the last one
indicating a bi-cluster or "checkerboard" structure. Discovering this grouping
structure along with parameter inference makes sense in several applications,
such as multi-response Genome-Wide Association Studies. This additional
structure can not only can be leveraged for more accurate parameter estimation,
but it also provides valuable information on the underlying data mechanisms
(e.g. relationships among genotypes and phenotypes in GWAS). In this paper, we
propose two formulations to simultaneously learn the parameter matrix and its
group structures, based on convex regularization penalties. We present
optimization approaches to solve the resulting problems and provide numerical
convergence guarantees. Our approaches are validated on extensive simulations
and real datasets concerning phenotypes and genotypes of plant varieties.Comment: 15 pages, 15 figure
Modelling of temporal fluctuation scaling in online news network with independent cascade model
We show that activity of online news outlets follows a temporal fluctuation
scaling law and we recover this feature using an independent cascade model
augmented with a varying hype parameter representing a viral potential of an
original article. We use the Event Registry platform to track activity of over
10,000 news outlets in 11 different topics in the course of the year 2016.
Analyzing over 22,000,000 articles, we found that fluctuation scaling exponents
depend on time window size in a characteristic way for all
the considered topics -- news outlets activities are partially synchronized for
with a cross-over for . The
proposed model was run on several synthetic network models as well as on a
network extracted from the real data. Our approach discards timestamps as not
fully reliable observables and focuses on co-occurrences of publishers in
cascades of similarly phrased news items. We make use of the Event Registry
news clustering feature to find correlations between content published by news
outlets in order to uncover common information propagation paths in published
articles and to estimate weights of edges in the independent cascade model.
While the independent cascade model follows the fluctuation scaling law with a
trivial exponent , we argue that besides the topology of the
underlying cooperation network a temporal clustering of articles with similar
hypes is necessary to qualitatively reproduce the fluctuation scaling observed
in the data
Learning Influence-Receptivity Network Structure with Guarantee
Traditional works on community detection from observations of information
cascade assume that a single adjacency matrix parametrizes all the observed
cascades. However, in reality the connection structure usually does not stay
the same across cascades. For example, different people have different topics
of interest, therefore the connection structure depends on the
information/topic content of the cascade. In this paper we consider the case
where we observe a sequence of noisy adjacency matrices triggered by
information/event with different topic distributions. We propose a novel latent
model using the intuition that a connection is more likely to exist between two
nodes if they are interested in similar topics, which are common with the
information/event. Specifically, we endow each node with two node-topic
vectors: an influence vector that measures how influential/authoritative they
are on each topic; and a receptivity vector that measures how
receptive/susceptible they are to each topic. We show how these two node-topic
structures can be estimated from observed adjacency matrices with theoretical
guarantee on estimation error, in cases where the topic distributions of the
information/event are known, as well as when they are unknown. Experiments on
synthetic and real data demonstrate the effectiveness of our model and superior
performance compared to state-of-the-art methods
Constrained High Dimensional Statistical Inference
In typical high dimensional statistical inference problems, confidence
intervals and hypothesis tests are performed for a low dimensional subset of
model parameters under the assumption that the parameters of interest are
unconstrained. However, in many problems, there are natural constraints on
model parameters and one is interested in whether the parameters are on the
boundary of the constraint or not. e.g. non-negativity constraints for
transmission rates in network diffusion. In this paper, we provide algorithms
to solve this problem of hypothesis testing in high-dimensional statistical
models under constrained parameter space. We show that following our testing
procedure we are able to get asymptotic designed Type I error under the null.
Numerical experiments demonstrate that our algorithm has greater power than the
standard algorithms where the constraints are ignored. We demonstrate the
effectiveness of our algorithms on two real datasets where we have
{\emph{intrinsic}} constraint on the parameters
Tensor Canonical Correlation Analysis with Convergence and Statistical Guarantees
In many applications, such as classification of images or videos, it is of
interest to develop a framework for tensor data instead of an ad-hoc way of
transforming data to vectors due to the computational and under-sampling
issues. In this paper, we study convergence and statistical properties of
two-dimensional canonical correlation analysis \citep{Lee2007Two} under an
assumption that data come from a probabilistic model. We show that carefully
initialized the power method converges to the optimum and provide a finite
sample bound. Then we extend this framework to tensor-valued data and propose
the higher-order power method, which is commonly used in tensor decomposition,
to extract the canonical directions. Our method can be used effectively in a
large-scale data setting by solving the inner least squares problem with a
stochastic gradient descent, and we justify convergence via the theory of
Lojasiewicz's inequalities without any assumption on data generating process
and initialization. For practical applications, we further develop (a) an
inexact updating scheme which allows us to use the state-of-the-art stochastic
gradient descent algorithm, (b) an effective initialization scheme which
alleviates the problem of local optimum in non-convex optimization, and (c) a
deflation procedure for extracting several canonical components. Empirical
analyses on challenging data including gene expression and air pollution
indexes in Taiwan, show the effectiveness and efficiency of the proposed
methodology. Our results fill a missing, but crucial, part in the literature on
tensor data