18 research outputs found
Decomposable Principal Component Analysis
We consider principal component analysis (PCA) in decomposable Gaussian
graphical models. We exploit the prior information in these models in order to
distribute its computation. For this purpose, we reformulate the problem in the
sparse inverse covariance (concentration) domain and solve the global
eigenvalue problem using a sequence of local eigenvalue problems in each of the
cliques of the decomposable graph. We demonstrate the application of our
methodology in the context of decentralized anomaly detection in the Abilene
backbone network. Based on the topology of the network, we propose an
approximate statistical graphical model and distribute the computation of PCA
Reduced-Dimension Linear Transform Coding of Correlated Signals in Networks
A model, called the linear transform network (LTN), is proposed to analyze
the compression and estimation of correlated signals transmitted over directed
acyclic graphs (DAGs). An LTN is a DAG network with multiple source and
receiver nodes. Source nodes transmit subspace projections of random correlated
signals by applying reduced-dimension linear transforms. The subspace
projections are linearly processed by multiple relays and routed to intended
receivers. Each receiver applies a linear estimator to approximate a subset of
the sources with minimum mean squared error (MSE) distortion. The model is
extended to include noisy networks with power constraints on transmitters. A
key task is to compute all local compression matrices and linear estimators in
the network to minimize end-to-end distortion. The non-convex problem is solved
iteratively within an optimization framework using constrained quadratic
programs (QPs). The proposed algorithm recovers as special cases the regular
and distributed Karhunen-Loeve transforms (KLTs). Cut-set lower bounds on the
distortion region of multi-source, multi-receiver networks are given for linear
coding based on convex relaxations. Cut-set lower bounds are also given for any
coding strategy based on information theory. The distortion region and
compression-estimation tradeoffs are illustrated for different communication
demands (e.g. multiple unicast), and graph structures.Comment: 33 pages, 7 figures, To appear in IEEE Transactions on Signal
Processin
Multivariate Generalized Gaussian Distribution: Convexity and Graphical Models
We consider covariance estimation in the multivariate generalized Gaussian
distribution (MGGD) and elliptically symmetric (ES) distribution. The maximum
likelihood optimization associated with this problem is non-convex, yet it has
been proved that its global solution can be often computed via simple fixed
point iterations. Our first contribution is a new analysis of this likelihood
based on geodesic convexity that requires weaker assumptions. Our second
contribution is a generalized framework for structured covariance estimation
under sparsity constraints. We show that the optimizations can be formulated as
convex minimization as long the MGGD shape parameter is larger than half and
the sparsity pattern is chordal. These include, for example, maximum likelihood
estimation of banded inverse covariances in multivariate Laplace distributions,
which are associated with time varying autoregressive processes
Principal component analysis in decomposable Gaussian graphical models
We consider principal component analysis (PCA) in decomposable Gaussian graphical models. We exploit the prior information in these models in order to distribute its computation. For this purpose, we reformulate the problem in the sparse inverse covariance (concen-tration) domain and solve the global eigenvalue problem using a se-quence of local eigenvalue problems in each of the cliques of the de-composable graph. We demonstrate the application of our methodol-ogy in the context of decentralized anomaly detection in the Abilene backbone network. Based on the topology of the network, we pro-pose an approximate statistical graphical model and distribute the computation of PCA. Index Terms — Principal component analysis, graphical mod-els, distributed data mining. 1
USO DE ANÁLISE DE COMPONENTES PRINCIPAIS NA SELEÇÃO DE VARIÁVEIS PARA CLASSIFICAÇÃO EM BASES DE DADOS CONTAMINADAS POR RUÍDO BRANCO
Técnicas de indução de modelos podem ser usadas na tentativa de descobrir conhecimento em bases de dados, contudo, o requerimento relativo à complexidade da amostra pode inviabilizar a obtenção de resultados confiáveis. Uma forma de reduzir as exigências da complexidade da amostra é selecionar um subconjunto de variáveis. Este trabalho avalia como asserções de independência sobre as variáveis do domínio de aplicação afetam o desempenho dos métodos B2 e B4, baseados na Análise de Componentes Principais, na seleção de variáveis para indução de Redes Neurais Artificiais. A diferença no desempenho dos classificadores dada a presença ou ausência de informações de independências foi determinada em experimentos realizados sobre bases dados sintéticas e agrícolas
USO DE ANÁLISE DE COMPONENTES PRINCIPAIS NA SELEÇÃO DE VARIÁVEIS PARA CLASSIFICAÇÃO EM BASES DE DADOS CONTAMINADAS POR RUÍDO BRANCO
Técnicas de indução de modelos podem ser usadas na tentativa de descobrir conhecimento em bases de dados, contudo, o requerimento relativo à complexidade da amostra pode inviabilizar a obtenção de resultados confiáveis. Uma forma de reduzir as exigências da complexidade da amostra é selecionar um subconjunto de variáveis. Este trabalho avalia como asserções de independência sobre as variáveis do domínio de aplicação afetam o desempenho dos métodos B2 e B4, baseados na Análise de Componentes Principais, na seleção de variáveis para indução de Redes Neurais Artificiais. A diferença no desempenho dos classificadores dada a presença ou ausência de informações de independências foi determinada em experimentos realizados sobre bases dados sintéticas e agrícolas