Search CORE

37,521 research outputs found

Bayesian Nonparametric Hidden Semi-Markov Models

Author: Johnson Matthew J.
Willsky Alan S.
Publication venue
Publication date: 01/09/2012
Field of study

There is much interest in the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) as a natural Bayesian nonparametric extension of the ubiquitous Hidden Markov Model for learning from sequential and time-series data. However, in many settings the HDP-HMM's strict Markovian constraints are undesirable, particularly if we wish to learn or encode non-geometric state durations. We can extend the HDP-HMM to capture such structure by drawing upon explicit-duration semi-Markovianity, which has been developed mainly in the parametric frequentist setting, to allow construction of highly interpretable models that admit natural prior information on state durations. In this paper we introduce the explicit-duration Hierarchical Dirichlet Process Hidden semi-Markov Model (HDP-HSMM) and develop sampling algorithms for efficient posterior inference. The methods we introduce also provide new methods for sampling inference in the finite Bayesian HSMM. Our modular Gibbs sampling methods can be embedded in samplers for larger hierarchical Bayesian models, adding semi-Markov chain modeling as another tool in the Bayesian inference toolbox. We demonstrate the utility of the HDP-HSMM and our inference methods on both synthetic and real experiments

arXiv.org e-Print Archive

DSpace@MIT

Dictionary-based Tensor Canonical Polyadic Decomposition

Author: Cohen Jérémy E.
Gillis Nicolas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/11/2017
Field of study

To ensure interpretability of extracted sources in tensor decomposition, we introduce in this paper a dictionary-based tensor canonical polyadic decomposition which enforces one factor to belong exactly to a known dictionary. A new formulation of sparse coding is proposed which enables high dimensional tensors dictionary-based canonical polyadic decomposition. The benefits of using a dictionary in tensor decomposition models are explored both in terms of parameter identifiability and estimation accuracy. Performances of the proposed algorithms are evaluated on the decomposition of simulated data and the unmixing of hyperspectral images

arXiv.org e-Print Archive

On the Distributed Complexity of Large-Scale Graph Computations

Author: Pandurangan Gopal
Robinson Peter
Scquizzato Michele
Publication venue
Publication date: 01/01/2018
Field of study

Motivated by the increasing need to understand the distributed algorithmic foundations of large-scale graph computations, we study some fundamental graph problems in a message-passing model for distributed computing where

k \geq 2

machines jointly perform computations on graphs with

n

nodes (typically,

n \gg k

). The input graph is assumed to be initially randomly partitioned among the

k

machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication {\em rounds} of the computation. Our main contribution is the {\em General Lower Bound Theorem}, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations. The General Lower Bound Theorem is established via an information-theoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic and this theorem can be used in a "cookbook" fashion to show distributed lower bounds in the context of several problems, including non-graph problems. We present two applications by showing (almost) tight lower bounds for the round complexity of two fundamental graph problems, namely {\em PageRank computation} and {\em triangle enumeration}. Our approach, as demonstrated in the case of PageRank, can yield tight lower bounds for problems (including, and especially, under a stochastic partition of the input) where communication complexity techniques are not obvious. Our approach, as demonstrated in the case of triangle enumeration, can yield stronger round lower bounds as well as message-round tradeoffs compared to approaches that use communication complexity techniques

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Padova

Spectral Unmixing with Multiple Dictionaries

Author: Cohen Jeremy E.
Gillis Nicolas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/11/2017
Field of study

Spectral unmixing aims at recovering the spectral signatures of materials, called endmembers, mixed in a hyperspectral or multispectral image, along with their abundances. A typical assumption is that the image contains one pure pixel per endmember, in which case spectral unmixing reduces to identifying these pixels. Many fully automated methods have been proposed in recent years, but little work has been done to allow users to select areas where pure pixels are present manually or using a segmentation algorithm. Additionally, in a non-blind approach, several spectral libraries may be available rather than a single one, with a fixed number (or an upper or lower bound) of endmembers to chose from each. In this paper, we propose a multiple-dictionary constrained low-rank matrix approximation model that address these two problems. We propose an algorithm to compute this model, dubbed M2PALS, and its performance is discussed on both synthetic and real hyperspectral images

arXiv.org e-Print Archive

Real-time adaptive aircraft scheduling

Author: Kolitz Stephan E.
Terrab Mostafa
Publication venue
Publication date
Field of study

One of the most important functions of any air traffic management system is the assignment of ground-holding times to flights, i.e., the determination of whether and by how much the take-off of a particular aircraft headed for a congested part of the air traffic control (ATC) system should be postponed in order to reduce the likelihood and extent of airborne delays. An analysis is presented for the fundamental case in which flights from many destinations must be scheduled for arrival at a single congested airport; the formulation is also useful in scheduling the landing of airborne flights within the extended terminal area. A set of approaches is described for addressing a deterministic and a probabilistic version of this problem. For the deterministic case, where airport capacities are known and fixed, several models were developed with associated low-order polynomial-time algorithms. For general delay cost functions, these algorithms find an optimal solution. Under a particular natural assumption regarding the delay cost function, an extremely fast (O(n ln n)) algorithm was developed. For the probabilistic case, using an estimated probability distribution of airport capacities, a model was developed with an associated low-order polynomial-time heuristic algorithm with useful properties

NASA Technical Reports Server

Linear Time Parameterized Algorithms via Skew-Symmetric Multicuts

Author: Ramanujan M. S.
Saurabh Saket
Publication venue
Publication date: 28/04/2013
Field of study

A skew-symmetric graph

(D=(V,A),\sigma)

is a directed graph

D

with an involution

\sigma

on the set of vertices and arcs. In this paper, we introduce a separation problem,

d

-Skew-Symmetric Multicut, where we are given a skew-symmetric graph

D

, a family of

\cal T

d

-sized subsets of vertices and an integer

k

. The objective is to decide if there is a set

X\subseteq A

k

arcs such that every set

J

in the family has a vertex

v

such that

v

and

\sigma(v)

are in different connected components of

D'=(V,A\setminus (X\cup \sigma(X))

. In this paper, we give an algorithm for this problem which runs in time

O((4d)^{k}(m+n+\ell))

, where

m

is the number of arcs in the graph,

n

the number of vertices and

\ell

the length of the family given in the input. Using our algorithm, we show that Almost 2-SAT has an algorithm with running time

O(4^kk^4\ell)

and we obtain algorithms for {\sc Odd Cycle Transversal} and {\sc Edge Bipartization} which run in time

O(4^kk^4(m+n))

and

O(4^kk^5(m+n))

respectively. This resolves an open problem posed by Reed, Smith and Vetta [Operations Research Letters, 2003] and improves upon the earlier almost linear time algorithm of Kawarabayashi and Reed [SODA, 2010]. We also show that Deletion q-Horn Backdoor Set Detection is a special case of 3-Skew-Symmetric Multicut, giving us an algorithm for Deletion q-Horn Backdoor Set Detection which runs in time

O(12^kk^5\ell)

. This gives the first fixed-parameter tractable algorithm for this problem answering a question posed in a paper by a superset of the authors [STACS, 2013]. Using this result, we get an algorithm for Satisfiability which runs in time

O(12^kk^5\ell)

where

k

is the size of the smallest q-Horn deletion backdoor set, with

\ell

being the length of the input formula

arXiv.org e-Print Archive

Crossref