533 research outputs found

    Weakly-Supervised Alignment of Video With Text

    Get PDF
    Suppose that we are given a set of videos, along with natural language descriptions in the form of multiple sentences (e.g., manual annotations, movie scripts, sport summaries etc.), and that these sentences appear in the same temporal order as their visual counterparts. We propose in this paper a method for aligning the two modalities, i.e., automatically providing a time stamp for every sentence. Given vectorial features for both video and text, we propose to cast this task as a temporal assignment problem, with an implicit linear mapping between the two feature modalities. We formulate this problem as an integer quadratic program, and solve its continuous convex relaxation using an efficient conditional gradient algorithm. Several rounding procedures are proposed to construct the final integer solution. After demonstrating significant improvements over the state of the art on the related task of aligning video with symbolic labels [7], we evaluate our method on a challenging dataset of videos with associated textual descriptions [36], using both bag-of-words and continuous representations for text.Comment: ICCV 2015 - IEEE International Conference on Computer Vision, Dec 2015, Santiago, Chil

    Embedding multidimensional grids into optimal hypercubes

    Full text link
    Let GG and HH be graphs, with V(H)V(G)|V(H)|\geq |V(G)| , and f:V(G)V(H)f:V(G)\rightarrow V(H) a one to one map of their vertices. Let dilation(f)=max{distH(f(x),f(y)):xyE(G)}dilation(f) = max\{ dist_{H}(f(x),f(y)): xy\in E(G) \}, where distH(v,w)dist_{H}(v,w) is the distance between vertices vv and ww of HH. Now let B(G,H)B(G,H) = minf{dilation(f)}min_{f}\{ dilation(f) \}, over all such maps ff. The parameter B(G,H)B(G,H) is a generalization of the classic and well studied "bandwidth" of GG, defined as B(G,P(n))B(G,P(n)), where P(n)P(n) is the path on nn points and n=V(G)n = |V(G)|. Let [a1×a2××ak][a_{1}\times a_{2}\times \cdots \times a_{k} ] be the kk-dimensional grid graph with integer values 11 through aia_{i} in the ii'th coordinate. In this paper, we study B(G,H)B(G,H) in the case when G=[a1×a2××ak]G = [a_{1}\times a_{2}\times \cdots \times a_{k} ] and HH is the hypercube QnQ_{n} of dimension n=log2(V(G))n = \lceil log_{2}(|V(G)|) \rceil, the hypercube of smallest dimension having at least as many points as GG. Our main result is that B([a1×a2××ak],Qn)3k,B( [a_{1}\times a_{2}\times \cdots \times a_{k} ],Q_{n}) \le 3k, provided ai222a_{i} \geq 2^{22} for each 1ik1\le i\le k. For such GG, the bound 3k3k improves on the previous best upper bound 4k+O(1)4k+O(1). Our methods include an application of Knuth's result on two-way rounding and of the existence of spanning regular cyclic caterpillars in the hypercube.Comment: 47 pages, 8 figure

    Numerically optimized Markovian coupling and mixing in one-dimensional maps

    Get PDF
    Algorithms are introduced that produce optimal Markovian couplings for large finite-state-space discrete-time Markov chains with sparse transition matrices; these algorithms are applied to some toy models motivated by fluid-dynamical mixing problems at high Peclét number. An alternative definition of the time-scale of a mixing process is suggested. Finally, these algorithms are applied to the problem of coupling diffusion processes in an acute-angled triangle, and some of the simplifications that occur in continuum coupling problems are discussed

    COMPUTERIZED PRODUCTION CONTROL OF PROCESSES IN MACHINE INDUSTRY

    Get PDF
    corecore