398 research outputs found

    Finite automata for caching in matrix product algorithms

    Full text link
    A diagram is introduced for visualizing matrix product states which makes transparent a connection between matrix product factorizations of states and operators, and complex weighted finite state automata. It is then shown how one can proceed in the opposite direction: writing an automaton that ``generates'' an operator gives one an immediate matrix product factorization of it. Matrix product factorizations have the advantage of reducing the cost of computing expectation values by facilitating caching of intermediate calculations. Thus our connection to complex weighted finite state automata yields insight into what allows for efficient caching in matrix product algorithms. Finally, these techniques are generalized to the case of multiple dimensions.Comment: 18 pages, 19 figures, LaTeX; numerous improvements have been made to the manuscript in response to referee feedbac

    Utilisation de l'analyse formelle de concepts pour extraire le plus grand modèle commun

    Get PDF
    International audienceThe development of information systems follows a long and complex process in which various actors are involved. We report an experiment in which we observe the evolution of the analysis model of an information system through 15 successive versions. We use indicators on the underlying concept lattices built by applying Relational Concept Analysis (RCA) to each version. RCA is an extension of FCA which groups entities based on characteristics they share, including links to other entities. It here helps in analyzing their evolution. From this experience, we establish recommendations to monitor and verify the proper evolution of the analysis process

    CUR Decompositions, Similarity Matrices, and Subspace Clustering

    Get PDF
    A general framework for solving the subspace clustering problem using the CUR decomposition is presented. The CUR decomposition provides a natural way to construct similarity matrices for data that come from a union of unknown subspaces U=⋃Mi=1Si\mathscr{U}=\underset{i=1}{\overset{M}\bigcup}S_i. The similarity matrices thus constructed give the exact clustering in the noise-free case. Additionally, this decomposition gives rise to many distinct similarity matrices from a given set of data, which allow enough flexibility to perform accurate clustering of noisy data. We also show that two known methods for subspace clustering can be derived from the CUR decomposition. An algorithm based on the theoretical construction of similarity matrices is presented, and experiments on synthetic and real data are presented to test the method. Additionally, an adaptation of our CUR based similarity matrices is utilized to provide a heuristic algorithm for subspace clustering; this algorithm yields the best overall performance to date for clustering the Hopkins155 motion segmentation dataset.Comment: Approximately 30 pages. Current version contains improved algorithm and numerical experiments from the previous versio

    Curriculum Guidelines for Undergraduate Programs in Data Science

    Get PDF
    The Park City Math Institute (PCMI) 2016 Summer Undergraduate Faculty Program met for the purpose of composing guidelines for undergraduate programs in Data Science. The group consisted of 25 undergraduate faculty from a variety of institutions in the U.S., primarily from the disciplines of mathematics, statistics and computer science. These guidelines are meant to provide some structure for institutions planning for or revising a major in Data Science

    Image-based Recommendations on Styles and Substitutes

    Full text link
    Humans inevitably develop a sense of the relationships between objects, some of which are based on their appearance. Some pairs of objects might be seen as being alternatives to each other (such as two pairs of jeans), while others may be seen as being complementary (such as a pair of jeans and a matching shirt). This information guides many of the choices that people make, from buying clothes to their interactions with each other. We seek here to model this human sense of the relationships between objects based on their appearance. Our approach is not based on fine-grained modeling of user annotations but rather on capturing the largest dataset possible and developing a scalable method for uncovering human notions of the visual relationships within. We cast this as a network inference problem defined on graphs of related images, and provide a large-scale dataset for the training and evaluation of the same. The system we develop is capable of recommending which clothes and accessories will go well together (and which will not), amongst a host of other applications.Comment: 11 pages, 10 figures, SIGIR 201

    Sparse and Nonnegative Factorizations For Music Understanding

    Get PDF
    In this dissertation, we propose methods for sparse and nonnegative factorization that are specifically suited for analyzing musical signals. First, we discuss two constraints that aid factorization of musical signals: harmonic and co-occurrence constraints. We propose a novel dictionary learning method that imposes harmonic constraints upon the atoms of the learned dictionary while allowing the dictionary size to grow appropriately during the learning procedure. When there is significant spectral-temporal overlap among the musical sources, our method outperforms popular existing matrix factorization methods as measured by the recall and precision of learned dictionary atoms. We also propose co-occurrence constraints -- three simple and convenient multiplicative update rules for nonnegative matrix factorization (NMF) that enforce dependence among atoms. Using examples in music transcription, we demonstrate the ability of these updates to represent each musical note with multiple atoms and cluster the atoms for source separation purposes. Second, we study how spectral and temporal information extracted by nonnegative factorizations can improve upon musical instrument recognition. Musical instrument recognition in melodic signals is difficult, especially for classification systems that rely entirely upon spectral information instead of temporal information. Here, we propose a simple and effective method of combining spectral and temporal information for instrument recognition. While existing classification methods use traditional features such as statistical moments, we extract novel features from spectral and temporal atoms generated by NMF using a biologically motivated multiresolution gamma filterbank. Unlike other methods that require thresholds, safeguards, and hierarchies, the proposed spectral-temporal method requires only simple filtering and a flat classifier. Finally, we study how to perform sparse factorization when a large dictionary of musical atoms is already known. Sparse coding methods such as matching pursuit (MP) have been applied to problems in music information retrieval such as transcription and source separation with moderate success. However, when the set of dictionary atoms is large, identification of the best match in the dictionary with the residual is slow -- linear in the size of the dictionary. Here, we propose a variant called approximate matching pursuit (AMP) that is faster than MP while maintaining scalability and accuracy. Unlike MP, AMP uses an approximate nearest-neighbor (ANN) algorithm to find the closest match in a dictionary in sublinear time. One such ANN algorithm, locality-sensitive hashing (LSH), is a probabilistic hash algorithm that places similar, yet not identical, observations into the same bin. While the accuracy of AMP is comparable to similar MP methods, the computational complexity is reduced. Also, by using LSH, this method scales easily; the dictionary can be expanded without reorganizing any data structures

    Tensor factorizations of local second-order Møller–Plesset theory

    Get PDF
    Efficient electronic structure methods can be built around efficient tensor representations of the wavefunction. Here we first describe a general view of tensor factorization for the compact representation of electronic wavefunctions. Next, we use this language to construct a low-complexity representation of the doubles amplitudes in local second-order Møller–Plesset perturbation theory. We introduce two approximations—the direct orbital-specific virtual approximation and the full orbital-specific virtual approximation. In these approximations, each occupied orbital is associated with a small set of correlating virtual orbitals. Conceptually, the representation lies between the projected atomic orbital representation in Pulay–Saebø local correlation theories and pair natural orbital correlation theories. We have tested the orbital-specific virtual approximations on a variety of systems and properties including total energies, reaction energies, and potential energy curves. Compared to the Pulay–Saebø ansatz, we find that these approximations exhibit favorable accuracy and computational times while yielding smooth potential energy curves
    • …
    corecore