12,238 research outputs found

    Maximum entropy models capture melodic styles

    Full text link
    We introduce a Maximum Entropy model able to capture the statistics of melodies in music. The model can be used to generate new melodies that emulate the style of the musical corpus which was used to train it. Instead of using the n−n-body interactions of (n−1)−(n-1)-order Markov models, traditionally used in automatic music generation, we use a k−k-nearest neighbour model with pairwise interactions only. In that way, we keep the number of parameters low and avoid over-fitting problems typical of Markov models. We show that long-range musical phrases don't need to be explicitly enforced using high-order Markov interactions, but can instead emerge from multiple, competing, pairwise interactions. We validate our Maximum Entropy model by contrasting how much the generated sequences capture the style of the original corpus without plagiarizing it. To this end we use a data-compression approach to discriminate the levels of borrowing and innovation featured by the artificial sequences. The results show that our modelling scheme outperforms both fixed-order and variable-order Markov models. This shows that, despite being based only on pairwise interactions, this Maximum Entropy scheme opens the possibility to generate musically sensible alterations of the original phrases, providing a way to generate innovation

    Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art

    Get PDF
    Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration methods to make use of the extracted information. Handling uncertainty in extraction and integration process is an important issue to enhance the quality of the data in such integrated systems. This article presents the state of the art of the mentioned areas of research and shows the common grounds and how to integrate information extraction and data integration under uncertainty management cover

    Many Roads to Synchrony: Natural Time Scales and Their Algorithms

    Full text link
    We consider two important time scales---the Markov and cryptic orders---that monitor how an observer synchronizes to a finitary stochastic process. We show how to compute these orders exactly and that they are most efficiently calculated from the epsilon-machine, a process's minimal unifilar model. Surprisingly, though the Markov order is a basic concept from stochastic process theory, it is not a probabilistic property of a process. Rather, it is a topological property and, moreover, it is not computable from any finite-state model other than the epsilon-machine. Via an exhaustive survey, we close by demonstrating that infinite Markov and infinite cryptic orders are a dominant feature in the space of finite-memory processes. We draw out the roles played in statistical mechanical spin systems by these two complementary length scales.Comment: 17 pages, 16 figures: http://cse.ucdavis.edu/~cmg/compmech/pubs/kro.htm. Santa Fe Institute Working Paper 10-11-02

    On Joint Source-Channel Coding for Correlated Sources Over Multiple-Access Relay Channels

    Get PDF
    We study the transmission of correlated sources over discrete memoryless (DM) multiple-access-relay channels (MARCs), in which both the relay and the destination have access to side information arbitrarily correlated with the sources. As the optimal transmission scheme is an open problem, in this work we propose a new joint source-channel coding scheme based on a novel combination of the correlation preserving mapping (CPM) technique with Slepian-Wolf (SW) source coding, and obtain the corresponding sufficient conditions. The proposed coding scheme is based on the decode-and-forward strategy, and utilizes CPM for encoding information simultaneously to the relay and the destination, whereas the cooperation information from the relay is encoded via SW source coding. It is shown that there are cases in which the new scheme strictly outperforms the schemes available in the literature. This is the first instance of a source-channel code that uses CPM for encoding information to two different nodes (relay and destination). In addition to sufficient conditions, we present three different sets of single-letter necessary conditions for reliable transmission of correlated sources over DM MARCs. The newly derived conditions are shown to be at least as tight as the previously known necessary conditions.Comment: Accepted to TI

    Low-Complexity Approaches to Slepian–Wolf Near-Lossless Distributed Data Compression

    Get PDF
    This paper discusses the Slepian–Wolf problem of distributed near-lossless compression of correlated sources. We introduce practical new tools for communicating at all rates in the achievable region. The technique employs a simple “source-splitting” strategy that does not require common sources of randomness at the encoders and decoders. This approach allows for pipelined encoding and decoding so that the system operates with the complexity of a single user encoder and decoder. Moreover, when this splitting approach is used in conjunction with iterative decoding methods, it produces a significant simplification of the decoding process. We demonstrate this approach for synthetically generated data. Finally, we consider the Slepian–Wolf problem when linear codes are used as syndrome-formers and consider a linear programming relaxation to maximum-likelihood (ML) sequence decoding. We note that the fractional vertices of the relaxed polytope compete with the optimal solution in a manner analogous to that observed when the “min-sum” iterative decoding algorithm is applied. This relaxation exhibits the ML-certificate property: if an integral solution is found, it is the ML solution. For symmetric binary joint distributions, we show that selecting easily constructable “expander”-style low-density parity check codes (LDPCs) as syndrome-formers admits a positive error exponent and therefore provably good performance
    • 

    corecore