2,112 research outputs found

    Analysing symbolic music with probabilistic grammars

    Get PDF
    Recent developments in computational linguistics offer ways to approach the analysis of musical structure by inducing probabilistic models (in the form of grammars) over a corpus of music. These can produce idiomatic sentences from a probabilistic model of the musical language and thus offer explanations of the musical structures they model. This chapter surveys historical and current work in musical analysis using grammars, based on computational linguistic approaches. We outline the theory of probabilistic grammars and illustrate their implementation in Prolog using PRISM. Our experiments on learning the probabilities for simple grammars from pitch sequences in two kinds of symbolic musical corpora are summarized. The results support our claim that probabilistic grammars are a promising framework for computational music analysis, but also indicate that further work is required to establish their superiority over Markov models

    BOSS: Bayesian Optimization over String Spaces

    Get PDF
    This article develops a Bayesian optimization (BO) method which acts directly over raw strings, proposing the first uses of string kernels and genetic algorithms within BO loops. Recent applications of BO over strings have been hindered by the need to map inputs into a smooth and unconstrained latent space. Learning this projection is computationally and data-intensive. Our approach instead builds a powerful Gaussian process surrogate model based on string kernels, naturally supporting variable length inputs, and performs efficient acquisition function maximization for spaces with syntactical constraints. Experiments demonstrate considerably improved optimization over existing approaches across a broad range of constraints, including the popular setting where syntax is governed by a context-free grammar

    SynJax: Structured Probability Distributions for JAX

    Full text link
    The development of deep learning software libraries enabled significant progress in the field by allowing users to focus on modeling, while letting the library to take care of the tedious and time-consuming task of optimizing execution for modern hardware accelerators. However, this has benefited only particular types of deep learning models, such as Transformers, whose primitives map easily to the vectorized computation. The models that explicitly account for structured objects, such as trees and segmentations, did not benefit equally because they require custom algorithms that are difficult to implement in a vectorized form. SynJax directly addresses this problem by providing an efficient vectorized implementation of inference algorithms for structured distributions covering alignment, tagging, segmentation, constituency trees and spanning trees. With SynJax we can build large-scale differentiable models that explicitly model structure in the data. The code is available at https://github.com/deepmind/synjax

    Text analysis of handwritten production deviations

    Get PDF
    Companies want to understand the latest trends and summarize product status or public opinion based on social media data. Because data is rich and very diverse, there has been a need to create automated and real-time opinion polling and data mining. This need has contributed to the huge popularity of text analysis and at the same time the development and use of it is being applied to more and more industries. Not just for evaluating consumer feedback, for example. Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence which is focused to enable computers to understand and interpret human language. Its goal and strength is specifically to program computers to process and analyze large amounts of natural language. NLP technology can extract data accurately from text and classify and organize data. Using machine learning methods makes text analysis much faster and more efficient than manual word processing. The methods can be used to reduce labor costs and speed up the processing of texts without compromising on quality. The main focus of the thesis is to study the textual material received from the client and to develop a prediction model based on it using natural language processing (NLP) techniques. As a research strategy has been used a case study. The obtained text data, sentences about 9000, are from the period 2016/11-2018/9 from the production deviations observed in the welding and assembly process. Text sentences, i.e. user comments, were available at all stages from the detection of a deviation to its solution. This study has focused on the first observational comment written on the deviation. Based on them, a predictive model has been trained that can predict based on the given first comment, what can be the root cause of the deviation. The research material has been analyzed using both traditional machine learning methods and more advanced deep learning methods, pre-trained FinBERT and multilingual BERT. The accuracy of the model has been a key measure of the superiority of the model. The result was a reliable prediction model that can be used to predict when a deviation falls into class 100 (missing part) or class 200 (other deviations). The best accuracy of the traditional machine learning model was 85.7 % and of the transformer model was 82.6 %. The most common word in the all Finnish sentences was "puuttua" in different forms
    corecore