182,148 research outputs found

    On simple matrix languages versus scattered context languages

    Full text link

    An approach to computing downward closures

    Full text link
    The downward closure of a word language is the set of all (not necessarily contiguous) subwords of its members. It is well-known that the downward closure of any language is regular. While the downward closure appears to be a powerful abstraction, algorithms for computing a finite automaton for the downward closure of a given language have been established only for few language classes. This work presents a simple general method for computing downward closures. For language classes that are closed under rational transductions, it is shown that the computation of downward closures can be reduced to checking a certain unboundedness property. This result is used to prove that downward closures are computable for (i) every language class with effectively semilinear Parikh images that are closed under rational transductions, (ii) matrix languages, and (iii) indexed languages (equivalently, languages accepted by higher-order pushdown automata of order 2).Comment: Full version of contribution to ICALP 2015. Comments welcom

    On the Reproducibility and Generalisation of the Linear Transformation of Word Embeddings

    Get PDF
    Linear transformation is a way to learn a linear relationship between two word embeddings, such that words in the two different embedding spaces can be semantically related. In this paper, we examine the reproducibility and generalisation of the linear transformation of word embeddings. Linear transformation is particularly useful when translating word embedding models in different languages, since it can capture the semantic relationships between two models. We first reproduce two linear transformation approaches, a recent one using orthogonal transformation and the original one using simple matrix transformation. Previous findings on a machine translation task are re-examined, validating that linear transformation is indeed an effective way to transform word embedding models in different languages. In particular, we show that the orthogonal transformation can better relate the different embedding models. Following the verification of previous findings, we then study the generalisation of linear transformation in a multi-language Twitter election classification task. We observe that the orthogonal transformation outperforms the matrix transformation. In particular, it significantly outperforms the random classifier by at least 10% under the F1 metric across English and Spanish datasets. In addition, we also provide best practices when using linear transformation for multi-language Twitter election classification

    Matrix Languages, Register Machines, Vector Addition Systems

    Get PDF
    We give a direct and simple proof of the equality of Parikh images of lan- guages generated by matrix grammars with appearance checking with the sets of vectors generated by register machines. As a particular case, we get the equality of the Parikh images of languages generated by matrix grammars without appearance checking with the sets of vectors generated by partially blind register machines. Then, we consider pure matrix grammars (i.e., grammars which do not distinguish terminal and nonterminal symbols), and prove the inclusion of the family of Parikh images of languages generated by such grammars (without appearance checking) in the family of sets of vectors generated by blind register machines, as well as the inclusion of reachability sets of vector addition systems in the family of Parikh images of pure matrix languages. For pure matrix grammars with a certain restriction on the form of matrices, also the converse of the latter inclusion is obtained. Thus, in view of the result from, we obtain the semilin- earity of languages generated by pure matrix grammars (without appearance checking) with alphabets with at most five letters, with the considered restrictions on the form of matrices. A pure matrix grammar with five symbols, but without restrictions on the form of matrices, is produced which generates a non-semilinear language

    Computationally efficient min-max MPC

    Get PDF
    2005 IFAC 16th Triennial World Congress, Prague, Czech RepublicMin-Max MPC (MMMPC) controllers (Campo and Morari, 1987) suffer from a great computational burden that is often circumvented by using upper bounds of the worst possible case of a performance index. These upper bounds are usually computed by means of LMI techniques. In this paper a more efficient approach is shown. This paper proposes a computationally efficient MMMPC control strategy in which the worst case cost is approximated by an upper bound which can be easily computed using simple matrix operations. This implies that the algorithm can be coded easily even in non mathematical oriented programming languages such as those found in industrial embedded control hardware. Simulation examples are given in the paper

    Min-Max MPC based on a computationally efficient upper bound of the worst case cost

    Get PDF
    Min-Max MPC (MMMPC) controllers [P.J. Campo, M. Morari, Robust model predictive control, in: Proc. American Control Conference, June 10–12, 1987, pp. 1021–1026] suffer from a great computational burden which limits their applicability in the industry. Sometimes upper bounds of the worst possible case of a performance index have been used to reduce the computational burden. This paper proposes a computationally efficient MMMPC control strategy in which the worst case cost is approximated by an upper bound based on a diagonalization scheme. The upper bound can be computed with O(n3) operations and using only simple matrix operations. This implies that the algorithm can be coded easily even in non-mathematical oriented programming languages such as those found in industrial embedded control hardware. A simulation example is given in the paper

    Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages

    Full text link
    Pre-trained multilingual language models underpin a large portion of modern NLP tools outside of English. A strong baseline for specializing these models for specific languages is Language-Adaptive Pre-Training (LAPT). However, retaining a large cross-lingual vocabulary and embedding matrix comes at considerable excess computational cost during adaptation. In this study, we propose several simple techniques to replace a cross-lingual vocabulary with a compact, language-specific one. Namely, we address strategies for re-initializing the token embedding matrix after vocabulary specialization. We then provide a systematic experimental comparison of our techniques, in addition to the recently-proposed Focus method. We demonstrate that: 1) Embedding-replacement techniques in the monolingual transfer literature are inadequate for adapting multilingual models. 2) Replacing cross-lingual vocabularies with smaller specialized ones provides an efficient method to improve performance in low-resource languages. 3) Simple embedding re-initialization techniques based on script-wise sub-distributions rival techniques such as Focus, which rely on similarity scores obtained from an auxiliary model
    • …
    corecore