347,855 research outputs found

    Comparative Analysis of Urdu Based Stemming Techniques

    Get PDF
    Stemming reduces many variant forms of a word into its base, stem or root, which is necessary for many different language processing application including Urdu. Urdu is a morphologically rich and resourceful language. Multilingual Urdu words are very challenging to process due to complexity of morphology. The Research of Urdu stemming has an age of a decade. The present work introduces a research on Urdu stemmers with better performance as compare to the existing Urdu stemmer

    Monoids and the State Complexity of the Operation root(<i>L</i>)

    Get PDF
    In this thesis, we cover the general topic of state complexity. In particular, we examine the bounds on the state complexity of some different representations of regular languages. As well, we consider the state complexity of the operation root(L). We give quick treatment of the deterministic state complexity bounds for nondeterministic finite automata and regular expressions. This includes an improvement on the worst-case lower bound for a regular expression, relative to its alphabetic length. The focus of this thesis is the study of the increase in state complexity of a regular language L under the operation root(L). This operation requires us to examine the connections between abstract algebra and formal languages. We present results, some original to this thesis, concerning the size of the largest monoid generated by two elements. Also, we give good bounds on the worst-case state complexity of root(L). In turn, these new results concerning root(L) allow us to improve previous bounds given for the state complexity of two-way deterministic finite automata

    Tone lowering in nominal compounds of Copala Triqui

    Get PDF
    This paper examines the distribution of tone lowering in the nominal compounds of Copala Triqui (Otomanguean). We consider two types of compounds: fused compounds and unfused compounds, which differ with respect to their prosodic complexity. For both types of compounds, the second root noun lowers in some examples, while it maintains its lexically specified tone in others. In this paper, we consider the advantages and disadvantages of a lexical as compared to a structural approach to tone lowering. More specifically, tonal overlay may belong to either a categorical head or a specific syntactic configuration, as in McPherson (2014)'s description of tone overlay in the Dogon language famil

    Assembly Language

    Get PDF
    “Assembly Language” is a culmination of an exploration, through the medium of ceramics, in understanding complexity that arises through the interactions between simple components. In the realm of computer science, the term “Assembly Language” refers to a low-level programming language for any programmable digital device. It is typically just one step above writing in the raw ones and zeros of binary. Every program at some point needs to be translated into assembly language so that it can be understood by the device, and every program that has ever been written for a digital device is essentially composed of a series of these simple assembly language instructions. In this body of work, I use the metaphor of the role of assembly language in computer science to explore a similar process of breaking down complex systems into simple components and then using those simple components to construct new complex systems. The starting point for this investigation is the design of a root component that would have common physical interface points with other instances of that component. My choice of a root component is a five-degree tapered column with a height that is four times the length of one of the sides of its largest hexagonal end. I created a synthetic phylogeny of the components used in the creation of works for this show. A component’s ancestor within this phylogeny is the one with the most influence on the revisions to create the new component. All works created for this exploration are comprised solely of components that are ceramic instances of the components shown in the phylogeny. Each grouping highlights a novel interface between individual components joining together to form something more complex. Each work showcases a particular instance of this interfacing between instances of components to form a unique sculpture

    Privileged Words and Sturmian Words

    Get PDF
    This dissertation has two almost unrelated themes: privileged words and Sturmian words. Privileged words are a new class of words introduced recently. A word is privileged if it is a complete first return to a shorter privileged word, the shortest privileged words being letters and the empty word. Here we give and prove almost all results on privileged words known to date. On the other hand, the study of Sturmian words is a well-established topic in combinatorics on words. In this dissertation, we focus on questions concerning repetitions in Sturmian words, reproving old results and giving new ones, and on establishing completely new research directions. The study of privileged words presented in this dissertation aims to derive their basic properties and to answer basic questions regarding them. We explore a connection between privileged words and palindromes and seek out answers to questions on context-freeness, computability, and enumeration. It turns out that the language of privileged words is not context-free, but privileged words are recognizable by a linear-time algorithm. A lower bound on the number of binary privileged words of given length is proven. The main interest, however, lies in the privileged complexity functions of the Thue-Morse word and Sturmian words. We derive recurrences for computing the privileged complexity function of the Thue-Morse word, and we prove that Sturmian words are characterized by their privileged complexity function. As a slightly separate topic, we give an overview of a certain method of automated theorem-proving and show how it can be applied to study privileged factors of automatic words. The second part of this dissertation is devoted to Sturmian words. We extensively exploit the interpretation of Sturmian words as irrational rotation words. The essential tools are continued fractions and elementary, but powerful, results of Diophantine approximation theory. With these tools at our disposal, we reprove old results on powers occurring in Sturmian words with emphasis on the fractional index of a Sturmian word. Further, we consider abelian powers and abelian repetitions and characterize the maximum exponents of abelian powers with given period occurring in a Sturmian word in terms of the continued fraction expansion of its slope. We define the notion of abelian critical exponent for Sturmian words and explore its connection to the Lagrange spectrum of irrational numbers. The results obtained are often specialized for the Fibonacci word; for instance, we show that the minimum abelian period of a factor of the Fibonacci word is a Fibonacci number. In addition, we propose a completely new research topic: the square root map. We prove that the square root map preserves the language of any Sturmian word. Moreover, we construct a family of non-Sturmian optimal squareful words whose language the square root map also preserves.This construction yields examples of aperiodic infinite words whose square roots are periodic.Siirretty Doriast

    Time and Place in the Prehistory of the Aslian Languages

    Get PDF
    The Aslian language family, located in the Malay Peninsula and southern Thai Isthmus, consists of four distinct branches comprising some 18 languages. These languages predate the now dominant Malay and Thai. The speakers of Aslian languages exhibit some of the highest degree of phylogenetic and societal diversity present in Mainland Southeast Asia today, among them a foraging tradition particularly associated with locally ancient, Pleistocene genetic lineages. Little advance has been made in our understanding of the linguistic prehistory of this region or how such complexity arose. In this article we present a Bayesian phylogeographic analysis of a large sample of Aslian languages. An explicit geographic model of diffusion is combined with a cognate birth-word death model of lexical evolution to infer the location of the major events of Aslian cladogenesis. The resultant phylogenetic trees are calibrated against dates in the historical and archaeological record to infer a detailed picture of Aslian language history, addressing a number of outstanding questions, including (1) whether the root ancestor of Aslian was spoken in the Malay Peninsula, or whether the family had already divided before entry, and (2) the dynamics of the movement of Aslian languages across the peninsula, with a particular focus on its spread to the indigenous foragers

    Root finding with threshold circuits

    Get PDF
    We show that for any constant d, complex roots of degree d univariate rational (or Gaussian rational) polynomials---given by a list of coefficients in binary---can be computed to a given accuracy by a uniform TC^0 algorithm (a uniform family of constant-depth polynomial-size threshold circuits). The basic idea is to compute the inverse function of the polynomial by a power series. We also discuss an application to the theory VTC^0 of bounded arithmetic.Comment: 19 pages, 1 figur
    corecore