4,555 research outputs found

    Is a Dataframe Just a Table?

    Get PDF
    Querying data is core to databases and data science. However, the two communities have seemingly different concepts and use cases. As a result, both designers and users of the query languages disagree on whether the core abstractions - dataframes (data science) and tables (databases) - and the operations are the same. To investigate the difference from a PL-HCI perspective, we identify the basic affordances provided by tables and dataframes and how programming experiences over tables and dataframes differ. We show that the data structures nudge programmers to query and store their data in different ways. We hope the case study could clarify confusions, dispel misinformation, increase cross-pollination between the two communities, and identify open PL-HCI questions

    Syntax Analysis on The News Title of Cyber Media on Detik Twitter Account (@Detikcom)

    Get PDF
    Penelitian ini berkaitan dengan penggunaan media siber sintaksis. Dalam hal ini, peneliti telah melakukan penelitian tentang berita utama di akun Twitter @detikcom. Akun Twitter @detikcom adalah akun resmi dari situs www.detik.com. Penelitian ini bertujuan untuk menjelaskan jenis-jenis kalimat dan mengkategorikan jenis-jenis kata dalam akun Twitter@detikcom timeline. Dalam penelitian ini, peneliti menggunakan metode deskriptif kualitatif. Hasil dianalisis menggunakan pendekatan sintaksis. Dalam pengumpulan data, peneliti menggunakan penelitian bahasa sinkron dengan mengamati fenomena bahasa pada waktu tertentu. Hasil penelitian menunjukkan bahwa ada empata jenis kalimat di akun Twitter@detikcom berdasarkan struktur sintaksis, yaitu 93 kalimat lengkap, 7 kalimat lengkap, 4 kalimat ambigu dan 13 kalimat ellipsis. Berkaitan dengan 13 kategori jenis kata yang diusulkan oleh Harimurti Kridalaksana, peneliti menemukan 5 kategori yaitu kata benda, kata kerja, kata depan, kata sifat dan kata keterangan. Kata Kunci: sintaksis, berita utama, twitte

    Abstract Syntax Networks for Code Generation and Semantic Parsing

    Full text link
    Tasks like code generation and semantic parsing require mapping unstructured (or partially structured) inputs to well-formed, executable outputs. We introduce abstract syntax networks, a modeling framework for these problems. The outputs are represented as abstract syntax trees (ASTs) and constructed by a decoder with a dynamically-determined modular structure paralleling the structure of the output tree. On the benchmark Hearthstone dataset for code generation, our model obtains 79.2 BLEU and 22.7% exact match accuracy, compared to previous state-of-the-art values of 67.1 and 6.1%. Furthermore, we perform competitively on the Atis, Jobs, and Geo semantic parsing datasets with no task-specific engineering.Comment: ACL 2017. MR and MS contributed equall

    A generalized strategy for building resident database interfaces

    Get PDF
    A strategy for building resident interfaces to host heterogeneous distributed data base management systems is developed. The strategy is used to construct several interfaces. A set of guidelines is developed for users to construct their own interfaces

    GENERATING AMHARIC PRESENT TENSE VERBS: A NETWORK MORPHOLOGY & DATR ACCOUNT

    Get PDF
    In this thesis I attempt to model, that is, computationally reproduce, the natural transmission (i.e. inflectional regularities) of twenty present tense Amharic verbs (i.e. triradicals beginning with consonants) as used by the language’s speakers. I root my approach in the linguistic theory of network morphology (NM) and model it using the DATR evaluator. In Chapter 1, I provide an overview of Amharic and discuss the fidel as an abugida, the verb system’s root-and-pattern morphology, and how radicals of each lexeme interacts with prefixes and suffixes. I offer an overview of NM in Chapter 2 and DATR in Chapter 3. In both chapters I draw attention to and help interpret key terms used among scholars doing work in both fields. In Chapter 4 I set forth my full theory, along with notation, for generating the paradigms of twenty present tense Amharic verbs that follow four different patterns. Chapter 5, the final chapter, contains a summary and offers several conclusions. I provide the DATR output in the Appendix. In writing, my main hope is that this project will make a contribution, however minimal or sizeable, that might advance the field of Amharic studies in particular and (computational) linguistics in general

    LAKI VERBAL MORPHOSYNTAX

    Get PDF
    Most western Iranian languages, despite their broad differences, show a common quality when it comes to the verbal agreement of past transitive verbs. Dabir-moghaddam (2013) and Haig (2008) discuss it as a grammaticalized split-agreement to encode S, A, and P, which is sensitive to tense and transitivity, and uses split-ergative constructions for its past transitive verbs. Laki shows vestiges of the same kind of verb-agreement ergativity (Comrie 1978) by using a mixture of affixes and clitics for subject and object marking. In this thesis, I investigate how the different classes of verbs show agreement using four distinct property classes. Considering the special case of the {3 sg} and using Hopper and Traugott\u27s pattern for the cline of grammaticality (2003), I argue that although Laki has already lost the main part of its ergative constructions, the case of the {3 sg} marking is yet another sign that this language is in the process of absolute de-ergativization and its hybrid alignment system is moving toward morphosyntactic unity. As a formal representation of the Laki data, the final part of the thesis provides a morphosyntactic HPSG analysis of the agreement patterns in Laki, using the grammar of cliticized verb-forms (Miller and Sag 1997)

    Stripping paradigmatic relations out of the syntax

    Get PDF

    Learning Language from a Large (Unannotated) Corpus

    Full text link
    A novel approach to the fully automated, unsupervised extraction of dependency grammars and associated syntax-to-semantic-relationship mappings from large text corpora is described. The suggested approach builds on the authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well as on a number of prior papers and approaches from the statistical language learning literature. If successful, this approach would enable the mining of all the information needed to power a natural language comprehension and generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa
    • …
    corecore