Search CORE

14,967 research outputs found

Algorithmic Programming Language Identification

Author: Klein David
Murray Kyle
Weber Simon
Publication venue
Publication date: 01/01/2011
Field of study

Motivated by the amount of code that goes unidentified on the web, we introduce a practical method for algorithmically identifying the programming language of source code. Our work is based on supervised learning and intelligent statistical features. We also explored, but abandoned, a grammatical approach. In testing, our implementation greatly outperforms that of an existing tool that relies on a Bayesian classifier. Code is written in Python and available under an MIT license.Comment: 11 pages. Code: https://github.com/simon-weber/Programming-Language-Identificatio

arXiv.org e-Print Archive

CiteSeerX

English grammar, punctuation and spelling test framework: end of Key Stage 2 framework for assessment 2013–2015

Author
Publication venue: Standards and Testing Agency
Publication date
Field of study

Digital Education Resource Archive

Crossings as a side effect of dependency lengths

Author: Bick
Christensen
Conover
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Futrell
Gibson
Gildea
Gildea
Gómez-Rodríguez
Hays
Hochberg
Hudson
Iwatate
Jiang
Kawata
Kelih
Liu
Lu
Newman
Poirier
Popper
Prokhorov
Ramasamy
Tanaka
Temperley
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

The syntactic structure of sentences exhibits a striking regularity: dependencies tend to not cross when drawn above the sentence. We investigate two competing explanations. The traditional hypothesis is that this trend arises from an independent principle of syntax that reduces crossings practically to zero. An alternative to this view is the hypothesis that crossings are a side effect of dependency lengths, i.e. sentences with shorter dependency lengths should tend to have fewer crossings. We are able to reject the traditional view in the majority of languages considered. The alternative hypothesis can lead to a more parsimonious theory of language.Comment: the discussion section has been expanded significantly; in press in Complexity (Wiley

arXiv.org e-Print Archive

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News

Author: Adali Sibel
Horne Benjamin D.
Publication venue
Publication date: 28/03/2017
Field of study

The problem of fake news has gained a lot of attention as it is claimed to have had a significant impact on 2016 US Presidential Elections. Fake news is not a new problem and its spread in social networks is well-studied. Often an underlying assumption in fake news discussion is that it is written to look like real news, fooling the reader who does not check for reliability of the sources or the arguments in its content. Through a unique study of three data sets and features that capture the style and the language of articles, we show that this assumption is not true. Fake news in most cases is more similar to satire than to real news, leading us to conclude that persuasion in fake news is achieved through heuristics rather than the strength of arguments. We show overall title structure and the use of proper nouns in titles are very significant in differentiating fake from real. This leads us to conclude that fake news is targeted for audiences who are not likely to read beyond titles and is aimed at creating mental associations between entities and claims.Comment: Published at The 2nd International Workshop on News and Public Opinion at ICWS

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Teaching University Students to Read and Write

Author: Daylight Russell
O'Carroll John
Publication venue: 'University of Technology, Sydney (UTS)'
Publication date: 31/01/2020
Field of study

Recent government initiatives have required universities to include specific literacy and numeracy targets for the students. The authors – both members of the English discipline at Charles Sturt University – were invited to develop and run a two-semester program for all students studying to become early childhood, primary, and secondary teachers. This article outlines the nature of the two subjects which comprise the program: the first focused on reading and comprehension, the second on writing and composition. These subjects were conceived from collegial dialogues between academics in education and the humanities, and then developed from these different assumptions and starting points. Over the last five years, the shared experiences of teaching these prospective teachers has grown into a strongly coherent first year of study. This article seeks the describe the experiences of teaching literacy to first-year education students, and it is by turns hypothesising and speculative, reflective and qualitative, in its approach. In the process, this article offers colleagues across the country a reflection on the hypotheses of literacy education, some new ideas for teaching literacy, and some optimism for the future of the teaching profession, and the dignity of those who aspire to be a part of it

UTS ePress

Improving the translation environment for professional translators

Author: Augustinus Liesbeth
Bulté Bram
Buysschaert Joost
Coppers Sven
Daems Joke
Heyman Geert
Hoste Veronique
Lefever Els
Luyten Kris
Macken Lieve
Moens Marie-Francine
Pelemans Joris
Rigouts Terryn Ayla
Steurs Frieda
Tezcan Arda
Van den Bergh Jan
van der Lek-Ciudin Iulianna
Van Eynde Frank
Vanallemeersch Tom
Vandeghinste Vincent
Verwimp Lyan
Wambacq Patrick
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side. This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Efficient deep processing of japanese

Author: Bender Emily M.
Siegel Melanie
Publication venue
Publication date: 01/01/2002
Field of study

We present a broad coverage Japanese grammar written in the HPSG formalism with MRS semantics. The grammar is created for use in real world applications, such that robustness and performance issues play an important role. It is connected to a POS tagging and word segmentation tool. This grammar is being developed in a multilingual context, requiring MRS structures that are easily comparable across languages

arXiv.org e-Print Archive

Crossref

Publications at Bielefeld University

Hochschulschriftenserver - Universität Frankfurt am Main