138,999 research outputs found
On Hilberg's Law and Its Links with Guiraud's Law
Hilberg (1990) supposed that finite-order excess entropy of a random human
text is proportional to the square root of the text length. Assuming that
Hilberg's hypothesis is true, we derive Guiraud's law, which states that the
number of word types in a text is greater than proportional to the square root
of the text length. Our derivation is based on some mathematical conjecture in
coding theory and on several experiments suggesting that words can be defined
approximately as the nonterminals of the shortest context-free grammar for the
text. Such operational definition of words can be applied even to texts
deprived of spaces, which do not allow for Mandelbrot's ``intermittent
silence'' explanation of Zipf's and Guiraud's laws. In contrast to
Mandelbrot's, our model assumes some probabilistic long-memory effects in human
narration and might be capable of explaining Menzerath's law.Comment: To appear in Journal of Quantitative Linguistic
A CNL for Contract-Oriented Diagrams
We present a first step towards a framework for defining and manipulating
normative documents or contracts described as Contract-Oriented (C-O) Diagrams.
These diagrams provide a visual representation for such texts, giving the
possibility to express a signatory's obligations, permissions and prohibitions,
with or without timing constraints, as well as the penalties resulting from the
non-fulfilment of a contract. This work presents a CNL for verbalising C-O
Diagrams, a web-based tool allowing editing in this CNL, and another for
visualising and manipulating the diagrams interactively. We then show how these
proof-of-concept tools can be used by applying them to a small example
Recommended from our members
Music-reading expertise modulates the visual span for English letters but not Chinese characters.
Recent research has suggested that the visual span in stimulus identification can be enlarged through perceptual learning. Since both English and music reading involve left-to-right sequential symbol processing, music-reading experience may enhance symbol identification through perceptual learning particularly in the right visual field (RVF). In contrast, as Chinese can be read in all directions, and components of Chinese characters do not consistently form a left-right structure, this hypothesized RVF enhancement effect may be limited in Chinese character identification. To test these hypotheses, here we recruited musicians and nonmusicians who read Chinese as their first language (L1) and English as their second language (L2) to identify music notes, English letters, Chinese characters, and novel symbols (Tibetan letters) presented at different eccentricities and visual field locations on the screen while maintaining central fixation. We found that in English letter identification, significantly more musicians achieved above-chance performance in the center-RVF locations than nonmusicians. This effect was not observed in Chinese character or novel symbol identification. We also found that in music note identification, musicians outperformed nonmusicians in accuracy in the center-RVF condition, consistent with the RVF enhancement effect in the visual span observed in English-letter identification. These results suggest that the modulation of music-reading experience on the visual span for stimulus identification depends on the similarities in the perceptual processes involved
Frame-Based Editing: Easing the Transition from Blocks to Text-Based Programming
Block-based programming systems, such as Scratch or Alice, are the most popular environments for introducing young children to programming. However, mastery of text-based programming continues to be the educational goal for stu- dents who continue to program into their teenage years and beyond. Transitioning across the significant gap between the two editing styles presents a difficult challenge in school- level teaching of programming. We propose a new style of program manipulation to bridge the gap: frame-based edit- ing. Frame-based editing has the resistance to errors and approachability of block-based programming while retaining the flexibility and more conventional programming seman- tics of text-based programming languages. In this paper, we analyse the issues involved in the transition from blocks to text and argue that they can be overcome by using frame- based editing as an intermediate step. A design and imple- mentation of a frame-based editor is provided
SPARQL Playground: A block programming tool to experiment with SPARQL
SPARQL is a powerful query language for SemanticWeb data sources but one which is quite complex to master. As the block programming paradigm has been succesfully used to teach programming skills, we propose a tool that allows users to build and run SPARQL queries on an endpoint without previous knowledge of the syntax of SPARQL and the model of the data in the endpoint (vocabularies and semantics). This user interface attempts to close the gap between tools for the lay user that do not allow to express complex queries and overtly complex technical tools
The Validation of Speech Corpora
1.2 Intended audience........................
Declarative Specification
Deriving formal specifications from informal requirements is extremely difficult since one has to overcome the conceptual gap between an application domain and the domain of formal specification methods. To reduce this gap we introduce application-specific specification languages, i.e., graphical and textual notations that can be unambiguously mapped to formal specifications in a logic language. We describe a number of realised approaches based on this idea, and evaluate them with respect to their domain specificity vs. generalit
SurveyMan: Programming and Automatically Debugging Surveys
Surveys can be viewed as programs, complete with logic, control flow, and
bugs. Word choice or the order in which questions are asked can unintentionally
bias responses. Vague, confusing, or intrusive questions can cause respondents
to abandon a survey. Surveys can also have runtime errors: inattentive
respondents can taint results. This effect is especially problematic when
deploying surveys in uncontrolled settings, such as on the web or via
crowdsourcing platforms. Because the results of surveys drive business
decisions and inform scientific conclusions, it is crucial to make sure they
are correct.
We present SurveyMan, a system for designing, deploying, and automatically
debugging surveys. Survey authors write their surveys in a lightweight
domain-specific language aimed at end users. SurveyMan statically analyzes the
survey to provide feedback to survey authors before deployment. It then
compiles the survey into JavaScript and deploys it either to the web or a
crowdsourcing platform. SurveyMan's dynamic analyses automatically find survey
bugs, and control for the quality of responses. We evaluate SurveyMan's
algorithms analytically and empirically, demonstrating its effectiveness with
case studies of social science surveys conducted via Amazon's Mechanical Turk.Comment: Submitted version; accepted to OOPSLA 201
- âŚ