66,866 research outputs found

    A Flexible Shallow Approach to Text Generation

    Full text link
    In order to support the efficient development of NL generation systems, two orthogonal methods are currently pursued with emphasis: (1) reusable, general, and linguistically motivated surface realization components, and (2) simple, task-oriented template-based techniques. In this paper we argue that, from an application-oriented perspective, the benefits of both are still limited. In order to improve this situation, we suggest and evaluate shallow generation methods associated with increased flexibility. We advise a close connection between domain-motivated and linguistic ontologies that supports the quick adaptation to new tasks and domains, rather than the reuse of general resources. Our method is especially designed for generating reports with limited linguistic variations.Comment: LaTeX, 10 page

    The structure of the Arts & Humanities Citation Index: A mapping on the basis of aggregated citations among 1,157 journals

    Full text link
    Using the Arts & Humanities Citation Index (A&HCI) 2008, we apply mapping techniques previously developed for mapping journal structures in the Science and Social Science Citation Indices. Citation relations among the 110,718 records were aggregated at the level of 1,157 journals specific to the A&HCI, and the journal structures are questioned on whether a cognitive structure can be reconstructed and visualized. Both cosine-normalization (bottom up) and factor analysis (top down) suggest a division into approximately twelve subsets. The relations among these subsets are explored using various visualization techniques. However, we were not able to retrieve this structure using the ISI Subject Categories, including the 25 categories which are specific to the A&HCI. We discuss options for validation such as against the categories of the Humanities Indicators of the American Academy of Arts and Sciences, the panel structure of the European Reference Index for the Humanities (ERIH), and compare our results with the curriculum organization of the Humanities Section of the College of Letters and Sciences of UCLA as an example of institutional organization

    Presenting GECO : an eyetracking corpus of monolingual and bilingual sentence reading

    Get PDF
    This paper introduces GECO, the Ghent Eye-tracking Corpus, a monolingual and bilingual corpus of eye-tracking data of participants reading a complete novel. English monolinguals and Dutch-English bilinguals read an entire novel, which was presented in paragraphs on the screen. The bilinguals read half of the novel in their first language, and the other half in their second language. In this paper we describe the distributions and descriptive statistics of the most important reading time measures for the two groups of participants. This large eye-tracking corpus is perfectly suited for both exploratory purposes as well as more directed hypothesis testing, and it can guide the formulation of ideas and theories about naturalistic reading processes in a meaningful context. Most importantly, this corpus has the potential to evaluate the generalizability of monolingual and bilingual language theories and models to reading of long texts and narratives

    Four Lessons in Versatility or How Query Languages Adapt to the Web

    Get PDF
    Exposing not only human-centered information, but machine-processable data on the Web is one of the commonalities of recent Web trends. It has enabled a new kind of applications and businesses where the data is used in ways not foreseen by the data providers. Yet this exposition has fractured the Web into islands of data, each in different Web formats: Some providers choose XML, others RDF, again others JSON or OWL, for their data, even in similar domains. This fracturing stifles innovation as application builders have to cope not only with one Web stack (e.g., XML technology) but with several ones, each of considerable complexity. With Xcerpt we have developed a rule- and pattern based query language that aims to give shield application builders from much of this complexity: In a single query language XML and RDF data can be accessed, processed, combined, and re-published. Though the need for combined access to XML and RDF data has been recognized in previous work (including the W3Cā€™s GRDDL), our approach differs in four main aspects: (1) We provide a single language (rather than two separate or embedded languages), thus minimizing the conceptual overhead of dealing with disparate data formats. (2) Both the declarative (logic-based) and the operational semantics are unified in that they apply for querying XML and RDF in the same way. (3) We show that the resulting query language can be implemented reusing traditional database technology, if desirable. Nevertheless, we also give a unified evaluation approach based on interval labelings of graphs that is at least as fast as existing approaches for tree-shaped XML data, yet provides linear time and space querying also for many RDF graphs. We believe that Web query languages are the right tool for declarative data access in Web applications and that Xcerpt is a significant step towards a more convenient, yet highly efficient data access in a ā€œWeb of Dataā€

    A Perfect Match for Reasoning, Explanation, and Reason Maintenance

    Get PDF
    Path query languages have been previously shown to com- plement RDF rule languages in a natural way and have been used as a means to implement the RDFS derivation rules. RPL is a novel path query language specifically designed to be incorporated with RDF rules and comes in three avors: Node-, edge- and path- avored expressions allow to express conditional regular expressions over the nodes, edges, or nodes and edges appearing on paths within RDF graphs. Providing reg- ular string expressions and negation, RPL is more expressive than other RDF path languages that have been proposed. We give a compositional semantics for RPL and show that it can be evaluated efficiently, while several possible extensions of it cannot
    • ā€¦
    corecore