42,142 research outputs found

    A graphical environment for change detection in structured documents

    Get PDF
    Change detection in structured documents (e.g. SGML is important in data warehousing, digital libraries and Internet databases. This thesis presents a graphical environment for detecting changes in the structured documents. We represent. each document by alp ordered labeled tree based on the underlying markup language. We then compare two documents by invoking previously developed algorithms for approximate pattern matching and pattern discovery in trees. Several operators are developed to support. the comparison of the documents; graphical devices are provided to facilitate the use of the operators. We believe the proposed tool is useful for not only document management, but also software maintenance, particularly configuration management and version control, where programs aro represented as parse trees and detecting changes in the trees provides a way to find the syntactic differences of two program versions

    New Methods, Current Trends and Software Infrastructure for NLP

    Full text link
    The increasing use of `new methods' in NLP, which the NeMLaP conference series exemplifies, occurs in the context of a wider shift in the nature and concerns of the discipline. This paper begins with a short review of this context and significant trends in the field. The review motivates and leads to a set of requirements for support software of general utility for NLP research and development workers. A freely-available system designed to meet these requirements is described (called GATE - a General Architecture for Text Engineering). Information Extraction (IE), in the sense defined by the Message Understanding Conferences (ARPA \cite{Arp95}), is an NLP application in which many of the new methods have found a home (Hobbs \cite{Hob93}; Jacobs ed. \cite{Jac92}). An IE system based on GATE is also available for research purposes, and this is described. Lastly we review related work.Comment: 12 pages, LaTeX, uses nemlap.sty (included

    Content delivery and challenges in education hybrid students

    Get PDF
    Traditionally, taught postgraduate programmes placed students in well-defined categories such as 'distance learning' and 'on-campus' or 'part-time' and 'full-time'. The practical reality is that postgraduate students rarely fall into such simple, diametric roles and can be more suitably generalised under the concept of the 'hybrid student'. Hybrid students are dynamic, with changing requirements in relation to their education. They expect flexibility and the ability to make changes relating to module participation level, study mechanism and lecture attendance, in order to suit personal preference and circumstance. This paper briefly introduces the concept of the hybrid student and how the concept has been handled within the School of Electronic Engineering at DCU. Following this, some discussion is provided in relation to a number of the content delivery technologies used in programmes facilitating these students: HTML, PowerPoint, Moodle, DocBook and Wiki. Finally, some of the general challenges, which have been encountered in supporting such diverse students, are briefly discussed

    Submission of content to a digital object repository using a configurable workflow system

    Full text link
    The prototype of a workflow system for the submission of content to a digital object repository is here presented. It is based entirely on open-source standard components and features a service-oriented architecture. The front-end consists of Java Business Process Management (jBPM), Java Server Faces (JSF), and Java Server Pages (JSP). A Fedora Repository and a mySQL data base management system serve as a back-end. The communication between front-end and back-end uses a SOAP minimal binding stub. We describe the design principles and the construction of the prototype and discuss the possibilities and limitations of work ow creation by administrators. The code of the prototype is open-source and can be retrieved in the project escipub at http://sourceforge.ne

    Strange bedfellows? Keyword and conceptual search unite to make sense of relevant ESI in electronic discovery

    Get PDF
    In the brief history of electronic discovery, the latter part of the twentieth century witnessed the demise of paper by a digital hero that emancipated the content of paper documents with OCR and TIFF. This technology added a third dimension to the realm of 2D paper document review and production that lead to a sea change in discovery methods. By many accounts what we have before us is a three-stage evolution from paper to digital to clustering in order to overcome the problems of volume and complexity of ESI. The intent of this position paper is to describe the development of the digital hero and methodology that is emancipating the content and context of ESI – conceptual search that spans file formats, languages and technique, and includes keyword search on a common, shared index

    Analytical Challenges in Modern Tax Administration: A Brief History of Analytics at the IRS

    Get PDF
    • 

    corecore