42,147 research outputs found
A graphical environment for change detection in structured documents
Change detection in structured documents (e.g. SGML is important in data warehousing, digital libraries and Internet databases. This thesis presents a graphical environment for detecting changes in the structured documents. We represent. each document by alp ordered labeled tree based on the underlying markup language. We then compare two documents by invoking previously developed algorithms for approximate pattern matching and pattern discovery in trees. Several operators are developed to support. the comparison of the documents; graphical devices are provided to facilitate the use of the operators. We believe the proposed tool is useful for not only document management, but also software maintenance, particularly configuration management and version control, where programs aro represented as parse trees and detecting changes in the trees provides a way to find the syntactic differences of two program versions
New Methods, Current Trends and Software Infrastructure for NLP
The increasing use of `new methods' in NLP, which the NeMLaP conference
series exemplifies, occurs in the context of a wider shift in the nature and
concerns of the discipline. This paper begins with a short review of this
context and significant trends in the field. The review motivates and leads to
a set of requirements for support software of general utility for NLP research
and development workers. A freely-available system designed to meet these
requirements is described (called GATE - a General Architecture for Text
Engineering). Information Extraction (IE), in the sense defined by the Message
Understanding Conferences (ARPA \cite{Arp95}), is an NLP application in which
many of the new methods have found a home (Hobbs \cite{Hob93}; Jacobs ed.
\cite{Jac92}). An IE system based on GATE is also available for research
purposes, and this is described. Lastly we review related work.Comment: 12 pages, LaTeX, uses nemlap.sty (included
Content delivery and challenges in education hybrid students
Traditionally, taught postgraduate programmes placed students in well-defined categories such as 'distance learning' and 'on-campus' or 'part-time' and 'full-time'. The practical reality is that postgraduate students rarely fall into such simple, diametric roles and can be more suitably generalised under the concept of the 'hybrid student'. Hybrid students are dynamic, with changing
requirements in relation to their education. They expect flexibility and the ability to make changes relating to module participation level, study mechanism and lecture attendance, in order to suit personal preference and circumstance. This paper briefly introduces the concept of the hybrid student and how the concept has been handled within the School of Electronic Engineering at DCU.
Following this, some discussion is provided in relation to a number of the content delivery technologies used in programmes facilitating these students: HTML, PowerPoint, Moodle, DocBook and Wiki. Finally, some of the general challenges, which have been encountered in supporting such
diverse students, are briefly discussed
Submission of content to a digital object repository using a configurable workflow system
The prototype of a workflow system for the submission of content to a digital
object repository is here presented. It is based entirely on open-source
standard components and features a service-oriented architecture. The front-end
consists of Java Business Process Management (jBPM), Java Server Faces (JSF),
and Java Server Pages (JSP). A Fedora Repository and a mySQL data base
management system serve as a back-end. The communication between front-end and
back-end uses a SOAP minimal binding stub. We describe the design principles
and the construction of the prototype and discuss the possibilities and
limitations of work ow creation by administrators. The code of the prototype is
open-source and can be retrieved in the project escipub at
http://sourceforge.ne
Strange bedfellows? Keyword and conceptual search unite to make sense of relevant ESI in electronic discovery
In the brief history of electronic discovery, the latter part of the twentieth century witnessed the
demise of paper by a digital hero that emancipated the content of paper documents with OCR
and TIFF. This technology added a third dimension to the realm of 2D paper document review
and production that lead to a sea change in discovery methods. By many accounts what we have
before us is a three-stage evolution from paper to digital to clustering in order to overcome the
problems of volume and complexity of ESI. The intent of this position paper is to describe the
development of the digital hero and methodology that is emancipating the content and context of
ESI â conceptual search that spans file formats, languages and technique, and includes keyword
search on a common, shared index
- âŠ