172 research outputs found
Recommended from our members
Introduction to the Special Issue on Software Architecture for Language Engineering
Every building, and every computer program, has an architecture: structural and organisational principles that underpin its design and construction. The garden shed
once built by one of the authors had an ad hoc architecture, extracted (somewhat painfully) from the imagination during a slow and non-deterministic process that, luckily, resulted in a structure which keeps the rain on the outside and the mower on the inside (at least for the time being). As well as being ad hoc (i.e. not informed by analysis of similar practice or relevant science or engineering) this architecture is implicit: no explicit design was made, and no records or documentation kept of the construction process. The pyramid in the courtyard of the Louvre, by contrast, was constructed in a process involving explicit design performed by qualified engineers with a wealth of theoretical and practical knowledge of the properties of materials, the relative merits and strengths of different construction techniques, et cetera. So it is with software: sometimes it is thrown together by enthusiastic amateurs; sometimes it is architected, built to last, and intended to be 'not something you finish, but something you start' (to paraphrase Brand (1994). A number of researchers argued in the early and middle 1990s that the field of computational infrastructure or architecture for human language computation merited an increase in attention. The reasoning was that the increasingly large-scale and technologically significant nature of language processing science was placing increasing burdens of an engineering nature on research and development workers seeking robust and practical methods (as was the increasingly collaborative nature of research in this field, which puts a large premium on software integration and interoperation). Over the intervening period a number of significant systems and practices have been developed in what we may call Software Architecture for Language Engineering (SALE). This special issue represented an opportunity for practitioners in this area to report their work in a coordinated setting, and to present a snapshot of the state-ofthe-art in infrastructural work, which may indicate where further development and further take-up of these systems can be of benefit
New Methods, Current Trends and Software Infrastructure for NLP
The increasing use of `new methods' in NLP, which the NeMLaP conference
series exemplifies, occurs in the context of a wider shift in the nature and
concerns of the discipline. This paper begins with a short review of this
context and significant trends in the field. The review motivates and leads to
a set of requirements for support software of general utility for NLP research
and development workers. A freely-available system designed to meet these
requirements is described (called GATE - a General Architecture for Text
Engineering). Information Extraction (IE), in the sense defined by the Message
Understanding Conferences (ARPA \cite{Arp95}), is an NLP application in which
many of the new methods have found a home (Hobbs \cite{Hob93}; Jacobs ed.
\cite{Jac92}). An IE system based on GATE is also available for research
purposes, and this is described. Lastly we review related work.Comment: 12 pages, LaTeX, uses nemlap.sty (included
Software Infrastructure for Natural Language Processing
We classify and review current approaches to software infrastructure for
research, development and delivery of NLP systems. The task is motivated by a
discussion of current trends in the field of NLP and Language Engineering. We
describe a system called GATE (a General Architecture for Text Engineering)
that provides a software infrastructure on top of which heterogeneous NLP
processing modules may be evaluated and refined individually, or may be
combined into larger application systems. GATE aims to support both researchers
and developers working on component technologies (e.g. parsing, tagging,
morphological analysis) and those working on developing end-user applications
(e.g. information extraction, text summarisation, document generation, machine
translation, and second language learning). GATE promotes reuse of component
technology, permits specialisation and collaboration in large-scale projects,
and allows for the comparison and evaluation of alternative technologies. The
first release of GATE is now available - see
http://www.dcs.shef.ac.uk/research/groups/nlp/gate/Comment: LaTeX, uses aclap.sty, 8 page
Visual Execution and Data Visualisation in Natural Language Processing
We describe GGI, a visual system that allows the user to execute an automatically generated data flow graph containing code modules that perform natural language processing tasks. These code modules operate on text documents. GGI has a suite of text visualisation tools that allows the user useful views of the annotation data that is produced by the modules in the executable graph. GGI forms part of the GATE natural language engineering system
Recommended from our members
The sustainable vegetables that thrive on a diet of fish poo
We currently rely on shipping large amounts of food across the world, leaving us remarkably vulnerable to disruption from wars, climate chaos, or volatile oil prices. Growing as much food as possible locally could boost resilience and allow us to start taking more control of what we eat, and reduce the intake of salt, sugar and preservatives, which drive diabetes and heart disease. Aquaponics is a sustainable and intensive alternative to both growing in soil and to hydroponics
Taxonomic Classification of IoT Smart Home Voice Control
Voice control in the smart home is commonplace, enabling the convenient
control of smart home Internet of Things hubs, gateways and devices, along with
information seeking dialogues. Cloud-based voice assistants are used to
facilitate the interaction, yet privacy concerns surround the cloud analysis of
data. To what extent can voice control be performed using purely local
computation, to ensure user data remains private? In this paper we present a
taxonomy of the voice control technologies present in commercial smart home
systems. We first review literature on the topic, and summarise relevant work
categorising IoT devices and voice control in the home. The taxonomic
classification of these entities is then presented, and we analyse our
findings. Following on, we turn to academic efforts in implementing and
evaluating voice-controlled smart home set-ups, and we then discuss open-source
libraries and devices that are applicable to the design of a privacy-preserving
voice assistant for smart homes and the IoT. Towards the end, we consider
additional technologies and methods that could support a cloud-free voice
assistant, and conclude the work
GATE -- an Environment to Support Research and Development in Natural Language Engineering
We describe a software environment to support research and development in natural language (NL) engineering. This environment -- GATE (General Architecture for Text Engineering) -- aims to advance research in the area of machine processing of natural languages by providing a software infrastructure on top of which heterogeneous NL component modules may be evaluated and refined individually or may be combined into larger application systems. Thus, GATE aims to support both researchers and developers working on component technologies (e.g. parsing, tagging, morphological analysis) and those working on developing end-user applications (e.g. information extraction, text summarisation, document generation, machine translation, and second language learning). GATE will promote reuse of component technology, permit specialisation and collaboration in large-scale projects, and allow for the comparison and evaluation of alternative technologies. The first release of GATE is now available
Using HLT for acquiring, retrieving and publishing knowledge in AKT:position paper
AKT is a major research project applying a variety of technologies to knowledge management. Knowledge is a dynamic, ubiquitous resource, which is to be found equally in an expert's head, under terabytes of data, or explicitly stated in manuals. AKT will extend knowledge management technologies to exploit the potential of the semantic web, covering the use of knowledge over its entire lifecycle, from acquisition to maintenance and deletion. In this paper we discuss how HLT will be used in AKT and how the use of HLT will affect different areas of KM, such as knowledge acquisition, retrieval and publishing
- …