274,172 research outputs found
Recommended from our members
Introduction to the Special Issue on Software Architecture for Language Engineering
Every building, and every computer program, has an architecture: structural and organisational principles that underpin its design and construction. The garden shed
once built by one of the authors had an ad hoc architecture, extracted (somewhat painfully) from the imagination during a slow and non-deterministic process that, luckily, resulted in a structure which keeps the rain on the outside and the mower on the inside (at least for the time being). As well as being ad hoc (i.e. not informed by analysis of similar practice or relevant science or engineering) this architecture is implicit: no explicit design was made, and no records or documentation kept of the construction process. The pyramid in the courtyard of the Louvre, by contrast, was constructed in a process involving explicit design performed by qualified engineers with a wealth of theoretical and practical knowledge of the properties of materials, the relative merits and strengths of different construction techniques, et cetera. So it is with software: sometimes it is thrown together by enthusiastic amateurs; sometimes it is architected, built to last, and intended to be 'not something you finish, but something you start' (to paraphrase Brand (1994). A number of researchers argued in the early and middle 1990s that the field of computational infrastructure or architecture for human language computation merited an increase in attention. The reasoning was that the increasingly large-scale and technologically significant nature of language processing science was placing increasing burdens of an engineering nature on research and development workers seeking robust and practical methods (as was the increasingly collaborative nature of research in this field, which puts a large premium on software integration and interoperation). Over the intervening period a number of significant systems and practices have been developed in what we may call Software Architecture for Language Engineering (SALE). This special issue represented an opportunity for practitioners in this area to report their work in a coordinated setting, and to present a snapshot of the state-ofthe-art in infrastructural work, which may indicate where further development and further take-up of these systems can be of benefit
Software engineering and Ada in design
Modern software engineering promises significant reductions in software costs and improvements in software quality. The Ada language is the focus for these software methodology and tool improvements. The IBM FSD approach, including the software engineering practices that guide the systematic design and development of software products and the management of the software process are examined. The revised Ada design language adaptation is revealed. This four level design methodology is detailed including the purpose of each level, the management strategy that integrates the software design activity with the program milestones, and the technical strategy that maps the Ada constructs to each level of design. A complete description of each design level is provided along with specific design language recording guidelines for each level. Finally, some testimony is offered on education, tools, architecture, and metrics resulting from project use of the four level Ada design language adaptation
A Query Language for Software Architecture Information (Extended version)
Software maintenance is an important part of a software system's life cycle.
Maintenance tasks of existing software systems suffer from architecture
information that is diverging over time (architectural drift). The Digital
Architecture Twin (DArT) can support software maintenance by providing
up-to-date architecture information. For this, the DArT gathers such
information and co-evolves with a software system, enabling continuous reverse
engineering. But the crucial link for stakeholders to retrieve this information
is missing. To fill this gap, we contribute the Architecture Information Query
Language (AIQL), which enables stakeholders to access up-to-date and tailored
architecture information. We derived four application scenarios in the context
of continuous reverse engineering. We showed that the AIQL provides the
required functionality to formulate queries for the application scenarios and
that the language scales for use with real-world software systems. In a user
study, stakeholders agreed that the language is easy to understand and assessed
its value to the specific stakeholder for the application scenarios
Lightweight Multilingual Software Analysis
Developer preferences, language capabilities and the persistence of older
languages contribute to the trend that large software codebases are often
multilingual, that is, written in more than one computer language. While
developers can leverage monolingual software development tools to build
software components, companies are faced with the problem of managing the
resultant large, multilingual codebases to address issues with security,
efficiency, and quality metrics. The key challenge is to address the opaque
nature of the language interoperability interface: one language calling
procedures in a second (which may call a third, or even back to the first),
resulting in a potentially tangled, inefficient and insecure codebase. An
architecture is proposed for lightweight static analysis of large multilingual
codebases: the MLSA architecture. Its modular and table-oriented structure
addresses the open-ended nature of multiple languages and language
interoperability APIs. We focus here as an application on the construction of
call-graphs that capture both inter-language and intra-language calls. The
algorithms for extracting multilingual call-graphs from codebases are
presented, and several examples of multilingual software engineering analysis
are discussed. The state of the implementation and testing of MLSA is
presented, and the implications for future work are discussed.Comment: 15 page
GATE -- an Environment to Support Research and Development in Natural Language Engineering
We describe a software environment to support research and development in natural language (NL) engineering. This environment -- GATE (General Architecture for Text Engineering) -- aims to advance research in the area of machine processing of natural languages by providing a software infrastructure on top of which heterogeneous NL component modules may be evaluated and refined individually or may be combined into larger application systems. Thus, GATE aims to support both researchers and developers working on component technologies (e.g. parsing, tagging, morphological analysis) and those working on developing end-user applications (e.g. information extraction, text summarisation, document generation, machine translation, and second language learning). GATE will promote reuse of component technology, permit specialisation and collaboration in large-scale projects, and allow for the comparison and evaluation of alternative technologies. The first release of GATE is now available
Two Case Studies of Subsystem Design for General-Purpose CSCW Software Architectures
This paper discusses subsystem design guidelines for the software architecture of general-purpose computer supported cooperative work systems, i.e., systems that are designed to be applicable in various application areas requiring explicit collaboration support. In our opinion, guidelines for subsystem level design are rarely given most guidelines currently given apply to the programming language level. We extract guidelines from a case study of the redesign and extension of an advanced commercial workflow management system and place them into the context of existing software engineering research. The guidelines are then validated against the design decisions made in the construction of a widely used web-based groupware system. Our approach is based on the well-known distinction between essential (logical) and physical architectures. We show how essential architecture design can be based on a direct mapping of abstract functional concepts as found in general-purpose systems to modules in the essential architecture. The essential architecture is next mapped to a physical architecture by applying software clustering and replication to achieve the required distribution and performance characteristics
Software Infrastructure for Natural Language Processing
We classify and review current approaches to software infrastructure for
research, development and delivery of NLP systems. The task is motivated by a
discussion of current trends in the field of NLP and Language Engineering. We
describe a system called GATE (a General Architecture for Text Engineering)
that provides a software infrastructure on top of which heterogeneous NLP
processing modules may be evaluated and refined individually, or may be
combined into larger application systems. GATE aims to support both researchers
and developers working on component technologies (e.g. parsing, tagging,
morphological analysis) and those working on developing end-user applications
(e.g. information extraction, text summarisation, document generation, machine
translation, and second language learning). GATE promotes reuse of component
technology, permits specialisation and collaboration in large-scale projects,
and allows for the comparison and evaluation of alternative technologies. The
first release of GATE is now available - see
http://www.dcs.shef.ac.uk/research/groups/nlp/gate/Comment: LaTeX, uses aclap.sty, 8 page
Convolutional Neural Networks over Tree Structures for Programming Language Processing
Programming language processing (similar to natural language processing) is a
hot research topic in the field of software engineering; it has also aroused
growing interest in the artificial intelligence community. However, different
from a natural language sentence, a program contains rich, explicit, and
complicated structural information. Hence, traditional NLP models may be
inappropriate for programs. In this paper, we propose a novel tree-based
convolutional neural network (TBCNN) for programming language processing, in
which a convolution kernel is designed over programs' abstract syntax trees to
capture structural information. TBCNN is a generic architecture for programming
language processing; our experiments show its effectiveness in two different
program analysis tasks: classifying programs according to functionality, and
detecting code snippets of certain patterns. TBCNN outperforms baseline
methods, including several neural models for NLP.Comment: Accepted at AAAI-1
- âŚ