2,876 research outputs found
New Methods, Current Trends and Software Infrastructure for NLP
The increasing use of `new methods' in NLP, which the NeMLaP conference
series exemplifies, occurs in the context of a wider shift in the nature and
concerns of the discipline. This paper begins with a short review of this
context and significant trends in the field. The review motivates and leads to
a set of requirements for support software of general utility for NLP research
and development workers. A freely-available system designed to meet these
requirements is described (called GATE - a General Architecture for Text
Engineering). Information Extraction (IE), in the sense defined by the Message
Understanding Conferences (ARPA \cite{Arp95}), is an NLP application in which
many of the new methods have found a home (Hobbs \cite{Hob93}; Jacobs ed.
\cite{Jac92}). An IE system based on GATE is also available for research
purposes, and this is described. Lastly we review related work.Comment: 12 pages, LaTeX, uses nemlap.sty (included
Recommended from our members
Systems and methods for automated detection of application vulnerabilities
*/Board of Regents, University of Texas Syste
Getting More out of Biomedical Documents with GATE's Full Lifecycle Open Source Text Analytics.
This software article describes the GATE family of open source text analysis tools and processes. GATE is one of the most
widely used systems of its type with yearly download rates of tens of thousands and many active users in both academic
and industrial contexts. In this paper we report three examples of GATE-based systems operating in the life sciences and in
medicine. First, in genome-wide association studies which have contributed to discovery of a head and neck cancer
mutation association. Second, medical records analysis which has significantly increased the statistical power of treatment/
outcome models in the UK’s largest psychiatric patient cohort. Third, richer constructs in drug-related searching. We also
explore the ways in which the GATE family supports the various stages of the lifecycle present in our examples. We conclude
that the deployment of text mining for document abstraction or rich search and navigation is best thought of as a process,
and that with the right computational tools and data collection strategies this process can be made defined and repeatable.
The GATE research programme is now 20 years old and has grown from its roots as a specialist development tool for text
processing to become a rather comprehensive ecosystem, bringing together software developers, language engineers and
research staff from diverse fields. GATE now has a strong claim to cover a uniquely wide range of the lifecycle of text analysis
systems. It forms a focal point for the integration and reuse of advances that have been made by many people (the majority
outside of the authors’ own group) who work in text processing for biomedicine and other areas. GATE is available online
,1. under GNU open source licences and runs on all major operating systems. Support is available from an active user and
developer community and also on a commercial basis
Evaluating Information Retrieval and Access Tasks
This open access book summarizes the first two decades of the NII Testbeds and Community for Information access Research (NTCIR). NTCIR is a series of evaluation forums run by a global team of researchers and hosted by the National Institute of Informatics (NII), Japan. The book is unique in that it discusses not just what was done at NTCIR, but also how it was done and the impact it has achieved. For example, in some chapters the reader sees the early seeds of what eventually grew to be the search engines that provide access to content on the World Wide Web, today’s smartphones that can tailor what they show to the needs of their owners, and the smart speakers that enrich our lives at home and on the move. We also get glimpses into how new search engines can be built for mathematical formulae, or for the digital record of a lived human life. Key to the success of the NTCIR endeavor was early recognition that information access research is an empirical discipline and that evaluation therefore lay at the core of the enterprise. Evaluation is thus at the heart of each chapter in this book. They show, for example, how the recognition that some documents are more important than others has shaped thinking about evaluation design. The thirty-three contributors to this volume speak for the many hundreds of researchers from dozens of countries around the world who together shaped NTCIR as organizers and participants. This book is suitable for researchers, practitioners, and students—anyone who wants to learn about past and present evaluation efforts in information retrieval, information access, and natural language processing, as well as those who want to participate in an evaluation task or even to design and organize one
- …