22,911 research outputs found
Multiple Retrieval Models and Regression Models for Prior Art Search
This paper presents the system called PATATRAS (PATent and Article Tracking,
Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach
presents three main characteristics: 1. The usage of multiple retrieval models
(KL, Okapi) and term index definitions (lemma, phrase, concept) for the three
languages considered in the present track (English, French, German) producing
ten different sets of ranked results. 2. The merging of the different results
based on multiple regression models using an additional validation set created
from the patent collection. 3. The exploitation of patent metadata and of the
citation structures for creating restricted initial working sets of patents and
for producing a final re-ranking regression model. As we exploit specific
metadata of the patent documents and the citation relations only at the
creation of initial working sets and during the final post ranking step, our
architecture remains generic and easy to extend
Special Libraries, December 1961
Volume 52, Issue 10https://scholarworks.sjsu.edu/sla_sl_1961/1009/thumbnail.jp
Special Libraries, February 1978
Volume 69, Issue 2https://scholarworks.sjsu.edu/sla_sl_1978/1001/thumbnail.jp
Special Libraries, March 1968
Volume 59, Issue 3https://scholarworks.sjsu.edu/sla_sl_1968/1002/thumbnail.jp
Common Infrastructure for Finite-State Based Methods and Linguistics Descriptions
Finite-state methods have been adopted widely in computational morphology and related linguistic applications. To enable efficient development of finite-state based linguistic descriptions, these methods should be a freely available resource for academic language research and the language technology industry. The following needs can be identified: (i) a registry that maps the existing approaches, implementations and descriptions, (ii) managing the incompatibilities of the existing tools, (iii) increasing synergy and complementary functionality of the tools, (iv) persistent availability of the tools used to manipulate the archived descriptions, (v) an archive for free finite-state based tools and linguistic descriptions. Addressing these challenges contributes to building a common research infrastructure for advanced language technology.Peer reviewe
Special Libraries, January 1966
Volume 57, Issue 1https://scholarworks.sjsu.edu/sla_sl_1966/1000/thumbnail.jp
New Methods, Current Trends and Software Infrastructure for NLP
The increasing use of `new methods' in NLP, which the NeMLaP conference
series exemplifies, occurs in the context of a wider shift in the nature and
concerns of the discipline. This paper begins with a short review of this
context and significant trends in the field. The review motivates and leads to
a set of requirements for support software of general utility for NLP research
and development workers. A freely-available system designed to meet these
requirements is described (called GATE - a General Architecture for Text
Engineering). Information Extraction (IE), in the sense defined by the Message
Understanding Conferences (ARPA \cite{Arp95}), is an NLP application in which
many of the new methods have found a home (Hobbs \cite{Hob93}; Jacobs ed.
\cite{Jac92}). An IE system based on GATE is also available for research
purposes, and this is described. Lastly we review related work.Comment: 12 pages, LaTeX, uses nemlap.sty (included
- …