5,726 research outputs found
Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing
Linguistic typology aims to capture structural and semantic variation across
the world's languages. A large-scale typology could provide excellent guidance
for multilingual Natural Language Processing (NLP), particularly for languages
that suffer from the lack of human labeled resources. We present an extensive
literature survey on the use of typological information in the development of
NLP techniques. Our survey demonstrates that to date, the use of information in
existing typological databases has resulted in consistent but modest
improvements in system performance. We show that this is due to both intrinsic
limitations of databases (in terms of coverage and feature granularity) and
under-employment of the typological features included in them. We advocate for
a new approach that adapts the broad and discrete nature of typological
categories to the contextual and continuous nature of machine learning
algorithms used in contemporary NLP. In particular, we suggest that such
approach could be facilitated by recent developments in data-driven induction
of typological knowledge
A Framework for Reference Management in the Semantic Web
Much of the semantic web relies upon open and unhindered interoperability between diverse systems. The successful convergence of multiple ontologies and referencing schemes is key. This is hampered by a lack of any means for managing and communicating co-references. We have therefore developed an ontology and framework for the exploration and resolution of potential co-references, in the semantic web at large, that allow the user to a) discover and record uniquely identifying attributes b) interface candidates with and create pipelines of other systems for reference management c) record identified duplicates in a usable and retrievable manner, and d) provide a consistent reference service for accessing them. This paper describes this ontology and a framework of web services designed to support and utilise it
Language technologies for a multilingual Europe
This volume of the series âTranslation and Multilingual Natural Language Processingâ includes most of the papers presented at the Workshop âLanguage Technology for a Multilingual Europeâ, held at the University of Hamburg on September 27, 2011 in the framework of the conference GSCL 2011 with the topic âMultilingual Resources and Multilingual Applicationsâ, along with several additional contributions. In addition to an overview article on Machine Translation and two contributions on the European initiatives META-NET and Multilingual Web, the volume includes six full research articles. Our intention with this workshop was to bring together various groups concerned with the umbrella topics of multilingualism and language technology, especially multilingual technologies. This encompassed, on the one hand, representatives from research and development in the field of language technologies, and, on the other hand, users from diverse areas such as, among others, industry, administration and funding agencies. The Workshop âLanguage Technology for a Multilingual Europeâ was co-organised by the two GSCL working groups âText Technologyâ and âMachine Translationâ (http://gscl.info) as well as by META-NET (http://www.meta-net.eu)
Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing
Linguistic typology aims to capture structural and semantic variation across the worldâs languages. A large-scale typology could provide excellent guidance for multilingual Natural Language Processing (NLP), particularly for languages that suffer from the lack of human labeled resources. We present an extensive literature survey on the use of typological information in the development of NLP techniques. Our survey demonstrates that to date, the use of information in existing typological databases has resulted in consistent but modest improvements in system performance. We show that this is due to both intrinsic limitations of databases (in terms of coverage and feature granularity) and under-utilization of the typological features included in them. We advocate for a new approach that adapts the broad and discrete nature of typological categories to the contextual and continuous nature of machine learning algorithms used in contemporary NLP. In particular, we suggest that such an approach could be facilitated by recent developments in data-driven induction of typological knowledge.</jats:p
prototypical implementations
In this technical report, we present prototypical implementations of
innovative tools and methods developed according to the working plan outlined
in Technical Report TR-B-09-05 [23]. We present an ontology modularization and
integration framework and the SVoNt server, the server-side end of an SVN-
based versioning system for ontologies in the Corporate Ontology Engineering
pillar. For the Corporate Semantic Collaboration pillar, we present the
prototypical implementation of a light-weight ontology editor for non-experts
and an ontology based expert finder system. For the Corporate Semantic Search
pillar, we present a prototype for algorithmic extraction of relations in
folksonomies, a tool for trend detection using a semantic analyzer, a tool for
automatic classification of web documents using Hidden Markov models, a
personalized semantic recommender for multimedia content, and a semantic
search assistant developed in co-operation with the Museumsportal Berlin. The
prototypes complete the next milestone on the path to an integral Cor- porate
Semantic Web architecture based on the three pillars Corporate Ontol- ogy
Engineering, Corporate Semantic Collaboration, and Corporate Semantic Search,
as envisioned in [23]
- âŠ