36 research outputs found

    A Lexicalized Tree-Adjoining Grammar for Vietnamese

    Get PDF
    In this paper, we present the first sizable grammar built for Vietnamese using LTAG, developed over the past two years, named vnLTAG. This grammar aims at modelling written language and is general enough to be both application- and domain-independent. It can be used for the morpho-syntactic tagging and syntactic parsing of Vietnamese texts, as well as text generation. We then present a robust parsing scheme using vnLTAG and a parser for the grammar. We finish with an evaluation using a test suite

    LREP: A language repository exchange protocol

    No full text
    The recent increase in the number and complexity of the language resources available on the Internet is followed by a similar increase of available tools for linguistic analysis. Ideally the user does not need to be confronted with the question in how to match tools with resources. If resource repositories and tool repositories offer adequate metadata information and a suitable exchange protocol is developed this matching process could be performed (semi-) automatically

    A Lexicalized Tree-Adjoining Grammar for Vietnamese

    No full text
    In this paper, we present the first sizable grammar built for Vietnamese using LTAG, developed over the past two years, named vnLTAG. This grammar aims at modelling written language and is general enough to be both application- and domain-independent. It can be used for the morpho-syntactic tagging and syntactic parsing of Vietnamese texts, as well as text generation. We then present a robust parsing scheme using vnLTAG and a parser for the grammar. We finish with an evaluation using a test suite

    Foundation of a component-based flexible registry for language resources and technology

    No full text
    Within the CLARIN e-science infrastructure project it is foreseen to develop a component-based registry for metadata for Language Resources and Language Technology. With this registry it is hoped to overcome the problems of the current available systems with respect to inflexible fixed schema, unsuitable terminology and interoperability problems. The registry will address interoperability needs by refering to a shared vocabulary registered in data category registries as they are suggested by ISO

    How much do bacterial growth properties and biodegradable dissolved organic matter control water quality at low flow?

    Get PDF
    The development of accurate water quality modeling tools is necessary for integrated water quality management of river systems. Even though some water quality models can simulate dissolved oxygen (DO) concentrations accurately during high-flow periods and phytoplankton blooms in rivers, significant discrepancies remain during low-flow periods, when the dilution capacity of the rivers is reduced. We use the C-RIVE biogeochemical model to evaluate the influence of controlling parameters on DO simulations at low flow. Based on a coarse model pre-analysis, three sensitivity analyses (SAs) are carried out using the Sobol method. The parameters studied are related to bacterial community (e.g., bacterial growth rate), organic matter (OM; partitioning and degradation of OM into constituent fractions), and physical factors (e.g., reoxygenation of the river due to navigation and wind). Bacterial growth and mortality rates are found to be by far the two most influential parameters, followed by bacterial growth yield. More refined SA results indicate that the biodegradable fraction of dissolved organic matter (BDOM) and the bacterial growth yield are the most influential parameters under conditions of a high net bacterial growth rate (= growth rate − mortality rate), while bacterial growth yield is independently dominant in low net growth situations. Based on the results of this study, proposals are made for in situ measurement of BDOM under an urban area water quality monitoring network that provides high-frequency data. The results also indicate the need for bacterial community monitoring in order to detect potential bacterial community shifts after transient events such as combined sewer overflows and modifications in internal processes of treatment plants. Furthermore, we discuss the inclusion of BDOM in statistical water quality modeling software for improvement in the estimation of organic matter inflow from boundary conditions.</p

    Parallel alignment of structured documents

    Get PDF
    Classical methods for parallel text alignment consider one specific level (e.g. sentences) along which two or more versions of a text are to be synchronised. This may lead to some problems when these documents are particularly long since alignment errors at some point in the text may, in the absence of any other linguistic information, propagate for some time without any chance of recovery. In this chapter we consider how multilingual parallel alignment can be based on the fact that more and more texts are now highly structured by means of tagging languages such as SGML. In particular we will describe recent efforts in multi-level alignment for which we will present the main advances as well as some of the difficulties to be dealt with, in particular when the text and its translation are associated with different encoding schemes or different encoding practices for the same scheme

    Parallel alignment of structured documents

    No full text

    A large metadata domain of language resources

    Get PDF
    The INTERA and ECHO projects were partly intended to create a critical mass of open and linked metadata descriptions of language resources, helping researchers to understand the benefits of an increased visibility of language resources in the Internet and motivating them to participate. The work was based on the new IMDI version 3.0.3 which is a result of experiences with the earlier versions and new requirements coming from the involved partners. While in INTERA major data centers in Europe are participating, the ECHO project focuses on resources that can be seen as part of cultural heritage. Currently, 27 institutions and projects are active with the goal of having a large browsable and searchable domain by the summer of 2004. Experience shows that the creation of high quality metadata is not trivial and asks for a considerable amount of effort and skills, since manual work alone is too time consuming
    corecore