Search CORE

27 research outputs found

Transferring PoS-tagging and lemmatization tools from spoken to written Dutch corpus development

Author: Schuurman I.
van den Bosch A.
Vandeghinste V.
Publication venue: [s.n.]
Publication date: 01/01/2006
Field of study

Tilburg University Repository

Introduction

Author: J. Belder De
J.F. Gemmeke
N. Oostdijk
V. Vandeghinste
Publication venue: Springer Berlin Heidelberg
Publication date
Field of study

Crossref

Springer - Publisher Connector

Querying large treebanks: Benchmarking GrETEL indexing

Author: Augustinus L
Vandeghinste V
Vanroy B
Publication venue: Computational Linguistics in the Netherlands
Publication date: 01/12/2017
Field of study

The amount of data that is available for research grows rapidly, yet technology to efficiently interpret and excavate these data lags behind. For instance, when using large treebanks for linguistic research, the speed of a query leaves much to be desired. GrETEL Indexing, or GrInding, tackles this issue. The idea behind GrInding is to make the search space as small as possible before actually starting the treebank search, by pre-processing the treebank at hand. We recursively divide the treebank into smaller parts, called subtree-banks, which are then converted into database files. All subtree-banks are organized according to their linguistic dependency pattern, and labeled as such. Additionally, general patterns are linked to more specific ones. By doing so, we create millions of databases, and given a linguistic structure we know in which databases that structure can occur, leading up to a significant efficiency boost. We present the results of a benchmark experiment, testing the effect of the GrInding procedure on the SoNaR-500 treebank.status: publishe

Lirias

From D-Coi to SoNaR: A reference corpus for Dutch

Author: Monachesi P.
Noord G. van
Oostdijk N.H.J.
Ordelman R.
Reynaert M.
Schuurman I.
Vandeghinste V.
Publication venue: Marrakech, Marocco : ELRA
Publication date: 01/01/2008
Field of study

Contains fulltext : 67981.pdf (publisher's version ) (Open Access

CiteSeerX

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Radboud Repository

University of Twente Research Information

Tilburg University Repository

Dissertations of the University of Groningen

Discovery of association rules between syntactic variables. Data mining the Syntactic Atlas of the Dutch dialects.

Author: Dirix P.
Schuurman I.
Spruit M.R.
Van Eynde F.
Vandeghinste V.
Publication venue: LOT Publications
Publication date: 01/01/2007
Field of study

This research applies an association rule mining technique to purely syntactic dialect data. The paper answers the research question of how relevant associations between syntactic variables can be discovered. The method calculates the proportional overlap between geographical distributions of syntactic microvariables and incorporates rule quality factors such as accuracy, coverage and completeness to measure the interestingness of the variable associations.The exploratory review of the results discusses several highly ranked association rules and also examines an implicational chain of syntactic variables

Building a Multilingual Parallel Subtitle Corpus

Author: Dirix P.
Schuurman I.
Tiedemann J.
Van Eynde F.
Vandeghinste V.
Publication venue
Publication date: 01/01/2007
Field of study

Lexico-Semantic Multiword Expression Extraction

Author: Dirix P.D.
Schuurman I.
Van de Cruys T.
Van Eynde F.
Vandeghinste V.
Villada Moiron M.B.
Publication venue: 'Leuven University Press'
Publication date: 01/01/2007
Field of study

A memory-based classification approach to marker-based EBMT

Author: Schuurman I.
Stroppa N.
van den Bosch A.
van Eynde F.
Vandeghinste V.
Way A.
Publication venue: Katholieke Universiteit Leuven
Publication date: 01/01/2007
Field of study

Conditional entropy measures intelligibility among related languages

Author: Dirix P.
Gooskens C.S.
Moberg J.
Nerbonne J.
Schuurman I.
Vaillette N.
Van Eynde F.
Vandeghinste V.
Publication venue: 'The Korean Society of Clothing and Textiles'
Publication date: 01/01/2007
Field of study

A pilot study for automatic semantic role labeling in a dutch corpus

Author: Dirix P.
Monachesi P.
Schuurman I.
Stevens G.
van den Bosch A.
van Eynde F.
Vandeghinste V.
Publication venue: 'The Korean Society of Clothing and Textiles'
Publication date: 01/01/2007
Field of study