Search CORE

67 research outputs found

Comparison of resource discovery methods

Author: Broeder D.
Klassmann A.
Offenga F.
Skiba R.
Wittenburg P.
Publication venue
Publication date: 01/01/2006
Field of study

MPG.PuRe

Metadata profile in the ISO data category registry

Author: Broeder D.
Ducret J.
Offenga F.
Romary L.
Wittenburg P.
Publication venue
Publication date: 01/01/2006
Field of study

MPG.PuRe

Foundation of a component-based flexible registry for language resources and technology

Author: Broeder D.
Calzolari N.
Declerck T.
Hinrichs E.
Piperidis S.
Romary L.
Wittenburg P.
Publication venue
Publication date: 01/01/2008
Field of study

Within the CLARIN e-science infrastructure project it is foreseen to develop a component-based registry for metadata for Language Resources and Language Technology. With this registry it is hoped to overcome the problems of the current available systems with respect to inflexible fixed schema, unsuitable terminology and interoperability problems. The registry will address interoperability needs by refering to a shared vocabulary registered in data category registries as they are suggested by ISO

MPG.PuRe

Foundation of a Component-based Flexible Registry for Language Resources and Technology

Author: Broeder Daan
Calzolari Nicoletta
Declerck Thierry
Hinrichs Erhard
Piperidis Stelios
Romary Laurent
Wittenburg Peter
Publication venue: European Language Resources Association (ELRA), Paris, France
Publication date: 01/01/2008
Field of study

INRIA a CCSD electronic archive server

PUblication MAnagement

MPG.PuRe

A Federation of Language Archives Enabling Future eHumanities Scenarios

Author: Broeder D.
Dimitriadis A.
Kemps-Snijders M.
Soddemann T.
Wittenburg P.
Publication venue
Publication date: 02/05/2007
Field of study

This paper describes the need for new infrastructures for future eScience scenarios in the humanities. Three projects working on different aspects of these infrastructures are examined in detail. The first project is trying to achieve a federation of archives, developing an integration layer at the level of localization, access to and referring to an archive’s raw data objects. The other two try to achieve interoperability at the level of semantic interpretation of linguistic data-types and tagging systems. The project’s different approaches to this problem show the trade-of between flexibility and the user’s workload. All three approaches give an impression about the necessary steps to come to an eHumanities scenario

MPG.PuRe

A large Metadata Domain for Language Resources

Author: Broeder Daan
Declerck Thierry
Romary Laurent
Strömqvist Sven
Uneson Markus
Villemonte de La Clergerie Éric
Wittenburg Peter
Publication venue: HAL CCSD
Publication date: 26/05/2004
Field of study

Colloque avec actes et comité de lecture. internationale.International audienceThe INTERA and ECHO projects were partly intended to create a critical mass of open and linked metadata descriptions of language resources, helping researchers to understand the benefits of an increased visibility of language resources in the Internet and motivating them to participate. The work was based on the new IMDI version 3.0.3 which is a result of experiences with the earlier versions and new requirements coming from the involved partners. While in INTERA major data centers in Europe are participating, the ECHO project focuses on resources that can be seen as part of cultural heritage. Currently, 27 institutions and projects are active with the goal of having a large browsable and searchable domain by the summer of 2004. Experience shows that the creation of high quality metadata is not trivial and asks for a considerable amount of effort and skills, since manual work alone is too time consuming

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1

Archiving and accessing language resources

Author: Wittenburg P.
Publication venue: 'Wiley'
Publication date: 01/01/2010
Field of study

Languages are among the most complex systems that evolution has created. With an unforeseen speed many of these unique results of evolution are currently disappearing: every two weeks one of the 6500 still spoken languages is dying and many are subject to extreme changes due to globalization. Experts understood the need to document the languages and preserve the cultural and linguistic treasures embedded in them for future generations. Also linguistic theory will need to consider the variation of the linguistic systems encoded in languages to improve our understanding of how human minds process language material, thus accessibility to all types of resources is increasingly crucial. Deeper insights into human language processing and a higher degree of integration and interoperability between resources will also improve our language processing technology. The DOBES programme is focussing on the documentation and preservation of language material. The Max Planck Institute developed the Language Archiving Technology to help researchers when creating, archiving and accessing language resources. The recently started CLARIN research infrastructure has as main goals to achieve a broad visibility and an easy accessibility of language resources

MPG.PuRe

Digitizing Intangible Cultural Heritage

Author: Uneson Marcus
Wittenburg Peter
Publication venue: [Publisher information missing]
Publication date: 01/01/2004
Field of study

As part of the UNESCO project "Establishment of a National Inventory and Electronic Database of Lithuanian Intangible Cultural Heritage" the authors, representing the EU-funded project "European Cultural Heritage Online" (ECHO) were invited to give a course in digital archiving called "Digitizing Intangible Cultural Heritage" in Vilnius, Lithuania, March 15 to 20, 2004. The present report summarizes very briefly the sessions given. Thereafter, the analyses of the state of the digitization work of the participating institutes and recommendations for the future are given in a dedicated, stand-alone section

Lund University Publications

Naturalistic Emotional Speech Corpora with Large Scale Emotional Dimension Ratings

Author: Vaughan Brian
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2011
Field of study

The investigation of the emotional dimensions of speech is dependent on large sets of reliable data. Existing work has been carried out on the creation of emotional speech corpora and the acoustic analysis of emotional speech and this research seeks to buildupon this work while suggesting new methods and areas of potential. A review of the literature determined that a two dimensional emotional model of activation and evaluation was the ideal method for representing the emotional states expressed inspeech. Two case studies were carried out to investigate methods of obtaining naturalunderlying emotional speech in a high quality audio environment, the results of which were used to design a final experimental procedure to elicit natural underlying emotional speech. The speech obtained in this experiment was used in the creation ofa speech corpus that was underpinned by a persistent backend database that incorporated a three-tiered annotation methodology. This methodology was used to comprehensively annotate the metadata, acoustic data and emotional data of the recorded speech. Structuring the three levels of annotation and the assets in a persistent backend database allowed interactive web-based tools to be developed; aweb-based listening tool was developed to obtain a large amount of ratings for the assets that were then written back to the database for analysis. Once a large amount of ratings had been obtained, statistical analysis was used to determine the dimensionalrating for each asset. Acoustic analysis of the underlying emotional speech was then carried out and determined that certain acoustic parameters were correlated with the activation dimension of the dimensional model. This substantiated some of thefindings in the literature review and further determined that spectral energy was strongly correlated with the activation dimension in relation to underlying emotional speech. The lack of a correlation for certain acoustic parameters in relation to the evaluation dimension was also determined, again substantiating some of the findings in the literature.The work contained in this thesis makes a number of contributions to the field: the development of an experimental design to elicit natural underlying emotional speech in a high quality audio environment; the development and implementation of acomprehensive three-tiered corpus annotation methodology; the development and implementation of large scale web based listening tests to rate the emotional dimensions of emotional speech; the determination that certain acoustic parameters are correlated with the activation dimension of a dimensional emotional model inrelation to natural underlying emotional speech and the determination that certain acoustic parameters are not correlated with the evaluation dimension of a twodimensional emotional model in relation to natural underlying emotional speech

Arrow@TUDublin

Foundation of a Component-based Flexible Registry for Language Resources and Technolog

Author: Broeder Daan
Calzolari Nicoletta
Declerck Thierry
Hinrichs Erhard
Piperidis Stelios
Romary Laurent
Wittenburg Peter
Publication venue: HAL CCSD
Publication date: 28/05/2008
Field of study

International audienceWithin the CLARIN e-science infrastructure project it is foreseen to develop a component-based registry for metadata for Language Resources and Language Technology. With this registry it is hoped to overcome the problems of the current available systems with respect to inflexible fixed schema, unsuitable terminology and interoperability problems. The registry will address interoperability needs by refering to a shared vocabulary registered in data category registries as they are suggested by ISO

INRIA a CCSD electronic archive server