8 research outputs found
User, who art thou? User profiling for oral corpus platforms
This contribution presents the background, design and results of a study of users of three oral corpus platforms in Germany. Roughly 5.000 registered users of the Database for Spoken German (DGD), the GeWiss corpus and the corpora of the Hamburg Centre for Language Corpora (HZSK) were asked to participate in a user survey. This quantitative approach was complemented by qualitative interviews with selected users. We briefly introduce the corpus resources involved in the study in section 2. Section 3 describes the methods employed in the user studies. Section 4 summarizes results of the studies focusing on selected key topics. Section 5 attempts a generalization of these results to larger contexts
Towards MerMEId 2.0
The “Metadata Editor and Repository for MEI Data” (MerMEId) is a web-based tool to capture and enrich data in the MEI header. This tool was originally developed by Axel Teich Geertinger and Sigfrid Lundberg at the “Danish Centre for Music Editing”, under an open-source license. This poster describes the transfer of this project to community-based releases by members of the MerMEId community with a formal governance structure, towards a true open-source community venture
VIKUS Viewer (overlay extension)
The VIKUS Viewer (overlay extension) is an adaption of VIKUS Viewer, a web-based visualization system for the dynamic, interactive visualization of metadata (originally on cultural heritage data) that allows for the exploration of thematic and temporal patterns of large collections. The extension aims at the support of multi-media data that occurs in spoken language corpora (esp. audio and video).
VIKUS Viewer was designed and developed by Christopher Pietsch. The VIKUS Viewer software is based on the visualization code behind Past Visions, a collaborative effort by Katrin Glinka, Christopher Pietsch, and Marian Dörk carried out at the University of Applied Sciences Potsdam in the context of the Urban Complexity Lab during the research project VIKUS (2014-2017). Related Paper: Past Visions and Reconciling Views. The T-SNE view has been implemented for the Sphaera project with funding from Chronoi-RE
A toolkit for multi-dimensional markup
Stührenberg M, Jettka D. A toolkit for multi-dimensional markup. In: Proceedings of Balisage: The Markup Conference 2009. Balisage series on markup technologies. Vol 3. Mulberry Technologies, Inc.; 2009
Visualization of concurrent markup
Jettka D, Stührenberg M. Visualization of concurrent markup. In: Proceedings of Balisage: The Markup Conference 2011. Balisage series on markup technologies. Vol 7. Mulberry Technologies, Inc.; 2011
Conversion and annotation web services for spoken language data in CLARIN
We present an approach to making existing CLARIN web services usable for spoken language transcriptions. Our approach is based on a new TEI-based ISO standard for such transcriptions. We show how existing tool formats can be transformed to this standard, how an encoder/decoder pair for the TCF format enables users to feed this type of data through a WebLicht tool chain, and why and how web services operating directly on the standard format would be useful
Vernetzung statt Vereinheitlichung. Digitale Forschungsinfrastrukturen in den Geisteswissenschaften
Die Entwicklung der digitalen Infrastruktur am Hamburger Zentrum für Sprachkorpora (HZSK) kann als Beispiel für die Evolution individueller technischer Einzellösungen hin zu fachspezifischen virtuellen Arbeits- und Forschungsumgebungen, die im Rahmen supranationaler Forschungsinfrastrukturen für die digitalen Geisteswissenschaften miteinander vernetzt sind, angesehen werden. Im Fokus steht im konkreten Fall des HZSK die Sicherung der langfristigen Zugänglichkeit von Forschungsdaten (multimedialen Daten gesprochener Sprache) durch die Entwicklung einer virtuellen Forschungsumgebung, die einerseits an die zentrenbasierte Forschungsinfrastruktur CLARIN-D angebunden ist und andererseits fachspezifische Benutzerschnittstellen schafft.The development of the digital infrastructure at the Hamburg Center for Language Corpora (Hamburger Zentrum für Sprachkorpora - HZSK) can be seen as an example for the evolution of individual technical solutions towards community-specific virtual workspaces and research environments that are interconnected in the context of supranational research infrastructures for the digital humanities. In the case of the HZSK the focus lies on the assurance of the long-term accessibility of research data (multimedial data of spoken language) by developing a virtual research platform, which on the one hand is connected to the center-based research infrastructure CLARIN-D, and on the other hand provides community-specific user interfaces
Markup Infrastructure for the Anaphoric Bank, Part I: Supporting Web Collaboration
Poesio M, Diewald N, Stührenberg M, et al. Markup Infrastructure for the Anaphoric Bank, Part I: Supporting Web Collaboration. In: Mehler A, Kühnberger K-U, Lobin H, Lüngen H, Storrer A, Witt A, eds. Modeling, Learning and Processing of Text Technological Data Structures. Studies in Computational Intelligence. Vol 370. Berlin ; Heidelberg: Springer; 2011: 175-195.Modern NLP systems rely either on unsupervised methods, or on data created as part of governmental initiatives such as MUC, ACE, or GALE. The data created in these efforts tend to be annotated according to task-specific schemes. The Anaphoric Bank is an attempt to create large quantities of data annotated with anaphoric information according to a general purpose and linguistically motivated scheme. We do this by pooling smaller amounts of data annotated according to rich schemes that are by and large compatible, and by taking advantage of Web collaboration. In this chapter we discuss the markup infrastructure that underpins the two modalities of Web collaboration in the project: expert annotation and game-based annotation