20 research outputs found
Swiss-AL : language data platform for applied sciences
Language data is used not only by linguists, but also in many other research disciplines. In the Swiss-AL project, we aim to develop practices for a diverse community interested in exploring the potential of language data while implementing FAIR data management principles and integrating them into the Swiss-AL workbench
Swiss-AL: platform for language data in applied sciences : on challenges in the field of language open research data
Open Science is transforming the way researchers collect, process, analyze, and store empirical research data, particularly in the social sciences and humanities, where language data is crucial. This transformation processespecially concerns developers and providers oflarge language corporaand manifests itself in at least three challengeswhen providing these corpora as Open Research Data (ORD). Challenges concernheterogeneous practices that researchers apply when working with language data, research data lifecycle, and legal and ethical aspect. In this paper, we present Swiss-AL, a language data platform developed in Switzerland that is being transformed into an Open Research Data Resource for Applied Sciences within the Swiss Open Science Strategy. The paper gives an overview over the data contained in Swiss-AL and the infrastructure that is used to process and analyze the data. Furthermore, it presents approaches to the three abovementioned challenges to language ORD
Using computational linguistics to enhance protest event analysis
For now more than four decades, quantitative protest event analysis (PEA) has routinely contributed to the testing and refinement of theories on political processes from different perspectives. However, it is commonly agreed that PEA data face serious challenges regarding their data sources. Precisely, researchers applying PEA struggle with the fact that they cannot use multiple sources for large geographical areas and long time periods. As a consequence, most of the scholarship still focuses on a narrow set of European countries or the United States and does not cover the period since the early 2000s. We are bringing PEA and computational linguistics together to suggest and evaluate an approach that will enable political scientists to extend their research designs with a more efficient and at the same time reliable data collection. The approach relies on hidden topic models, word space models, and named entity recognition to identify and code protest events
„Korporatheken“: Die digitale und verdatete Bibliothek
Mit dem digitalen Zeitalter ergeben sich für die Geisteswissenschaften neue methodologische Möglichkeiten, deren Wurzeln jedoch weit zurück reichen. Am Anfang steht der Index, der inzwischen mit viel weniger Aufwand als früher zum Volltextindex wird. Mit der anschliessenden „Verdatung“ werden Sprachdaten verrechenbar und damit anders nutzbar. Welche Rolle können Bibliotheken in einer verdateten Welt spielen? Der folgende Beitrag betont nicht nur die technischen Möglichkeiten, sondern auch die Probleme, spezifisch aus geistes- und kulturwissenschaftlicher Sicht.
The digital era produces a whole range of new approaches for research in the humanities whose origins, however, can be traced far back. Their very source is the index, which can take the form of a full text index much easier than before. By turning text into data it becomes computationally tractable and can be used in novel ways. Which role can libraries play in a digitized world? This paper explores technical possibilities but it also points to problems specific for arts and social sciences
Using layout analysis for tracing the evolution of digital society in the 19th century
Presentation in the track "Natural Language Processing for GLAM (Galleries, Libraries, Archives and Museum) Collections"
Qualitative vs. quantitative methods? : an approach to qualitative questions in applied linguistics using Swiss-AL-Workbench
Slides sind hier zu finden: https://cdn.ymaws.com/pragmatics.international/resource/collection/2A1FB89E-A6BC-4582-B10C-1756818FE74E/Webinar_IPrA_2023.pd