10 research outputs found

    Creation and Evaluation of Datasets for Distributional Semantics Tasks in the Digital Humanities Domain

    Full text link
    Word embeddings are already well studied in the general domain, usually trained on large text corpora, and have been evaluated for example on word similarity and analogy tasks, but also as an input to downstream NLP processes. In contrast, in this work we explore the suitability of word embedding technologies in the specialized digital humanities domain. After training embedding models of various types on two popular fantasy novel book series, we evaluate their performance on two task types: term analogies, and word intrusion. To this end, we manually construct test datasets with domain experts. Among the contributions are the evaluation of various word embedding techniques on the different task types, with the findings that even embeddings trained on small corpora perform well for example on the word intrusion task. Furthermore, we provide extensive and high-quality datasets in digital humanities for further investigation, as well as the implementation to easily reproduce or extend the experiments

    Scalable Algorithms for the Analysis of Massive Networks

    Get PDF
    Die Netzwerkanalyse zielt darauf ab, nicht-triviale Erkenntnisse aus vernetzten Daten zu gewinnen. Beispiele für diese Erkenntnisse sind die Wichtigkeit einer Entität im Verhältnis zu anderen nach bestimmten Kriterien oder das Finden des am besten geeigneten Partners für jeden Teilnehmer eines Netzwerks - bekannt als Maximum Weighted Matching (MWM). Da der Begriff der Wichtigkeit an die zu betrachtende Anwendung gebunden ist, wurden zahlreiche Zentralitätsmaße eingeführt. Diese Maße stammen hierbei aus Jahrzehnten, in denen die Rechenleistung sehr begrenzt war und die Netzwerke im Vergleich zu heute viel kleiner waren. Heute sind massive Netzwerke mit Millionen von Kanten allgegenwärtig und eine triviale Berechnung von Zentralitätsmaßen ist oft zu zeitaufwändig. Darüber hinaus ist die Suche nach der Gruppe von k Knoten mit hoher Zentralität eine noch kostspieligere Aufgabe. Skalierbare Algorithmen zur Identifizierung hochzentraler (Gruppen von) Knoten in großen Graphen sind von großer Bedeutung für eine umfassende Netzwerkanalyse. Heutigen Netzwerke verändern sich zusätzlich im zeitlichen Verlauf und die effiziente Aktualisierung der Ergebnisse nach einer Änderung ist eine Herausforderung. Effiziente dynamische Algorithmen sind daher ein weiterer wesentlicher Bestandteil moderner Analyse-Pipelines. Hauptziel dieser Arbeit ist es, skalierbare algorithmische Lösungen für die zwei oben genannten Probleme zu finden. Die meisten unserer Algorithmen benötigen Sekunden bis einige Minuten, um diese Aufgaben in realen Netzwerken mit bis zu Hunderten Millionen von Kanten zu lösen, was eine deutliche Verbesserung gegenüber dem Stand der Technik darstellt. Außerdem erweitern wir einen modernen Algorithmus für MWM auf dynamische Graphen. Experimente zeigen, dass unser dynamischer MWM-Algorithmus Aktualisierungen in Graphen mit Milliarden von Kanten in Millisekunden bewältigt.Network analysis aims to unveil non-trivial insights from networked data by studying relationship patterns between the entities of a network. Among these insights, a popular one is to quantify the importance of an entity with respect to the others according to some criteria. Another one is to find the most suitable matching partner for each participant of a network knowing the pairwise preferences of the participants to be matched with each other - known as Maximum Weighted Matching (MWM). Since the notion of importance is tied to the application under consideration, numerous centrality measures have been introduced. Many of these measures, however, were conceived in a time when computing power was very limited and networks were much smaller compared to today's, and thus scalability to large datasets was not considered. Today, massive networks with millions of edges are ubiquitous, and a complete exact computation for traditional centrality measures are often too time-consuming. This issue is amplified if our objective is to find the group of k vertices that is the most central as a group. Scalable algorithms to identify highly central (groups of) vertices on massive graphs are thus of pivotal importance for large-scale network analysis. In addition to their size, today's networks often evolve over time, which poses the challenge of efficiently updating results after a change occurs. Hence, efficient dynamic algorithms are essential for modern network analysis pipelines. In this work, we propose scalable algorithms for identifying important vertices in a network, and for efficiently updating them in evolving networks. In real-world graphs with hundreds of millions of edges, most of our algorithms require seconds to a few minutes to perform these tasks. Further, we extend a state-of-the-art algorithm for MWM to dynamic graphs. Experiments show that our dynamic MWM algorithm handles updates in graphs with billion edges in milliseconds

    35th Symposium on Theoretical Aspects of Computer Science: STACS 2018, February 28-March 3, 2018, Caen, France

    Get PDF

    36th International Symposium on Theoretical Aspects of Computer Science: STACS 2019, March 13-16, 2019, Berlin, Germany

    Get PDF

    Effects of Irrigation Rate and Planting Density on Maize Yield and Water Use Efficiency in the Temperate Climate of Serbia

    Get PDF
    Scarce water resources severely limit maize (Zea mays L.) cultivation in the temperate regions of northern Serbia. A two-year field experiment was conducted to investigate the effects of irrigation and planting density on yield and water use efficiency in temperate climate under sprinkler irrigation. The experiment included five irrigation treatments (full irrigated treatment – FIT; 80% FIT, 60% FIT, 40% FIT, and rainfed) and three planting densities (PD1: 54,900 plants ha–1 ; PD2: 64,900 plants ha–1; PD3: 75,200 plants ha–1). There was increase in yield with the irrigation (1.05–80.00%) as compared to the rainfed crop. Results showed that decreasing irrigation rates resulted in a decrease in yield, crop water use efficiency (WUE), and irrigation water use efficiency (IWUE). Planting density had significant effects on yield, WUE, and IWUE which differed in both years. Increasing planting density gradually increased yield, WUE, and IWUE. For the pooled data, irrigation rate, planting density and their interaction was significant (P < 0.05). The highest two-year average yield, WUE, and IWUE were found for FIT-PD3 (14,612 kg ha–1), rainfed-PD2 (2.764 kg m–3), and 60% FITPD3 (2.356 kg m–3), respectively. The results revealed that irrigation is necessary for maize cultivation because rainfall is insufficient to meet the crop water needs. In addition, if water becomes a limiting factor, 80% FIT-PD3 with average yield loss of 15% would be the best agronomic practices for growing maize with a sprinkler irrigation system in a temperate climate of Serbia

    Handbook of Lexical Functional Grammar

    Get PDF
    Lexical Functional Grammar (LFG) is a nontransformational theory of linguistic structure, first developed in the 1970s by Joan Bresnan and Ronald M. Kaplan, which assumes that language is best described and modeled by parallel structures representing different facets of linguistic organization and information, related by means of functional correspondences. This volume has five parts. Part I, Overview and Introduction, provides an introduction to core syntactic concepts and representations. Part II, Grammatical Phenomena, reviews LFG work on a range of grammatical phenomena or constructions. Part III, Grammatical modules and interfaces, provides an overview of LFG work on semantics, argument structure, prosody, information structure, and morphology. Part IV, Linguistic disciplines, reviews LFG work in the disciplines of historical linguistics, learnability, psycholinguistics, and second language learning. Part V, Formal and computational issues and applications, provides an overview of computational and formal properties of the theory, implementations, and computational work on parsing, translation, grammar induction, and treebanks. Part VI, Language families and regions, reviews LFG work on languages spoken in particular geographical areas or in particular language families. The final section, Comparing LFG with other linguistic theories, discusses LFG work in relation to other theoretical approaches
    corecore