67 research outputs found

    Sharing Cultural Heritage: the Clavius on the Web Project

    Get PDF
    In the last few years the amount of manuscripts digitized and made available on the Web has been constantly increasing. However, there is still a considarable lack of results concerning both the explicitation of their content and the tools developed to make it available. The objective of the Clavius on the Web project is to develop a Web platform exposing a selection of Christophorus Clavius letters along with three different levels of analysis: linguistic, lexical and semantic. The multilayered annotation of the corpus involves a XML-TEI encoding followed by a tokenization step where each token is univocally identified through a CTS urn notation and then associated to a part-of-speech and a lemma. The text is lexically and semantically annotated on the basis of a lexicon and a domain ontology, the former structuring the most relevant terms occurring in the text and the latter representing the domain entities of interest (e.g. people, places, etc.). Moreover, each entity is connected to linked and non linked resources, including DBpedia and VIAF. Finally, the results of the three layers of analysis are gathered and shown through interactive visualization and storytelling techniques. A demo version of the integrated architecture was developed

    Word Meaning

    Get PDF

    Cognitive lexicon

    Get PDF

    LSP Journal Vol 4, No 2 (2013)

    Get PDF

    Proučavanje asocijacija u multilingualnih osoba: međujezično istraživanje

    Get PDF
    In the literature on mental lexicon, associations are used as a way to inspect that elusive human mechanism. Researchers have until recently mostly opted for studies based on monolingual participants, but language and, therefore, cultural communities are nowadays perceived as “melting pots” and they are mostly multilingual due to different cultural backgrounds of their members. This thesis aims to explore associations and mental lexicon organisation of multilingual speakers of Croatian, English and Russian. Association questionnaires have been used to collect data which has further been statistically analysed and explained in terms of associative fields and conceptualisation overlaps caused by typological closeness of languages at large and their status in users’ repertoires. Based on a review of the literature on linguistic culturology, the Slavic etymology and tradition that Croatian and Russian languages share shapes the way in which speakers form their linguistic picture of the world. Analysis of the responses has shown that the conceptual categories in participants’ languages are often mediated by the L1 concept and that many variables, such as e.g. participants’ language proficiency and word-related variables have an effect on the answers in the three languages. Due to a small sample of participants the results obtained are only tentative and further research is needed.В литературе по ментальной лексике ассоциации используются как решение для изучения этого неуловимого человеческого механизма. До недавнего времени исследователи в основном занимались исследованиями, основанными на данных одноязычных респондентов, но в настоящее время сообщества воспринимаются как «плавильные котлы», и они в основном многоязычны из-за разного культурного происхождения своих членов. Данные тезисы направлены на изучение ассоциаций и организацию ментальной лексики многоязычных носителей хорватского, английского и русского языков. Анкеты ассоциаций использовались для сбора данных, которые затем подвергались статистическому анализу и объяснению с точки зрения ассоциативных полей и концептуальных совпадений, вызванных типологической близостью языков и их статуса в репертуаре респондентов. Опираясь на литературу по лингвокультурологии, общая славянская этимология и традиции, которыми обладают хорватский и русский языки, определяют способ, которым говорящие на этих языках формируют свою языковую картину мира. Анализ ответов показал, что концептуальные категории в языках участников часто опосредуются концепцией L1 и что многие факторы, как например, уровень владения языком и факторы, связанные со словами-стимулами, влияют на ответы на трех языках. Но размер выборки, использованный в этом исследовании, является лишь ориентировочным. Поэтому необходимы дальнейшие исследования.U literaturi na temu mentalnog leksikona asocijacije se javljaju najboljim rješenjem za ispitivanje tog ljudskog mehanizma. Istraživači su se sve do nedavno opredjeljivali za istraživanja zasnovanima na monolingualnim sudionicima, no jezične, pa samim time i kulturne zajednice se danas percipiraju kao „melting pot“ i najčešće su multilingualne zbog različitih kulturoloških pozadina svojih članova. Cilj je ovog diplomskog rada istražiti asocijacije i organizaciju mentalnog leksikona višejezičnih govornika hrvatskog, engleskog i ruskog jezika. Za skupljanje podataka korišteni su upitnici asocijacija koji su se zatim statistički analizirali i objasnili u okvirima asocijativnih polja i konceptualnih podudaranja uzrokovanih tipološkom bliskošću proučavanih jezika, kao i statusom tih jezika u njihovim repertoarima. Na temelju lingvokulturološke teorije, slavenska etimologija i tradicija, koju ruski i hrvatski dijele, uvjetuje način na koji govornici oblikuju svoju jezičnu sliku svijeta. Analiza odgovora pokazala je da su sličnosti u konceptualnim kategorijama sudionika često pod utjecajem materinskog jezika. Nadalje, mnogi faktori, poput razine znanja jezika sudionika i faktori vezani uz riječi-stimule utječu na odgovore u trima jezicima. Zbog veličine uzorka korištenog u ovom istraživanju, rezultati koji su dobiveni smatraju se samo indikativnima pa je, prema tome, potrebno daljnje istraživanje

    Ontology Learning from the Arabic Text of the Qur’an: Concepts Identification and Hierarchical Relationships Extraction

    Get PDF
    Recent developments in ontology learning have highlighted the growing role ontologies play in linguistic and computational research areas such as language teaching and natural language processing. The ever-growing availability of annotations for the Qur’an text has made the acquisition of the ontological knowledge promising. However, the availability of resources and tools for Arabic ontology is not comparable with other languages. Manual ontology development is labour-intensive, time-consuming and it requires knowledge and skills of domain experts. This thesis aims to develop new methods for Ontology learning from the Arabic text of the Qur’an, including concepts identification and hierarchical relationships extraction. The thesis presents a methodology for reducing human intervention in building ontology from Classical Arabic Language of the Qur’an text. The set of concepts, which is a crucial step in ontology learning, was generated based on a set of patterns made of lexical and inflectional information. The concepts were identified based on adapted weighting schema that exploit a combination of knowledge to learn the relevance degree of a term. Statistical, domain-specific knowledge and internal information of Multi-Word Terms (MWTs) were combined to learn the relevance of generated terms. This methodology which represents the major contribution of the thesis was experimentally investigated using different terms generation methods. As a result, we provided the Arabic Qur’anic Terms (AQT) as a training resource for machine learning based term extraction. This thesis also introduces a new approach for hierarchical relations extraction from Arabic text of the Qur’an. A set of hierarchical relations occurring between identified concepts are extracted based on hybrid methods including head-modifier, set of markers for copula construct in Arabic text, referents. We also compared a number of ontology alignment methods for matching ontological bilingual Qur’anic resources. In addition, a multi-dimensional resource named Arabic Qur’anic Database (AQD) about the Qur’an is made for Arabic computational researchers, allowing regular expression query search over the included annotations. The search tool was successfully applied to find instances for a given complex rule made of different combined resources

    Using semantic technologies to resolve heterogeneity issues in sustainability and disaster management knowledge bases

    Get PDF
    This thesis examines issues of semantic heterogeneity in the domains of sustainability indicators and disaster management. We propose a model that links two domains with the following logic. While disaster management implies a proper and efficient response to a risk that has materialised as a disaster, sustainability can be defined as the preparedness to unexpected situations by applying measurements such as sustainability indicators. As a step to this direction, we investigate how semantic technologies can tackle the issues of heterogeneity in the aforementioned domains. First, we consider approaches to resolve the heterogeneity issues of representing the key concepts of sustainability indicator sets. To develop a knowledge base, we apply the METHONTOLOGY approach to guide the construction of two ontology design candidates: generic and specic. Of the two, the generic design is more abstract, with fewer classes and properties. Documents describing two indicator systems - the Global Reporting Initiative and the Organisation for Economic Co-operation and Development - are used in the design of both candidate ontologies. We then evaluate both ontology designs using the ROMEO approach, to calculate their level of coverage against the seen indicators, as well as against an unseen third indicator set (the United Nations Statistics Division). We also show that use of existing structured approaches like METHONTOLOGY and ROMEO can reduce ambiguity in ontology design and evaluation for domain-level ontologies. It is concluded that where an ontology needs to be designed for both seen and unseen indicator systems, a generic and reusable design is preferable. Second, having addressed the heterogeneity issues at the data level of sustainability indicators in the first phase of the research, we then develop a software for a sustainability reporting framework - Circles of Sustainability - which provides two mechanisms for browsing heterogeneous sustainability indicator sets: a Tabular view and a Circular view. In particular, the generic design of ontology developed during the first phase of the research is applied to this software. Next, we evaluate the overall usefulness and ease of use for the presented software and the associated user interfaces by conducting a user study. The analysis of quantitative and qualitative results of the user study concludes that the Circular view is the preferred interface by most participants for browsing semantic heterogeneous indicators. Third, in the context of disaster management, we present a geotagger method for the OzCrisisTracker application that automatically detects and disambiguates the heterogeneity of georeferences mentioned in the tweets' content with three possibilities: definite, ambiguous and no-location. Our method semantically annotates the tweet components utilising existing and new ontologies. We also concluded that the accuracy of geographic focus of our geotagger is considerably higher than other systems. From a more general perspective the research contributions can be articulated as follows. The knowledge bases developed in this research have been applied to the two domain applications. The thesis therefore demonstrates how semantic technologies, such as ontology design patterns, browsing tools and geocoding, can untangle data representation and navigation issues of semantic heterogeneity in sustainability and disaster management domains

    Theorizing about resource integration : Studies of actors and service innovation in dynamic contexts

    Get PDF
    Resource integration represents the most foundational construct in service-dominant (S-D) logic, but efforts to theorize about the concept are scarce, especially in regards to actors and service innovation in dynamic contexts. By theorizing about resource integration from a S-D logic perspective, this thesis aims to contribute to filling that gap through two conceptual and two empirical papers, based on extensive literature reviews and interviews. This thesis builds on a wide range of literature to theorize about resource integration, but it predominantly focuses on literature utilizing a service perspective and literature in psychology to study actors as resource integrators in value co-creation processes and service innovation, as well as what mechanisms enable and drive actors to perform these activities with greater success than other actors. The actor as a driver of activities in value co-creation processes not only needs to integrate the right resources, but also integrate the resources right. Furthermore, actors must be agile to respond to change and continuously innovate to maintain competitive advantages. Hence, explorative and exploitative resource integration should be considered the norm in companies, rather than the exception. The results of this study contribute to linking two central concepts of service research, namely, resource integration and service innovation, and theorizing about resource integration as a phenomenon through combining conceptual and exploratory research about actors’ ability to effectively and efficiently integrate resources and develop innovative solutions in services and service delivery through explorative and exploitative resource integration. By zooming in on micro-level phenomena, we have investigated elements and mechanisms that give energetic force and drive the actors performing activities. In a complex, dynamic world filled with problems and challenges, being able to adapt to changing environments, being resourceful and creative, and being able to solve problems under stress may be the most important abilities actors need to face the unpredictable future

    On link predictions in complex networks with an application to ontologies and semantics

    Get PDF
    It is assumed that ontologies can be represented and treated as networks and that these networks show properties of so-called complex networks. Just like ontologies “our current pictures of many networks are substantially incomplete” (Clauset et al., 2008, p. 3ff.). For this reason, networks have been analyzed and methods for identifying missing edges have been proposed. The goal of this thesis is to show how treating and understanding an ontology as a network can be used to extend and improve existing ontologies, and how measures from graph theory and techniques developed in social network analysis and other complex networks in recent years can be applied to semantic networks in the form of ontologies. Given a large enough amount of data, here data organized according to an ontology, and the relations defined in the ontology, the goal is to find patterns that help reveal implicitly given information in an ontology. The approach does not, unlike reasoning and methods of inference, rely on predefined patterns of relations, but it is meant to identify patterns of relations or of other structural information taken from the ontology graph, to calculate probabilities of yet unknown relations between entities. The methods adopted from network theory and social sciences presented in this thesis are expected to reduce the work and time necessary to build an ontology considerably by automating it. They are believed to be applicable to any ontology and can be used in either supervised or unsupervised fashion to automatically identify missing relations, add new information, and thereby enlarge the data set and increase the information explicitly available in an ontology. As seen in the IBM Watson example, different knowledge bases are applied in NLP tasks. An ontology like WordNet contains lexical and semantic knowl- edge on lexemes while general knowledge ontologies like Freebase and DBpedia contain information on entities of the non-linguistic world. In this thesis, examples from both kinds of ontologies are used: WordNet and DBpedia. WordNet is a manually crafted resource that establishes a network of representations of word senses, connected to the word forms used to express these, and connect these senses and forms with lexical and semantic relations in a machine-readable form. As will be shown, although a lot of work has been put into WordNet, it can still be improved. While it already contains many lexical and semantical relations, it is not possible to distinguish between polysemous and homonymous words. As will be explained later, this can be useful for NLP problems regarding word sense disambiguation and hence QA. Using graph- and network-based centrality and path measures, the goal is to train a machine learning model that is able to identify new, missing relations in the ontology and assign this new relation to the whole data set (i.e., WordNet). The approach presented here will be based on a deep analysis of the ontology and the network structure it exposes. Using different measures from graph theory as features and a set of manually created examples, a so-called training set, a supervised machine learning approach will be presented and evaluated that will show what the benefit of interpreting an ontology as a network is compared to other approaches that do not take the network structure into account. DBpedia is an ontology derived from Wikipedia. The structured information given in Wikipedia infoboxes is parsed and relations according to an underlying ontology are extracted. Unlike Wikipedia, it only contains the small amount of structured information (e.g., the infoboxes of each page) and not the large amount of unstructured information (i.e., the free text) of Wikipedia pages. Hence DBpedia is missing a large number of possible relations that are described in Wikipedia. Also compared to Freebase, an ontology used and maintained by Google, DBpedia is quite incomplete. This, and the fact that Wikipedia is expected to be usable to compare possible results to, makes DBpedia a good subject of investigation. The approach used to extend DBpedia presented in this thesis will be based on a thorough analysis of the network structure and the assumed evolution of the network, which will point to the locations of the network where information is most likely to be missing. Since the structure of the ontology and the resulting network is assumed to reveal patterns that are connected to certain relations defined in the ontology, these patterns can be used to identify what kind of relation is missing between two entities of the ontology. This will be done using unsupervised methods from the field of data mining and machine learning
    corecore