Search CORE

114 research outputs found

Natural language processing for similar languages, varieties, and dialects: A survey

Author: Nakov Preslav
Scherrer Yves
Zampieri Marcos
Publication venue
Publication date: 20/11/2020
Field of study

There has been a lot of recent interest in the natural language processing (NLP) community in the computational processing of language varieties and dialects, with the aim to improve the performance of applications such as machine translation, speech recognition, and dialogue systems. Here, we attempt to survey this growing field of research, with focus on computational methods for processing similar languages, varieties, and dialects. In particular, we discuss the most important challenges when dealing with diatopic language variation, and we present some of the available datasets, the process of data collection, and the most common data collection strategies used to compile datasets for similar languages, varieties, and dialects. We further present a number of studies on computational methods developed and/or adapted for preprocessing, normalization, part-of-speech tagging, and parsing similar languages, language varieties, and dialects. Finally, we discuss relevant applications such as language and dialect identification and machine translation for closely related languages, language varieties, and dialects.Non peer reviewe

Helsingin yliopiston digitaalinen arkisto

Histories of Australian Rock Art Research

Author
Publication venue: 'ANU Press'
Publication date: 25/10/2022
Field of study

Australia has one of the largest inventories of rock art in the world with pictographs and petroglyphs found almost anywhere that has suitable rock surfaces – in rock shelters and caves, on boulders and rock platforms. First Nations people have been marking these places with figurative imagery, abstract designs, stencils and prints for tens of thousands of years, often engaging with earlier rock markings. The art reflects and expresses changing experiences within landscapes over time, spirituality, history, law and lore, as well as relationships between individuals and groups of people, plants, animals, land and Ancestral Beings that are said to have created the world, including some rock art. Since the late 1700s, people arriving in Australia have been fascinated with the rock art they encountered, with detailed studies commencing in the late 1800s. Through the 1900s an impressive body of research on Australian rock art was undertaken, with dedicated academic study using archaeological methods employed since the late 1940s. Since then, Australian rock art has been researched from various perspectives, including that of Traditional Owners, custodians and other community members. Through the 1900s, there was also growing interest in Australian rock art from researchers across the globe, leading many to visit or migrate to Australia to undertake rock art research. In this volume, the varied histories of Australian rock art research from different parts of the country are explored not only in terms of key researchers, developments and changes over time, but also the crucial role of First Nations people themselves in investigations of this key component of their living heritage

The Nature of attachment:An Australian experience

Author: Brown Steve
Publication venue: Taylor & Francis
Publication date: 01/01/2018
Field of study

Throughout the world, protected area management regimes typically separate cultural and natural heritage in legislation, policy, administrative structures, disciplinary expertise, and on-ground practice. Within settler colonial nations, including Australia, cultural heritage is itself habitually separated into indigenous heritage and 'historic' (or non-indigenous) heritage. A consequence of these multiple binaries and disconnected regimes is that they work across rather than with one another. In this chapter, I use the frame of place-attachment to consider issues arising from the separation of natural and cultural heritage in the management of protected areas. The case examples are homestead gardens within protected areas, and my concern is for the recognition of Anglo-Australian place-attachment to domestic gardens.</p

University of Canberra Research Repository

IKUWA6. Shared Heritage

Author
Publication venue
Publication date
Field of study

Celebrating the theme ‘Shared heritage’, IKUWA6 (the 6th International Congress for Underwater Archaeology), was the first such major conference to be held in the Asia-Pacific region, and the first IKUWA meeting hosted outside Europe since the organisation’s inception in Germany in the 1990s. A primary objective of holding IKUWA6 in Australia was to give greater voice to practitioners and emerging researchers across the Asia and Pacific regions who are often not well represented in northern hemisphere scientific gatherings of this scale; and, to focus on the areas of overlap in our mutual heritage, techniques and technology. Drawing together peer-reviewed presentations by delegates from across the world who converged in Fremantle in 2016 to participate, this volume covers a stimulating diversity of themes and niche topics of value to maritime archaeology practitioners, researchers, students, historians and museum professionals across the world

OAPEN Library

Cross-Domain information extraction from scientific articles for research knowledge graphs

Author: Brack Arthur
Publication venue: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
Publication date: 01/01/2022
Field of study

Today’s scholarly communication is a document-centred process and as such, rather inefficient. Fundamental contents of research papers are not accessible by computers since they are only present in unstructured PDF files. Therefore, current research infrastructures are not able to assist scientists appropriately in their core research tasks. This thesis addresses this issue and proposes methods to automatically extract relevant information from scientific articles for Research Knowledge Graphs (RKGs) that represent scholarly knowledge structured and interlinked. First, this thesis conducts a requirements analysis for an Open Research Knowledge Graph (ORKG). We present literature-related use cases of researchers that should be supported by an ORKG-based system and their specific requirements for the underlying ontology and instance data. Based on this analysis, the identified use cases are categorised into two groups: The first group of use cases needs manual or semi-automatic approaches for knowledge graph (KG) construction since they require high correctness of the instance data. The second group requires high completeness and can tolerate noisy instance data. Thus, this group needs automatic approaches for KG population. This thesis focuses on the second group of use cases and provides contributions for machine learning tasks that aim to support them. To assess the relevance of a research paper, scientists usually skim through titles, abstracts, introductions, and conclusions. An organised presentation of the articles' essential information would make this process more time-efficient. The task of sequential sentence classification addresses this issue by classifying sentences in an article in categories like research problem, used methods, or obtained results. To address this problem, we propose a novel unified cross-domain multi-task deep learning approach that makes use of datasets from different scientific domains (e.g. biomedicine and computer graphics) and varying structures (e.g. datasets covering either only abstracts or full papers). Our approach outperforms the state of the art on full paper datasets significantly while being competitive for datasets consisting of abstracts. Moreover, our approach enables the categorisation of sentences in a domain-independent manner. Furthermore, we present the novel task of domain-independent information extraction to extract scientific concepts from research papers in a domain-independent manner. This task aims to support the use cases find related work and get recommended articles. For this purpose, we introduce a set of generic scientific concepts that are relevant over ten domains in Science, Technology, and Medicine (STM) and release an annotated dataset of 110 abstracts from these domains. Since the annotation of scientific text is costly, we suggest an active learning strategy based on a state-of-the-art deep learning approach. The proposed method enables us to nearly halve the amount of required training data. Then, we extend this domain-independent information extraction approach with the task of \textit{coreference resolution}. Coreference resolution aims to identify mentions that refer to the same concept or entity. Baseline results on our corpus with current state-of-the-art approaches for coreference resolution showed that current approaches perform poorly on scientific text. Therefore, we propose a sequential transfer learning approach that exploits annotated datasets from non-academic domains. Our experimental results demonstrate that our approach noticeably outperforms the state-of-the-art baselines. Additionally, we investigate the impact of coreference resolution on KG population. We demonstrate that coreference resolution has a small impact on the number of resulting concepts in the KG, but improved its quality significantly. Consequently, using our domain-independent information extraction approach, we populate an RKG from 55,485 abstracts of the ten investigated STM domains. We show that every domain mainly uses its own terminology and that the populated RKG contains useful concepts. Moreover, we propose a novel approach for the task of \textit{citation recommendation}. This task can help researchers improve the quality of their work by finding or recommending relevant related work. Our approach exploits RKGs that interlink research papers based on mentioned scientific concepts. Using our automatically populated RKG, we demonstrate that the combination of information from RKGs with existing state-of-the-art approaches is beneficial. Finally, we conclude the thesis and sketch possible directions of future work.Die Kommunikation von Forschungsergebnissen erfolgt heutzutage in Form von Dokumenten und ist aus verschiedenen Gründen ineffizient. Wesentliche Inhalte von Forschungsarbeiten sind für Computer nicht zugänglich, da sie in unstrukturierten PDF-Dateien verborgen sind. Daher können derzeitige Forschungsinfrastrukturen Forschende bei ihren Kernaufgaben nicht angemessen unterstützen. Diese Arbeit befasst sich mit dieser Problemstellung und untersucht Methoden zur automatischen Extraktion von relevanten Informationen aus Forschungspapieren für Forschungswissensgraphen (Research Knowledge Graphs). Solche Graphen sollen wissenschaftliches Wissen maschinenlesbar strukturieren und verknüpfen. Zunächst wird eine Anforderungsanalyse für einen Open Research Knowledge Graph (ORKG) durchgeführt. Wir stellen literaturbezogene Anwendungsfälle von Forschenden vor, die durch ein ORKG-basiertes System unterstützt werden sollten, und deren spezifische Anforderungen an die zugrundeliegende Ontologie und die Instanzdaten. Darauf aufbauend werden die identifizierten Anwendungsfälle in zwei Gruppen eingeteilt: Die erste Gruppe von Anwendungsfällen benötigt manuelle oder halbautomatische Ansätze für die Konstruktion eines ORKG, da sie eine hohe Korrektheit der Instanzdaten erfordern. Die zweite Gruppe benötigt eine hohe Vollständigkeit der Instanzdaten und kann fehlerhafte Daten tolerieren. Daher erfordert diese Gruppe automatische Ansätze für die Konstruktion des ORKG. Diese Arbeit fokussiert sich auf die zweite Gruppe von Anwendungsfällen und schlägt Methoden für maschinelle Aufgabenstellungen vor, die diese Anwendungsfälle unterstützen können. Um die Relevanz eines Forschungsartikels effizient beurteilen zu können, schauen sich Forschende in der Regel die Titel, Zusammenfassungen, Einleitungen und Schlussfolgerungen an. Durch eine strukturierte Darstellung von wesentlichen Informationen des Artikels könnte dieser Prozess zeitsparender gestaltet werden. Die Aufgabenstellung der sequenziellen Satzklassifikation befasst sich mit diesem Problem, indem Sätze eines Artikels in Kategorien wie Forschungsproblem, verwendete Methoden oder erzielte Ergebnisse automatisch klassifiziert werden. In dieser Arbeit wird für diese Aufgabenstellung ein neuer vereinheitlichter Multi-Task Deep-Learning-Ansatz vorgeschlagen, der Datensätze aus verschiedenen wissenschaftlichen Bereichen (z. B. Biomedizin und Computergrafik) mit unterschiedlichen Strukturen (z. B. Datensätze bestehend aus Zusammenfassungen oder vollständigen Artikeln) nutzt. Unser Ansatz übertrifft State-of-the-Art-Verfahren der Literatur auf Benchmark-Datensätzen bestehend aus vollständigen Forschungsartikeln. Außerdem ermöglicht unser Ansatz die Klassifizierung von Sätzen auf eine domänenunabhängige Weise. Darüber hinaus stellen wir die neue Aufgabenstellung domänenübergreifende Informationsextraktion vor. Hierbei werden, unabhängig vom behandelten wissenschaftlichen Fachgebiet, inhaltliche Konzepte aus Forschungspapieren extrahiert. Damit sollen die Anwendungsfälle Finden von verwandten Arbeiten und Empfehlung von Artikeln unterstützt werden. Zu diesem Zweck führen wir eine Reihe von generischen wissenschaftlichen Konzepten ein, die in zehn Bereichen der Wissenschaft, Technologie und Medizin (STM) relevant sind, und veröffentlichen einen annotierten Datensatz von 110 Zusammenfassungen aus diesen Bereichen. Da die Annotation wissenschaftlicher Texte aufwändig ist, kombinieren wir ein Active-Learning-Verfahren mit einem aktuellen Deep-Learning-Ansatz, um die notwendigen Trainingsdaten zu reduzieren. Die vorgeschlagene Methode ermöglicht es uns, die Menge der erforderlichen Trainingsdaten nahezu zu halbieren. Anschließend erweitern wir unseren domänenunabhängigen Ansatz zur Informationsextraktion um die Aufgabe der Koreferenzauflösung. Die Auflösung von Koreferenzen zielt darauf ab, Erwähnungen zu identifizieren, die sich auf dasselbe Konzept oder dieselbe Entität beziehen. Experimentelle Ergebnisse auf unserem Korpus mit aktuellen Ansätzen zur Koreferenzauflösung haben gezeigt, dass diese bei wissenschaftlichen Texten unzureichend abschneiden. Daher schlagen wir eine Transfer-Learning-Methode vor, die annotierte Datensätze aus nicht-akademischen Bereichen nutzt. Die experimentellen Ergebnisse zeigen, dass unser Ansatz deutlich besser abschneidet als die bisherigen Ansätze. Darüber hinaus untersuchen wir den Einfluss der Koreferenzauflösung auf die Erstellung von Wissensgraphen. Wir zeigen, dass diese einen geringen Einfluss auf die Anzahl der resultierenden Konzepte in dem Wissensgraphen hat, aber die Qualität des Wissensgraphen deutlich verbessert. Mithilfe unseres domänenunabhängigen Ansatzes zur Informationsextraktion haben wir aus 55.485 Zusammenfassungen der zehn untersuchten STM-Domänen einen Forschungswissensgraphen erstellt. Unsere Analyse zeigt, dass jede Domäne hauptsächlich ihre eigene Terminologie verwendet und dass der erstellte Wissensgraph nützliche Konzepte enthält. Schließlich schlagen wir einen Ansatz für die Empfehlung von passenden Referenzen vor. Damit können Forschende einfacher relevante verwandte Arbeiten finden oder passende Empfehlungen erhalten. Unser Ansatz nutzt Forschungswissensgraphen, die Forschungsarbeiten mit in ihnen erwähnten wissenschaftlichen Konzepten verknüpfen. Wir zeigen, dass aktuelle Verfahren zur Empfehlung von Referenzen von zusätzlichen Informationen aus einem automatisch erstellten Wissensgraphen profitieren. Zum Schluss wird ein Fazit gezogen und ein Ausblick für mögliche zukünftige Arbeiten gegeben

Bloody but unbowed : how international and national legal norms and frameworks can improve recognition and inclusion of Aboriginal and Torres Strait Islander knowledge in Australian environmental decision-making

Author: Preston Judith A.
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2019
Field of study

The Earth’s natural and cultural resources are disappearing at an unrelenting pace, adversely affecting both human and non-human species. Solutions to these challenges derived from legal framework requires a shift towards creative and interdisciplinary approaches. Increased reference to knowledges held and protected by Indigenous custodians and inclusion of their input into decision-making particularly related to natural resource protection and development, form part of this shift. Legal frameworks globally, have begun to include references to Indigenous knowledges (IK) as important knowledge sources included in national constitutions as well as environmental and cultural heritage protection laws. In Australia, Indigenous knowledges have been referenced as important sources of knowledge in laws at national, state and local levels. Custodians of IK may be consulted in the decision-making process particularly in environmental matters; however, it is generally subject to the discretion of the decision-makers. This thesis considers IK in the context of the relationship, rights and needs of Aboriginal and Torres Strait Islander knowledges custodians as determined by them. A key starting point is to consider whether these knowledges, and its custodians, are respected and consulted effectively. Consideration would also be given as to whether their voices are heard within Australian environmental law and governance frameworks in accordance with international law and policy, which incorporates Indigenous self-determination. This requires enabling laws and institutions to address the dispossession and disempowerment suffered by Aboriginal and Torres Strait Islander communities by colonial powers and the impediments of both past and current governance. The thesis cannot address and make an in-depth analysis into all these complex issues simultaneously. It does not attempt to know, or articulate, Aboriginal and Torres Strait Islander knowledges in the vastly different nations, cultures and ecosystems throughout Australia. This is sui generis to Indigenous peoples and their Country. Direct Indigenous voices and institutions should determine whether IK can be revealed and applied in the wider context of EDM. However, this thesis can delve further into these issues to provide a base of knowledge for the reader to begin comprehending the complexity and importance of IK and its place in the Australian legal framework to improve EDM. Focus is placed on the value and recognition of IK within the legal system and processes for empowering the holders of IK to be a powerful and effective voice within EDM. The proposed approaches can only facilitate improved environmental governance. Achieving the integration and implementation of IK in EDM requires wider legal, political, economic and social change towards Aboriginal and Torres Strait Islander self-determination

Multimodal sentiment analysis in real-life videos

Author: Stappen Lukas
Publication venue
Publication date: 24/11/2022
Field of study

This thesis extends the emerging field of multimodal sentiment analysis of real-life videos, taking two components into consideration: the emotion and the emotion's target. The emotion component of media is traditionally represented as a segment-based intensity model of emotion classes. This representation is replaced here by a value- and time-continuous view. Adjacent research fields, such as affective computing, have largely neglected the linguistic information available from automatic transcripts of audio-video material. As is demonstrated here, this text modality is well-suited for time- and value-continuous prediction. Moreover, source-specific problems, such as trustworthiness, have been largely unexplored so far. This work examines perceived trustworthiness of the source, and its quantification, in user-generated video data and presents a possible modelling path. Furthermore, the transfer between the continuous and discrete emotion representations is explored in order to summarise the emotional context at a segment level. The other component deals with the target of the emotion, for example, the topic the speaker is addressing. Emotion targets in a video dataset can, as is shown here, be coherently extracted based on automatic transcripts without limiting a priori parameters, such as the expected number of targets. Furthermore, alternatives to purely linguistic investigation in predicting targets, such as knowledge-bases and multimodal systems, are investigated. A new dataset is designed for this investigation, and, in conjunction with proposed novel deep neural networks, extensive experiments are conducted to explore the components described above. The developed systems show robust prediction results and demonstrate strengths of the respective modalities, feature sets, and modelling techniques. Finally, foundations are laid for cross-modal information prediction systems with applications to the correction of corrupted in-the-wild signals from real-life videos

OPUS Augsburg

Enterprise reference architectures for higher education institutions: Analysis, comparison and practical uses

Author: Sanchez Puchol Felix
Publication venue: 'Fundacio per la Universitat Oberta de Catalunya'
Publication date: 29/04/2022
Field of study

Enterprise Architecture (EA) is currently accepted as one on the major instruments for enabling organisations in their transformation processes to achieve business-technology alignment. Despite that over the last years EA has been successfully adopted in many industries, Higher Education still represents one of the sectors with lower levels of adoption and maturity of EA practices. The present thesis puts the emphasis particularly on the study Enterprise Reference Architectures (ERAs), as a particular type of EA artefact, in Higher Education Institutions (HEIs). After formally clarifying the concept of ERAs and giving a panoramic view of the current state-of-the-art of existing HEI-oriented ERAs, the thesis proposes an artefact framework build through a Design Science Research (DSR) approach aimed to facilitate practitioners their (re-)use or application in their own real practical settings. The purpose of the constructed artefact is to support practitioners when conducting the necessary adjustments to exiting HEI-oriented ERAs in order to be successfully applied for their specific needs.La Arquitectura Empresarial (AE) es actualmente reconocida como una disciplina que permite configurar procesos de trasformación organizativa a objeto de alinear el negocio con la tecnología. A pesar de que en los últimos años la AE se ha ido adoptando progresivamente de forma exitosa en diversas industrias, la educación superior representa todavía hoy en día uno de los sectores con menores niveles de adopción y de madurez en lo que se refiere a las prácticas de AE. La presente tesis hace especial hincapié en el estudio de las Arquitecturas de Referencia Empresariales (AREs), entendidas como un artefacto específico de AE, en Instituciones de Educación Superior (IES). Así, después de clarificar formalmente el concepto de ARE y de ofrecer una visión panorámica del estado del arte relativo a las AREs para IES existentes, la tesis propone un framework de trabajo construido a través de un enfoque de investigación basado en la Ciencia del diseño destinado a facilitar su (re-)utilización o aplicación práctica en dominios de trabajo reales. El objetivo del artefacto es proporcionar soporte práctico a los profesionales para realizar los ajustes necesarios a las AREs para IES existentes para que puedan aplicarlas con éxito a sus necesidades específicas.L'Arquitectura Empresarial (AE) és actualment reconeguda com una disciplina que permet configurar processos de transformació organitzatius a fi d'alinear el negoci amb la tecnologia. Tot i que en els darrers anys l'AE s'ha anat adoptant progressivament amb èxit en diverses indústries, l'educació superior representa encara avui dia un dels sectors amb menors nivells d'adopció i de maduresa pel que fa a pràctiques d'AE. Aquesta tesi posa especial èmfasi en l'estudi de les Arquitectures de Referència Empresarials (AREs), enteses com un artefacte concret d'AE, a Institucions d'Educació Superior (IES). Així, després d'aclarir formalment el concepte d'ARE i oferir una visió panoràmica de l'estat de l'art relatiu a les ARE per a IES existents, la tesi proposa un framework de treball construït a través d'un enfocament de recerca basat en la ciència del disseny destinat a facilitar-ne la seva (re-)utilització o aplicació pràctica en dominis de treball reals. L'objectiu de l'artefacte és proporcionar suport pràctic als professionals per realitzar els ajustaments necessaris a les AREs per a IES existents de forma que les puguin aplicar amb èxit a les seves necessitats específiques.Tecnologies de la informació i de xarxe