6,431 research outputs found

    Graduate Catalog of Studies, 2023-2024

    Get PDF

    Improving Cross-Lingual Transfer Learning for Event Detection

    Get PDF
    The widespread adoption of applications powered by Artificial Intelligence (AI) backbones has unquestionably changed the way we interact with the world around us. Applications such as automated personal assistants, automatic question answering, and machine-based translation systems have become mainstays of modern culture thanks to the recent considerable advances in Natural Language Processing (NLP) research. Nonetheless, with over 7000 spoken languages in the world, there still remain a considerable number of marginalized communities that are unable to benefit from these technological advancements largely due to the language they speak. Cross-Lingual Learning (CLL) looks to address this issue by transferring the knowledge acquired from a popular, high-resource source language (e.g., English, Chinese, or Spanish) to a less favored, lower-resourced target language (e.g., Urdu or Swahili). This dissertation leverages the Event Detection (ED) sub-task of Information Extraction (IE) as a testbed and presents three novel approaches that improve cross-lingual transfer learning from distinct perspectives: (1) direct knowledge transfer, (2) hybrid knowledge transfer, and (3) few-shot learning

    Dataflow Programming and Acceleration of Computationally-Intensive Algorithms

    Get PDF
    The volume of unstructured textual information continues to grow due to recent technological advancements. This resulted in an exponential growth of information generated in various formats, including blogs, posts, social networking, and enterprise documents. Numerous Enterprise Architecture (EA) documents are also created daily, such as reports, contracts, agreements, frameworks, architecture requirements, designs, and operational guides. The processing and computation of this massive amount of unstructured information necessitate substantial computing capabilities and the implementation of new techniques. It is critical to manage this unstructured information through a centralized knowledge management platform. Knowledge management is the process of managing information within an organization. This involves creating, collecting, organizing, and storing information in a way that makes it easily accessible and usable. The research involved the development textual knowledge management system, and two use cases were considered for extracting textual knowledge from documents. The first case study focused on the safety-critical documents of a railway enterprise. Safety is of paramount importance in the railway industry. There are several EA documents including manuals, operational procedures, and technical guidelines that contain critical information. Digitalization of these documents is essential for analysing vast amounts of textual knowledge that exist in these documents to improve the safety and security of railway operations. A case study was conducted between the University of Huddersfield and the Railway Safety Standard Board (RSSB) to analyse EA safety documents using Natural language processing (NLP). A graphical user interface was developed that includes various document processing features such as semantic search, document mapping, text summarization, and visualization of key trends. For the second case study, open-source data was utilized, and textual knowledge was extracted. Several features were also developed, including kernel distribution, analysis offkey trends, and sentiment analysis of words (such as unique, positive, and negative) within the documents. Additionally, a heterogeneous framework was designed using CPU/GPU and FPGAs to analyse the computational performance of document mapping

    Graduate Catalog of Studies, 2023-2024

    Get PDF

    Location Reference Recognition from Texts: A Survey and Comparison

    Full text link
    A vast amount of location information exists in unstructured texts, such as social media posts, news stories, scientific articles, web pages, travel blogs, and historical archives. Geoparsing refers to recognizing location references from texts and identifying their geospatial representations. While geoparsing can benefit many domains, a summary of its specific applications is still missing. Further, there is a lack of a comprehensive review and comparison of existing approaches for location reference recognition, which is the first and core step of geoparsing. To fill these research gaps, this review first summarizes seven typical application domains of geoparsing: geographic information retrieval, disaster management, disease surveillance, traffic management, spatial humanities, tourism management, and crime management. We then review existing approaches for location reference recognition by categorizing these approaches into four groups based on their underlying functional principle: rule-based, gazetteer matching–based, statistical learning-–based, and hybrid approaches. Next, we thoroughly evaluate the correctness and computational efficiency of the 27 most widely used approaches for location reference recognition based on 26 public datasets with different types of texts (e.g., social media posts and news stories) containing 39,736 location references worldwide. Results from this thorough evaluation can help inform future methodological developments and can help guide the selection of proper approaches based on application needs

    Sound Event Detection by Exploring Audio Sequence Modelling

    Get PDF
    Everyday sounds in real-world environments are a powerful source of information by which humans can interact with their environments. Humans can infer what is happening around them by listening to everyday sounds. At the same time, it is a challenging task for a computer algorithm in a smart device to automatically recognise, understand, and interpret everyday sounds. Sound event detection (SED) is the process of transcribing an audio recording into sound event tags with onset and offset time values. This involves classification and segmentation of sound events in the given audio recording. SED has numerous applications in everyday life which include security and surveillance, automation, healthcare monitoring, multimedia information retrieval, and assisted living technologies. SED is to everyday sounds what automatic speech recognition (ASR) is to speech and automatic music transcription (AMT) is to music. The fundamental questions in designing a sound recognition system are, which portion of a sound event should the system analyse, and what proportion of a sound event should the system process in order to claim a confident detection of that particular sound event. While the classification of sound events has improved a lot in recent years, it is considered that the temporal-segmentation of sound events has not improved in the same extent. The aim of this thesis is to propose and develop methods to improve the segmentation and classification of everyday sound events in SED models. In particular, this thesis explores the segmentation of sound events by investigating audio sequence encoding-based and audio sequence modelling-based methods, in an effort to improve the overall sound event detection performance. In the first phase of this thesis, efforts are put towards improving sound event detection by explicitly conditioning the audio sequence representations of an SED model using sound activity detection (SAD) and onset detection. To achieve this, we propose multi-task learning-based SED models in which SAD and onset detection are used as auxiliary tasks for the SED task. The next part of this thesis explores self-attention-based audio sequence modelling, which aggregates audio representations based on temporal relations within and between sound events, scored on the basis of the similarity of sound event portions in audio event sequences. We propose SED models that include memory-controlled, adaptive, dynamic, and source separation-induced self-attention variants, with the aim to improve overall sound recognition

    "It's not a career": Platform work among young people aged 16-19

    Get PDF
    In the online gig economy, or platform work as it is sometimes known, work can be organised through websites and smartphone apps. People can drive for Uber or Deliveroo, sell items on eBay or Etsy, or rent their properties on Airbnb. This research examines the views of young people between the ages of 16 and 19 in the United Kingdom to see whether they knew about the online gig economy, whether they were using it already to earn money, and whether they expected to use it for their careers. It discovers careers professionals’ levels of knowledge, and their ability (and desire) to include the gig economy in their professional practice. This research contributes to discussions about what constitutes decent work, and whether it can be found within the online gig economy. The results point to ways in which careers practice could include platform work as a means of extending young people’s knowledge about alternative forms of work. This study also makes a theoretical contribution to literature, bringing together elements of careership, cognitive schema theory, and motivational theory and psychology of working theory, in a novel combination, to explain how young people were thinking about platform work in the context of their careers

    Explainable Representations for Relation Prediction in Knowledge Graphs

    Full text link
    Knowledge graphs represent real-world entities and their relations in a semantically-rich structure supported by ontologies. Exploring this data with machine learning methods often relies on knowledge graph embeddings, which produce latent representations of entities that preserve structural and local graph neighbourhood properties, but sacrifice explainability. However, in tasks such as link or relation prediction, understanding which specific features better explain a relation is crucial to support complex or critical applications. We propose SEEK, a novel approach for explainable representations to support relation prediction in knowledge graphs. It is based on identifying relevant shared semantic aspects (i.e., subgraphs) between entities and learning representations for each subgraph, producing a multi-faceted and explainable representation. We evaluate SEEK on two real-world highly complex relation prediction tasks: protein-protein interaction prediction and gene-disease association prediction. Our extensive analysis using established benchmarks demonstrates that SEEK achieves significantly better performance than standard learning representation methods while identifying both sufficient and necessary explanations based on shared semantic aspects.Comment: 16 pages, 3 figure

    A BIM - GIS Integrated Information Model Using Semantic Web and RDF Graph Databases

    Get PDF
    In recent years, 3D virtual indoor and outdoor urban modelling has become an essential geospatial information framework for civil and engineering applications such as emergency response, evacuation planning, and facility management. Building multi-sourced and multi-scale 3D urban models are in high demand among architects, engineers, and construction professionals to achieve these tasks and provide relevant information to decision support systems. Spatial modelling technologies such as Building Information Modelling (BIM) and Geographical Information Systems (GIS) are frequently used to meet such high demands. However, sharing data and information between these two domains is still challenging. At the same time, the semantic or syntactic strategies for inter-communication between BIM and GIS do not fully provide rich semantic and geometric information exchange of BIM into GIS or vice-versa. This research study proposes a novel approach for integrating BIM and GIS using semantic web technologies and Resources Description Framework (RDF) graph databases. The suggested solution's originality and novelty come from combining the advantages of integrating BIM and GIS models into a semantically unified data model using a semantic framework and ontology engineering approaches. The new model will be named Integrated Geospatial Information Model (IGIM). It is constructed through three stages. The first stage requires BIMRDF and GISRDF graphs generation from BIM and GIS datasets. Then graph integration from BIM and GIS semantic models creates IGIMRDF. Lastly, the information from IGIMRDF unified graph is filtered using a graph query language and graph data analytics tools. The linkage between BIMRDF and GISRDF is completed through SPARQL endpoints defined by queries using elements and entity classes with similar or complementary information from properties, relationships, and geometries from an ontology-matching process during model construction. The resulting model (or sub-model) can be managed in a graph database system and used in the backend as a data-tier serving web services feeding a front-tier domain-oriented application. A case study was designed, developed, and tested using the semantic integrated information model for validating the newly proposed solution, architecture, and performance
    • …
    corecore