12,767 research outputs found

    Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference

    Full text link
    We propose Conditional Adapter (CoDA), a parameter-efficient transfer learning method that also improves inference efficiency. CoDA generalizes beyond standard adapter approaches to enable a new way of balancing speed and accuracy using conditional computation. Starting with an existing dense pretrained model, CoDA adds sparse activation together with a small number of new parameters and a light-weight training phase. Our experiments demonstrate that the CoDA approach provides an unexpectedly efficient way to transfer knowledge. Across a variety of language, vision, and speech tasks, CoDA achieves a 2x to 8x inference speed-up compared to the state-of-the-art Adapter approach with moderate to no accuracy loss and the same parameter efficiency

    A Decision Support System for Economic Viability and Environmental Impact Assessment of Vertical Farms

    Get PDF
    Vertical farming (VF) is the practice of growing crops or animals using the vertical dimension via multi-tier racks or vertically inclined surfaces. In this thesis, I focus on the emerging industry of plant-specific VF. Vertical plant farming (VPF) is a promising and relatively novel practice that can be conducted in buildings with environmental control and artificial lighting. However, the nascent sector has experienced challenges in economic viability, standardisation, and environmental sustainability. Practitioners and academics call for a comprehensive financial analysis of VPF, but efforts are stifled by a lack of valid and available data. A review of economic estimation and horticultural software identifies a need for a decision support system (DSS) that facilitates risk-empowered business planning for vertical farmers. This thesis proposes an open-source DSS framework to evaluate business sustainability through financial risk and environmental impact assessments. Data from the literature, alongside lessons learned from industry practitioners, would be centralised in the proposed DSS using imprecise data techniques. These techniques have been applied in engineering but are seldom used in financial forecasting. This could benefit complex sectors which only have scarce data to predict business viability. To begin the execution of the DSS framework, VPF practitioners were interviewed using a mixed-methods approach. Learnings from over 19 shuttered and operational VPF projects provide insights into the barriers inhibiting scalability and identifying risks to form a risk taxonomy. Labour was the most commonly reported top challenge. Therefore, research was conducted to explore lean principles to improve productivity. A probabilistic model representing a spectrum of variables and their associated uncertainty was built according to the DSS framework to evaluate the financial risk for VF projects. This enabled flexible computation without precise production or financial data to improve economic estimation accuracy. The model assessed two VPF cases (one in the UK and another in Japan), demonstrating the first risk and uncertainty quantification of VPF business models in the literature. The results highlighted measures to improve economic viability and the viability of the UK and Japan case. The environmental impact assessment model was developed, allowing VPF operators to evaluate their carbon footprint compared to traditional agriculture using life-cycle assessment. I explore strategies for net-zero carbon production through sensitivity analysis. Renewable energies, especially solar, geothermal, and tidal power, show promise for reducing the carbon emissions of indoor VPF. Results show that renewably-powered VPF can reduce carbon emissions compared to field-based agriculture when considering the land-use change. The drivers for DSS adoption have been researched, showing a pathway of compliance and design thinking to overcome the ‘problem of implementation’ and enable commercialisation. Further work is suggested to standardise VF equipment, collect benchmarking data, and characterise risks. This work will reduce risk and uncertainty and accelerate the sector’s emergence

    Learning disentangled speech representations

    Get PDF
    A variety of informational factors are contained within the speech signal and a single short recording of speech reveals much more than the spoken words. The best method to extract and represent informational factors from the speech signal ultimately depends on which informational factors are desired and how they will be used. In addition, sometimes methods will capture more than one informational factor at the same time such as speaker identity, spoken content, and speaker prosody. The goal of this dissertation is to explore different ways to deconstruct the speech signal into abstract representations that can be learned and later reused in various speech technology tasks. This task of deconstructing, also known as disentanglement, is a form of distributed representation learning. As a general approach to disentanglement, there are some guiding principles that elaborate what a learned representation should contain as well as how it should function. In particular, learned representations should contain all of the requisite information in a more compact manner, be interpretable, remove nuisance factors of irrelevant information, be useful in downstream tasks, and independent of the task at hand. The learned representations should also be able to answer counter-factual questions. In some cases, learned speech representations can be re-assembled in different ways according to the requirements of downstream applications. For example, in a voice conversion task, the speech content is retained while the speaker identity is changed. And in a content-privacy task, some targeted content may be concealed without affecting how surrounding words sound. While there is no single-best method to disentangle all types of factors, some end-to-end approaches demonstrate a promising degree of generalization to diverse speech tasks. This thesis explores a variety of use-cases for disentangled representations including phone recognition, speaker diarization, linguistic code-switching, voice conversion, and content-based privacy masking. Speech representations can also be utilised for automatically assessing the quality and authenticity of speech, such as automatic MOS ratings or detecting deep fakes. The meaning of the term "disentanglement" is not well defined in previous work, and it has acquired several meanings depending on the domain (e.g. image vs. speech). Sometimes the term "disentanglement" is used interchangeably with the term "factorization". This thesis proposes that disentanglement of speech is distinct, and offers a viewpoint of disentanglement that can be considered both theoretically and practically

    On the Mechanism of Building Core Competencies: a Study of Chinese Multinational Port Enterprises

    Get PDF
    This study aims to explore how Chinese multinational port enterprises (MNPEs) build their core competencies. Core competencies are firms’special capabilities and sources to gain sustainable competitive advantage (SCA) in marketplace, and the concept led to extensive research and debates. However, few studies include inquiries about the mechanisms of building core competencies in the context of Chinese MNPEs. Accordingly, answers were sought to three research questions: 1. What are the core competencies of the Chinese MNPEs? 2. What are the mechanisms that the Chinese MNPEs use to build their core competencies? 3. What are the paths that the Chinese MNPEs pursue to build their resources bases? The study adopted a multiple-case study design, focusing on building mechanism of core competencies with RBV. It selected purposively five Chinese leading MNPEs and three industry associations as Case Companies. The study revealed three main findings. First, it identified three generic core competencies possessed by Case Companies, i.e., innovation in business models and operations, utilisation of technologies, and acquisition of strategic resources. Second, it developed the conceptual framework of the Mechanism of Building Core Competencies (MBCC), which is a process of change of collective learning in effective and efficient utilization of resources of a firm in response to critical events. Third, it proposed three paths to build core competencies, i.e., enhancing collective learning, selecting sustainable processes, and building resource base. The study contributes to the knowledge of core competencies and RBV in three ways: (1) presenting three generic core competencies of the Chinese MNPEs, (2) proposing a new conceptual framework to explain how Chinese MNPEs build their core competencies, (3) suggesting a solid anchor point (MBCC) to explain the links among resources, core competencies, and SCA. The findings set benchmarks for Chinese logistics industry and provide guidelines to build core competencies

    Modelling and Solving the Single-Airport Slot Allocation Problem

    Get PDF
    Currently, there are about 200 overly congested airports where airport capacity does not suffice to accommodate airline demand. These airports play a critical role in the global air transport system since they concern 40% of global passenger demand and act as a bottleneck for the entire air transport system. This imbalance between airport capacity and airline demand leads to excessive delays, as well as multi-billion economic, and huge environmental and societal costs. Concurrently, the implementation of airport capacity expansion projects requires time, space and is subject to significant resistance from local communities. As a short to medium-term response, Airport Slot Allocation (ASA) has been used as the main demand management mechanism. The main goal of this thesis is to improve ASA decision-making through the proposition of models and algorithms that provide enhanced ASA decision support. In doing so, this thesis is organised into three distinct chapters that shed light on the following questions (I–V), which remain untapped by the existing literature. In parentheses, we identify the chapters of this thesis that relate to each research question. I. How to improve the modelling of airline demand flexibility and the utility that each airline assigns to each available airport slot? (Chapters 2 and 4) II. How can one model the dynamic and endogenous adaptation of the airport’s landside and airside infrastructure to the characteristics of airline demand? (Chapter 2) III. How to consider operational delays in strategic ASA decision-making? (Chapter 3) IV. How to involve the pertinent stakeholders into the ASA decision-making process to select a commonly agreed schedule; and how can one reduce the inherent decision-complexity without compromising the quality and diversity of the schedules presented to the decision-makers? (Chapter 3) V. Given that the ASA process involves airlines (submitting requests for slots) and coordinators (assigning slots to requests based on a set of rules and priorities), how can one jointly consider the interactions between these two sides to improve ASA decision-making? (Chapter 4) With regards to research questions (I) and (II), the thesis proposes a Mixed Integer Programming (MIP) model that considers airlines’ timing flexibility (research question I) and constraints that enable the dynamic and endogenous allocation of the airport’s resources (research question II). The proposed modelling variant addresses several additional problem characteristics and policy rules, and considers multiple efficiency objectives, while integrating all constraints that may affect airport slot scheduling decisions, including the asynchronous use of the different airport resources (runway, aprons, passenger terminal) and the endogenous consideration of the capabilities of the airport’s infrastructure to adapt to the airline demand’s characteristics and the aircraft/flight type associated with each request. The proposed model is integrated into a two-stage solution approach that considers all primary and several secondary policy rules of ASA. New combinatorial results and valid tightening inequalities that facilitate the solution of the problem are proposed and implemented. An extension of the above MIP model that considers the trade-offs among schedule displacement, maximum displacement, and the number of displaced requests, is integrated into a multi-objective solution framework. The proposed framework holistically considers the preferences of all ASA stakeholder groups (research question IV) concerning multiple performance metrics and models the operational delays associated with each airport schedule (research question III). The delays of each schedule/solution are macroscopically estimated, and a subtractive clustering algorithm and a parameter tuning routine reduce the inherent decision complexity by pruning non-dominated solutions without compromising the representativeness of the alternatives offered to the decision-makers (research question IV). Following the determination of the representative set, the expected delay estimates of each schedule are further refined by considering the whole airfield’s operations, the landside, and the airside infrastructure. The representative schedules are ranked based on the preferences of all ASA stakeholder groups concerning each schedule’s displacement-related and operational-delay performance. Finally, in considering the interactions between airlines’ timing flexibility and utility, and the policy-based priorities assigned by the coordinator to each request (research question V), the thesis models the ASA problem as a two-sided matching game and provides guarantees on the stability of the proposed schedules. A Stable Airport Slot Allocation Model (SASAM) capitalises on the flexibility considerations introduced for addressing research question (I) through the exploitation of data submitted by the airlines during the ASA process and provides functions that proxy each request’s value considering both the airlines’ timing flexibility for each submitted request and the requests’ prioritisation by the coordinators when considering the policy rules defining the ASA process. The thesis argues on the compliance of the proposed functions with the primary regulatory requirements of the ASA process and demonstrates their applicability for different types of slot requests. SASAM guarantees stability through sets of inequalities that prune allocations blocking the formation of stable schedules. A multi-objective Deferred-Acceptance (DA) algorithm guaranteeing the stability of each generated schedule is developed. The algorithm can generate all stable non-dominated points by considering the trade-off between the spilled airline and passenger demand and maximum displacement. The work conducted in this thesis addresses several problem characteristics and sheds light on their implications for ASA decision-making, hence having the potential to improve ASA decision-making. Our findings suggest that the consideration of airlines’ timing flexibility (research question I) results in improved capacity utilisation and scheduling efficiency. The endogenous consideration of the ability of the airport’s infrastructure to adapt to the characteristics of airline demand (research question II) enables a more efficient representation of airport declared capacity that results in the scheduling of additional requests. The concurrent consideration of airlines’ timing flexibility and the endogenous adaptation of airport resources to airline demand achieves an improved alignment between the airport infrastructure and the characteristics of airline demand, ergo proposing schedules of improved efficiency. The modelling and evaluation of the peak operational delays associated with the different airport schedules (research question III) provides allows the study of the implications of strategic ASA decision-making for operations and quantifies the impact of the airport’s declared capacity on each schedule’s operational performance. In considering the preferences of the relevant ASA stakeholders (airlines, coordinators, airport, and air traffic authorities) concerning multiple operational and strategic ASA efficiency metrics (research question IV) the thesis assesses the impact of alternative preference considerations and indicates a commonly preferred schedule that balances the stakeholders’ preferences. The proposition of representative subsets of alternative schedules reduces decision-complexity without significantly compromising the quality of the alternatives offered to the decision-making process (research question IV). The modelling of the ASA as a two-sided matching game (research question V), results in stable schedules consisting of request-to-slot assignments that provide no incentive to airlines and coordinators to reject or alter the proposed timings. Furthermore, the proposition of stable schedules results in more intensive use of airport capacity, while simultaneously improving scheduling efficiency. The models and algorithms developed as part of this thesis are tested using airline requests and airport capacity data from coordinated airports. Computational results that are relevant to the context of the considered airport instances provide evidence on the potential improvements for the current ASA process and facilitate data-driven policy and decision-making. In particular, with regards to the alignment of airline demand with the capabilities of the airport’s infrastructure (questions I and II), computational results report improved slot allocation efficiency and airport capacity utilisation, which for the considered airport instance translate to improvements ranging between 5-24% for various schedule performance metrics. In reducing the difficulty associated with the assessment of multiple ASA solutions by the stakeholders (question IV), instance-specific results suggest reductions to the number of alternative schedules by 87%, while maintaining the quality of the solutions presented to the stakeholders above 70% (expressed in relation to the initially considered set of schedules). Meanwhile, computational results suggest that the concurrent consideration of ASA stakeholders’ preferences (research question IV) with regards to both operational (research question III) and strategic performance metrics leads to alternative airport slot scheduling solutions that inform on the trade-offs between the schedules’ operational and strategic performance and the stakeholders’ preferences. Concerning research question (V), the application of SASAM and the DA algorithm suggest improvements to the number of unaccommodated flights and passengers (13 and 40% improvements) at the expense of requests concerning fewer passengers and days of operations (increasing the number of rejected requests by 1.2% in relation to the total number of submitted requests). The research conducted in this thesis aids in the identification of limitations that should be addressed by future studies to further improve ASA decision-making. First, the thesis focuses on exact solution approaches that consider the landside and airside infrastructure of the airport and generate multiple schedules. The proposition of pre-processing techniques that identify the bottleneck of the airport’s capacity, i.e., landside and/or airside, can be used to reduce the size of the proposed formulations and improve the required computational times. Meanwhile, the development of multi-objective heuristic algorithms that consider several problem characteristics and generate multiple efficient schedules in reasonable computational times, could extend the capabilities of the models propositioned in this thesis and provide decision support for some of the world’s most congested airports. Furthermore, the thesis models and evaluates the operational implications of strategic airport slot scheduling decisions. The explicit consideration of operational delays as an objective in ASA optimisation models and algorithms is an issue that merits investigation since it may further improve the operational performance of the generated schedules. In accordance with current practice, the models proposed in this work have considered deterministic capacity parameters. Perhaps, future research could propose formulations that consider stochastic representations of airport declared capacity and improve strategic ASA decision-making through the anticipation of operational uncertainty and weather-induced capacity reductions. Finally, in modelling airlines’ utility for each submitted request and available time slot the thesis proposes time-dependent functions that utilise available data to approximate airlines’ scheduling preferences. Future studies wishing to improve the accuracy of the proposed functions could utilise commercial data sources that provide route-specific information; or in cases that such data is unavailable, employ data mining and machine learning methodologies to extract airlines’ time-dependent utility and preferences

    From wallet to mobile: exploring how mobile payments create customer value in the service experience

    Get PDF
    This study explores how mobile proximity payments (MPP) (e.g., Apple Pay) create customer value in the service experience compared to traditional payment methods (e.g. cash and card). The main objectives were firstly to understand how customer value manifests as an outcome in the MPP service experience, and secondly to understand how the customer activities in the process of using MPP create customer value. To achieve these objectives a conceptual framework is built upon the Grönroos-Voima Value Model (Grönroos and Voima, 2013), and uses the Theory of Consumption Value (Sheth et al., 1991) to determine the customer value constructs for MPP, which is complimented with Script theory (Abelson, 1981) to determine the value creating activities the consumer does in the process of paying with MPP. The study uses a sequential exploratory mixed methods design, wherein the first qualitative stage uses two methods, self-observations (n=200) and semi-structured interviews (n=18). The subsequent second quantitative stage uses an online survey (n=441) and Structural Equation Modelling analysis to further examine the relationships and effect between the value creating activities and customer value constructs identified in stage one. The academic contributions include the development of a model of mobile payment services value creation in the service experience, introducing the concept of in-use barriers which occur after adoption and constrains the consumers existing use of MPP, and revealing the importance of the mobile in-hand momentary condition as an antecedent state. Additionally, the customer value perspective of this thesis demonstrates an alternative to the dominant Information Technology approaches to researching mobile payments and broadens the view of technology from purely an object a user interacts with to an object that is immersed in consumers’ daily life

    Innovative Hybrid Approaches for Vehicle Routing Problems

    Get PDF
    This thesis deals with the efficient resolution of Vehicle Routing Problems (VRPs). The first chapter faces the archetype of all VRPs: the Capacitated Vehicle Routing Problem (CVRP). Despite having being introduced more than 60 years ago, it still remains an extremely challenging problem. In this chapter I design a Fast Iterated-Local-Search Localized Optimization algorithm for the CVRP, shortened to FILO. The simplicity of the CVRP definition allowed me to experiment with advanced local search acceleration and pruning techniques that have eventually became the core optimization engine of FILO. FILO experimentally shown to be extremely scalable and able to solve very large scale instances of the CVRP in a fraction of the computing time compared to existing state-of-the-art methods, still obtaining competitive solutions in terms of their quality. The second chapter deals with an extension of the CVRP called the Extended Single Truck and Trailer Vehicle Routing Problem, or simply XSTTRP. The XSTTRP models a broad class of VRPs in which a single vehicle, composed of a truck and a detachable trailer, has to serve a set of customers with accessibility constraints making some of them not reachable by using the entire vehicle. This problem moves towards VRPs including more realistic constraints and it models scenarios such as parcel deliveries in crowded city centers or rural areas, where maneuvering a large vehicle is forbidden or dangerous. The XSTTRP generalizes several well known VRPs such as the Multiple Depot VRP and the Location Routing Problem. For its solution I developed an hybrid metaheuristic which combines a fast heuristic optimization with a polishing phase based on the resolution of a limited set partitioning problem. Finally, the thesis includes a final chapter aimed at guiding the computational evaluation of new approaches to VRPs proposed by the machine learning community

    Foundations for programming and implementing effect handlers

    Get PDF
    First-class control operators provide programmers with an expressive and efficient means for manipulating control through reification of the current control state as a first-class object, enabling programmers to implement their own computational effects and control idioms as shareable libraries. Effect handlers provide a particularly structured approach to programming with first-class control by naming control reifying operations and separating from their handling. This thesis is composed of three strands of work in which I develop operational foundations for programming and implementing effect handlers as well as exploring the expressive power of effect handlers. The first strand develops a fine-grain call-by-value core calculus of a statically typed programming language with a structural notion of effect types, as opposed to the nominal notion of effect types that dominates the literature. With the structural approach, effects need not be declared before use. The usual safety properties of statically typed programming are retained by making crucial use of row polymorphism to build and track effect signatures. The calculus features three forms of handlers: deep, shallow, and parameterised. They each offer a different approach to manipulate the control state of programs. Traditional deep handlers are defined by folds over computation trees, and are the original con-struct proposed by Plotkin and Pretnar. Shallow handlers are defined by case splits (rather than folds) over computation trees. Parameterised handlers are deep handlers extended with a state value that is threaded through the folds over computation trees. To demonstrate the usefulness of effects and handlers as a practical programming abstraction I implement the essence of a small UNIX-style operating system complete with multi-user environment, time-sharing, and file I/O. The second strand studies continuation passing style (CPS) and abstract machine semantics, which are foundational techniques that admit a unified basis for implementing deep, shallow, and parameterised effect handlers in the same environment. The CPS translation is obtained through a series of refinements of a basic first-order CPS translation for a fine-grain call-by-value language into an untyped language. Each refinement moves toward a more intensional representation of continuations eventually arriving at the notion of generalised continuation, which admit simultaneous support for deep, shallow, and parameterised handlers. The initial refinement adds support for deep handlers by representing stacks of continuations and handlers as a curried sequence of arguments. The image of the resulting translation is not properly tail-recursive, meaning some function application terms do not appear in tail position. To rectify this the CPS translation is refined once more to obtain an uncurried representation of stacks of continuations and handlers. Finally, the translation is made higher-order in order to contract administrative redexes at translation time. The generalised continuation representation is used to construct an abstract machine that provide simultaneous support for deep, shallow, and parameterised effect handlers. kinds of effect handlers. The third strand explores the expressiveness of effect handlers. First, I show that deep, shallow, and parameterised notions of handlers are interdefinable by way of typed macro-expressiveness, which provides a syntactic notion of expressiveness that affirms the existence of encodings between handlers, but it provides no information about the computational content of the encodings. Second, using the semantic notion of expressiveness I show that for a class of programs a programming language with first-class control (e.g. effect handlers) admits asymptotically faster implementations than possible in a language without first-class control

    Full stack development toward a trapped ion logical qubit

    Get PDF
    Quantum error correction is a key step toward the construction of a large-scale quantum computer, by preventing small infidelities in quantum gates from accumulating over the course of an algorithm. Detecting and correcting errors is achieved by using multiple physical qubits to form a smaller number of robust logical qubits. The physical implementation of a logical qubit requires multiple qubits, on which high fidelity gates can be performed. The project aims to realize a logical qubit based on ions confined on a microfabricated surface trap. Each physical qubit will be a microwave dressed state qubit based on 171Yb+ ions. Gates are intended to be realized through RF and microwave radiation in combination with magnetic field gradients. The project vertically integrates software down to hardware compilation layers in order to deliver, in the near future, a fully functional small device demonstrator. This thesis presents novel results on multiple layers of a full stack quantum computer model. On the hardware level a robust quantum gate is studied and ion displacement over the X-junction geometry is demonstrated. The experimental organization is optimized through automation and compressed waveform data transmission. A new quantum assembly language purely dedicated to trapped ion quantum computers is introduced. The demonstrator is aimed at testing implementation of quantum error correction codes while preparing for larger scale iterations.Open Acces

    Machine learning for managing structured and semi-structured data

    Get PDF
    As the digitalization of private, commercial, and public sectors advances rapidly, an increasing amount of data is becoming available. In order to gain insights or knowledge from these enormous amounts of raw data, a deep analysis is essential. The immense volume requires highly automated processes with minimal manual interaction. In recent years, machine learning methods have taken on a central role in this task. In addition to the individual data points, their interrelationships often play a decisive role, e.g. whether two patients are related to each other or whether they are treated by the same physician. Hence, relational learning is an important branch of research, which studies how to harness this explicitly available structural information between different data points. Recently, graph neural networks have gained importance. These can be considered an extension of convolutional neural networks from regular grids to general (irregular) graphs. Knowledge graphs play an essential role in representing facts about entities in a machine-readable way. While great efforts are made to store as many facts as possible in these graphs, they often remain incomplete, i.e., true facts are missing. Manual verification and expansion of the graphs is becoming increasingly difficult due to the large volume of data and must therefore be assisted or substituted by automated procedures which predict missing facts. The field of knowledge graph completion can be roughly divided into two categories: Link Prediction and Entity Alignment. In Link Prediction, machine learning models are trained to predict unknown facts between entities based on the known facts. Entity Alignment aims at identifying shared entities between graphs in order to link several such knowledge graphs based on some provided seed alignment pairs. In this thesis, we present important advances in the field of knowledge graph completion. For Entity Alignment, we show how to reduce the number of required seed alignments while maintaining performance by novel active learning techniques. We also discuss the power of textual features and show that graph-neural-network-based methods have difficulties with noisy alignment data. For Link Prediction, we demonstrate how to improve the prediction for unknown entities at training time by exploiting additional metadata on individual statements, often available in modern graphs. Supported with results from a large-scale experimental study, we present an analysis of the effect of individual components of machine learning models, e.g., the interaction function or loss criterion, on the task of link prediction. We also introduce a software library that simplifies the implementation and study of such components and makes them accessible to a wide research community, ranging from relational learning researchers to applied fields, such as life sciences. Finally, we propose a novel metric for evaluating ranking results, as used for both completion tasks. It allows for easier interpretation and comparison, especially in cases with different numbers of ranking candidates, as encountered in the de-facto standard evaluation protocols for both tasks.Mit der rasant fortschreitenden Digitalisierung des privaten, kommerziellen und öffentlichen Sektors werden immer grĂ¶ĂŸere Datenmengen verfĂŒgbar. Um aus diesen enormen Mengen an Rohdaten Erkenntnisse oder Wissen zu gewinnen, ist eine tiefgehende Analyse unerlĂ€sslich. Das immense Volumen erfordert hochautomatisierte Prozesse mit minimaler manueller Interaktion. In den letzten Jahren haben Methoden des maschinellen Lernens eine zentrale Rolle bei dieser Aufgabe eingenommen. Neben den einzelnen Datenpunkten spielen oft auch deren ZusammenhĂ€nge eine entscheidende Rolle, z.B. ob zwei Patienten miteinander verwandt sind oder ob sie vom selben Arzt behandelt werden. Daher ist das relationale Lernen ein wichtiger Forschungszweig, der untersucht, wie diese explizit verfĂŒgbaren strukturellen Informationen zwischen verschiedenen Datenpunkten nutzbar gemacht werden können. In letzter Zeit haben Graph Neural Networks an Bedeutung gewonnen. Diese können als eine Erweiterung von CNNs von regelmĂ€ĂŸigen Gittern auf allgemeine (unregelmĂ€ĂŸige) Graphen betrachtet werden. Wissensgraphen spielen eine wesentliche Rolle bei der Darstellung von Fakten ĂŒber EntitĂ€ten in maschinenlesbaren Form. Obwohl große Anstrengungen unternommen werden, so viele Fakten wie möglich in diesen Graphen zu speichern, bleiben sie oft unvollstĂ€ndig, d. h. es fehlen Fakten. Die manuelle ÜberprĂŒfung und Erweiterung der Graphen wird aufgrund der großen Datenmengen immer schwieriger und muss daher durch automatisierte Verfahren unterstĂŒtzt oder ersetzt werden, die fehlende Fakten vorhersagen. Das Gebiet der WissensgraphenvervollstĂ€ndigung lĂ€sst sich grob in zwei Kategorien einteilen: Link Prediction und Entity Alignment. Bei der Link Prediction werden maschinelle Lernmodelle trainiert, um unbekannte Fakten zwischen EntitĂ€ten auf der Grundlage der bekannten Fakten vorherzusagen. Entity Alignment zielt darauf ab, gemeinsame EntitĂ€ten zwischen Graphen zu identifizieren, um mehrere solcher Wissensgraphen auf der Grundlage einiger vorgegebener Paare zu verknĂŒpfen. In dieser Arbeit stellen wir wichtige Fortschritte auf dem Gebiet der VervollstĂ€ndigung von Wissensgraphen vor. FĂŒr das Entity Alignment zeigen wir, wie die Anzahl der benötigten Paare reduziert werden kann, wĂ€hrend die Leistung durch neuartige aktive Lerntechniken erhalten bleibt. Wir erörtern auch die LeistungsfĂ€higkeit von Textmerkmalen und zeigen, dass auf Graph-Neural-Networks basierende Methoden Schwierigkeiten mit verrauschten Paar-Daten haben. FĂŒr die Link Prediction demonstrieren wir, wie die Vorhersage fĂŒr unbekannte EntitĂ€ten zur Trainingszeit verbessert werden kann, indem zusĂ€tzliche Metadaten zu einzelnen Aussagen genutzt werden, die oft in modernen Graphen verfĂŒgbar sind. GestĂŒtzt auf Ergebnisse einer groß angelegten experimentellen Studie prĂ€sentieren wir eine Analyse der Auswirkungen einzelner Komponenten von Modellen des maschinellen Lernens, z. B. der Interaktionsfunktion oder des Verlustkriteriums, auf die Aufgabe der Link Prediction. Außerdem stellen wir eine Softwarebibliothek vor, die die Implementierung und Untersuchung solcher Komponenten vereinfacht und sie einer breiten Forschungsgemeinschaft zugĂ€nglich macht, die von Forschern im Bereich des relationalen Lernens bis hin zu angewandten Bereichen wie den Biowissenschaften reicht. Schließlich schlagen wir eine neuartige Metrik fĂŒr die Bewertung von Ranking-Ergebnissen vor, wie sie fĂŒr beide Aufgaben verwendet wird. Sie ermöglicht eine einfachere Interpretation und einen leichteren Vergleich, insbesondere in FĂ€llen mit einer unterschiedlichen Anzahl von Kandidaten, wie sie in den de-facto Standardbewertungsprotokollen fĂŒr beide Aufgaben vorkommen
    • 

    corecore