2,280 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Guided rewriting and constraint satisfaction for parallel GPU code generation

    Get PDF
    Graphics Processing Units (GPUs) are notoriously hard to optimise for manually due to their scheduling and memory hierarchies. What is needed are good automatic code generators and optimisers for such parallel hardware. Functional approaches such as Accelerate, Futhark and LIFT leverage a high-level algorithmic Intermediate Representation (IR) to expose parallelism and abstract the implementation details away from the user. However, producing efficient code for a given accelerator remains challenging. Existing code generators depend on the user input to choose a subset of hard-coded optimizations or automated exploration of implementation search space. The former suffers from the lack of extensibility, while the latter is too costly due to the size of the search space. A hybrid approach is needed, where a space of valid implementations is built automatically and explored with the aid of human expertise. This thesis presents a solution combining user-guided rewriting and automatically generated constraints to produce high-performance code. The first contribution is an automatic tuning technique to find a balance between performance and memory consumption. Leveraging its functional patterns, the LIFT compiler is empowered to infer tuning constraints and limit the search to valid tuning combinations only. Next, the thesis reframes parallelisation as a constraint satisfaction problem. Parallelisation constraints are extracted automatically from the input expression, and a solver is used to identify valid rewriting. The constraints truncate the search space to valid parallel mappings only by capturing the scheduling restrictions of the GPU in the context of a given program. A synchronisation barrier insertion technique is proposed to prevent data races and improve the efficiency of the generated parallel mappings. The final contribution of this thesis is the guided rewriting method, where the user encodes a design space of structural transformations using high-level IR nodes called rewrite points. These strongly typed pragmas express macro rewrites and expose design choices as explorable parameters. The thesis proposes a small set of reusable rewrite points to achieve tiling, cache locality, data reuse and memory optimisation. A comparison with the vendor-provided handwritten kernel ARM Compute Library and the TVM code generator demonstrates the effectiveness of this thesis' contributions. With convolution as a use case, LIFT-generated direct and GEMM-based convolution implementations are shown to perform on par with the state-of-the-art solutions on a mobile GPU. Overall, this thesis demonstrates that a functional IR yields well to user-guided and automatic rewriting for high-performance code generation

    Experience matters: women's experience of care during facility-based childbirth. A mixed-methods study on postpartum outcomes

    Get PDF
    Background: The poor treatment women are receiving during facility-based childbirth is an escalating global issue with potentially adverse postnatal consequences. My thesis aims to enhance understanding of these consequences, with a focus on postnatal care-seeking behaviour, maternal mental health and breastfeeding patterns in Tucumán, Argentina. / Objective: I sought to investigate the impact of mistreatment during childbirth (MDC) on postnatal outcomes and explore the influence of individual, interpersonal and societal factors on this relationship. / Methods: Employing a pragmatic epistemological framework, I adopted a mixed-methods approach. First, a systematic review of existing literature on mistreatment and its postnatal effects provided a comprehensive foundation for my research. Subsequently, I conducted semi-structured interviews and focus group discussions with women from an underserved community in Tucumán to gain qualitative insights. To complement this, I carried out a prospective cohort study with women who delivered in a public maternity hospital. Data analysis involved using the capability, opportunity, motivation, and behaviour (COM-B) model, directed acyclic graphs, and factor analysis to examine behavioural impacts, association pathways, and operationalisation of MDC. Multivariable models were applied to measure the association between MDC and postnatal outcomes. / Results: The study revealed that MDC should not be operationalised as a single construct, as women perceive breaches of quality of care differently from direct physical or verbal abuse. Health literacy, social support and self-esteem were identified as psychosocial confounders in the relationship between mistreatment and postnatal outcomes. Only 26% of women in the cohort study in Tucumán accessed postnatal care, with incidences of postpartum depression and anxiety of 67% and 21%, respectively. No statistically significant association was found between MDC and care seeking behaviour, although a possible trend emerged suggesting the women experiencing physical or verbal MDC could be more likely to seek care than those who were not mistreated. / Conclusion: Several exploratory hypotheses are presented to explain the trend suggesting that women who are verbally or physically mistreated are more prone to seek care after birth. Additionally, three concrete contributions emerged from this work: 1) the need to differentiate the conceptualisation of MDC from its operationalisation when assessing postnatal effects; 2) the importance of integrating psychosocial factors into the theory of change when designing effective interventions, and 3) the urgency of enhancing postnatal care access to improve maternal and newborn health outcomes, regardless of women’s childbirth experiences

    Tools for efficient Deep Learning

    Get PDF
    In the era of Deep Learning (DL), there is a fast-growing demand for building and deploying Deep Neural Networks (DNNs) on various platforms. This thesis proposes five tools to address the challenges for designing DNNs that are efficient in time, in resources and in power consumption. We first present Aegis and SPGC to address the challenges in improving the memory efficiency of DL training and inference. Aegis makes mixed precision training (MPT) stabler by layer-wise gradient scaling. Empirical experiments show that Aegis can improve MPT accuracy by at most 4\%. SPGC focuses on structured pruning: replacing standard convolution with group convolution (GConv) to avoid irregular sparsity. SPGC formulates GConv pruning as a channel permutation problem and proposes a novel heuristic polynomial-time algorithm. Common DNNs pruned by SPGC have maximally 1\% higher accuracy than prior work. This thesis also addresses the challenges lying in the gap between DNN descriptions and executables by Polygeist for software and POLSCA for hardware. Many novel techniques, e.g. statement splitting and memory partitioning, are explored and used to expand polyhedral optimisation. Polygeist can speed up software execution in sequential and parallel by 2.53 and 9.47 times on Polybench/C. POLSCA achieves 1.5 times speedup over hardware designs directly generated from high-level synthesis on Polybench/C. Moreover, this thesis presents Deacon, a framework that generates FPGA-based DNN accelerators of streaming architectures with advanced pipelining techniques to address the challenges from heterogeneous convolution and residual connections. Deacon provides fine-grained pipelining, graph-level optimisation, and heuristic exploration by graph colouring. Compared with prior designs, Deacon shows resource/power consumption efficiency improvement of 1.2x/3.5x for MobileNets and 1.0x/2.8x for SqueezeNets. All these tools are open source, some of which have already gained public engagement. We believe they can make efficient deep learning applications easier to build and deploy.Open Acces

    A survey of Bayesian Network structure learning

    Get PDF

    Discovering Causal Relations and Equations from Data

    Full text link
    Physics is a field of science that has traditionally used the scientific method to answer questions about why natural phenomena occur and to make testable models that explain the phenomena. Discovering equations, laws and principles that are invariant, robust and causal explanations of the world has been fundamental in physical sciences throughout the centuries. Discoveries emerge from observing the world and, when possible, performing interventional studies in the system under study. With the advent of big data and the use of data-driven methods, causal and equation discovery fields have grown and made progress in computer science, physics, statistics, philosophy, and many applied fields. All these domains are intertwined and can be used to discover causal relations, physical laws, and equations from observational data. This paper reviews the concepts, methods, and relevant works on causal and equation discovery in the broad field of Physics and outlines the most important challenges and promising future lines of research. We also provide a taxonomy for observational causal and equation discovery, point out connections, and showcase a complete set of case studies in Earth and climate sciences, fluid dynamics and mechanics, and the neurosciences. This review demonstrates that discovering fundamental laws and causal relations by observing natural phenomena is being revolutionised with the efficient exploitation of observational data, modern machine learning algorithms and the interaction with domain knowledge. Exciting times are ahead with many challenges and opportunities to improve our understanding of complex systems.Comment: 137 page

    Anpassen verteilter eingebetteter Anwendungen im laufenden Betrieb

    Get PDF
    The availability of third-party apps is among the key success factors for software ecosystems: The users benefit from more features and innovation speed, while third-party solution vendors can leverage the platform to create successful offerings. However, this requires a certain decoupling of engineering activities of the different parties not achieved for distributed control systems, yet. While late and dynamic integration of third-party components would be required, resulting control systems must provide high reliability regarding real-time requirements, which leads to integration complexity. Closing this gap would particularly contribute to the vision of software-defined manufacturing, where an ecosystem of modern IT-based control system components could lead to faster innovations due to their higher abstraction and availability of various frameworks. Therefore, this thesis addresses the research question: How we can use modern IT technologies and enable independent evolution and easy third-party integration of software components in distributed control systems, where deterministic end-to-end reactivity is required, and especially, how can we apply distributed changes to such systems consistently and reactively during operation? This thesis describes the challenges and related approaches in detail and points out that existing approaches do not fully address our research question. To tackle this gap, a formal specification of a runtime platform concept is presented in conjunction with a model-based engineering approach. The engineering approach decouples the engineering steps of component definition, integration, and deployment. The runtime platform supports this approach by isolating the components, while still offering predictable end-to-end real-time behavior. Independent evolution of software components is supported through a concept for synchronous reconfiguration during full operation, i.e., dynamic orchestration of components. Time-critical state transfer is supported, too, and can lead to bounded quality degradation, at most. The reconfiguration planning is supported by analysis concepts, including simulation of a formally specified system and reconfiguration, and analyzing potential quality degradation with the evolving dataflow graph (EDFG) method. A platform-specific realization of the concepts, the real-time container architecture, is described as a reference implementation. The model and the prototype are evaluated regarding their feasibility and applicability of the concepts by two case studies. The first case study is a minimalistic distributed control system used in different setups with different component variants and reconfiguration plans to compare the model and the prototype and to gather runtime statistics. The second case study is a smart factory showcase system with more challenging application components and interface technologies. The conclusion is that the concepts are feasible and applicable, even though the concepts and the prototype still need to be worked on in future -- for example, to reach shorter cycle times.Eine große Auswahl von Drittanbieter-Lösungen ist einer der Schlüsselfaktoren für Software Ecosystems: Nutzer profitieren vom breiten Angebot und schnellen Innovationen, während Drittanbieter über die Plattform erfolgreiche Lösungen anbieten können. Das jedoch setzt eine gewisse Entkopplung von Entwicklungsschritten der Beteiligten voraus, welche für verteilte Steuerungssysteme noch nicht erreicht wurde. Während Drittanbieter-Komponenten möglichst spät -- sogar Laufzeit -- integriert werden müssten, müssen Steuerungssysteme jedoch eine hohe Zuverlässigkeit gegenüber Echtzeitanforderungen aufweisen, was zu Integrationskomplexität führt. Dies zu lösen würde insbesondere zur Vision von Software-definierter Produktion beitragen, da ein Ecosystem für moderne IT-basierte Steuerungskomponenten wegen deren höherem Abstraktionsgrad und der Vielzahl verfügbarer Frameworks zu schnellerer Innovation führen würde. Daher behandelt diese Dissertation folgende Forschungsfrage: Wie können wir moderne IT-Technologien verwenden und unabhängige Entwicklung und einfache Integration von Software-Komponenten in verteilten Steuerungssystemen ermöglichen, wo Ende-zu-Ende-Echtzeitverhalten gefordert ist, und wie können wir insbesondere verteilte Änderungen an solchen Systemen konsistent und im Vollbetrieb vornehmen? Diese Dissertation beschreibt Herausforderungen und verwandte Ansätze im Detail und zeigt auf, dass existierende Ansätze diese Frage nicht vollständig behandeln. Um diese Lücke zu schließen, beschreiben wir eine formale Spezifikation einer Laufzeit-Plattform und einen zugehörigen Modell-basierten Engineering-Ansatz. Dieser Ansatz entkoppelt die Design-Schritte der Entwicklung, Integration und des Deployments von Komponenten. Die Laufzeit-Plattform unterstützt den Ansatz durch Isolation von Komponenten und zugleich Zeit-deterministischem Ende-zu-Ende-Verhalten. Unabhängige Entwicklung und Integration werden durch Konzepte für synchrone Rekonfiguration im Vollbetrieb unterstützt, also durch dynamische Orchestrierung. Dies beinhaltet auch Zeit-kritische Zustands-Transfers mit höchstens begrenzter Qualitätsminderung, wenn überhaupt. Rekonfigurationsplanung wird durch Analysekonzepte unterstützt, einschließlich der Simulation formal spezifizierter Systeme und Rekonfigurationen und der Analyse der etwaigen Qualitätsminderung mit dem Evolving Dataflow Graph (EDFG). Die Real-Time Container Architecture wird als Referenzimplementierung und Evaluationsplattform beschrieben. Zwei Fallstudien untersuchen Machbarkeit und Nützlichkeit der Konzepte. Die erste verwendet verschiedene Varianten und Rekonfigurationen eines minimalistischen verteilten Steuerungssystems, um Modell und Prototyp zu vergleichen sowie Laufzeitstatistiken zu erheben. Die zweite Fallstudie ist ein Smart-Factory-Demonstrator, welcher herausforderndere Applikationskomponenten und Schnittstellentechnologien verwendet. Die Konzepte sind den Studien nach machbar und nützlich, auch wenn sowohl die Konzepte als auch der Prototyp noch weitere Arbeit benötigen -- zum Beispiel, um kürzere Zyklen zu erreichen

    Blockchain-Coordinated Frameworks for Scalable and Secure Supply Chain Networks

    Full text link
    Supply chains have progressed through time from being limited to a few regional traders to becoming complicated business networks. As a result, supply chain management systems now rely significantly on the digital revolution for the privacy and security of data. Due to key qualities of blockchain, such as transparency, immutability and decentralization, it has recently gained a lot of interest as a way to solve security, privacy and scalability problems in supply chains. However conventional blockchains are not appropriate for supply chain ecosystems because they are computationally costly, have a limited potential to scale and fail to provide trust. Consequently, due to limitations with a lack of trust and coordination, supply chains tend to fail to foster trust among the network’s participants. Assuring data privacy in a supply chain ecosystem is another challenge. If information is being shared with a large number of participants without establishing data privacy, access control risks arise in the network. Protecting data privacy is a concern when sending corporate data, including locations, manufacturing supplies and demand information. The third challenge in supply chain management is scalability, which continues to be a significant barrier to adoption. As the amount of transactions in a supply chain tends to increase along with the number of nodes in a network. So scalability is essential for blockchain adoption in supply chain networks. This thesis seeks to address the challenges of privacy, scalability and trust by providing frameworks for how to effectively combine blockchains with supply chains. This thesis makes four novel contributions. It first develops a blockchain-based framework with Attribute-Based Access Control (ABAC) model to assure data privacy by adopting a distributed framework to enable fine grained, dynamic access control management for supply chain management. To solve the data privacy challenge, AccessChain is developed. This proposed AccessChain model has two types of ledgers in the system: local and global. Local ledgers are used to store business contracts between stakeholders and the ABAC model management, whereas the global ledger is used to record transaction data. AccessChain can enable decentralized, fine-grained and dynamic access control management in SCM when combined with the ABAC model and blockchain technology (BCT). The framework enables a systematic approach that advantages the supply chain, and the experiments yield convincing results. Furthermore, the results of performance monitoring shows that AccessChain’s response time with four local ledgers is acceptable, and therefore it provides significantly greater scalability. Next, a framework for reducing the bullwhip effect (BWE) in SCM is proposed. The framework also focuses on combining data visibility with trust. BWE is first observed in SC and then a blockchain architecture design is used to minimize it. Full sharing of demand data has been shown to help improve the robustness of overall performance in a multiechelon SC environment, especially for BWE mitigation and cumulative cost reduction. It is observed that when it comes to providing access to data, information sharing using a blockchain has some obvious benefits in a supply chain. Furthermore, when data sharing is distributed, parties in the supply chain will have fair access to other parties’ data, even though they are farther downstream. Sharing customer demand is important in a supply chain to enhance decision-making, reduce costs and promote the final end product. This work also explores the ability of BCT as a solution in a distributed ledger approach to create a trust-enhanced environment where trust is established so that stakeholders can share their information effectively. To provide visibility and coordination along with a blockchain consensus process, a new consensus algorithm, namely Reputation-based proof-of cooperation (RPoC), is proposed for blockchain-based SCM, which does not involve validators to solve any mathematical puzzle before storing a new block. The RPoC algorithm is an efficient and scalable consensus algorithm that selects the consensus node dynamically and permits a large number of nodes to participate in the consensus process. The algorithm decreases the workload on individual nodes while increasing consensus performance by allocating the transaction verification process to specific nodes. Through extensive theoretical analyses and experimentation, the suitability of the proposed algorithm is well grounded in terms of scalability and efficiency. The thesis concludes with a blockchain-enabled framework that addresses the issue of preserving privacy and security for an open-bid auction system. This work implements a bid management system in a private BC environment to provide a secure bidding scheme. The novelty of this framework derives from an enhanced approach for integrating BC structures by replacing the original chain structure with a tree structure. Throughout the online world, user privacy is a primary concern, because the electronic environment enables the collection of personal data. Hence a suitable cryptographic protocol for an open-bid auction atop BC is proposed. Here the primary aim is to achieve security and privacy with greater efficiency, which largely depends on the effectiveness of the encryption algorithms used by BC. Essentially this work considers Elliptic Curve Cryptography (ECC) and a dynamic cryptographic accumulator encryption algorithm to enhance security between auctioneer and bidder. The proposed e-bidding scheme and the findings from this study should foster the further growth of BC strategies

    Machine learning in portfolio management

    Get PDF
    Financial markets are difficult learning environments. The data generation process is time-varying, returns exhibit heavy tails and signal-to-noise ratio tends to be low. These contribute to the challenge of applying sophisticated, high capacity learning models in financial markets. Driven by recent advances of deep learning in other fields, we focus on applying deep learning in a portfolio management context. This thesis contains three distinct but related contributions to literature. First, we consider the problem of neural network training in a time-varying context. This results in a neural network that can adapt to a data generation process that changes over time. Second, we consider the problem of learning in noisy environments. We propose to regularise the neural network using a supervised autoencoder and show that this improves the generalisation performance of the neural network. Third, we consider the problem of quantifying forecast uncertainty in time-series with volatility clustering. We propose a unified framework for the quantification of forecast uncertainty that results in uncertainty estimates that closely match actual realised forecast errors in cryptocurrencies and U.S. stocks
    corecore