4,478 research outputs found

    Bayesian inference for challenging scientific models

    Get PDF
    Advances in technology and computation have led to ever more complicated scientific models of phenomena across a wide variety of fields. Many of these models present challenges for Bayesian inference, as a result of computationally intensive likelihoods, high-dimensional parameter spaces or large dataset sizes. In this thesis we show how we can apply developments in probabilistic machine learning and statistics to do inference with examples of these types of models. As a demonstration of an applied inference problem involving a non-trivial likelihood computation, we show how a combination of optimisation and MCMC methods along with careful consideration of priors can be used to infer the parameters of an ODE model of the cardiac action potential. We then consider the problem of pileup, a phenomenon that occurs in astronomy when using CCD detectors to observe bright sources. It complicates the fitting of even simple spectral models by introducing an observation model with a large number of continuous and discrete latent variables that scales with the size of the dataset. We develop an MCMC-based method that can work in the presence of pileup by explicitly marginalising out discrete variables and using adaptive HMC on the remaining continuous variables. We show with synthetic experiments that it allows us to fit spectral models in the presence of pileup without biasing the results. We also compare it to neural Simulation- Based Inference approaches, and find that they perform comparably to the MCMC-based approach whilst being able to scale to larger datasets. As an example of a problem where we wish to do inference with extremely large datasets, we consider the Extreme Deconvolution method. The method fits a probability density to a dataset where each observation has Gaussian noise added with a known sample-specific covariance, originally intended for use with astronomical datasets. The existing fitting method is batch EM, which would not normally be applied to large datasets such as the Gaia catalog containing noisy observations of a billion stars. In this thesis we propose two minibatch variants of extreme deconvolution, based on an online variation of the EM algorithm, and direct gradient-based optimisation of the log-likelihood, both of which can run on GPUs. We demonstrate that these methods provide faster fitting, whilst being able to scale to much larger models for use with larger datasets. We then extend the extreme deconvolution approach to work with non- Gaussian noise, and to use more flexible density estimators such as normalizing flows. Since both adjustments lead to an intractable likelihood, we resort to amortized variational inference in order to fit them. We show that for some datasets that flows can outperform Gaussian mixtures for extreme deconvolution, and that fitting with non-Gaussian noise is now possible

    Resource-aware scheduling for 2D/3D multi-/many-core processor-memory systems

    Get PDF
    This dissertation addresses the complexities of 2D/3D multi-/many-core processor-memory systems, focusing on two key areas: enhancing timing predictability in real-time multi-core processors and optimizing performance within thermal constraints. The integration of an increasing number of transistors into compact chip designs, while boosting computational capacity, presents challenges in resource contention and thermal management. The first part of the thesis improves timing predictability. We enhance shared cache interference analysis for set-associative caches, advancing the calculation of Worst-Case Execution Time (WCET). This development enables accurate assessment of cache interference and the effectiveness of partitioned schedulers in real-world scenarios. We introduce TCPS, a novel task and cache-aware partitioned scheduler that optimizes cache partitioning based on task-specific WCET sensitivity, leading to improved schedulability and predictability. Our research explores various cache and scheduling configurations, providing insights into their performance trade-offs. The second part focuses on thermal management in 2D/3D many-core systems. Recognizing the limitations of Dynamic Voltage and Frequency Scaling (DVFS) in S-NUCA many-core processors, we propose synchronous thread migrations as a thermal management strategy. This approach culminates in the HotPotato scheduler, which balances performance and thermal safety. We also introduce 3D-TTP, a transient temperature-aware power budgeting strategy for 3D-stacked systems, reducing the need for Dynamic Thermal Management (DTM) activation. Finally, we present 3QUTM, a novel method for 3D-stacked systems that combines core DVFS and memory bank Low Power Modes with a learning algorithm, optimizing response times within thermal limits. This research contributes significantly to enhancing performance and thermal management in advanced processor-memory systems

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Semi-online Scheduling with Lookahead

    Full text link
    The knowledge of future partial information in the form of a lookahead to design efficient online algorithms is a theoretically-efficient and realistic approach to solving computational problems. Design and analysis of semi-online algorithms with extra-piece-of-information (EPI) as a new input parameter has gained the attention of the theoretical computer science community in the last couple of decades. Though competitive analysis is a pessimistic worst-case performance measure to analyze online algorithms, it has immense theoretical value in developing the foundation and advancing the state-of-the-art contributions in online and semi-online scheduling. In this paper, we study and explore the impact of lookahead as an EPI in the context of online scheduling in identical machine frameworks. We introduce a kk-lookahead model and design improved competitive semi-online algorithms. For a 22-identical machine setting, we prove a lower bound of 43\frac{4}{3} and design an optimal algorithm with a matching upper bound of 43\frac{4}{3} on the competitive ratio. For a 33-identical machine setting, we show a lower bound of 1511\frac{15}{11} and design a 1611\frac{16}{11}-competitive improved semi-online algorithm.Comment: 14 pages, 1 figur

    Regulating ChatGPT and other Large Generative AI Models

    Full text link
    Large generative AI models (LGAIMs), such as ChatGPT or Stable Diffusion, are rapidly transforming the way we communicate, illustrate, and create. However, AI regulation, in the EU and beyond, has primarily focused on conventional AI models, not LGAIMs. This paper will situate these new generative models in the current debate on trustworthy AI regulation, and ask how the law can be tailored to their capabilities. After laying technical foundations, the legal part of the paper proceeds in four steps, covering (1) direct regulation, (2) data protection, (3) content moderation, and (4) policy proposals. It suggests a novel terminology to capture the AI value chain in LGAIM settings by differentiating between LGAIM developers, deployers, professional and non-professional users, as well as recipients of LGAIM output. We tailor regulatory duties to these different actors along the value chain and suggest four strategies to ensure that LGAIMs are trustworthy and deployed for the benefit of society at large. Rules in the AI Act and other direct regulation must match the specificities of pre-trained models. In particular, regulation should focus on concrete high-risk applications, and not the pre-trained model itself, and should include (i) obligations regarding transparency and (ii) risk management. Non-discrimination provisions (iii) may, however, apply to LGAIM developers. Lastly, (iv) the core of the DSA content moderation rules should be expanded to cover LGAIMs. This includes notice and action mechanisms, and trusted flaggers. In all areas, regulators and lawmakers need to act fast to keep track with the dynamics of ChatGPT et al.Comment: under revie

    Mapping the Focal Points of WordPress: A Software and Critical Code Analysis

    Get PDF
    Programming languages or code can be examined through numerous analytical lenses. This project is a critical analysis of WordPress, a prevalent web content management system, applying four modes of inquiry. The project draws on theoretical perspectives and areas of study in media, software, platforms, code, language, and power structures. The applied research is based on Critical Code Studies, an interdisciplinary field of study that holds the potential as a theoretical lens and methodological toolkit to understand computational code beyond its function. The project begins with a critical code analysis of WordPress, examining its origins and source code and mapping selected vulnerabilities. An examination of the influence of digital and computational thinking follows this. The work also explores the intersection of code patching and vulnerability management and how code shapes our sense of control, trust, and empathy, ultimately arguing that a rhetorical-cultural lens can be used to better understand code\u27s controlling influence. Recurring themes throughout these analyses and observations are the connections to power and vulnerability in WordPress\u27 code and how cultural, processual, rhetorical, and ethical implications can be expressed through its code, creating a particular worldview. Code\u27s emergent properties help illustrate how human values and practices (e.g., empathy, aesthetics, language, and trust) become encoded in software design and how people perceive the software through its worldview. These connected analyses reveal cultural, processual, and vulnerability focal points and the influence these entanglements have concerning WordPress as code, software, and platform. WordPress is a complex sociotechnical platform worthy of further study, as is the interdisciplinary merging of theoretical perspectives and disciplines to critically examine code. Ultimately, this project helps further enrich the field by introducing focal points in code, examining sociocultural phenomena within the code, and offering techniques to apply critical code methods

    R^3: On-device Real-Time Deep Reinforcement Learning for Autonomous Robotics

    Full text link
    Autonomous robotic systems, like autonomous vehicles and robotic search and rescue, require efficient on-device training for continuous adaptation of Deep Reinforcement Learning (DRL) models in dynamic environments. This research is fundamentally motivated by the need to understand and address the challenges of on-device real-time DRL, which involves balancing timing and algorithm performance under memory constraints, as exposed through our extensive empirical studies. This intricate balance requires co-optimizing two pivotal parameters of DRL training -- batch size and replay buffer size. Configuring these parameters significantly affects timing and algorithm performance, while both (unfortunately) require substantial memory allocation to achieve near-optimal performance. This paper presents R^3, a holistic solution for managing timing, memory, and algorithm performance in on-device real-time DRL training. R^3 employs (i) a deadline-driven feedback loop with dynamic batch sizing for optimizing timing, (ii) efficient memory management to reduce memory footprint and allow larger replay buffer sizes, and (iii) a runtime coordinator guided by heuristic analysis and a runtime profiler for dynamically adjusting memory resource reservations. These components collaboratively tackle the trade-offs in on-device DRL training, improving timing and algorithm performance while minimizing the risk of out-of-memory (OOM) errors. We implemented and evaluated R^3 extensively across various DRL frameworks and benchmarks on three hardware platforms commonly adopted by autonomous robotic systems. Additionally, we integrate R^3 with a popular realistic autonomous car simulator to demonstrate its real-world applicability. Evaluation results show that R^3 achieves efficacy across diverse platforms, ensuring consistent latency performance and timing predictability with minimal overhead.Comment: Accepted by RTSS 202

    Efficient Model Checking: The Power of Randomness

    Get PDF

    NCC: Natural Concurrency Control for Strictly Serializable Datastores by Avoiding the Timestamp-Inversion Pitfall

    Full text link
    Strictly serializable datastores greatly simplify the development of correct applications by providing strong consistency guarantees. However, existing techniques pay unnecessary costs for naturally consistent transactions, which arrive at servers in an order that is already strictly serializable. We find these transactions are prevalent in datacenter workloads. We exploit this natural arrival order by executing transaction requests with minimal costs while optimistically assuming they are naturally consistent, and then leverage a timestamp-based technique to efficiently verify if the execution is indeed consistent. In the process of designing such a timestamp-based technique, we identify a fundamental pitfall in relying on timestamps to provide strict serializability, and name it the timestamp-inversion pitfall. We find timestamp-inversion has affected several existing works. We present Natural Concurrency Control (NCC), a new concurrency control technique that guarantees strict serializability and ensures minimal costs -- i.e., one-round latency, lock-free, and non-blocking execution -- in the best (and common) case by leveraging natural consistency. NCC is enabled by three key components: non-blocking execution, decoupled response control, and timestamp-based consistency check. NCC avoids timestamp-inversion with a new technique: response timing control, and proposes two optimization techniques, asynchrony-aware timestamps and smart retry, to reduce false aborts. Moreover, NCC designs a specialized protocol for read-only transactions, which is the first to achieve the optimal best-case performance while ensuring strict serializability, without relying on synchronized clocks. Our evaluation shows that NCC outperforms state-of-the-art solutions by an order of magnitude on many workloads
    • …
    corecore