312 research outputs found

    TANDEM: taming failures in next-generation datacenters with emerging memory

    Get PDF
    The explosive growth of online services, leading to unforeseen scales, has made modern datacenters highly prone to failures. Taming these failures hinges on fast and correct recovery, minimizing service interruptions. Applications, owing to recovery, entail additional measures to maintain a recoverable state of data and computation logic during their failure-free execution. However, these precautionary measures have severe implications on performance, correctness, and programmability, making recovery incredibly challenging to realize in practice. Emerging memory, particularly non-volatile memory (NVM) and disaggregated memory (DM), offers a promising opportunity to achieve fast recovery with maximum performance. However, incorporating these technologies into datacenter architecture presents significant challenges; Their distinct architectural attributes, differing significantly from traditional memory devices, introduce new semantic challenges for implementing recovery, complicating correctness and programmability. Can emerging memory enable fast, performant, and correct recovery in the datacenter? This thesis aims to answer this question while addressing the associated challenges. When architecting datacenters with emerging memory, system architects face four key challenges: (1) how to guarantee correct semantics; (2) how to efficiently enforce correctness with optimal performance; (3) how to validate end-to-end correctness including recovery; and (4) how to preserve programmer productivity (Programmability). This thesis aims to address these challenges through the following approaches: (a) defining precise consistency models that formally specify correct end-to-end semantics in the presence of failures (consistency models also play a crucial role in programmability); (b) developing new low-level mechanisms to efficiently enforce the prescribed models given the capabilities of emerging memory; and (c) creating robust testing frameworks to validate end-to-end correctness and recovery. We start our exploration with non-volatile memory (NVM), which offers fast persistence capabilities directly accessible through the processor’s load-store (memory) interface. Notably, these capabilities can be leveraged to enable fast recovery for Log-Free Data Structures (LFDs) while maximizing performance. However, due to the complexity of modern cache hierarchies, data hardly persist in any specific order, jeop- ardizing recovery and correctness. Therefore, recovery needs primitives that explicitly control the order of updates to NVM (known as persistency models). We outline the precise specification of a novel persistency model – Release Persistency (RP) – that provides a consistency guarantee for LFDs on what remains in non-volatile memory upon failure. To efficiently enforce RP, we propose a novel microarchitecture mechanism, lazy release persistence (LRP). Using standard LFDs benchmarks, we show that LRP achieves fast recovery while incurring minimal overhead on performance. We continue our discussion with memory disaggregation which decouples memory from traditional monolithic servers, offering a promising pathway for achieving very high availability in replicated in-memory data stores. Achieving such availability hinges on transaction protocols that can efficiently handle recovery in this setting, where compute and memory are independent. However, there is a challenge: disaggregated memory (DM) fails to work with RPC-style protocols, mandating one-sided transaction protocols. Exacerbating the problem, one-sided transactions expose critical low-level ordering to architects, posing a threat to correctness. We present a highly available transaction protocol, Pandora, that is specifically designed to achieve fast recovery in disaggregated key-value stores (DKVSes). Pandora is the first one-sided transactional protocol that ensures correct, non-blocking, and fast recovery in DKVS. Our experimental implementation artifacts demonstrate that Pandora achieves fast recovery and high availability while causing minimal disruption to services. Finally, we introduce a novel target litmus-testing framework – DART – to validate the end-to-end correctness of transactional protocols with recovery. Using DART’s target testing capabilities, we have found several critical bugs in Pandora, highlighting the need for robust end-to-end testing methods in the design loop to iteratively fix correctness bugs. Crucially, DART is lightweight and black-box, thereby eliminating any intervention from the programmers

    Configuration Management of Distributed Systems over Unreliable and Hostile Networks

    Get PDF
    Economic incentives of large criminal profits and the threat of legal consequences have pushed criminals to continuously improve their malware, especially command and control channels. This thesis applied concepts from successful malware command and control to explore the survivability and resilience of benign configuration management systems. This work expands on existing stage models of malware life cycle to contribute a new model for identifying malware concepts applicable to benign configuration management. The Hidden Master architecture is a contribution to master-agent network communication. In the Hidden Master architecture, communication between master and agent is asynchronous and can operate trough intermediate nodes. This protects the master secret key, which gives full control of all computers participating in configuration management. Multiple improvements to idempotent configuration were proposed, including the definition of the minimal base resource dependency model, simplified resource revalidation and the use of imperative general purpose language for defining idempotent configuration. Following the constructive research approach, the improvements to configuration management were designed into two prototypes. This allowed validation in laboratory testing, in two case studies and in expert interviews. In laboratory testing, the Hidden Master prototype was more resilient than leading configuration management tools in high load and low memory conditions, and against packet loss and corruption. Only the research prototype was adaptable to a network without stable topology due to the asynchronous nature of the Hidden Master architecture. The main case study used the research prototype in a complex environment to deploy a multi-room, authenticated audiovisual system for a client of an organization deploying the configuration. The case studies indicated that imperative general purpose language can be used for idempotent configuration in real life, for defining new configurations in unexpected situations using the base resources, and abstracting those using standard language features; and that such a system seems easy to learn. Potential business benefits were identified and evaluated using individual semistructured expert interviews. Respondents agreed that the models and the Hidden Master architecture could reduce costs and risks, improve developer productivity and allow faster time-to-market. Protection of master secret keys and the reduced need for incident response were seen as key drivers for improved security. Low-cost geographic scaling and leveraging file serving capabilities of commodity servers were seen to improve scaling and resiliency. Respondents identified jurisdictional legal limitations to encryption and requirements for cloud operator auditing as factors potentially limiting the full use of some concepts

    Infrastructure management in multicloud environments

    Get PDF
    With the increasing number of cloud service providers and data centres around the world, cloud services users are becoming increasingly concerned about where their data is stored and who has access to the data. The legal reach of customers’ countries does not expand over the country’s borders without special agreements that can take a long while to get. Because it is safer for a cloud service customer to use a cloud service provider that is domestically legally accounta-ble, customers are moving to using these cloud service providers. For the case company this causes both a technical problem and a managerial problem. The technical problem is how to manage cloud environments when the business expands to multiple countries, with said countries customers requiring that the data is stored within their country. Different cloud service providers can also be heterogeneous in their features to manage infrastructure, which makes managing and developing the infrastructure even more difficult. For example, application programming interfaces (API) that makes automation easier can vary between providers. From a management point of view, different time zones also make it harder to quickly respond to any issues in the IT infrastruc-ture when the case company employees are working in the same time zone. The objective of this thesis is to address the issue by investigating which tools and functionali-ties are commonly utilized for automating IT infrastructure and are additionally supported by cloud service providers while being compatible with the specific requirements of the organization in question. The research will help the case organization replace and add new tools to help maintain the IT infrastructure. This thesis will not investigate the managerial problem of case company em-ployees working in the same time zone. The thesis will also not research security, version control, desktop and laptop management or log collection tools or produce a code-based solution to set-ting up an IT environment since further research needs to be done after the tools presented in this thesis have been decided upon. The research does also not investigate every cloud service pro-vider in every country as case company business strategies can change and the size of the thesis would grow too much. Qualitative research method is used for this thesis and the data gathered comes from literature and articles from various source. Both literature and article review provided the theoretical aspects of this research. Data was also gathered by looking at a few countries that have companies whose business is cloud service providing and comparing the findings regarding infrastructure management and automatization. The research is divided into five parts. The first part tries to introduce the background, re-search objective and structure of the research., while the second part tries to explain the theoreti-cal background. In the third part of the research methodology is explained as what material was used and how it was gathered and descriptions of the results, fourth part analyses the results, while the fifth and final part concludes the research

    Lazy Merging: From a Potential of Universes to a Universe of Potentials

    Get PDF
    Current collaboration workflows force participants to resolve conflicts eagerly, despite having insufficient knowledge and not being aware of their collaborators’ intentions. This is a major reason for bad decisions because it can disregard opinions within the team and cover up disagreements. In our concept of lazy merging we propose to aggregate conflicts as variant potentials. Variant potentials preserve concurrent changes and present the different options to the participants. They can be further merged and edited without restrictions and behave robustly even in complex collaboration scenarios. We use lattice theory to prove important properties and show the correctness and robustness of the collaboration protocol. With lazy merging, conflicts can be resolved deliberately, when all opinions within the team were explored and discussed. This facilitates alignment among team members and prepares them to arrive at the best possible decision that considers the knowledge of the whole team

    Algebraic Replicated Data Types: Programming Secure Local-First Software

    Get PDF

    Co-designing reliability and performance for datacenter memory

    Get PDF
    Memory is one of the key components that affects reliability and performance of datacenter servers. Memory in today’s servers is organized and shared in several ways to provide the most performant and efficient access to data. For example, cache hierarchy in multi-core chips to reduce access latency, non-uniform memory access (NUMA) in multi-socket servers to improve scalability, disaggregation to increase memory capacity. In all these organizations, hardware coherence protocols are used to maintain memory consistency of this shared memory and implicitly move data to the requesting cores. This thesis aims to provide fault-tolerance against newer models of failure in the organization of memory in datacenter servers. While designing for improved reliability, this thesis explores solutions that can also enhance performance of applications. The solutions build over modern coherence protocols to achieve these properties. First, we observe that DRAM memory system failure rates have increased, demanding stronger forms of memory reliability. To combat this, the thesis proposes Dvé, a hardware driven replication mechanism where data blocks are replicated across two different memory controllers in a cache-coherent NUMA system. Data blocks are accompanied by a code with strong error detection capabilities so that when an error is detected, correction is performed using the replica. Dvé’s organization offers two independent points of access to data which enables: (a) strong error correction that can recover from a range of faults affecting any of the components in the memory and (b) higher performance by providing another nearer point of memory access. Dvé’s coherent replication keeps the replicas in sync for reliability and also provides coherent access to read replicas during fault-free operation for improved performance. Dvé can flexibly provide these benefits on-demand at runtime. Next, we observe that the coherence protocol itself requires to be hardened against failures. Memory in datacenter servers is being disaggregated from the compute servers into dedicated memory servers, driven by standards like CXL. CXL specifies the coherence protocol semantics for compute servers to access and cache data from a shared region in the disaggregated memory. However, the CXL specification lacks the requisite level of fault-tolerance necessary to operate at an inter-server scale within the datacenter. Compute servers can fail or be unresponsive in the datacenter and therefore, it is important that the coherence protocol remain available in the presence of such failures. The thesis proposes Āpta, a CXL-based, shared disaggregated memory system for keeping the cached data consistent without compromising availability in the face of compute server failures. Āpta architects a high-performance fault-tolerant object-granular memory server that significantly improves performance for stateless function-as-a-service (FaaS) datacenter applications

    Selected Topics in Gravity, Field Theory and Quantum Mechanics

    Get PDF
    Quantum field theory has achieved some extraordinary successes over the past sixty years; however, it retains a set of challenging problems. It is not yet able to describe gravity in a mathematically consistent manner. CP violation remains unexplained. Grand unified theories have been eliminated by experiment, and a viable unification model has yet to replace them. Even the highly successful quantum chromodynamics, despite significant computational achievements, struggles to provide theoretical insight into the low-energy regime of quark physics, where the nature and structure of hadrons are determined. The only proposal for resolving the fine-tuning problem, low-energy supersymmetry, has been eliminated by results from the LHC. Since mathematics is the true and proper language for quantitative physical models, we expect new mathematical constructions to provide insight into physical phenomena and fresh approaches for building physical theories

    Data Management for Dynamic Multimedia Analytics and Retrieval

    Get PDF
    Multimedia data in its various manifestations poses a unique challenge from a data storage and data management perspective, especially if search, analysis and analytics in large data corpora is considered. The inherently unstructured nature of the data itself and the curse of dimensionality that afflicts the representations we typically work with in its stead are cause for a broad range of issues that require sophisticated solutions at different levels. This has given rise to a huge corpus of research that puts focus on techniques that allow for effective and efficient multimedia search and exploration. Many of these contributions have led to an array of purpose-built, multimedia search systems. However, recent progress in multimedia analytics and interactive multimedia retrieval, has demonstrated that several of the assumptions usually made for such multimedia search workloads do not hold once a session has a human user in the loop. Firstly, many of the required query operations cannot be expressed by mere similarity search and since the concrete requirement cannot always be anticipated, one needs a flexible and adaptable data management and query framework. Secondly, the widespread notion of staticity of data collections does not hold if one considers analytics workloads, whose purpose is to produce and store new insights and information. And finally, it is impossible even for an expert user to specify exactly how a data management system should produce and arrive at the desired outcomes of the potentially many different queries. Guided by these shortcomings and motivated by the fact that similar questions have once been answered for structured data in classical database research, this Thesis presents three contributions that seek to mitigate the aforementioned issues. We present a query model that generalises the notion of proximity-based query operations and formalises the connection between those queries and high-dimensional indexing. We complement this by a cost-model that makes the often implicit trade-off between query execution speed and results quality transparent to the system and the user. And we describe a model for the transactional and durable maintenance of high-dimensional index structures. All contributions are implemented in the open-source multimedia database system Cottontail DB, on top of which we present an evaluation that demonstrates the effectiveness of the proposed models. We conclude by discussing avenues for future research in the quest for converging the fields of databases on the one hand and (interactive) multimedia retrieval and analytics on the other

    Development of semiconductor light sources for photonic-enabled quantum communication

    Get PDF
    Quantum information technologies have attracted tremendous attentions and development efforts by worldwide research organizations and governments in the past decades. It comprises the generation, manipulation, and transfer of quantum bits `qubits' based on the laws of quantum mechanics, enabling the applications of quantum metrology, quantum computation, quantum communication, etc. As one of the frontier quantum technologies, quantum communication features unconditionally secure data transfer between parties over long distance in theory, which can be accomplished through quantum state of light photons, due to their weak interaction with the environment and their remaining coherence over long distance. Meanwhile, quantum repeaters, similar as amplifier in classical communication are believed to be indispensable components to address the photon absorption and decoherence in noisy quantum channels, which scales exponentially with the distance. Quantum repeaters generally consist of three basic elements, namely entanglement swapping, entanglement purification, and quantum memories. In spite of significant breakthroughs achieved with a variety of optical protocols theoretically and experimentally, lack of near-perfect deterministic light sources with fast repetition rates, high degree of single photon purity, indistinguishability, and entanglement still impedes the practical applications. Semiconductor quantum dots are one of the leading system that have exhibited their potential for on-demand generation of high-quality single and entangled photon pairs for above applications. In this work, epitaxially grown III-V semiconductor quantum dots are investigated for driving their application in future quantum networks. First, an individual quantum dot emitting two pairs of entangled photons under pulsed two-photon resonant excitation has been utilized for realization of entanglement swapping, with the swapped photon pairs yielding a fidelity of 0.81 ± 0.04 to the Bell state Ψ+. To explore the practical limits of future quantum networks featuring multiple semiconductor based sources, we scrutinize the consequences of device fabrication, dynamic tuning techniques, time evolution of entanglement, and statistical effects on two separated quantum dot devices adapted in an entanglement swapping scheme. A numerical model based on the observed experimental data is proposed, serving not only as a benchmark for scalability of quantum dot devices, but also laying a roadmap for optimization of solid-state quantum emitters in quantum networks. For real-world quantum applications envisioned with quantum dots, the brightness of the quantum light sources is one of the key enabling factors, which is determined by the source excitation and extraction efficiency, as well as system detection system efficiency. Usually, the primary issue restricting the extraction of photons from III-V semiconductor quantum dots is the high-refractive index material of the host matrix which causes at the semiconductor-vacuum interface. To improve the photon extraction efficiency, a simple and efficient structure based on the principle of optical antennas is developed, resulting in an observed extraction of 17% of single photons in the telecom O-band, and a broadband enhancement of up to 180 times compared to the as-grown sample. A further limiting factor in the source efficiency is caused by the presence of charges in the solid-state environment. Charge fluctuation occur that quench radiative emission processes in resonant excitation schemes and induce fluorescence intermittence (blinking) that deteriorates the quantum yield. The photo-neutralization of GaAs/AlGaAs quantum dots excited by two-photon resonant pumping is investigated. Applying weak gate laser light to the quantum dot allows for controlling the charges capture processes. By adjusting the gate laser power and wavelength, an increase in excitation efficiency of 30% is observed compared to the two-photon resonant excitation without optical gating. The transition rates between the neutral and charged ground state are investigated by means of auto-/cross- correlation measurements. Furthermore, by studying a series of surface-passivated samples with different dot-to-surface distance as close to 20 nm, ODT was found to be an effective compound to neutralize the surface states, leading to reduced formation of non-radiative transition channels. It is anticipated that such a passivation method paves the way of near-field coupling related nano-photonic devices, or elimination of surface states for well-preserved emission properties towards the development of uncapped structure, fundamentally getting rid of total internal reflection to the maximum extent.European Research Council (ERC)/Starting Grant/QD-NOMS/E

    Large Scale Kernel Methods for Fun and Profit

    Get PDF
    Kernel methods are among the most flexible classes of machine learning models with strong theoretical guarantees. Wide classes of functions can be approximated arbitrarily well with kernels, while fast convergence and learning rates have been formally shown to hold. Exact kernel methods are known to scale poorly with increasing dataset size, and we believe that one of the factors limiting their usage in modern machine learning is the lack of scalable and easy to use algorithms and software. The main goal of this thesis is to study kernel methods from the point of view of efficient learning, with particular emphasis on large-scale data, but also on low-latency training, and user efficiency. We improve the state-of-the-art for scaling kernel solvers to datasets with billions of points using the Falkon algorithm, which combines random projections with fast optimization. Running it on GPUs, we show how to fully utilize available computing power for training kernel machines. To boost the ease-of-use of approximate kernel solvers, we propose an algorithm for automated hyperparameter tuning. By minimizing a penalized loss function, a model can be learned together with its hyperparameters, reducing the time needed for user-driven experimentation. In the setting of multi-class learning, we show that – under stringent but realistic assumptions on the separation between classes – a wide set of algorithms needs much fewer data points than in the more general setting (without assumptions on class separation) to reach the same accuracy. The first part of the thesis develops a framework for efficient and scalable kernel machines. This raises the question of whether our approaches can be used successfully in real-world applications, especially compared to alternatives based on deep learning which are often deemed hard to beat. The second part aims to investigate this question on two main applications, chosen because of the paramount importance of having an efficient algorithm. First, we consider the problem of instance segmentation of images taken from the iCub robot. Here Falkon is used as part of a larger pipeline, but the efficiency afforded by our solver is essential to ensure smooth human-robot interactions. In the second instance, we consider time-series forecasting of wind speed, analysing the relevance of different physical variables on the predictions themselves. We investigate different schemes to adapt i.i.d. learning to the time-series setting. Overall, this work aims to demonstrate, through novel algorithms and examples, that kernel methods are up to computationally demanding tasks, and that there are concrete applications in which their use is warranted and more efficient than that of other, more complex, and less theoretically grounded models
    • …
    corecore