547,162 research outputs found

    Depth-Bounded Quantum Cryptography with Applications to One-Time Memory and More

    Get PDF
    With the power of quantum information, we can achieve exciting and classically impossible cryptographic primitives. However, almost all quantum cryptography faces extreme difficulties with the near-term intermediate-scale quantum technology (NISQ technology); namely, the short lifespan of quantum states and limited sequential computation. At the same time, considering only limited quantum adversaries may still enable us to achieve never-before-possible tasks. In this work, we consider quantum cryptographic primitives against limited quantum adversaries - depth-bounded adversaries. We introduce a model for (depth-bounded) NISQ computers, which are classical circuits interleaved with shallow quantum circuits. Then, we show one-time memory can be achieved against any depth-bounded quantum adversaries introduced in the work, with their depth being any pre-fixed polynomial. Therefore we obtain applications like one-time programs and one-time proofs. Finally, we show our one-time memory has correctness even against constant-rate errors

    Brief strategy training enhances targeted memory and beliefs and promotes near transfer

    Get PDF
    A traditional and common approach to cognitive interventions for adults is memory strategy training, but limited work of this type has examined whether self-regulatory factors (e.g., self-evaluative beliefs) might benefit from these programs or moderate other training-related gains. Further, while interventions focused on intensive practice or core capacity training have demonstrated near transfer (performance improvement following training on untrained tasks related to the target task), evidence of near transfer from strategy training programs is quite rare. The present research, Everyday Memory Clinic–Revised (EMC-R), addressed self-regulation and transfer issues in memory strategy training. EMC-R examined whether (1) a short-term strategy training led to improvement in beliefs and strategy usage, as well as score gains, (2) training designed to emphasize self-regulatory changes was more effective than a content- and duration-matched program focused solely on strategy training, and (3) the training evidenced near transfer. Participants were 122 relatively healthy and well-educated middle-aged and older adults (51-90 years old; M=73.24, SD=8.31 years). They were recruited from the community and randomly assigned to one of three training conditions: a waitlist control group (CT; n=38), a traditional strategy-only training group (SO; n=46), or the target strategy-plus beliefs training group, designed to enhance self-regulatory factors as well as memory (SB; n=38). The training programs constituted two hours of in-person instructor-led training and approximately two to three hours of continued self-study at home. EMC-R used a 2 time (within: pretest, posttest) x 3 training condition (between: CT, SO, SB) mixed-model design. After training, as compared to pretest data and to inactive waitlist control participants, trainees demonstrated improved name recall memory, higher levels of memory self-efficacy, and more effective use of memory strategies. Contrary to expectations, benefits from training were similar for a beliefs-focused strategy training approach, as compared to the training approach without the focus on beliefs. The benefits from both training approaches extended beyond the name recall task targeted in training to performance on similar, but untrained, associative memory tasks. Results underscore the value of brief memory training that improves self-efficacy for middle-0aged and older individuals—even with only five hours of training over a single week, performance was enhanced on the trained memory task but also on untrained memory performance. From a practical standpoint, a brief but effective training regimen can be easily disseminated through existing programming, and the low cost of a single-session program may incentivize initial participation and full completion of the training

    CIMAR, NIMAR, and LMMA: novel algorithms for thread and memory migrations in user space on NUMA systems using hardware counters

    Get PDF
    This paper introduces two novel algorithms for thread migrations, named CIMAR (Core-aware Interchange and Migration Algorithm with performance Record –IMAR–) and NIMAR (Node-aware IMAR), and a new algorithm for the migration of memory pages, LMMA (Latency-based Memory pages Migration Algorithm), in the context of Non-Uniform Memory Access (NUMA) systems. This kind of system has complex memory hierarchies that present a challenging problem in extracting the best possible performance, where thread and memory mapping play a critical role. The presented algorithms gather and process the information provided by hardware counters to make decisions about the migrations to be performed, trying to find the optimal mapping. They have been implemented as a user space tool that looks for improving the system performance, particularly in, but not restricted to, scenarios where multiple programs with different characteristics are running. This approach has the advantage of not requiring any modification on the target programs or the Linux kernel while keeping a low overhead. Two different benchmark suites have been used to validate our algorithms: The NAS parallel benchmark, mainly devoted to computational routines, and the LevelDB database benchmark focused on read–write operations. These benchmarks allow us to illustrate the influence of our proposal in these two important types of codes. Note that those codes are state-of-the-art implementations of the routines, so few improvements could be initially expected. Experiments have been designed and conducted to emulate three different scenarios: a single program running in the system with full resources, an interactive server where multiple programs run concurrently varying the availability of resources, and a queue of tasks where granted resources are limited. The proposed algorithms have been able to produce significant benefits, especially in systems with higher latency penalties for remote accesses. When more than one benchmark is executed simultaneously, performance improvements have been obtained, reducing execution times up to 60%. In this kind of situation, the behaviour of the system is more critical, and the NUMA topology plays a more relevant role. Even in the worst case, when isolated benchmarks are executed using the whole system, that is, just one task at a time, the performance is not degradedThis research work has received financial support from the Ministerio de Ciencia e Innovación, Spain within the project PID2019-104834GB-I00. It was also funded by the Consellería de Cultura, Educación e Ordenación Universitaria of Xunta de Galicia (accr. 2019–2022, ED431G 2019/04 and reference competitive group 2019–2021, ED431C 2018/19)S

    The multi-program performance model: debunking current practice in multi-core simulation

    Get PDF
    Composing a representative multi-program multi-core workload is non-trivial. A multi-core processor can execute multiple independent programs concurrently, and hence, any program mix can form a potential multi-program workload. Given the very large number of possible multiprogram workloads and the limited speed of current simulation methods, it is impossible to evaluate all possible multi-program workloads. This paper presents the Multi-Program Performance Model (MPPM), a method for quickly estimating multiprogram multi-core performance based on single-core simulation runs. MPPM employs an iterative method to model the tight performance entanglement between co-executing programs on a multi-core processor with shared caches. Because MPPM involves analytical modeling, it is very fast, and it estimates multi-core performance for a very large number of multi-program workloads in a reasonable amount of time. In addition, it provides confidence bounds on its performance estimates. Using SPEC CPU2006 and up to 16 cores, we report an average performance prediction error of 2.3% and 2.9% for system throughput (STP) and average normalized turnaround time (ANTT), respectively, while being up to five orders of magnitude faster than detailed simulation. Subsequently, we demonstrate that randomly picking a limited number of multi-program workloads, as done in current pactice, can lead to incorrect design decisions in practical design and research studies, which is alleviated using MPPM. In addition, MPPM can be used to quickly identify multi-program workloads that stress multi-core performance through excessive conflict behavior in shared caches; these stress workloads can then be used for driving the design process further

    Trace-level reuse

    Get PDF
    Trace-level reuse is based on the observation that some traces (dynamic sequences of instructions) are frequently repeated during the execution of a program, and in many cases, the instructions that make up such traces have the same source operand values. The execution of such traces will obviously produce the same outcome and thus, their execution can be skipped if the processor records the outcome of previous executions. This paper presents an analysis of the performance potential of trace-level reuse and discusses a preliminary realistic implementation. Like instruction-level reuse, trace-level reuse can improve performance by decreasing resource contention and the latency of some instructions. However, we show that trace-level reuse is more effective than instruction-level reuse because the former can avoid fetching the instructions of reused traces. This has two important benefits: it reduces the fetch bandwidth requirements, and it increases the effective instruction window size since these instructions do not occupy window entries. Moreover, trace-level reuse can compute all at once the result of a chain of dependent instructions, which may allow the processor to avoid the serialization caused by data dependences and thus, to potentially exceed the dataflow limit.Peer ReviewedPostprint (published version

    Non-intrusive on-the-fly data race detection using execution replay

    Full text link
    This paper presents a practical solution for detecting data races in parallel programs. The solution consists of a combination of execution replay (RecPlay) with automatic on-the-fly data race detection. This combination enables us to perform the data race detection on an unaltered execution (almost no probe effect). Furthermore, the usage of multilevel bitmaps and snooped matrix clocks limits the amount of memory used. As the record phase of RecPlay is highly efficient, there is no need to switch it off, hereby eliminating the possibility of Heisenbugs because tracing can be left on all the time.Comment: In M. Ducasse (ed), proceedings of the Fourth International Workshop on Automated Debugging (AAdebug 2000), August 2000, Munich. cs.SE/001003
    corecore