969 research outputs found

    Superfund Reauthorization: Impact on State Environmental Enforcement

    Get PDF
    Branch predictor (BP) is an essential component in modern processors since high BP accuracy can improve performance and reduce energy by decreasing the number of instructions executed on wrong-path. However, reducing the latency and storage overhead of BP while maintaining high accuracy presents significant challenges. In this paper, we present a survey of dynamic branch prediction techniques. We classify the works based on key features to underscore their differences and similarities. We believe this paper will spark further research in this area and will be useful for computer architects, processor designers, and researchers

    On the design of state-of-the-art pseudorandom number generators by means of genetic programming

    Get PDF
    Congress on Evolutionary Computation. Portland, EEUU, 19-23 June 2004The design of pseudorandom number generators by means of evolutionary computation is a classical problem. Today, it has been mostly and better accomplished by means of cellular automata and not many proposals, inside or outside this paradigm could claim to be both robust (passing all the statistical tests, including the most demanding ones) and fast, as is the case of the proposal we present here. Furthermore, for obtaining these generators, we use a radical approach, where our fitness function is not at all based in any measure of randomness, as is frequently the case in the literature, but of nonlinearity. Efficiency is assured by using only very efficient operators (both in hardware and software) and by limiting the number of terminals in the genetic programming implementation

    Automated extraction of absorption features from Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) and Geophysical and Environmental Research Imaging Spectrometer (GERIS) data

    Get PDF
    Automated techniques were developed for the extraction and characterization of absorption features from reflectance spectra. The absorption feature extraction algorithms were successfully tested on laboratory, field, and aircraft imaging spectrometer data. A suite of laboratory spectra of the most common minerals was analyzed and absorption band characteristics tabulated. A prototype expert system was designed, implemented, and successfully tested to allow identification of minerals based on the extracted absorption band characteristics. AVIRIS spectra for a site in the northern Grapevine Mountains, Nevada, have been characterized and the minerals sericite (fine grained muscovite) and dolomite were identified. The minerals kaolinite, alunite, and buddingtonite were identified and mapped for a site at Cuprite, Nevada, using the feature extraction algorithms on the new Geophysical and Environmental Research 64 channel imaging spectrometer (GERIS) data. The feature extraction routines (written in FORTRAN and C) were interfaced to the expert system (written in PROLOG) to allow both efficient processing of numerical data and logical spectrum analysis

    Arabia Saudí en 2004: ¿podrá sobrevivir a la amenaza terrorista?

    Get PDF
    El presente análisis examina la estructura sociopolítica de Arabia Saudí, al tiempo que intenta evaluar si podrá hacer frente al conjunto de tensiones generadas por el desempleo entre los jóvenes, el llamamiento a la yihad por parte de los extremistas y el deterioro de sus relaciones con EEUU en un momento de incertidumbre en el liderazgo del país. El reino de Arabia Saudí, tan esencial para la economía mundial, no sólo debe hacer frente a la amenaza de los extremistas terroristas, sino también al desempleo entre los jóvenes y a una relación con EEUU en proceso de deterioro. Atrapado entre la necesidad de tomar medidas enérgicas para combatir toda forma de disidencia y la necesidad de abrir la sociedad para llevar el país al siglo XXI, el propio liderazgo saudí se halla en un estado de incertidumbre sobre el futuro, sin saber muy bien si aferrarse a su posición privilegiada, o si adaptarse e introducir alguna fórmula que permita algún tipo de reparto del poder. Enfrentado a tales retos, este año resultará crucial para el régimen saudí. El país requiere cambios, pero éstos deberían realizarse desde dentro, en lugar de serle impuestos por terroristas o por una potencia extranjera

    Rotting bandits are not harder than stochastic ones

    Get PDF
    In stochastic multi-armed bandits, the reward distribution of each arm is assumed to be stationary. This assumption is often violated in practice (e.g., in recommendation systems), where the reward of an arm may change whenever is selected, i.e., rested bandit setting. In this paper, we consider the non-parametric rotting bandit setting, where rewards can only decrease. We introduce the filtering on expanding window average (FEWA) algorithm that constructs moving averages of increasing windows to identify arms that are more likely to return high rewards when pulled once more. We prove that for an unknown horizon TT, and without any knowledge on the decreasing behavior of the KK arms, FEWA achieves problem-dependent regret bound of O~(log(KT)),\widetilde{\mathcal{O}}(\log{(KT)}), and a problem-independent one of O~(KT)\widetilde{\mathcal{O}}(\sqrt{KT}). Our result substantially improves over the algorithm of Levine et al. (2017), which suffers regret O~(K1/3T2/3)\widetilde{\mathcal{O}}(K^{1/3}T^{2/3}). FEWA also matches known bounds for the stochastic bandit setting, thus showing that the rotting bandits are not harder. Finally, we report simulations confirming the theoretical improvements of FEWA

    Convergence of the Linear Delta Expansion in the Critical O(N) Field Theory

    Full text link
    The linear delta expansion is applied to the 3-dimensional O(N) scalar field theory at its critical point in a way that is compatible with the large-N limit. For a range of the arbitrary mass parameter, the linear delta expansion for converges, with errors decreasing like a power of the order n in delta. If the principal of minimal sensitivity is used to optimize the convergence rate, the errors seem to decrease exponentially with n.Comment: 26 pages, latex, 8 figure

    ShoutFight!

    Get PDF
    A primarily sound-driven game developed to accompany an interactive game design talk called Playing Along, by Yann Seznec and Dr Niall Moody. Playing Along was a talk focusing on the way small changes to a game can have a sometimes outsized impact on the experience of playing that game.ShoutFight! comprises the main interactive element of the talk, and is designed in a modular fashion so that different elements can be gradually added to demonstrate this principle as the talk progresses.The Playing Along talk and ShoutFight! game have been presented at 2 events to date: The BFI Video Games Day 2018; Traverse Theatre, Edinburgh.Continue Edinburgh 2018, University of Edinburgh Business School

    Yet Another Compressed Cache: a Low Cost Yet Effective Compressed Cache

    Get PDF
    Cache memories play a critical role in bridging the latency, bandwidth, and energy gaps between cores and off-chip memory. However, caches frequently consume a significant fraction of a multicore chip’s area, and thus account for a significant fraction of its cost. Compression has the potential to improve the effective capacity of a cache, providing the performance and energy benefits of a larger cache while using less area. The design of a compressed cache must address two important issues: i) a low-latency, low-overhead compression algorithm that can represent a fixed-size cache block using fewer bits and ii) a cache organization that can efficiently store the resulting variable-size compressed blocks. This paper focuses on the latter issue. In this paper, we propose YACC (Yet Another Compressed Cache), a new compressed cache design that uses super-blocks to reduce tag overheads and variable-size blocks to reduce internal fragmentation, but eliminates two major sources of complexity in previous work—decoupled tag-data mapping and address skewing. YACC’s cache layout is similar to conventional caches, eliminating the back-pointers used to maintain a decoupled tag-data mapping and the extra decoders used to implement skewed associativity. An additional advantage of YACC is that it enables modern replacement mechanisms, such as RRIP. For our benchmark set, YACC performs comparably to the recently-proposed Skewed Compressed Cache (SCC) ‎[Sardashti et al. 2014], but with a simpler, more area efficient design without the complexity and overheads of skewing. Compared to a conventional uncompressed 8MB LLC, YACC improves performance by on average 8% and up to 26%, and reduces total energy by on average 6% and up to 20%. An 8MB YACC achieves approximately the same performance and energy improvements as a 16MB conventional cache at a much smaller silicon footprint, with 1.6% higher area than an 8MB conventional cach

    Selecting Benchmarks Combinations for the Evaluation of Multicore Throughput

    Get PDF
    Most high-performance processors today are able to execute multiple threads of execution simultaneously. Threads share processor resources, like the last-level cache, which may decrease throughput in a non obvious way, depending on threads characteristics. Computer architects usually study multiprogrammed workloads by considering a set of benchmarks and some combinations of these benchmarks. Because cycle-accurate microarchitecture simulators are slow, we want a set of combinations that is as small as possible, yet representative. However, there is no standard method for selecting such sample, and different authors have used different methods. It is not clear how the choice of a particular sample impacts the conclusions of a study. We propose and compare different sampling methods for defining multiprogrammed workloads for computer architecture. We evaluate their effectiveness on a case study, the comparison of several multicore last-level cache replacement policies. We show that random sampling, the simplest method, is robust to define a representative sample of workloads, provided the sample is big enough. We propose a method for estimating the required sample size based on fast approximate simulation. We propose a new method, workload stratification, which is very effective at reducing the sample size in situations where random sampling would require large samples.Aujourd'hui, la plupart des processeurs hautes performances sont capables d'exécuter plusieurs flots d'exécution simultanément. Ces flots d'exécution partagent les ressources du processeur, comme le cache de dernier niveau, ce qui peut réduire le débit d'exécution de manière difficilement prévisible, selon les caractéristiques de ces flots. Les architectes étudient généralement les charges multitâches en considérant un ensemble de charges de référence et des combinaisons de ces charges de référence. Comme les simulateurs précis au cycle près sont lents, nous voulons un ensemble de combinaisons qui soit aussi petit que possible, mais représentatif. Cependant, il n'existe pas de méthode standard pour la sélection de ces échantillons et différents auteurs ont utilisé différentes méthodes. Il n'est pas clair en quoi le choix d'un échantillon en particulier a une incidence sur les conclusions d'une étude. Nous proposons et comparons différentes méthodes d'échantillonnage permettant de définir des charges multitâches pour l'architecture des ordinateurs. Nous évaluons leur efficacité sur une étude de cas : la comparaison de plusieurs politiques de remplacement pour le cache de dernier niveau. Nous montrons que l'échantillonnage aléatoire, la méthode la plus simple, est robuste pour définir un échantillon représentatif de la charge de travail, à condition que l'échantillon soit assez grand. Nous proposons une méthode d'estimation de la taille de l'échantillon nécessaire basée sur une simulation rapide approximative. Nous proposons une nouvelle méthode, la stratification de charges multitâches, qui est très efficace pour réduire la taille de l'échantillon dans les cas où un échantillonnage aléatoire requerrait de grands échantillons

    Alternative Schemes for High-Bandwidth Instruction Fetching

    Get PDF
    Future processors combining out-of-order execution with aggressive speculation techniques will need to fetch multiple non-consecutive instruction blocks in a single cycle to achieve high-performance. Several high-bandwidth instruction fetching schemes have been proposed in the past few years. The Two-Block Ahead (TBA) branch predictor predicts two non-consecutive instruction blocks per cycle while relying on a conventional instruction cache. The trace cache (TC) records traces of instructions and delivers multiple non-consecutive instruction blocks to the execution core. The aim of this paper is to investigate the pros and cons of both approaches. Maintaining consistency between memory and TC is not a straightforward issue. We propose a simple hardware scheme to maintain consistency at a reasonable performance loss (1 to 5%). We also introduce a new fill unit heuristic for TC, the mispredict hint, that leads to significantly better performance (up to 20 %). This is mainly due to better prediction accuracy results and TC miss ratios. TBA requires double-ported or bank-interleaved structures to supply two non-consecutive blocks in a single cycle. We show that a 4-way interleaving scheme is cost-effective since it impairs performance by only 3 to 5%. Finally, simulation results show that such an enhanced TC scheme delivers higher performance than TBA when caches are large, due to a lower branch misprediction penalty and a higher instruction bandwidth on mispredictions. When the hardware budget is smaller, TBA outperforms TC because of a higher TC miss ratio and branch misprediction rate
    corecore