6 research outputs found

    Non-traditional Calculations of Elementary Mathematical Operations: Part 1. Multiplication and Division

    Get PDF
    Different computational systems are a set of functional units and processors that can work together and exchange data with each other if required. In most cases, data transmission is organized in such a way that enables for the possibility of connecting each node of the system to the other node of the system. Thus, a computer system consists of components for performing arithmetic operations, and an integrated data communication system, which allows for information interaction between the nodes, and combines them into a single unit. When designing a given type of computer systems, problems might occur if:– computing nodes of the system cannot simultaneously start and finish data processing over a certain time interval;– procedures for processing data in the nodes of the system do not start and do not end at a certain time;– the number of computational nodes of the inputs and outputs of the system is different.This article proposes an unconventional approach to constructing a mathematical model of adaptive-quantum computation of arithmetic operations of multiplication and division using the principle of predetermined random self-organization proposed by Ashby in 1966, as well as the method of the dynamics of averages and of the adaptive system of integration of the system of logical-differential equations for the dynamics of number-average states of particles S1, S2 of sets. This would make it easier to solve some of the problems listed above

    NON-TRADITIONAL CALCULATIONS OF ELEMENTARY MATHEMATICAL OPERATIONS: Part 1. MULTIPLICATION AND DIVISION

    Get PDF
    Different computational systems are a set of functional units and processors that can work together and exchange data with each other if required. In most cases, data transmission is organized in such a way that enables for the possibility of connecting each node of the system to the other node of the system. Thus, a computer system consists of components for performing arithmetic operations, and an integrated data communication system, which allows for information interaction between the nodes, and combines them into a single unit. When designing a given type of computer systems, problems might occur if: – computing nodes of the system cannot simultaneously start and finish data processing over a certain time interval; – procedures for processing data in the nodes of the system do not start and do not end at a certain time; – the number of computational nodes of the inputs and outputs of the system is different. This article proposes an unconventional approach to constructing a mathematical model of adaptive-quantum computation of arithmetic operations of multiplication and division using the principle of predetermined random self-organization proposed by Ashby in 1966, as well as the method of the dynamics of averages and of the adaptive system of integration of the system of logical-differential equations for the dynamics of number-average states of particles S1, S2 of sets. This would make it easier to solve some of the problems listed above

    DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks

    Full text link
    Data movement between the CPU and main memory is a first-order obstacle against improving performance, scalability, and energy efficiency in modern systems. Computer systems employ a range of techniques to reduce overheads tied to data movement, spanning from traditional mechanisms (e.g., deep multi-level cache hierarchies, aggressive hardware prefetchers) to emerging techniques such as Near-Data Processing (NDP), where some computation is moved close to memory. Our goal is to methodically identify potential sources of data movement over a broad set of applications and to comprehensively compare traditional compute-centric data movement mitigation techniques to more memory-centric techniques, thereby developing a rigorous understanding of the best techniques to mitigate each source of data movement. With this goal in mind, we perform the first large-scale characterization of a wide variety of applications, across a wide range of application domains, to identify fundamental program properties that lead to data movement to/from main memory. We develop the first systematic methodology to classify applications based on the sources contributing to data movement bottlenecks. From our large-scale characterization of 77K functions across 345 applications, we select 144 functions to form the first open-source benchmark suite (DAMOV) for main memory data movement studies. We select a diverse range of functions that (1) represent different types of data movement bottlenecks, and (2) come from a wide range of application domains. Using NDP as a case study, we identify new insights about the different data movement bottlenecks and use these insights to determine the most suitable data movement mitigation mechanism for a particular application. We open-source DAMOV and the complete source code for our new characterization methodology at https://github.com/CMU-SAFARI/DAMOV.Comment: Our open source software is available at https://github.com/CMU-SAFARI/DAMO

    Putting checkpoints to work in thread level speculative execution

    Get PDF
    With the advent of Chip Multi Processors (CMPs), improving performance relies on the programmers/compilers to expose thread level parallelism to the underlying hardware. Unfortunately, this is a difficult and error-prone process for the programmers, while state of the art compiler techniques are unable to provide significant benefits for many classes of applications. An interesting alternative is offered by systems that support Thread Level Speculation (TLS), which relieve the programmer and compiler from checking for thread dependencies and instead use the hardware to enforce them. Unfortunately, data misspeculation results in a high cost since all the intermediate results have to be discarded and threads have to roll back to the beginning of the speculative task. For this reason intermediate checkpointing of the state of the TLS threads has been proposed. When the violation does occur, we now have to roll back to a checkpoint before the violating instruction and not to the start of the task. However, previous work omits study of the microarchitectural details and implementation issues that are essential for effective checkpointing. Further, checkpoints have only been proposed and evaluated for a narrow class of benchmarks. This thesis studies checkpoints on a state of the art TLS system running a variety of benchmarks. The mechanisms required for checkpointing and the costs associated are described. Hardware modifications required for making checkpointed execution efficient in time and power are proposed and evaluated. Further, the need for accurately identifying suitable points for placing checkpoints is established. Various techniques for identifying these points are analysed in terms of both effectiveness and viability. This includes an extensive evaluation of data dependence prediction techniques. The results show that checkpointing thread level speculative execution results in consistent power savings, and for many benchmarks leads to speedups as well

    Applying Perceptrons to Speculation in Computer Architecture

    Get PDF
    Speculation plays an ever-increasing role in optimizing the execution of programs in computer architecture. Speculative decision-makers are typically required to have high speed and small size, thus limiting their complexity and capability. Because of these restrictions, predictors often consider only a small subset of the available data in making decisions, and consequently do not realize their potential accuracy. Perceptrons, or simple neural networks, can be highly useful in speculation for their ability to examine larger quantities of available data, and identify which data lead to accurate results. Recent research has demonstrated that perceptrons can operate successfully within the strict size and latency restrictions of speculation in computer architecture. This dissertation first studies how perceptrons can be made to predict accurately when they directly replace the traditional pattern table predictor. Several weight training methods and multiple-bit perceptron topologies are modeled and evaluated in their ability to learn data patterns that pattern tables can learn. The effects of interference between past data on perceptrons are evaluated, and different interference reduction strategies are explored. Perceptrons are then applied to two speculative applications: data value prediction and dataflow critical path prediction. Several new perceptron value predictors are proposed that can consider longer or more varied data histories than existing table-based value predictors. These include a global-based local predictor that uses global correlations between data values to predict past local values, a global-based global predictor that uses global correlations to predict past global values, and a bitwise predictor that can use global correlations to generate new data values. Several new perceptron criticality predictors are proposed that use global correlations between instruction behaviors to accurately determine whether instructions lie on the critical path. These predictors are evaluated against local table-based approaches on a custom cycle-accurate processor simulator, and are shown on average to have both superior accuracy and higher instruction-per-cycle performance. Finally, the perceptron predictors are simulated using the different weight training approaches and multiple-bit topologies. It is shown that for these applications, perceptron topologies and training approaches must be selected that respond well to highly imbalanced and poorly correlated past data patterns
    corecore