Search CORE

4,710 research outputs found

Fluid flow queue models for fixed-mobile network evaluation

Author: Bernal Escribano Álvaro
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2019
Field of study

A methodology for fast and accurate end-to-end KPI, like throughput and delay, estimation is proposed based on the service-centric traffic flow analysis and the fluid flow queuing model named CURSA-SQ. Mobile network features, like shared medium and mobility, are considered defining the models to be taken into account such as the propagation models and the fluid flow scheduling model. The developed methodology provides accurate computation of these KPIs, while performing orders of magnitude faster than discrete event simulators like ns-3. Finally, this methodology combined to its capacity for performance estimation in MPLS networks enables its application for near real-time converged fixed-mobile networks operation as it is proven in three use case scenarios

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Optimal Control of Software Ensuring Safety and Functionality

Author: Easwaran Arvind
Kannan Sampath
Lee Insup
Publication venue: ScholarlyCommons
Publication date: 11/10/2005
Field of study

Existing verification and validation methodologies can detect software violations very effectively but fail to provide any mechanism for correcting faults once they are detected. Detection of faults, their diagnosis and corrective actions are all essential components of any software rectification framework. In this paper, we propose a framework for correction of violations in software systems ensuring that the desired goals of the system are achieved. We describe a stochastic finite state machine used to abstract a software system along with the uncertainty in its operating environment. Safety property violations and satisfaction of functionalities are abstracted using penalties and rewards on the states, respectively. Rectification of software is then formulated as a stochastic optimal control problem over this abstraction. Algorithms polynomial in the size of the abstraction have been developed for solving this optimization problem exactly. The paper also applies the developed framework to a variety of examples from different domains

ScholarlyCommons@Penn

Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 2: Army fault tolerant architecture design and analysis

Author: Alger L. S.
Babikyan C. A.
Butler B. P.
Friend S. A.
Ganska R. J.
Harper R. E.
Lala J. H.
Masotto T. K.
Meyer A. J.
Morton D. P.
Publication venue
Publication date
Field of study

Described here is the Army Fault Tolerant Architecture (AFTA) hardware architecture and components and the operating system. The architectural and operational theory of the AFTA Fault Tolerant Data Bus is discussed. The test and maintenance strategy developed for use in fielded AFTA installations is presented. An approach to be used in reducing the probability of AFTA failure due to common mode faults is described. Analytical models for AFTA performance, reliability, availability, life cycle cost, weight, power, and volume are developed. An approach is presented for using VHSIC Hardware Description Language (VHDL) to describe and design AFTA's developmental hardware. A plan is described for verifying and validating key AFTA concepts during the Dem/Val phase. Analytical models and partial mission requirements are used to generate AFTA configurations for the TF/TA/NOE and Ground Vehicle missions

NASA Technical Reports Server

Selectable Heaps and Their Application to Lazy Search Trees

Author: Zhang Lingyi
Publication venue: 'University of Waterloo'
Publication date: 21/04/2021
Field of study

We show the O(log n) time extract minimum function of efficient priority queues can be generalized to the extraction of the k smallest elements in O(k log(n/k)) time. We first show the heap-ordered tree selection of Kaplan et al. can be applied on the heap-ordered trees of the classic Fibonacci heap to support the extraction in O(k \log(n/k)) amortized time. We then show selection is possible in a priority queue with optimal worst-case guarantees by applying heap-ordered tree selection on Brodal queues, supporting the operation in O(k log(n/k)) worst-case time. Via a reduction from the multiple selection problem, Ω(k log(n/k)) time is necessary. We then apply the result to the lazy search trees of Sandlund & Wild, creating a new interval data structure based on selectable heaps. This gives optimal O(B+n) lazy search tree performance, lowering insertion complexity into a gap Δi to O(log(n/|Δi|))$ time. An O(1)-time merge operation is also made possible under certain conditions. If Brodal queues are used, all runtimes of the lazy search tree can be made worst-case. The presented data structure uses soft heaps of Chazelle, biased search trees, and efficient priority queues in a non-trivial way, approaching the theoretically-best data structure for ordered data

University of Waterloo's Institutional Repository

Recommended from our members

Scalable hardware memory disambiguation

Author: Sethumadhavan Lakshminarasimhan, 1978-
Publication venue
Publication date: 01/12/2007
Field of study

This dissertation deals with one of the long-standing problems in Computer Architecture – the problem of memory disambiguation. Microprocessors typically reorder memory instructions during execution to improve concurrency. Such microprocessors use hardware memory structures for memory disambiguation, known as LoadStore Queues (LSQs), to ensure that memory instruction dependences are satisfied even when the memory instructions execute out-of-order. A typical LSQ implementation (circa 2006) holds all in-flight memory instructions in a physically centralized LSQ and performs a fully associative search on all buffered instructions to ensure that memory dependences are satisfied. These LSQ implementations do not scale because they use large, fully associative structures, which are known to be slow and power hungry. The increasing trend towards distributed microarchitectures further exacerbates these problems. As on-chip wire delays increase and high-performance processors become necessarily distributed, centralized structures such as the LSQ can limit scalability. This dissertation describes techniques to create scalable LSQs in both centralized and distributed microarchitectures. The problems and solutions described in this thesis are motivated and validated by real system designs. The dissertation starts with a description of the partitioned primary memory system of the TRIPS processor, of which the LSQ is an important component, and then through a series of optimizations describes how the power, area, and centralization problems of the LSQ can be solved with minor performance losses (if at all) even for large number of in flight memory instructions. The four solutions described in this dissertation — partitioning, filtering, late binding and efficient overflow management — enable power-, area-efficient, distributed and scalable LSQs, which in turn enable aggressive large-window processors capable of simultaneously executing thousands of instructions. To mitigate the power problem, we replaced the power-hungry, fully associative search with a power-efficient hash table lookup using a simple address-based Bloom filter. Bloom filters are probabilistic data structures used for testing set membership and can be used to quickly check if an instruction with the same data address is likely to be found in the LSQ without performing the associative search. Bloom filters typically eliminate more than 80% of the associative searches and they are highly effective because in most programs, it is uncommon for loads and stores to have the same data address and be in execution simultaneously. To rectify the area problem, we observe the fact that only a small fraction of all memory instructions are dependent, that only such dependent instructions need to be buffered in the LSQ, and that these instructions need to be in the LSQ only for certain parts of the pipelined execution. We propose two mechanisms to exploit these observations. The first mechanism, area filtering, is a hardware mechanism that couples Bloom filters and dependence predictors to dynamically identify and buffer only those instructions which are likely to be dependent. The second mechanism, late binding, reduces the occupancy and hence size of the LSQ. Both of these optimizations allows the number of LSQ slots to be reduced by up to one-half compared to a traditional organization without any performance degradation. Finally, we describe a new decentralized LSQ design for handling LSQ structural hazards in distributed microarchitectures. Decentralization of LSQs, and to a large extent distributed microarchitectures with memory speculation, has proved to be impractical because of the high performance penalties associated with the mechanisms for dealing with hazards. To solve this problem, we applied classic flow-control techniques from interconnection networks for handling resource con- flicts. The first method, memory-side buffering, buffers the overflowing instructions in a separate buffer near the LSQs. The second scheme, execution-side NACKing, sends the overflowing instruction back to the issue window from which it is later re-issued. The third scheme, network buffering, uses the buffers in the interconnection network between the execution units and memory to hold instructions when the LSQ is full, and uses virtual channel flow control to avoid deadlocks. The network buffering scheme is the most robust of all the overflow schemes and shows less than 1% performance degradation due to overflows for a subset of SPEC CPU 2000 and EEMBC benchmarks on a cycle-accurate simulator that closely models the TRIPS processor. The techniques proposed in this dissertation are independent, architectureneutral and their cumulative benefits result in LSQs that can be partitioned at a fine granularity and have low design complexity. Each of these partitions selectively buffers only memory instructions with true dependences and can be closely coupled with the execution units thus minimizing power, area, and latency. Such LSQ designs with near-ideal characteristics are well suited for microarchitectures with thousands of instructions in-flight and may enable even more aggressive microarchitectures in the future.Computer Science

Texas ScholarWorks