235,118 research outputs found

    From FPGA to ASIC: A RISC-V processor experience

    Get PDF
    This work document a correct design flow using these tools in the Lagarto RISC- V Processor and the RTL design considerations that must be taken into account, to move from a design for FPGA to design for ASIC

    Towards more accurate real time testing

    Get PDF
    The languages Message Sequence Charts (MSC) [1], System Design Language1 (SDL) [2] and Testing and Test Control Notation Testing2 (TTCN-3) [3] have been developed for the design, modelling and testing of complex software systems. These languages have been developed to complement one another in the software development process. Each of these languages has features for describing, analysing or testing the real time properties of systems. Robust toolsets exist which provide integrated environments for the design, analysis and testing of systems, and it is claimed, for the complete development of real time systems. It was shown in [4] however, that there are fundamental problems with the SDL language and its associated tools for modelling and reasoning about real time systems. In this paper we present the limitations of TTCN-3 and propose recommendations which help minimise the timing inaccuracies that would otherwise occur in using the language directly

    Spatio-temporal wavelet regularization for parallel MRI reconstruction: application to functional MRI

    Get PDF
    Parallel MRI is a fast imaging technique that enables the acquisition of highly resolved images in space or/and in time. The performance of parallel imaging strongly depends on the reconstruction algorithm, which can proceed either in the original k-space (GRAPPA, SMASH) or in the image domain (SENSE-like methods). To improve the performance of the widely used SENSE algorithm, 2D- or slice-specific regularization in the wavelet domain has been deeply investigated. In this paper, we extend this approach using 3D-wavelet representations in order to handle all slices together and address reconstruction artifacts which propagate across adjacent slices. The gain induced by such extension (3D-Unconstrained Wavelet Regularized -SENSE: 3D-UWR-SENSE) is validated on anatomical image reconstruction where no temporal acquisition is considered. Another important extension accounts for temporal correlations that exist between successive scans in functional MRI (fMRI). In addition to the case of 2D+t acquisition schemes addressed by some other methods like kt-FOCUSS, our approach allows us to deal with 3D+t acquisition schemes which are widely used in neuroimaging. The resulting 3D-UWR-SENSE and 4D-UWR-SENSE reconstruction schemes are fully unsupervised in the sense that all regularization parameters are estimated in the maximum likelihood sense on a reference scan. The gain induced by such extensions is illustrated on both anatomical and functional image reconstruction, and also measured in terms of statistical sensitivity for the 4D-UWR-SENSE approach during a fast event-related fMRI protocol. Our 4D-UWR-SENSE algorithm outperforms the SENSE reconstruction at the subject and group levels (15 subjects) for different contrasts of interest (eg, motor or computation tasks) and using different parallel acceleration factors (R=2 and R=4) on 2x2x3mm3 EPI images.Comment: arXiv admin note: substantial text overlap with arXiv:1103.353

    Loop pipelining with resource and timing constraints

    Get PDF
    Developing efficient programs for many of the current parallel computers is not easy due to the architectural complexity of those machines. The wide variety of machine organizations often makes it more difficult to port an existing program than to reprogram it completely. Therefore, powerful translators are necessary to generate effective code and free the programmer from concerns about the specific characteristics of the target machine. This work focuses on techniques to be used by an important class of translators, whose objective is to transform sequential programs into equivalent more parallel programs. The transformations are performed at instruction level in order to exploit low level parallelism and increase memory locality.Most of the current applications are programmed in languages which do not allow us to express parallelism between high-level sentences (as Pascal, C or Fortran). Furthermore, a lot of applications written ten or more years ago are still used today, and it is not feasible to rewrite such applications for many reasons (not only technical reasons, but also economic ones). Translators enable programmers to write the application in a familiar sequential programming language, without concerning their selves with the architecture of the target machine. Current compilers for parallel architectures not only translate a program written on a high-level language to the appropriate machine language, but also perform some transformations in the final code in order to execute the program in a more parallel way. The transformations improve the performance in the execution of the program by making use of the knowledge that the compiler has about the machine architecture. The semantics of the program remain intact after any transformation.Experiments show that limiting parallelization to basic blocks not included in loops limits maximum speedup. This is because loops often comprise a large portion of the parallelism available to be exploited in a program. For this reason, a lot of effort has been devoted in the recent years to parallelize loop execution. Several parallel computer architectures and compilation techniques have been proposed to exploit such a parallelism at different granularities. Multiprocessors exploit coarse grained parallelism by distributing entire loop iterations to different processors. Systems oriented to the high-level synthesis (HLS) of VLSI circuits, superscalar processors and very long instruction word (VLIW) processors exploit fine-grained parallelism at instruction level. This work addresses fine-grained parallelization of loops addressed to the HLS of VLSI circuits. Two algorithms are proposed for resource constraints and for timing constraints. An algorithm to reduce the number of registers required to execute a loop in a given architecture is also proposed.Postprint (published version

    Improving early design stage timing modeling in multicore based real-time systems

    Get PDF
    This paper presents a modelling approach for the timing behavior of real-time embedded systems (RTES) in early design phases. The model focuses on multicore processors - accepted as the next computing platform for RTES - and in particular it predicts the contention tasks suffer in the access to multicore on-chip shared resources. The model presents the key properties of not requiring the application's source code or binary and having high-accuracy and low overhead. The former is of paramount importance in those common scenarios in which several software suppliers work in parallel implementing different applications for a system integrator, subject to different intellectual property (IP) constraints. Our model helps reducing the risk of exceeding the assigned budgets for each application in late design stages and its associated costs.This work has received funding from the European Space Agency under Project Reference AO=17722=13=NL=LvH, and has also been supported by the Spanish Ministry of Science and Innovation grant TIN2015-65316-P. Jaume Abella has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft
    • …
    corecore