11 research outputs found

    Prototypage rapide d'un decodeur mpeg-4 optimise sur architectures heterogenes paralleles

    Get PDF
    - Les solutions Mpeg-4 actuellement développées sont séquentielles et tentent d'intégrer un maximum de fonctionnalités dans un unique logiciel, et sont généralement surdimensionnées en comparaison des services réellement nécessaires. De plus, elles sont difficilement utilisables dans un contexte multiprocesseurs de part leurs importantes tailles de codes et de données, mais également de part l'utilisation sous-optimale du parallélisme de l'architecture. Ce papier présente une application Mpeg-4 distribuée, où la partie système est localisée sur un PC standard, les calculs intensifs de décodage vidéo étant pris en charge par une carte multi-DSP. Nous présentons la méthodologie AVS/SynDEx utilisée pour la création de cette application. AVS/SynDEx autorise une remise à jour simple du décodeur vidéo, mais également le prototypage quasi-automatique sur une plate-forme multi-C6x. Nous définissons également un ordonnancement global permettant l'exécution en parallèle de la partie système et du décodage vidéo

    A new parallel SystemC kernel leveraging manycore architectures

    No full text
    Conference of 19th Design, Automation and Test in Europe Conference and Exhibition, DATE 2016 ; Conference Date: 14 March 2016 Through 18 March 2016; Conference Code:121520International audienceThe complexity of system-level modeling is continuously increasing. Electronic System Level (ESL) design requires fast simulation techniques to control future SoC development cost and time-to-market. However, SystemC simulations are sequential and then limited by single-thread performance. In this paper, we present a new parallel SystemC kernel that efficiently leverages the multiple cores of a host machine, reaching high simulation performance without relaxing accuracy. It supports atomic parallel evaluation of SystemC processes and repeatable execution for HW/SW debugging. This new kernel is fully compliant with existing standards and easy to integrate in any existing SystemC model. Evaluations show a maximum acceleration of 34× compared to Accellera SystemC on a 64-core AMD Opteron machine

    Highly-parallel special-purpose multicore architecture for SystemC/TLM simulations

    No full text
    Conference of 14th International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, SAMOS 2014 ; Conference Date: 14 July 2014 Through 17 July 2014; Conference Code:114504International audienceThe complexity of SystemC virtual prototyping is continuously increasing. Accelerating RTL/TLM SystemC simulations is essential to control future SoC development cost and time-to-market. In this paper, we present RAVES, a highly-parallel special-purpose multicore architecture that achieves simulation performance more efficiently by parallel execution of light-weight user-level threads on many small cores. We present a design study based on the virtual prototype of RAVES processors running a co-designed custom SystemC kernel. Our evaluation suggests that a 64-core RAVES processor can deliver up to 4.47× more simulation performance than a high-end x86 processor

    A context saving fault tolerant approach for a shared memory many-core architecture

    No full text
    Conference of IEEE International Symposium on Circuits and Systems, ISCAS 2015 ; Conference Date: 24 May 2015 Through 27 May 2015; Conference Code:115760International audienceMechanisms for runtime fault-tolerance in many-core architectures are mandatory to cope with transient and permanent faults. This issue is even more relevant with aggressive technology nodes due to process variability, aging effects, and susceptibility to upsets, among other factors. This work proposes to save periodically the context and to re-schedule tasks to the last reliable known state and avoid the faulty processor. This technique is implemented on an embedded multicore architecture named P2012. The proposed fault-tolerant approach induces a limited overhead of 9.37% in an industrial image processing application while guaranteeing a full-error recovery if any error is detected

    Protein-tyrosine phosphorylation interaction network in Bacillus subtilis reveals new substrates, kinase activators and kinase cross-talk

    Get PDF
    Signal transduction in eukaryotes is generally transmitted through phosphorylation cascades that involve a complex interplay of transmembrane receptors, protein kinases, phosphatases and their targets. Our previous work indicated that bacterial protein-tyrosine kinases and phosphatases may exhibit similar properties, since they act on many different substrates. To capture the complexity of this phosphorylation-based network, we performed a comprehensive interactome study focused on the protein-tyrosine kinases and phosphatases in the model bacterium Bacillus subtilis. The resulting network identified many potential new substrates of kinases and phosphatases, some of which were experimentally validated. Our study highlighted the role of tyrosine and serine/threonine kinases and phosphatases in DNA metabolism, transcriptional control and cell division. This interaction network reveals significant crosstalk among different classes of kinases. We found that tyrosine kinases can bind to several modulators, transmembrane or cytosolic, consistent with a branching of signaling pathways. Most particularly, we found that the division site regulator MinD can form a complex with the tyrosine kinase PtkA and modulate its activity in vitro. In vivo, it acts as a scaffold protein which anchors the kinase at the cell pole. This network highlighted a role of tyrosine phosphorylation in the spatial regulation of the Z-ring during cytokinesis

    Beyond Do Loops: Data Transfer Generation with Convex Array Regions

    No full text
    Abstract. Automatic data transfer generation is a critical step for guided or automatic code generation for accelerators using distributed memories. Although good results have been achieved for loop nests, more complex control ows such as switches or while loops are generally not handled. This paper shows how to leverage the convex array regions abstraction to generate data transfers. The scope of this study ranges from inter-procedural analysis in simple loop nests with function calls, to inter-iteration data reuse optimization and arbitrary control ow in loop bodies. Generated transfers are approximated when an exact solution cannot be found. Array regions are also used to extend redundant load store elimination to array variables. The approach has been successfully applied to GPUs and domain-speci c hardware accelerators

    PNeuro: A scalable energy-efficient programmable hardware accelerator for neural networks

    No full text
    Proceedings of a meeting held 19-23 March 2018, Dresden, GermanyInternational audienceArtificial intelligence and especially Machine Learning recently gained a lot of interest from the industry. Indeed, new generation of neural networks built with a large number of successive computing layers enables a large amount of new applications and services implemented from smart sensors to data centers. These Deep Neural Networks (DNN) can interpret signals to recognize objects or situations to drive decision processes. However, their integration into embedded systems remains challenging due to their high computing needs. This paper presents PNeuro, a scalable energy-efficient hardware accelerator for the inference phase of DNN processing chains. Simple programmable processing elements architectured in SIMD clusters perform all the operations needed by DNN (convolutions, pooling, non-linear functions, etc.). An FDSOI 28 nm prototype shows an energy efficiency of 700 GMACS/s/W at 800 MHz. These results open important perspectives regarding the development of smart energy-efficient solutions based on Deep Neural Networks

    Interaction of bacterial fatty-acid-displaced regulators with DNA is interrupted by tyrosine phosphorylation in the helix-turn-helix domain

    Get PDF
    Bacteria possess transcription regulators (of the TetR family) specifically dedicated to repressing genes for cytochrome P450, involved in oxidation of polyunsaturated fatty acids. Interaction of these repressors with operator sequences is disrupted in the presence of fatty acids, and they are therefore known as fatty-acid-displaced regulators. Here, we describe a novel mechanism of inactivating the interaction of these proteins with DNA, illustrated by the example of Bacillus subtilis regulator FatR. FatR was found to interact in a two-hybrid assay with TkmA, an activator of the protein-tyrosine kinase PtkA. We show that FatR is phosphorylated specifically at the residue tyrosine 45 in its helix-turn-helix domain by the kinase PtkA. Structural modelling reveals that the hydroxyl group of tyrosine 45 interacts with DNA, and we show that this phosphorylation reduces FatR DNA binding capacity. Point mutants mimicking phosphorylation of FatR in vivo lead to a strong derepression of the fatR operon, indicating that this regulatory mechanism works independently of derepression by polyunsaturated fatty acids. Tyrosine 45 is a highly conserved residue, and PtkA from B. subtilis can phosphorylate FatR homologues from other bacteria. This indicates that phosphorylation of tyrosine 45 may be a general mechanism of switching off bacterial fatty-acid-displaced regulators

    Rapid Prototyping for Heterogeneous Multicomponent Systems: An MPEG-4 Stream over a UMTS Communication Link

    Get PDF
    <p/> <p>Future generations of mobile phones, including advanced video and digital communication layers, represent a great challenge in terms of real-time embedded systems. Programmable multicomponent architectures can provide suitable target solutions combining flexibility and computation power. The aim of our work is to develop a fast and automatic prototyping methodology dedicated to signal processing application implementation on parallel heterogeneous architectures, two major features required by future systems. This paper presents the whole methodology based on the SynDEx CAD tool that directly generates a distributed implementation onto various platforms from a high-level application description, taking real-time aspects into account. It illustrates the methodology in the context of real-time distributed executives for multilayer applications based on an MPEG-4 video codec and a UMTS telecommunication link.</p
    corecore