117 research outputs found

    Hard real-time performances in multiprocessor-embedded systems using ASMP-Linux

    Get PDF
    Multiprocessor systems, especially those based on multicore or multithreaded processors, and new operating system architectures can satisfy the ever increasing computational requirements of embedded systems.ASMP-LINUX is a modified, high responsiveness, open-source hard real-time operating system for multiprocessorsystems capable of providing high real-time performance while maintaining the code simple and not impacting on theperformances of the rest of the system. Moreover, ASMP-LINUX does not require code changing or application recompiling/relinking.In order to assess the performances of ASMP-LINUX, benchmarks have been performed on several hardware platformsand configurations

    Towards Work-Efficient Parallel Parameterized Algorithms

    Full text link
    Parallel parameterized complexity theory studies how fixed-parameter tractable (fpt) problems can be solved in parallel. Previous theoretical work focused on parallel algorithms that are very fast in principle, but did not take into account that when we only have a small number of processors (between 2 and, say, 1024), it is more important that the parallel algorithms are work-efficient. In the present paper we investigate how work-efficient fpt algorithms can be designed. We review standard methods from fpt theory, like kernelization, search trees, and interleaving, and prove trade-offs for them between work efficiency and runtime improvements. This results in a toolbox for developing work-efficient parallel fpt algorithms.Comment: Prior full version of the paper that will appear in Proceedings of the 13th International Conference and Workshops on Algorithms and Computation (WALCOM 2019), February 27 - March 02, 2019, Guwahati, India. The final authenticated version is available online at https://doi.org/10.1007/978-3-030-10564-8_2

    Portable, scalable, per-core power estimation for intelligent resource management

    Get PDF
    Performance, power, and temperature are now all first-order design constraints. Balancing power efficiency, thermal constraints, and performance requires some means to convey data about real-time power consumption and temperature to intelligent resource managers. Resource managers can use this information to meet performance goals, maintain power budgets, and obey thermal constraints. Unfortunately, obtaining the required machine introspection is challenging. Most current chips provide no support for per-core power monitoring, and when support exists, it is not exposed to software. We present a methodology for deriving per-core power models using sampled performance counter values and temperature sensor readings. We develop application-independent models for four different (four- to eight-core) platforms, validate their accuracy, and show how they can be used to guide scheduling decisions in power-aware resource managers. Model overhead is negligible, and estimations exhibit 1.1%-5.2% per-suite median error on the NAS, SPEC OMP, and SPEC 2006 benchmarks (and 1.2%-4.4% overall)

    Independent Sets in Restricted Line of Sight Networks

    Get PDF
    Line of Sight (LoS) networks were designed to model wireless networks in settings which may contain obstacles restricting visibility of sensors. A grap

    Sparse parameterized problems

    No full text
    Sparse languages play an important role in classical structural complexity theory. In this paper we introduce a natural definition of sparse problems for parameterized complexity theory. We prove an analog of Mahaney's theorem: there is no sparse parameterized problem which is hard for the tth level of the W hierarchy, unless the W hierarchy itself collapses up to level t. The main result is proved for the most general form of parametric many:1 reducibility, where the parameter functions are not assumed to be recursive. This provides one of the few instances in parameterized complexity theory of a full analog of a major classical theorem. The proof involves not only the standard technique of left sets, but also substantial circuit combinatorics to deal with the problem of small weft, and a diagonalization to cope with potentially nonrecursive parameter functions. The latter techniques are potentially of interest for further explorations of parameterized complexity analogs of classical structural results

    Real-time cache management framework for multi-core architectures

    No full text
    Multi-core architectures are shaking the fundamental assumption that in real-time systems the WCET, used to compute the schedulability of the complete system, is calculated on individual tasks. This is not even true in an approximate sense in a modern multi-core chip, due to interference caused by hardware resource sharing. In this work we propose a complete framework to (1) analyze and profile task memory access patterns and (2) a novel kernel-level cache management technique to enforce a deterministic cache allocation of the most frequently accessed memory areas. In this way, we provide a powerful tool to address the main sources of interference in a system where the last level of cache is shared among two or more CPUs. The technique has been implemented on commercial hardware and our evaluations show that it can be used to significantly improve the predictability of a given set of critical tasks

    A global operating system for HPC clusters

    No full text
    Modern supercomputers consist of clusters of thousands of independent nodes interconnected through fast networks. These nodes run independent operating system kernels, thus synchronization among them is demanded for user mode programs. This means that temporal synchronization of the nodes is a daunting task. On the other hand, HPC cluster applications often require a rather strict temporal synchronization for activities like performance analysis, application debugging, or data checkpointing. Therefore, the performance of an HPC parallel application may be severely impaired by the lack of temporal synchronization among the activities of the nodes of the cluster; this poses a severe limit on the scalability of such architectures. In this paper we introduce CAOS, an extension of the Linux kernel that aims to address the temporal synchronization problems of modern HPC clusters. We describe the general ideas behind CAOS, and we discuss some details of a possible implementation. We also illustrate some experiments performed on a prototype implementation of CAOS including a centralized network time tick, which allows a master node to synchronize the activities of all other nodes in the cluster, and a specific task scheduler tailored for HPC applications. These experiments, performed on a modern HPC cluster, witness that this new component has no measurable impact on the efficiency of the nodes while reducing the OS noise and providing better performance prediction. An implementation of CAOS based on this component can achieve a significant gain in terms of synchronization, global control, and scalability of the cluster
    corecore