1,917 research outputs found

    Precision analysis for hardware acceleration of numerical algorithms

    No full text
    The precision used in an algorithm affects the error and performance of individual computations, the memory usage, and the potential parallelism for a fixed hardware budget. However, when migrating an algorithm onto hardware, the potential improvements that can be obtained by tuning the precision throughout an algorithm to meet a range or error specification are often overlooked; the major reason is that it is hard to choose a number system which can guarantee any such specification can be met. Instead, the problem is mitigated by opting to use IEEE standard double precision arithmetic so as to be ‘no worse’ than a software implementation. However, the flexibility in the number representation is one of the key factors that can be exploited on reconfigurable hardware such as FPGAs, and hence ignoring this potential significantly limits the performance achievable. In order to optimise the performance of hardware reliably, we require a method that can tractably calculate tight bounds for the error or range of any variable within an algorithm, but currently only a handful of methods to calculate such bounds exist, and these either sacrifice tightness or tractability, whilst simulation-based methods cannot guarantee the given error estimate. This thesis presents a new method to calculate these bounds, taking into account both input ranges and finite precision effects, which we show to be, in general, tighter in comparison to existing methods; this in turn can be used to tune the hardware to the algorithm specifications. We demonstrate the use of this software to optimise hardware for various algorithms to accelerate the solution of a system of linear equations, which forms the basis of many problems in engineering and science, and show that significant performance gains can be obtained by using this new approach in conjunction with more traditional hardware optimisations

    TIME-PREDICTABLE EXECUTION OF EMBEDDED SOFTWARE ON MULTI-CORE PLATFORMS

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Conservative collision prediction and avoidance for stochastic trajectories in continuous time and space

    Full text link
    Existing work in multi-agent collision prediction and avoidance typically assumes discrete-time trajectories with Gaussian uncertainty or that are completely deterministic. We propose an approach that allows detection of collisions even between continuous, stochastic trajectories with the only restriction that means and variances can be computed. To this end, we employ probabilistic bounds to derive criterion functions whose negative sign provably is indicative of probable collisions. For criterion functions that are Lipschitz, an algorithm is provided to rapidly find negative values or prove their absence. We propose an iterative policy-search approach that avoids prior discretisations and yields collision-free trajectories with adjustably high certainty. We test our method with both fixed-priority and auction-based protocols for coordinating the iterative planning process. Results are provided in collision-avoidance simulations of feedback controlled plants.Comment: This preprint is an extended version of a conference paper that is to appear in \textit{Proceedings of the 13th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2014)

    An Abstraction-Refinement Theory for the Analysis and Design of Concurrent Real-Time Systems

    Get PDF
    Concurrent real-time systems with shared resources belong to the class of safety-critical systems for which it is required to determine both temporally and functionally conservative guarantees. However, the growing complexity of real-time systems makes it more and more challenging to apply standard techniques for their analysis. Especially the presence of both cyclic data dependencies and cyclic resource dependencies makes many related analysis approaches inapplicable. The usage of Static Priority Preemptive (SPP) scheduling further impedes the employment of many "classical" analysis techniques. To address this growing complexity and to be able to give guarantees nevertheless we present an abstraction-refinement theory for real-time systems. We introduce a timed component model that is defined in such a generic way that both real-time system implementations and any kinds of analysis models for such applications can be expressed therein. Thereafter, we devise three different abstraction-refinement theories for the timed component model, exclusion, inclusion and bounding. Exclusion can be used to remove unconsidered corner cases, inclusion allows for the substitution of uncertainty with non-determinism, while bounding permits to replace non-determinism with determinism. The latter enables the creation of efficiently analyzable models that can be used to give temporal or functional guarantees on non-deterministic and non-monotone implementations. We use such abstractions to construct analysis models from concurrent real-time systems with shared resources and SPP scheduling. On these models we apply various analysis techniques, with the goal to increase analysis accuracy. Our first accuracy improvement is achieved by combining the rather coarse state-of-the-art period-and-jitter interference characterization with an explicit consideration of cyclic data dependencies. The interference-limiting effect of such cycles can be exploited even more with an "iterative buffer sizing". Next we replace period-and-jitter with execution intervals, resulting in an even higher accuracy. In our last approach we increase both accuracy and applicability by enabling the support of real-time systems with tasks consisting of multiple phases and operating at different rates. With a modification of this approach we further enable the analysis of applications with multiple shared resources. Finally, we also present the so-called HAPI simulator that is capable of simulating any kinds of concurrent real-time systems with shared resources

    Model-driven timing analysis of embedded software

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Converting existing analysis to the EDP resource model

    Get PDF
    In (hard) real-time embedded systems, it is necessary to guarantee that tasks always meet their deadlines i.e. results should neither be too early nor too late. In the context of fixed-priority systems, this is usually done by performing schedulability analysis in which the (best-case and) worst-case response-time of each task is computed and compared with its (best-case) worst-case deadline to determine schedulability. Resource reservation has been proposed as a means to provide temporal isolation between applications. Building upon this notion, hierarchical scheduling frameworks for different resource models have been proffered in the literature with complementary schedulability conditions. Unfortunately, these novel ideas do not directly allow for the reuse of existing results, but rather favor derivations from first principles. In this document, we investigate a means to reuse existing results from non-hierarchical scheduling theory by modeling the unavailability of a resource in a two-level hierarchical framework using two fictive tasks with highest priorities. We show that this novel method using our unavailability model not only allows for unifying the analysis but can also be easily applied in determining linear response-time upper bounds. For the latter, we also consider approaches for obtaining tighter bounds for harmonic tasks

    AN INVESTIGATION INTO PARTITIONING ALGORITHMS FOR AUTOMATIC HETEROGENEOUS COMPILERS

    Get PDF
    Automatic Heterogeneous Compilers allows blended hardware-software solutions to be explored without the cost of a full-fledged design team, but limited research exists on current partitioning algorithms responsible for separating hardware and software. The purpose of this thesis is to implement various partitioning algorithms onto the same automatic heterogeneous compiler platform to create an apples to apples comparison for AHC partitioning algorithms. Both estimated outcomes and actual outcomes for the solutions generated are studied and scored. The platform used to implement the algorithms is Cal Poly’s own Twill compiler, created by Doug Gallatin last year. Twill’s original partitioning algorithm is chosen along with two other partitioning algorithms: Tabu Search + Simulated Annealing (TSSA) and Genetic Search (GS). These algorithms are implemented inside Twill and test bench input code from the CHStone HLS Benchmark tests is used as stimulus. Along with the algorithms cost models, one key attribute of interest is queue counts generated, as the more cuts between hardware and software requires queues to pass the data between partition crossings. These high communication costs can end up damaging the heterogeneous solution’s performance. The Genetic, TSSA, and Twill’s original partitioning algorithm are all scored against each other’s cost models as well, combining the fitness and performance cost models with queue counts to evaluate each partitioning algorithm. The solutions generated by TSSA are rated as better by both the cost model for the TSSA algorithm and the cost model for the Genetic algorithm while producing low queue counts

    Real-time scheduling for media processing using conditionally guaranteed budgets

    Get PDF
    In dit proefschrift behandelen we een planningsprobleem dat haar oorsprong vindt in het kosteneffectief verwerken van verschillende media door software in consumentenapparaten, zoals digitale televisies. De laatste jaren zijn er trends gaande van analoge naar digitale systemen, en van verwerking van digitale signalen door speci??eke, toepassingsgerichte hardware naar verwerking door software. Voor de verwerking van digitale media door software wordt gebruik gemaakt van krachtige programmeerbare processoren. Om te kunnen wedijveren met bestaande oplossingen is het van belang dat deze programeerbare hardware zeer kosteneffectief wordt gebruikt. Daarnaast dienen de bestaande eigenschappen van deze consumenten apparaten, zoals robuustheid, stabiliteit, en voorspelbaarheid, behouden te blijven als er software wordt gebruikt. Verder geldt dat er gelijktijdig meerdere media stromen door een consumenten apparaat verwerkt moeten kunnen worden. Deze uitdaging is binnen de onderzoekslaboratoria van Philips aangegaan in het zogenoemde Video-Quality-of-Service programma, en het werk dat in dit proefschrift beschreven wordt is binnen dat programma ontstaan. De binnen dat programma gekozen aanpak is gebaseerd op schaalbare algoritmen voor de verwerking van media, budgetten voor die algoritmen, en software dat de instelling van die algoritmen en de grootte van de budgetten aanpast tijdens de verwerking van de media. Ten behoeve van het kosteneffectief gebruik van de programmeerbare processoren zijn de budgetten krap bemeten. Dit proefschrift geeft een uitvoerige beschrijving van die aanpak, en van een model van een apparaat dat de haalbaarheid van die aanpak aantoont. Vervolgens laten we zien dat die aanpak leidt tot een probleem wanneer er gelijktijdig meerdere stromen worden verwerkt die verschillende relatieve relevanties hebben voor de gebruiker van het apparaat. Om dit probleem op te lossen stellen we het nieuwe concept van voorwaardelijk gegarandeerde budgetten voor, en beschrijven we hoe dat concept kan worden gerealiseerd. De technieken voor het analyseren van het planningprobleem voor budgetten zijn gebaseerd op bestaande technieken voor slechtste-gevals-analyse voor periodieke real-time taken. We breiden die bestaande technieken uit met technieken voor beste-gevals-analyse zodat we apparaten die gebruik maken van dit nieuwe type budget kunnen analyseren
    corecore