230 research outputs found

    Advances in ILP-based Modulo Scheduling for High-Level Synthesis

    Get PDF
    In today's heterogenous computing world, field-programmable gate arrays (FPGA) represent the energy-efficient alternative to generic processor cores and graphics accelerators. However, due to their radically different computing model, automatic design methods, such as high-level synthesis (HLS), are needed to harness their full power. HLS raises the abstraction level to behavioural descriptions of algorithms, thus freeing designers from dealing with tedious low-level concerns, and enabling a rapid exploration of different microarchitectures for the same input specification. In an HLS tool, scheduling is the most influential step for the performance of the generated accelerator. Specifically, modulo schedulers enable a pipelined execution, which is a key technique to speed up the computation by extracting more parallelism from the input description. In this thesis, we make a case for the use of integer linear programming (ILP) as a framework for modulo scheduling approaches. First, we argue that ILP-based modulo schedulers are practically usable in the HLS context. Secondly, we show that the ILP framework enables a novel approach for the automatic design of FPGA accelerators. We substantiate the first claim by proposing a new, flexible ILP formulation for the modulo scheduling problem, and evaluate it experimentally with a diverse set of realistic test instances. While solving an ILP may incur an exponential runtime in the worst case, we observe that simple countermeasures, such as setting a time limit, help to contain the practical impact of outlier instances. Furthermore, we present an algorithm to compress problems before the actual scheduling. An HLS-generated microarchitecture is comprised of operators, i.e. single-purpose functional units such as a floating-point multiplier. Usually, the allocation of operators is determined before scheduling, even though both problems are interdependent. To that end, we investigate an extension of the modulo scheduling problem that combines both concerns in a single model. Based on the extension, we present a novel multi-loop scheduling approach capable of finding the fastest microarchitecture that still fits on a given FPGA device - an optimisation problem that current commercial HLS tools cannot solve. This proves our second claim

    Automatic performance optimisation of parallel programs for GPUs via rewrite rules

    Get PDF
    Graphics Processing Units (GPUs) are now commonplace in computing systems and are the most successful parallel accelerators. Their performance is orders of magnitude higher than traditional Central Processing Units (CPUs) making them attractive for many application domains with high computational demands. However, achieving their full performance potential is extremely hard, even for experienced programmers, as it requires specialised software tailored for specific devices written in low-level languages such as OpenCL. Differences in device characteristics between manufacturers and even hardware generations often lead to large performance variations when different optimisations are applied. This inevitably leads to code that is not performance portable across different hardware. This thesis demonstrates that achieving performance portability is possible using LIFT, a functional data-parallel language which allows programs to be expressed at a high-level in a hardware-agnostic way. The LIFT compiler is empowered to automatically explore the optimisation space using a set of well-defined rewrite rules to transform programs seamlessly between different high-level algorithmic forms before translating them to a low-level OpenCL-specific form. The first contribution of this thesis is the development of techniques to compile functional LIFT programs that have optimisations explicitly encoded into efficient imperative OpenCL code. Producing efficient code is non-trivial as many performance sensitive details such as memory allocation, array accesses or synchronisation are not explicitly represented in the functional LIFT language. The thesis shows that the newly developed techniques are essential for achieving performance on par with manually optimised code for GPU programs with the exact same complex optimisations applied. The second contribution of this thesis is the presentation of techniques that enable the LIFT compiler to perform complex optimisations that usually require from tens to hundreds of individual rule applications by grouping them as macro-rules that cut through the optimisation space. Using matrix multiplication as an example, starting from a single high-level program the compiler automatically generates highly optimised and specialised implementations for desktop and mobile GPUs with very different architectures achieving performance portability. The final contribution of this thesis is the demonstration of how low-level and GPU-specific features are extracted directly from the high-level functional LIFT program, enabling building a statistical performance model that makes accurate predictions about the performance of differently optimised program variants. This performance model is then used to drastically speed up the time taken by the optimisation space exploration by ranking the different variants based on their predicted performance. Overall, this thesis demonstrates that performance portability is achievable using LIFT

    Timing in Technischen Sicherheitsanforderungen für Systementwürfe mit heterogenen Kritikalitätsanforderungen

    Get PDF
    Traditionally, timing requirements as (technical) safety requirements have been avoided through clever functional designs. New vehicle automation concepts and other applications, however, make this harder or even impossible and challenge design automation for cyber-physical systems to provide a solution. This thesis takes upon this challenge by introducing cross-layer dependency analysis to relate timing dependencies in the bounded execution time (BET) model to the functional model of the artifact. In doing so, the analysis is able to reveal where timing dependencies may violate freedom from interference requirements on the functional layer and other intermediate model layers. For design automation this leaves the challenge how such dependencies are avoided or at least be bounded such that the design is feasible: The results are synthesis strategies for implementation requirements and a system-level placement strategy for run-time measures to avoid potentially catastrophic consequences of timing dependencies which are not eliminated from the design. Their applicability is shown in experiments and case studies. However, all the proposed run-time measures as well as very strict implementation requirements become ever more expensive in terms of design effort for contemporary embedded systems, due to the system's complexity. Hence, the second part of this thesis reflects on the design aspect rather than the analysis aspect of embedded systems and proposes a timing predictable design paradigm based on System-Level Logical Execution Time (SL-LET). Leveraging a timing-design model in SL-LET the proposed methods from the first part can now be applied to improve the quality of a design -- timing error handling can now be separated from the run-time methods and from the implementation requirements intended to guarantee them. The thesis therefore introduces timing diversity as a timing-predictable execution theme that handles timing errors without having to deal with them in the implemented application. An automotive 3D-perception case study demonstrates the applicability of timing diversity to ensure predictable end-to-end timing while masking certain types of timing errors.Traditionell wurden Timing-Anforderungen als (technische) Sicherheitsanforderungen durch geschickte funktionale Entwürfe vermieden. Neue Fahrzeugautomatisierungskonzepte und Anwendungen machen dies jedoch schwieriger oder gar unmöglich; Aufgrund der Problemkomplexität erfordert dies eine Entwurfsautomatisierung für cyber-physische Systeme heraus. Diese Arbeit nimmt sich dieser Herausforderung an, indem sie eine schichtenübergreifende Abhängigkeitsanalyse einführt, um zeitliche Abhängigkeiten im Modell der beschränkten Ausführungszeit (BET) mit dem funktionalen Modell des Artefakts in Beziehung zu setzen. Auf diese Weise ist die Analyse in der Lage, aufzuzeigen, wo Timing-Abhängigkeiten die Anforderungen an die Störungsfreiheit auf der funktionalen Schicht und anderen dazwischenliegenden Modellschichten verletzen können. Für die Entwurfsautomatisierung ergibt sich daraus die Herausforderung, wie solche Abhängigkeiten vermieden oder zumindest so eingegrenzt werden können, dass der Entwurf machbar ist: Das Ergebnis sind Synthesestrategien für Implementierungsanforderungen und eine Platzierungsstrategie auf Systemebene für Laufzeitmaßnahmen zur Vermeidung potentiell katastrophaler Folgen von Timing-Abhängigkeiten, die nicht aus dem Entwurf eliminiert werden. Ihre Anwendbarkeit wird in Experimenten und Fallstudien gezeigt. Allerdings werden alle vorgeschlagenen Laufzeitmaßnahmen sowie sehr strenge Implementierungsanforderungen für moderne eingebettete Systeme aufgrund der Komplexität des Systems immer teurer im Entwurfsaufwand. Daher befasst sich der zweite Teil dieser Arbeit eher mit dem Entwurfsaspekt als mit dem Analyseaspekt von eingebetteten Systemen und schlägt ein Entwurfsparadigma für vorhersagbares Timing vor, das auf der System-Level Logical Execution Time (SL-LET) basiert. Basierend auf einem Timing-Entwurfsmodell in SL-LET können die vorgeschlagenen Methoden aus dem ersten Teil nun angewandt werden, um die Qualität eines Entwurfs zu verbessern -- die Behandlung von Timing-Fehlern kann nun von den Laufzeitmethoden und von den Implementierungsanforderungen, die diese garantieren sollen, getrennt werden. In dieser Arbeit wird daher Timing Diversity als ein Thema der Timing-Vorhersage in der Ausführung eingeführt, das Timing-Fehler behandelt, ohne dass sie in der implementierten Anwendung behandelt werden müssen. Anhand einer Fallstudie aus dem Automobilbereich (3D-Umfeldwahrnehmung) wird die Anwendbarkeit von Timing-Diversität demonstriert, um ein vorhersagbares Ende-zu-Ende-Timing zu gewährleisten und gleichzeitig in der Lage zu sein, bestimmte Arten von Timing-Fehlern zu maskieren

    Clafer: Lightweight Modeling of Structure, Behaviour, and Variability

    Get PDF
    Embedded software is growing fast in size and complexity, leading to intimate mixture of complex architectures and complex control. Consequently, software specification requires modeling both structures and behaviour of systems. Unfortunately, existing languages do not integrate these aspects well, usually prioritizing one of them. It is common to develop a separate language for each of these facets. In this paper, we contribute Clafer: a small language that attempts to tackle this challenge. It combines rich structural modeling with state of the art behavioural formalisms. We are not aware of any other modeling language that seamlessly combines these facets common to system and software modeling. We show how Clafer, in a single unified syntax and semantics, allows capturing feature models (variability), component models, discrete control models (automata) and variability encompassing all these aspects. The language is built on top of first order logic with quantifiers over basic entities (for modeling structures) combined with linear temporal logic (for modeling behaviour). On top of this semantic foundation we build a simple but expressive syntax, enriched with carefully selected syntactic expansions that cover hierarchical modeling, associations, automata, scenarios, and Dwyer's property patterns. We evaluate Clafer using a power window case study, and comparing it against other notations that substantially overlap with its scope (SysML, AADL, Temporal OCL and Live Sequence Charts), discussing benefits and perils of using a single notation for the purpose

    A domain-extensible compiler with controllable automation of optimisations

    Get PDF
    In high performance domains like image processing, physics simulation or machine learning, program performance is critical. Programmers called performance engineers are responsible for the challenging task of optimising programs. Two major challenges prevent modern compilers targeting heterogeneous architectures from reliably automating optimisation. First, domain specific compilers such as Halide for image processing and TVM for machine learning are difficult to extend with the new optimisations required by new algorithms and hardware. Second, automatic optimisation is often unable to achieve the required performance, and performance engineers often fall back to painstaking manual optimisation. This thesis shows the potential of the Shine compiler to achieve domain-extensibility, controllable automation, and generate high performance code. Domain-extensibility facilitates adapting compilers to new algorithms and hardware. Controllable automation enables performance engineers to gradually take control of the optimisation process. The first research contribution is to add 3 code generation features to Shine, namely: synchronisation barrier insertion, kernel execution, and storage folding. Adding these features requires making novel design choices in terms of compiler extensibility and controllability. The rest of this thesis builds on these features to generate code with competitive runtime compared to established domain-specific compilers. The second research contribution is to demonstrate how extensibility and controllability are exploited to optimise a standard image processing pipeline for corner detection. Shine achieves 6 well-known image processing optimisations, 2 of them not being supported by Halide. Our results on 4 ARM multi-core CPUs show that the code generated by Shine for corner detection runs up to 1.4× faster than the Halide code. However, we observe that controlling rewriting is tedious, motivating the need for more automation. The final research contribution is to introduce sketch-guided equality saturation, a semiautomated technique that allows performance engineers to guide program rewriting by specifying rewrite goals as sketches: program patterns that leave details unspecified. We evaluate this approach by applying 7 realistic optimisations of matrix multiplication. Without guidance, the compiler fails to apply the 5 most complex optimisations even given an hour and 60GB of RAM. With the guidance of at most 3 sketch guides, each 10 times smaller than the complete program, the compiler applies the optimisations in seconds using less than 1GB

    LMS-Verify: abstraction without regret for verified systems programming

    Get PDF
    Performance critical software is almost always developed in C, as programmers do not trust high-level languages to deliver the same reliable performance. This is bad because low-level code in unsafe languages attracts security vulnerabilities and because development is far less productive, with PL advances mostly lost on programmers operating under tight performance constraints. High-level languages provide memory safety out of the box, but they are deemed too slow and unpredictable for serious system software. Recent years have seen a surge in staging and generative programming: the key idea is to use high-level languages and their abstraction power as glorified macro systems to compose code fragments in first-order, potentially domain-specific, intermediate languages, from which fast C can be emitted. But what about security? Since the end result is still C code, the safety guarantees of the high-level host language are lost. In this paper, we extend this generative approach to emit ACSL specifications along with C code. We demonstrate that staging achieves ``abstraction without regret'' for verification: we show how high-level programming models, in particular higher-order composable contracts from dynamic languages, can be used at generation time to compose and generate first-order specifications that can be statically checked by existing tools. We also show how type classes can automatically attach invariants to data types, reducing the need for repetitive manual annotations. We evaluate our system on several case studies that varyingly exercise verification of memory safety, overflow safety, and functional correctness. We feature an HTTP parser that is (1) fast (2) high-level: implemented using staged parser combinators (3) secure: with verified memory safety. This result is significant, as input parsing is a key attack vector, and vulnerabilities related to HTTP parsing have been documented in all widely-used web servers.</jats:p

    Cautiously Optimistic Program Analyses for Secure and Reliable Software

    Full text link
    Modern computer systems still have various security and reliability vulnerabilities. Well-known dynamic analyses solutions can mitigate them using runtime monitors that serve as lifeguards. But the additional work in enforcing these security and safety properties incurs exorbitant performance costs, and such tools are rarely used in practice. Our work addresses this problem by constructing a novel technique- Cautiously Optimistic Program Analysis (COPA). COPA is optimistic- it infers likely program invariants from dynamic observations, and assumes them in its static reasoning to precisely identify and elide wasteful runtime monitors. The resulting system is fast, but also ensures soundness by recovering to a conservatively optimized analysis when a likely invariant rarely fails at runtime. COPA is also cautious- by carefully restricting optimizations to only safe elisions, the recovery is greatly simplified. It avoids unbounded rollbacks upon recovery, thereby enabling analysis for live production software. We demonstrate the effectiveness of Cautiously Optimistic Program Analyses in three areas: Information-Flow Tracking (IFT) can help prevent security breaches and information leaks. But they are rarely used in practice due to their high performance overhead (>500% for web/email servers). COPA dramatically reduces this cost by eliding wasteful IFT monitors to make it practical (9% overhead, 4x speedup). Automatic Garbage Collection (GC) in managed languages (e.g. Java) simplifies programming tasks while ensuring memory safety. However, there is no correct GC for weakly-typed languages (e.g. C/C++), and manual memory management is prone to errors that have been exploited in high profile attacks. We develop the first sound GC for C/C++, and use COPA to optimize its performance (16% overhead). Sequential Consistency (SC) provides intuitive semantics to concurrent programs that simplifies reasoning for their correctness. However, ensuring SC behavior on commodity hardware remains expensive. We use COPA to ensure SC for Java at the language-level efficiently, and significantly reduce its cost (from 24% down to 5% on x86). COPA provides a way to realize strong software security, reliability and semantic guarantees at practical costs.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/170027/1/subarno_1.pd

    Fundamental Approaches to Software Engineering

    Get PDF
    This open access book constitutes the proceedings of the 24th International Conference on Fundamental Approaches to Software Engineering, FASE 2021, which took place during March 27–April 1, 2021, and was held as part of the Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg but changed to an online format due to the COVID-19 pandemic. The 16 full papers presented in this volume were carefully reviewed and selected from 52 submissions. The book also contains 4 Test-Comp contributions

    A soft computing decision support framework for e-learning

    Get PDF
    Tesi per compendi de publicacions.Supported by technological development and its impact on everyday activities, e-Learning and b-Learning (Blended Learning) have experienced rapid growth mainly in higher education and training. Its inherent ability to break both physical and cultural distances, to disseminate knowledge and decrease the costs of the teaching-learning process allows it to reach anywhere and anyone. The educational community is divided as to its role in the future. It is believed that by 2019 half of the world's higher education courses will be delivered through e-Learning. While supporters say that this will be the educational mode of the future, its detractors point out that it is a fashion, that there are huge rates of abandonment and that their massification and potential low quality, will cause its fall, assigning it a major role of accompanying traditional education. There are, however, two interrelated features where there seems to be consensus. On the one hand, the enormous amount of information and evidence that Learning Management Systems (LMS) generate during the e-Learning process and which is the basis of the part of the process that can be automated. In contrast, there is the fundamental role of e-tutors and etrainers who are guarantors of educational quality. These are continually overwhelmed by the need to provide timely and effective feedback to students, manage endless particular situations and casuistics that require decision making and process stored information. In this sense, the tools that e-Learning platforms currently provide to obtain reports and a certain level of follow-up are not sufficient or too adequate. It is in this point of convergence Information-Trainer, where the current developments of the LMS are centered and it is here where the proposed thesis tries to innovate. This research proposes and develops a platform focused on decision support in e-Learning environments. Using soft computing and data mining techniques, it extracts knowledge from the data produced and stored by e-Learning systems, allowing the classification, analysis and generalization of the extracted knowledge. It includes tools to identify models of students' learning behavior and, from them, predict their future performance and enable trainers to provide adequate feedback. Likewise, students can self-assess, avoid those ineffective behavior patterns, and obtain real clues about how to improve their performance in the course, through appropriate routes and strategies based on the behavioral model of successful students. The methodological basis of the mentioned functionalities is the Fuzzy Inductive Reasoning (FIR), which is particularly useful in the modeling of dynamic systems. During the development of the research, the FIR methodology has been improved and empowered by the inclusion of several algorithms. First, an algorithm called CR-FIR, which allows determining the Causal Relevance that have the variables involved in the modeling of learning and assessment of students. In the present thesis, CR-FIR has been tested on a comprehensive set of classical test data, as well as real data sets, belonging to different areas of knowledge. Secondly, the detection of atypical behaviors in virtual campuses was approached using the Generative Topographic Mapping (GTM) methodology, which is a probabilistic alternative to the well-known Self-Organizing Maps. GTM was used simultaneously for clustering, visualization and detection of atypical data. The core of the platform has been the development of an algorithm for extracting linguistic rules in a language understandable to educational experts, which helps them to obtain patterns of student learning behavior. In order to achieve this functionality, the LR-FIR algorithm (Extraction of Linguistic Rules in FIR) was designed and developed as an extension of FIR that allows both to characterize general behavior and to identify interesting patterns. In the case of the application of the platform to several real e-Learning courses, the results obtained demonstrate its feasibility and originality. The teachers' perception about the usability of the tool is very good, and they consider that it could be a valuable resource to mitigate the time requirements of the trainer that the e-Learning courses demand. The identification of student behavior models and prediction processes have been validated as to their usefulness by expert trainers. LR-FIR has been applied and evaluated in a wide set of real problems, not all of them in the educational field, obtaining good results. The structure of the platform makes it possible to assume that its use is potentially valuable in those domains where knowledge management plays a preponderant role, or where decision-making processes are a key element, e.g. ebusiness, e-marketing, customer management, to mention just a few. The Soft Computing tools used and developed in this research: FIR, CR-FIR, LR-FIR and GTM, have been applied successfully in other real domains, such as music, medicine, weather behaviors, etc.Soportado por el desarrollo tecnológico y su impacto en las diferentes actividades cotidianas, el e-Learning (o aprendizaje electrónico) y el b-Learning (Blended Learning o aprendizaje mixto), han experimentado un crecimiento vertiginoso principalmente en la educación superior y la capacitación. Su habilidad inherente para romper distancias tanto físicas como culturales, para diseminar conocimiento y disminuir los costes del proceso enseñanza aprendizaje le permite llegar a cualquier sitio y a cualquier persona. La comunidad educativa se encuentra dividida en cuanto a su papel en el futuro. Se cree que para el año 2019 la mitad de los cursos de educación superior del mundo se impartirá a través del e-Learning. Mientras que los partidarios aseguran que ésta será la modalidad educativa del futuro, sus detractores señalan que es una moda, que hay enormes índices de abandono y que su masificación y potencial baja calidad, provocará su caída, reservándole un importante papel de acompañamiento a la educación tradicional. Hay, sin embargo, dos características interrelacionadas donde parece haber consenso. Por un lado, la enorme generación de información y evidencias que los sistemas de gestión del aprendizaje o LMS (Learning Management System) generan durante el proceso educativo electrónico y que son la base de la parte del proceso que se puede automatizar. En contraste, está el papel fundamental de los e-tutores y e-formadores que son los garantes de la calidad educativa. Éstos se ven continuamente desbordados por la necesidad de proporcionar retroalimentación oportuna y eficaz a los alumnos, gestionar un sin fin de situaciones particulares y casuísticas que requieren toma de decisiones y procesar la información almacenada. En este sentido, las herramientas que las plataformas de e-Learning proporcionan actualmente para obtener reportes y cierto nivel de seguimiento no son suficientes ni demasiado adecuadas. Es en este punto de convergencia Información-Formador, donde están centrados los actuales desarrollos de los LMS y es aquí donde la tesis que se propone pretende innovar. La presente investigación propone y desarrolla una plataforma enfocada al apoyo en la toma de decisiones en ambientes e-Learning. Utilizando técnicas de Soft Computing y de minería de datos, extrae conocimiento de los datos producidos y almacenados por los sistemas e-Learning permitiendo clasificar, analizar y generalizar el conocimiento extraído. Incluye herramientas para identificar modelos del comportamiento de aprendizaje de los estudiantes y, a partir de ellos, predecir su desempeño futuro y permitir a los formadores proporcionar una retroalimentación adecuada. Así mismo, los estudiantes pueden autoevaluarse, evitar aquellos patrones de comportamiento poco efectivos y obtener pistas reales acerca de cómo mejorar su desempeño en el curso, mediante rutas y estrategias adecuadas a partir del modelo de comportamiento de los estudiantes exitosos. La base metodológica de las funcionalidades mencionadas es el Razonamiento Inductivo Difuso (FIR, por sus siglas en inglés), que es particularmente útil en el modelado de sistemas dinámicos. Durante el desarrollo de la investigación, la metodología FIR ha sido mejorada y potenciada mediante la inclusión de varios algoritmos. En primer lugar un algoritmo denominado CR-FIR, que permite determinar la Relevancia Causal que tienen las variables involucradas en el modelado del aprendizaje y la evaluación de los estudiantes. En la presente tesis, CR-FIR se ha probado en un conjunto amplio de datos de prueba clásicos, así como conjuntos de datos reales, pertenecientes a diferentes áreas de conocimiento. En segundo lugar, la detección de comportamientos atípicos en campus virtuales se abordó mediante el enfoque de Mapeo Topográfico Generativo (GTM), que es una alternativa probabilística a los bien conocidos Mapas Auto-organizativos. GTM se utilizó simultáneamente para agrupamiento, visualización y detección de datos atípicos. La parte medular de la plataforma ha sido el desarrollo de un algoritmo de extracción de reglas lingüísticas en un lenguaje entendible para los expertos educativos, que les ayude a obtener los patrones del comportamiento de aprendizaje de los estudiantes. Para lograr dicha funcionalidad, se diseñó y desarrolló el algoritmo LR-FIR, (extracción de Reglas Lingüísticas en FIR, por sus siglas en inglés) como una extensión de FIR que permite tanto caracterizar el comportamiento general, como identificar patrones interesantes. En el caso de la aplicación de la plataforma a varios cursos e-Learning reales, los resultados obtenidos demuestran su factibilidad y originalidad. La percepción de los profesores acerca de la usabilidad de la herramienta es muy buena, y consideran que podría ser un valioso recurso para mitigar los requerimientos de tiempo del formador que los cursos e-Learning exigen. La identificación de los modelos de comportamiento de los estudiantes y los procesos de predicción han sido validados en cuanto a su utilidad por los formadores expertos. LR-FIR se ha aplicado y evaluado en un amplio conjunto de problemas reales, no todos ellos del ámbito educativo, obteniendo buenos resultados. La estructura de la plataforma permite suponer que su utilización es potencialmente valiosa en aquellos dominios donde la administración del conocimiento juegue un papel preponderante, o donde los procesos de toma de decisiones sean una pieza clave, por ejemplo, e-business, e-marketing, administración de clientes, por mencionar sólo algunos. Las herramientas de Soft Computing utilizadas y desarrolladas en esta investigación: FIR, CR-FIR, LR-FIR y GTM, ha sido aplicadas con éxito en otros dominios reales, como música, medicina, comportamientos climáticos, etc.Postprint (published version
    corecore