266 research outputs found

    Anomaly Detection using Autoencoders in High Performance Computing Systems

    Full text link
    Anomaly detection in supercomputers is a very difficult problem due to the big scale of the systems and the high number of components. The current state of the art for automated anomaly detection employs Machine Learning methods or statistical regression models in a supervised fashion, meaning that the detection tool is trained to distinguish among a fixed set of behaviour classes (healthy and unhealthy states). We propose a novel approach for anomaly detection in High Performance Computing systems based on a Machine (Deep) Learning technique, namely a type of neural network called autoencoder. The key idea is to train a set of autoencoders to learn the normal (healthy) behaviour of the supercomputer nodes and, after training, use them to identify abnormal conditions. This is different from previous approaches which where based on learning the abnormal condition, for which there are much smaller datasets (since it is very hard to identify them to begin with). We test our approach on a real supercomputer equipped with a fine-grained, scalable monitoring infrastructure that can provide large amount of data to characterize the system behaviour. The results are extremely promising: after the training phase to learn the normal system behaviour, our method is capable of detecting anomalies that have never been seen before with a very good accuracy (values ranging between 88% and 96%).Comment: 9 pages, 3 figure

    ExaMon-X: a Predictive Maintenance Framework for Automatic Monitoring in Industrial IoT Systems

    Get PDF
    In recent years, the Industrial Internet of Things (IIoT) has led to significant steps forward in many industries, thanks to the exploitation of several technologies, ranging from Big Data processing to Artificial Intelligence (AI). Among the various IIoT scenarios, large-scale data centers can reap significant benefits from adopting Big Data analytics and AI-boosted approaches since these technologies can allow effective predictive maintenance. However, most of the off-the-shelf currently available solutions are not ideally suited to the HPC context, e.g., they do not sufficiently take into account the very heterogeneous data sources and the privacy issues which hinder the adoption of the cloud solution, or they do not fully exploit the computing capabilities available in loco in a supercomputing facility. In this paper, we tackle this issue, and we propose an IIoT holistic and vertical framework for predictive maintenance in supercomputers. The framework is based on a big lightweight data monitoring infrastructure, specialized databases suited for heterogeneous data, and a set of high-level AI-based functionalities tailored to HPC actors’ specific needs. We present the deployment and assess the usage of this framework in several in-production HPC systems

    Pricing Schemes for Energy-Efficient HPC Systems: Design and Exploration

    Full text link
    Energy efficiency is of paramount importance for the sustainability of HPC systems. Energy consumption limits the peak performance of supercomputers and accounts for a large share of total cost of ownership. Consequently, system owners and final users have started exploring mechanisms to trade off performance for power consumption, for example through frequency and voltage scaling. However, only a limited number of studies have been devoted to explore the economic viability of performance scaling solutions and to devise pricing mechanisms fostering a more energy-conscious usage of resources, without adversely impacting return-of-investment on the HPC facility. We present a parametrized model to analyze the impact of frequency scaling on energy and to assess the potential total cost benefits for the HPC facility and the user. We evaluate four pricing schemes, considering both facility manager and the user perspectives. We then perform a design space exploration considering current and near-future HPC systems and technologies

    Topic Analysis della letteratura scientifica sul tema Computer Chess con Metodi di Text Mining Non Supervisionati

    Get PDF
    Progettazione e implementazione di modelli di text mining non supervisionati su un dataset di dati non strutturati: articoli sulla storia del computer chess. Si sono affrontati per cui argomenti legati al Natural Language Processing (NLP). Inoltre, sono state affrontate tecniche di text augmentation per provvedere al bilanciamento delle classi del dataset. Tra i modelli utilizzati sono presenti: LDA, Word Embeddings, algoritmi di Clustering e Transformers

    Integrazione di ottimizzazione e simulazioni per il piano energetico regionale dell'Emilia-Romagna

    Get PDF

    Power-Aware Job Dispatching in High Performance Computing Systems

    Get PDF
    This works deals with the power-aware job dispatching problem in supercomputers; broadly speaking the dispatching consists of assigning finite capacity resources to a set of activities, with a special concern toward power and energy efficient solutions. We introduce novel optimization approaches to address its multiple aspects. The proposed techniques have a broad application range but are aimed at applications in the field of High Performance Computing (HPC) systems. Devising a power-aware HPC job dispatcher is a complex, where contrasting goals must be satisfied. Furthermore, the online nature of the problem request that solutions must be computed in real time respecting stringent limits. This aspect historically discouraged the usage of exact methods and favouring instead the adoption of heuristic techniques. The application of optimization approaches to the dispatching task is still an unexplored area of research and can drastically improve the performance of HPC systems. In this work we tackle the job dispatching problem on a real HPC machine, the Eurora supercomputer hosted at the Cineca research center, Bologna. We propose a Constraint Programming (CP) model that outperforms the dispatching software currently in use. An essential element to take power-aware decisions during the job dispatching phase is the possibility to estimate jobs power consumptions before their execution. To this end, we applied Machine Learning techniques to create a prediction model that was trained and tested on the Euora supercomputer, showing a great prediction accuracy. Then we finally develop a power-aware solution, considering the same target machine, and we devise different approaches to solve the dispatching problem while curtailing the power consumption of the whole system under a given threshold. We proposed a heuristic technique and a CP/heuristic hybrid method, both able to solve practical size instances and outperform the current state-of-the-art techniques

    An Analysis of Regularized Approaches for Constrained Machine Learning

    Get PDF
    open4noopenLombardi, Michele; Baldo, Federico; Borghesi, Andrea; Milano, MichelaLombardi, Michele; Baldo, Federico; Borghesi, Andrea; Milano, Michel

    Flying through congested airspaces: imaging of chronic rhinosinusitis

    Get PDF
    The complex regional anatomy of the nose and paranasal sinuses makes the interpretation of imaging studies of these structures intimidating to many radiologists. This paper aims to provide a key to interpretation by presenting a simplified approach to the functional anatomy of the paranasal sinuses and their most common (and most relevant) variants. This knowledge is basic for the full understanding of chronic rhinosinusitis and its computed tomography (CT) patterns. As fungal infections may be observed in the setting of chronic rhinosinusitis, these are also discussed. Chronic sinus inflammation produces bone changes, clearly depicted on CT images. Finally, clues to suspecting neoplastic lesions underlying inflammatory sinus conditions are provided
    • …
    corecore