266 research outputs found
Anomaly Detection using Autoencoders in High Performance Computing Systems
Anomaly detection in supercomputers is a very difficult problem due to the
big scale of the systems and the high number of components. The current state
of the art for automated anomaly detection employs Machine Learning methods or
statistical regression models in a supervised fashion, meaning that the
detection tool is trained to distinguish among a fixed set of behaviour classes
(healthy and unhealthy states).
We propose a novel approach for anomaly detection in High Performance
Computing systems based on a Machine (Deep) Learning technique, namely a type
of neural network called autoencoder. The key idea is to train a set of
autoencoders to learn the normal (healthy) behaviour of the supercomputer nodes
and, after training, use them to identify abnormal conditions. This is
different from previous approaches which where based on learning the abnormal
condition, for which there are much smaller datasets (since it is very hard to
identify them to begin with).
We test our approach on a real supercomputer equipped with a fine-grained,
scalable monitoring infrastructure that can provide large amount of data to
characterize the system behaviour. The results are extremely promising: after
the training phase to learn the normal system behaviour, our method is capable
of detecting anomalies that have never been seen before with a very good
accuracy (values ranging between 88% and 96%).Comment: 9 pages, 3 figure
ExaMon-X: a Predictive Maintenance Framework for Automatic Monitoring in Industrial IoT Systems
In recent years, the Industrial Internet of Things (IIoT) has led to significant steps forward in many industries, thanks to the exploitation of several technologies, ranging from Big Data processing to Artificial Intelligence (AI). Among the various IIoT scenarios, large-scale data centers can reap significant benefits from adopting Big Data analytics and AI-boosted approaches since these technologies can allow effective predictive maintenance. However, most of the off-the-shelf currently available solutions are not ideally suited to the HPC context, e.g., they do not sufficiently take into account the very heterogeneous data sources and the privacy issues which hinder the adoption of the cloud solution, or they do not fully
exploit the computing capabilities available in loco in a supercomputing facility. In this paper, we tackle this issue, and we propose an IIoT holistic and vertical framework for predictive maintenance in supercomputers. The framework is based on a big lightweight data monitoring infrastructure, specialized databases suited for heterogeneous data, and a set of high-level AI-based functionalities tailored to HPC actors’ specific needs. We present the deployment and assess the usage of this framework in several in-production HPC systems
Pricing Schemes for Energy-Efficient HPC Systems: Design and Exploration
Energy efficiency is of paramount importance for the sustainability of HPC
systems. Energy consumption limits the peak performance of supercomputers and
accounts for a large share of total cost of ownership. Consequently, system
owners and final users have started exploring mechanisms to trade off
performance for power consumption, for example through frequency and voltage
scaling.
However, only a limited number of studies have been devoted to explore the
economic viability of performance scaling solutions and to devise pricing
mechanisms fostering a more energy-conscious usage of resources, without
adversely impacting return-of-investment on the HPC facility. We present a
parametrized model to analyze the impact of frequency scaling on energy and to
assess the potential total cost benefits for the HPC facility and the user. We
evaluate four pricing schemes, considering both facility manager and the user
perspectives. We then perform a design space exploration considering current
and near-future HPC systems and technologies
Topic Analysis della letteratura scientifica sul tema Computer Chess con Metodi di Text Mining Non Supervisionati
Progettazione e implementazione di modelli di text mining non supervisionati su un dataset di dati non strutturati: articoli sulla storia del computer chess. Si sono affrontati per cui argomenti legati al Natural Language Processing (NLP). Inoltre, sono state affrontate tecniche di text augmentation per provvedere al bilanciamento delle classi del dataset. Tra i modelli utilizzati sono presenti: LDA, Word Embeddings, algoritmi di Clustering e Transformers
Power-Aware Job Dispatching in High Performance Computing Systems
This works deals with the power-aware job dispatching problem in supercomputers; broadly speaking the dispatching consists of assigning finite capacity resources to a set of activities, with a special concern toward power and energy efficient solutions. We introduce novel optimization approaches to address its multiple aspects.
The proposed techniques have a broad application range but are aimed at applications in the field of High Performance Computing (HPC) systems.
Devising a power-aware HPC job dispatcher is a complex, where contrasting goals must be satisfied. Furthermore, the online nature of the problem request that solutions must be computed in real time respecting stringent limits. This aspect historically discouraged the usage of exact methods and favouring instead the adoption of heuristic techniques. The application of optimization approaches to the dispatching task is still an unexplored area of research and can drastically improve the performance of HPC systems.
In this work we tackle the job dispatching problem on a real HPC machine, the Eurora supercomputer hosted at the Cineca research center, Bologna. We propose a Constraint Programming (CP) model that outperforms the dispatching software currently in use. An essential element to take power-aware decisions during the job dispatching phase is the possibility to estimate jobs power consumptions before their execution. To this end, we applied Machine Learning techniques to create a prediction model that was trained and tested on the Euora supercomputer, showing a great prediction accuracy. Then we finally develop a power-aware solution, considering the same target machine, and we devise different approaches to solve the dispatching problem while curtailing the power consumption of the whole system under a given threshold. We proposed a heuristic technique and a CP/heuristic hybrid method, both able to solve practical size instances and outperform the current state-of-the-art techniques
An Analysis of Regularized Approaches for Constrained Machine Learning
open4noopenLombardi, Michele; Baldo, Federico; Borghesi, Andrea; Milano, MichelaLombardi, Michele; Baldo, Federico; Borghesi, Andrea; Milano, Michel
Flying through congested airspaces: imaging of chronic rhinosinusitis
The complex regional anatomy of the nose and paranasal sinuses makes the interpretation of imaging studies of these structures intimidating to many radiologists. This paper aims to provide a key to interpretation by presenting a simplified approach to the functional anatomy of the paranasal sinuses and their most common (and most relevant) variants. This knowledge is basic for the full understanding of chronic rhinosinusitis and its computed tomography (CT) patterns. As fungal infections may be observed in the setting of chronic rhinosinusitis, these are also discussed. Chronic sinus inflammation produces bone changes, clearly depicted on CT images. Finally, clues to suspecting neoplastic lesions underlying inflammatory sinus conditions are provided
- …