157 research outputs found

    Mining for misconfigured machines in grid systems

    Full text link

    Detecting Abnormal Machine Characteristics in Cloud Infrastructures

    Get PDF
    In the cloud computing environment resources are accessed as services rather than as a product. Monitoring this system for performance is crucial because of typical pay-peruse packages bought by the users for their jobs. With the huge number of machines currently in the cloud system, it is often extremely difficult for system administrators to keep track of all machines using distributed monitoring programs such as Ganglia1 which lacks system health assessment and summarization capabilities. To overcome this problem, we propose a technique for automated anomaly detection using machine performance data in the cloud. Our algorithm is entirely distributed and runs locally on each computing machine on the cloud in order to rank the machines in order of their anomalous behavior for given jobs. There is no need to centralize any of the performance data for the analysis and at the end of the analysis, our algorithm generates error reports, thereby allowing the system administrators to take corrective actions. Experiments performed on real data sets collected for different jobs validate the fact that our algorithm has a low overhead for tracking anomalous machines in a cloud infrastructure

    HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

    Full text link
    Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing nterest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized both local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. In addition, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.Comment: 15 pages, 9 figure

    Thermal Neural Networks: Lumped-Parameter Thermal Modeling With State-Space Machine Learning

    Full text link
    With electric power systems becoming more compact and increasingly powerful, the relevance of thermal stress especially during overload operation is expected to increase ceaselessly. Whenever critical temperatures cannot be measured economically on a sensor base, a thermal model lends itself to estimate those unknown quantities. Thermal models for electric power systems are usually required to be both, real-time capable and of high estimation accuracy. Moreover, ease of implementation and time to production play an increasingly important role. In this work, the thermal neural network (TNN) is introduced, which unifies both, consolidated knowledge in the form of heat-transfer-based lumped-parameter models, and data-driven nonlinear function approximation with supervised machine learning. A quasi-linear parameter-varying system is identified solely from empirical data, where relationships between scheduling variables and system matrices are inferred statistically and automatically. At the same time, a TNN has physically interpretable states through its state-space representation, is end-to-end trainable -- similar to deep learning models -- with automatic differentiation, and requires no material, geometry, nor expert knowledge for its design. Experiments on an electric motor data set show that a TNN achieves higher temperature estimation accuracies than previous white-/grey- or black-box models with a mean squared error of 3.18 K23.18~\text{K}^2 and a worst-case error of 5.84 K5.84~\text{K} at 64 model parameters.Comment: Preprint; Fix typos, streamline math. notation; 10 page

    Towards Secure Cloud Orchestration for Multi-Cloud Deployments

    Get PDF
    Cloud orchestration frameworks are commonly used to deploy and operate cloud infrastructure. Their role spans both vertically (deployment on infrastructure, platform, application and microservice levels) and horizontally (deployments from many distinct cloud resource providers). However, despite the central role of orchestration, the popular orchestration frameworks lack mechanisms to provide security guarantees for cloud operators. In this work, we analyze the security landscape of cloud orchestration frameworks for multicloud infrastructure. We identify a set of attack scenarios, define security enforcement enablers and propose an architecture for a security-enabled cloud orchestration framework for multi-cloud application deployments

    BEHAVIORAL CHARACTERIZATION OF ATTACKS ON THE REMOTE DESKTOP PROTOCOL

    Get PDF
    The Remote Desktop Protocol (RDP) is popular for enabling remote access and administration of Windows systems; however, attackers can take advantage of RDP to cause harm to critical systems using it. Detection and classification of RDP attacks is a challenge because most RDP traffic is encrypted, and it is not always clear which connections to a system are malicious after manual decryption of RDP traffic. In this research, we used open-source tools to generate and analyze RDP attack data using a power-grid honeypot under our control. We developed methods for detecting and characterizing RDP attacks through malicious signatures, Windows event log entries, and network traffic metadata. Testing and evaluation of our characterization methods on actual attack data collected by four instances of our honeypot showed that we could effectively delineate benign and malicious RDP traffic and classify the severity of RDP attacks on unprotected or misconfigured Windows systems. The classification of attack patterns and severity levels can inform defenders of adversarial behavior in RDP attacks. Our results can also help protect national critical infrastructure, including Department of Defense systems.DOE, Washington DC 20805Civilian, SFSApproved for public release. Distribution is unlimited
    corecore