615 research outputs found

    Skyline: Interactive In-Editor Computational Performance Profiling for Deep Neural Network Training

    Full text link
    Training a state-of-the-art deep neural network (DNN) is a computationally-expensive and time-consuming process, which incentivizes deep learning developers to debug their DNNs for computational performance. However, effectively performing this debugging requires intimate knowledge about the underlying software and hardware systems---something that the typical deep learning developer may not have. To help bridge this gap, we present Skyline: a new interactive tool for DNN training that supports in-editor computational performance profiling, visualization, and debugging. Skyline's key contribution is that it leverages special computational properties of DNN training to provide (i) interactive performance predictions and visualizations, and (ii) directly manipulatable visualizations that, when dragged, mutate the batch size in the code. As an in-editor tool, Skyline allows users to leverage these diagnostic features to debug the performance of their DNNs during development. An exploratory qualitative user study of Skyline produced promising results; all the participants found Skyline to be useful and easy to use.Comment: 14 pages, 5 figures. Appears in the proceedings of UIST'2

    Modeling and Selection of Software Service Variants

    Get PDF
    Providers and consumers have to deal with variants, meaning alternative instances of a service?s design, implementation, deployment, or operation, when developing or delivering software services. This work presents service feature modeling to deal with associated challenges, comprising a language to represent software service variants and a set of methods for modeling and subsequent variant selection. This work?s evaluation includes a POC implementation and two real-life use cases

    Deployment and Operation of Complex Software in Heterogeneous Execution Environments

    Get PDF
    This open access book provides an overview of the work developed within the SODALITE project, which aims at facilitating the deployment and operation of distributed software on top of heterogeneous infrastructures, including cloud, HPC and edge resources. The experts participating in the project describe how SODALITE works and how it can be exploited by end users. While multiple languages and tools are available in the literature to support DevOps teams in the automation of deployment and operation steps, still these activities require specific know-how and skills that cannot be found in average teams. The SODALITE framework tackles this problem by offering modelling and smart editing features to allow those we call Application Ops Experts to work without knowing low level details about the adopted, potentially heterogeneous, infrastructures. The framework offers also mechanisms to verify the quality of the defined models, generate the corresponding executable infrastructural code, automatically wrap application components within proper execution containers, orchestrate all activities concerned with deployment and operation of all system components, and support on-the-fly self-adaptation and refactoring

    Exploring Strategies that IT Leaders Use to Adopt Cloud Computing

    Get PDF
    Information Technology (IT) leaders must leverage cloud computing to maintain competitive advantage. Evidence suggests that IT leaders who have leveraged cloud computing in small and medium sized organizations have saved an average of $1 million in IT services for their organizations. The purpose of this qualitative single case study was to explore strategies that IT leaders use to adopt cloud computing for their organizations. The target population consisted of 15 IT leaders who had experience with designing and deploying cloud computing solutions at their organization in Long Island, New York within the past 2 years. The conceptual framework of this research project was the disruptive innovation theory. Semistructured interviews were conducted and company documents were gathered. Data were inductively analyzed for emergent themes, then subjected to member checking to ensure the trustworthiness of findings. Four main themes emerged from the data: the essential elements for strategies to adopt cloud computing; most effective strategies; leadership essentials; and barriers, critical factors, and ineffective strategies affecting adoption of cloud computing. These findings may contribute to social change by providing insights to IT leaders in small and medium sized organizations to save money while gaining competitive advantage and ensure sustainable business growth that could enhance community standards of living

    Toward a More Accurate Web Service Selection Using Modified Interval DEA Models with Undesirable Outputs

    Get PDF
    With the growing number of Web services on the internet, there is a challenge to select the best Web service which can offer more quality-of-service (QoS) values at the lowest price. Another challenge is the uncertainty of QoS values over time due to the unpredictable nature of the internet. In this paper, we modify the interval data envelopment analysis (DEA) models [Wang, Greatbanks and Yang (2005)] for QoS-aware Web service selection considering the uncertainty of QoS attributes in the presence of desirable and undesirable factors. We conduct a set of experiments using a synthesized dataset to show the capabilities of the proposed models. The experimental results show that the correlation between the proposed models and the interval DEA models is significant. Also, the proposed models provide almost robust results and represent more stable behavior than the interval DEA models against QoS variations. Finally, we demonstrate the usefulness of the proposed models for QoS-aware Web service composition. Experimental results indicate that the proposed models significantly improve the fitness of the resultant compositions when they filter out unsatisfactory candidate services for each abstract service in the preprocessing phase. These models help users to select the best possible cloud service considering the dynamic internet environment and they help service providers to improve their Web services in the marke

    Challenges and Experiences in Designing Interpretable KPI-diagnostics for Cloud Applications

    Get PDF
    Automated root cause analysis of performance problems in modern cloud computing infrastructures is of a high technology value in the self-driving context. Those systems are evolved into large scale and complex solutions which are core for running most of today’s business applications. Hence, cloud management providers realize their mission through a “total” monitoring of data center flows thus enabling a full visibility into the cloud. Appropriate machine learning methods and software products rely on such observation data for real-time identification and remediation of potential sources of performance degradations in cloud operations to minimize their impacts. We describe the existing technology challenges and our experiences while working on designing problem root cause analysis mechanisms which are automatic, application agnostic, and, at the same time, interpretable for human operators to gain their trust. The paper focuses on diagnosis of cloud ecosystems through their Key Performance Indicators (KPI). Those indicators are utilized to build automatically labeled data sets and train explainable AI models for identifying conditions and processes “responsible” for misbehaviors. Our experiments on a large time series data set from a cloud application demonstrate that those approaches are effective in obtaining models that explain unacceptable KPI behaviors and localize sources of issues

    Modeling and Selection of Software Service Variants

    Get PDF
    Providers and consumers have to deal with variants of software services, which are alternative instances of a services design, implementation, deployment, or operation. This work develops the service feature modeling language to represent software service variants and a suite of methods to select variants for development or delivery. An evaluation describes the systems implemented to make use of service feature modeling and its application to two real-world use cases

    Service workload patterns for QoS-driven cloud resource management

    Get PDF
    Cloud service providers negotiate SLAs for customer services they offer based on the reliability of performance and availability of their lower-level platform infrastructure. While availability management is more mature, performance management is less reliable. In order to support a continuous approach that supports the initial static infrastructure configuration as well as dynamic reconfiguration and auto-scaling, an accurate and efficient solution is required. We propose a prediction technique that combines a workload pattern mining approach with a traditional collaborative filtering solution to meet the accuracy and efficiency requirements. Service workload patterns abstract common infrastructure workloads from monitoring logs and act as a part of a first-stage high-performant configuration mechanism before more complex traditional methods are considered. This enhances current reactive rule-based scalability approaches and basic prediction techniques by a hybrid prediction solution. Uncertainty and noise are additional challenges that emerge in multi-layered, often federated cloud architectures. We specifically add log smoothing combined with a fuzzy logic approach to make the prediction solution more robust in the context of these challenges

    Human system interaction with confident computing

    Get PDF
    This keynote will give an overview of the last 30 years of human system interaction and the key elements of Human Computer Interaction (HCI) and its transition from traditional HCI into the frontier of Human System Interaction (HSI). This leads to the discussion as to why HSI is about Digital Ecosystems and about the world we live in rather than just ICT. We explain the 5 Mega Trends, and the emergence of Confident Computing and how that is leading to the revolution of the next generation of Human System Interaction version 2.0 and Usability version 2.0. This is followed by the challenges and research issues within Human System Interaction (HSI)
    corecore