1,955 research outputs found

    An Experiment on Bare-Metal BigData Provisioning

    Full text link
    Many BigData customers use on-demand platforms in the cloud, where they can get a dedicated virtual cluster in a couple of minutes and pay only for the time they use. Increasingly, there is a demand for bare-metal bigdata solutions for applications that cannot tolerate the unpredictability and performance degradation of virtualized systems. Existing bare-metal solutions can introduce delays of 10s of minutes to provision a cluster by installing operating systems and applications on the local disks of servers. This has motivated recent research developing sophisticated mechanisms to optimize this installation. These approaches assume that using network mounted boot disks incur unacceptable run-time overhead. Our analysis suggest that while this assumption is true for application data, it is incorrect for operating systems and applications, and network mounting the boot disk and applications result in negligible run-time impact while leading to faster provisioning time.This research was supported in part by the MassTech Collaborative Research Matching Grant Program, NSF awards 1347525 and 1414119 and several commercial partners of the Massachusetts Open Cloud who may be found at http://www.massopencloud.or

    Public survey instruments for business administration using social network analysis and big data

    Get PDF
    Purpose: The subject matter of this research is closely intertwined with the scientific discussion about the necessity of developing and implementing practice-oriented means of measuring social well-being taking into account the intensity of contacts between individuals. The aim of the research is to test the toolkit for analyzing social networks and to develop a research algorithm to identify sources of consolidation of public opinion and key agents of influence. The research methodology is based on postulates of sociology, graph theory, social network analysis and cluster analysis. Design/Methodology/Approach: The basis for the empirical research was provided by the data representing the reflection of social media users on the existing image of Russia and its activities in the Arctic, chosen as a model case. Findings: The algorithm allows to estimate the density and intensity of connections between actors, to trace the main channels of formation of public opinion and key agents of influence, to identify implicit patterns and trends, to relate information flows and events with current information causes and news stories for the subsequent formation of a "cleansed" image of the object under study and the key actors with whom this object is associated. Practical Implications: The work contributes to filling the existing gap in the scientific literature, caused by insufficient elaboration of the issues of applying the social network analysis to solve sociological problems. Originality/Value: The work contributes to filling the existing gap in the scientific literature formed as a result of insufficient development of practical issues of using analysis of social networks to solve sociological problems.peer-reviewe

    HIL: designing an exokernel for the data center

    Full text link
    We propose a new Exokernel-like layer to allow mutually untrusting physically deployed services to efficiently share the resources of a data center. We believe that such a layer offers not only efficiency gains, but may also enable new economic models, new applications, and new security-sensitive uses. A prototype (currently in active use) demonstrates that the proposed layer is viable, and can support a variety of existing provisioning tools and use cases.Partial support for this work was provided by the MassTech Collaborative Research Matching Grant Program, National Science Foundation awards 1347525 and 1149232 as well as the several commercial partners of the Massachusetts Open Cloud who may be found at http://www.massopencloud.or

    The Case for Graph-Based Recommendations

    Get PDF
    Recommender systems have been intensively used to create personalised profiles, which enhance the user experience. In certain areas, such as e-learning, this approach is short-sighted, since each student masters each concept through different means. The progress from one concept to the next, or from one lesson to another, does not necessarily follow a fixed pattern. Given these settings, we can no longer use simple structures (vectors, strings, etc.) to represent each user's interactions with the system, because the sequence of events and their mapping to user's intentions, build up into more complex synergies. As a consequence, we propose a graph-based interpretation of the problem and identify the challenges behind (a) using graphs to model the users' journeys and hence as the input to the recommender system, and (b) producing recommendations in the form of graphs of actions to be taken

    User-profile-based analytics for detecting cloud security breaches

    Full text link
    While the growth of cloud-based technologies has benefited the society tremendously, it has also increased the surface area for cyber attacks. Given that cloud services are prevalent today, it is critical to devise systems that detect intrusions. One form of security breach in the cloud is when cyber-criminals compromise Virtual Machines (VMs) of unwitting users and, then, utilize user resources to run time-consuming, malicious, or illegal applications for their own benefit. This work proposes a method to detect unusual resource usage trends and alert the user and the administrator in real time. We experiment with three categories of methods: simple statistical techniques, unsupervised classification, and regression. So far, our approach successfully detects anomalous resource usage when experimenting with typical trends synthesized from published real-world web server logs and cluster traces. We observe the best results with unsupervised classification, which gives an average F1-score of 0.83 for web server logs and 0.95 for the cluster traces

    What does fault tolerant Deep Learning need from MPI?

    Full text link
    Deep Learning (DL) algorithms have become the de facto Machine Learning (ML) algorithm for large scale data analysis. DL algorithms are computationally expensive - even distributed DL implementations which use MPI require days of training (model learning) time on commonly studied datasets. Long running DL applications become susceptible to faults - requiring development of a fault tolerant system infrastructure, in addition to fault tolerant DL algorithms. This raises an important question: What is needed from MPI for de- signing fault tolerant DL implementations? In this paper, we address this problem for permanent faults. We motivate the need for a fault tolerant MPI specification by an in-depth consideration of recent innovations in DL algorithms and their properties, which drive the need for specific fault tolerance features. We present an in-depth discussion on the suitability of different parallelism types (model, data and hybrid); a need (or lack thereof) for check-pointing of any critical data structures; and most importantly, consideration for several fault tolerance proposals (user-level fault mitigation (ULFM), Reinit) in MPI and their applicability to fault tolerant DL implementations. We leverage a distributed memory implementation of Caffe, currently available under the Machine Learning Toolkit for Extreme Scale (MaTEx). We implement our approaches by ex- tending MaTEx-Caffe for using ULFM-based implementation. Our evaluation using the ImageNet dataset and AlexNet, and GoogLeNet neural network topologies demonstrates the effectiveness of the proposed fault tolerant DL implementation using OpenMPI based ULFM
    corecore