14,312 research outputs found

    Fundamental principles in drawing inference from sequence analysis

    No full text
    Individual life courses are dynamic and can be represented as a sequence of states for some portion of their experiences. More generally, study of such sequences has been made in many fields around social science; for example, sociology, linguistics, psychology, and the conceptualisation of subjects progressing through a sequence of states is common. However, many models and sets of data allow only for the treatment of aggregates or transitions, rather than interpreting whole sequences. The temporal aspect of the analysis is fundamental to any inference about the evolution of the subjects but assumptions about time are not normally made explicit. Moreover, without a clear idea of what sequences look like, it is impossible to determine when something is not seen whether it was not actually there. Some principles are proposed which link the ideas of sequences, hypothesis, analytical framework, categorisation and representation; each one being underpinned by the consideration of time. To make inferences about sequences, one needs to: understand what these sequences represent; the hypothesis and assumptions that can be derived about sequences; identify the categories within the sequences; and data representation at each stage. These ideas are obvious in themselves but they are interlinked, imposing restrictions on each other and on the inferences which can be draw

    Towards Operator-less Data Centers Through Data-Driven, Predictive, Proactive Autonomics

    Get PDF
    Continued reliance on human operators for managing data centers is a major impediment for them from ever reaching extreme dimensions. Large computer systems in general, and data centers in particular, will ultimately be managed using predictive computational and executable models obtained through data-science tools, and at that point, the intervention of humans will be limited to setting high-level goals and policies rather than performing low-level operations. Data-driven autonomics, where management and control are based on holistic predictive models that are built and updated using live data, opens one possible path towards limiting the role of operators in data centers. In this paper, we present a data-science study of a public Google dataset collected in a 12K-node cluster with the goal of building and evaluating predictive models for node failures. Our results support the practicality of a data-driven approach by showing the effectiveness of predictive models based on data found in typical data center logs. We use BigQuery, the big data SQL platform from the Google Cloud suite, to process massive amounts of data and generate a rich feature set characterizing node state over time. We describe how an ensemble classifier can be built out of many Random Forest classifiers each trained on these features, to predict if nodes will fail in a future 24-hour window. Our evaluation reveals that if we limit false positive rates to 5%, we can achieve true positive rates between 27% and 88% with precision varying between 50% and 72%.This level of performance allows us to recover large fraction of jobs' executions (by redirecting them to other nodes when a failure of the present node is predicted) that would otherwise have been wasted due to failures. [...

    An integrated approach for analysing and assessing the performance of virtual learning groups

    Get PDF
    Collaborative distance learning involves a variety of elements and factors that have to be considered and measured in order to analyse and assess group and individual performance more effectively and objectively. This paper presents an approach that integrates qualitative, social network analysis (SNA) and quantitative techniques for evaluating online collaborative learning interactions. Integration of various different data sources, tools and techniques provides a more complete and robust framework for group modelling and guarantees a more efficient evaluation of group effectiveness and individual competence. Our research relies on the analysis of a real, long-term, complex collaborative experience, which is initially evaluated in terms of principled criteria and a basic qualitative process. At the end of the experience, the coded student interactions are further analysed through the SNA technique to assess participatory aspects, identify the most effective groups and the most prominent actors. Finally, the approach is contrasted and completed through a statistical technique which sheds more light on the results obtained that far. The proposal draws a well-founded line toward the development of a principled framework for the monitoring and analysis of group interaction and group scaffolding which can be considered a major issue towards the actual application of the CSCL proposals to real classrooms.Peer ReviewedPostprint (author's final draft

    Big Data Transforms Discovery-Utilization Therapeutics Continuum.

    Get PDF
    Enabling omic technologies adopt a holistic view to produce unprecedented insights into the molecular underpinnings of health and disease, in part, by generating massive high-dimensional biological data. Leveraging these systems-level insights as an engine driving the healthcare evolution is maximized through integration with medical, demographic, and environmental datasets from individuals to populations. Big data analytics has accordingly emerged to add value to the technical aspects of storage, transfer, and analysis required for merging vast arrays of omic-, clinical-, and eco-datasets. In turn, this new field at the interface of biology, medicine, and information science is systematically transforming modern therapeutics across discovery, development, regulation, and utilization

    IoTSan: Fortifying the Safety of IoT Systems

    Full text link
    Today's IoT systems include event-driven smart applications (apps) that interact with sensors and actuators. A problem specific to IoT systems is that buggy apps, unforeseen bad app interactions, or device/communication failures, can cause unsafe and dangerous physical states. Detecting flaws that lead to such states, requires a holistic view of installed apps, component devices, their configurations, and more importantly, how they interact. In this paper, we design IoTSan, a novel practical system that uses model checking as a building block to reveal "interaction-level" flaws by identifying events that can lead the system to unsafe states. In building IoTSan, we design novel techniques tailored to IoT systems, to alleviate the state explosion associated with model checking. IoTSan also automatically translates IoT apps into a format amenable to model checking. Finally, to understand the root cause of a detected vulnerability, we design an attribution mechanism to identify problematic and potentially malicious apps. We evaluate IoTSan on the Samsung SmartThings platform. From 76 manually configured systems, IoTSan detects 147 vulnerabilities. We also evaluate IoTSan with malicious SmartThings apps from a previous effort. IoTSan detects the potential safety violations and also effectively attributes these apps as malicious.Comment: Proc. of the 14th ACM CoNEXT, 201

    Proceedings of the ECCS 2005 satellite workshop: embracing complexity in design - Paris 17 November 2005

    Get PDF
    Embracing complexity in design is one of the critical issues and challenges of the 21st century. As the realization grows that design activities and artefacts display properties associated with complex adaptive systems, so grows the need to use complexity concepts and methods to understand these properties and inform the design of better artifacts. It is a great challenge because complexity science represents an epistemological and methodological swift that promises a holistic approach in the understanding and operational support of design. But design is also a major contributor in complexity research. Design science is concerned with problems that are fundamental in the sciences in general and complexity sciences in particular. For instance, design has been perceived and studied as a ubiquitous activity inherent in every human activity, as the art of generating hypotheses, as a type of experiment, or as a creative co-evolutionary process. Design science and its established approaches and practices can be a great source for advancement and innovation in complexity science. These proceedings are the result of a workshop organized as part of the activities of a UK government AHRB/EPSRC funded research cluster called Embracing Complexity in Design (www.complexityanddesign.net) and the European Conference in Complex Systems (complexsystems.lri.fr). Embracing complexity in design is one of the critical issues and challenges of the 21st century. As the realization grows that design activities and artefacts display properties associated with complex adaptive systems, so grows the need to use complexity concepts and methods to understand these properties and inform the design of better artifacts. It is a great challenge because complexity science represents an epistemological and methodological swift that promises a holistic approach in the understanding and operational support of design. But design is also a major contributor in complexity research. Design science is concerned with problems that are fundamental in the sciences in general and complexity sciences in particular. For instance, design has been perceived and studied as a ubiquitous activity inherent in every human activity, as the art of generating hypotheses, as a type of experiment, or as a creative co-evolutionary process. Design science and its established approaches and practices can be a great source for advancement and innovation in complexity science. These proceedings are the result of a workshop organized as part of the activities of a UK government AHRB/EPSRC funded research cluster called Embracing Complexity in Design (www.complexityanddesign.net) and the European Conference in Complex Systems (complexsystems.lri.fr)
    • 

    corecore