645 research outputs found
DC-Prophet: Predicting Catastrophic Machine Failures in DataCenters
When will a server fail catastrophically in an industrial datacenter? Is it
possible to forecast these failures so preventive actions can be taken to
increase the reliability of a datacenter? To answer these questions, we have
studied what are probably the largest, publicly available datacenter traces,
containing more than 104 million events from 12,500 machines. Among these
samples, we observe and categorize three types of machine failures, all of
which are catastrophic and may lead to information loss, or even worse,
reliability degradation of a datacenter. We further propose a two-stage
framework-DC-Prophet-based on One-Class Support Vector Machine and Random
Forest. DC-Prophet extracts surprising patterns and accurately predicts the
next failure of a machine. Experimental results show that DC-Prophet achieves
an AUC of 0.93 in predicting the next machine failure, and a F3-score of 0.88
(out of 1). On average, DC-Prophet outperforms other classical machine learning
methods by 39.45% in F3-score.Comment: 13 pages, 5 figures, accepted by 2017 ECML PKD
Forecasting Player Behavioral Data and Simulating in-Game Events
Understanding player behavior is fundamental in game data science. Video
games evolve as players interact with the game, so being able to foresee player
experience would help to ensure a successful game development. In particular,
game developers need to evaluate beforehand the impact of in-game events.
Simulation optimization of these events is crucial to increase player
engagement and maximize monetization. We present an experimental analysis of
several methods to forecast game-related variables, with two main aims: to
obtain accurate predictions of in-app purchases and playtime in an operational
production environment, and to perform simulations of in-game events in order
to maximize sales and playtime. Our ultimate purpose is to take a step towards
the data-driven development of games. The results suggest that, even though the
performance of traditional approaches such as ARIMA is still better, the
outcomes of state-of-the-art techniques like deep learning are promising. Deep
learning comes up as a well-suited general model that could be used to forecast
a variety of time series with different dynamic behaviors
Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline
From medical charts to national census, healthcare has traditionally operated
under a paper-based paradigm. However, the past decade has marked a long and
arduous transformation bringing healthcare into the digital age. Ranging from
electronic health records, to digitized imaging and laboratory reports, to
public health datasets, today, healthcare now generates an incredible amount of
digital information. Such a wealth of data presents an exciting opportunity for
integrated machine learning solutions to address problems across multiple
facets of healthcare practice and administration. Unfortunately, the ability to
derive accurate and informative insights requires more than the ability to
execute machine learning models. Rather, a deeper understanding of the data on
which the models are run is imperative for their success. While a significant
effort has been undertaken to develop models able to process the volume of data
obtained during the analysis of millions of digitalized patient records, it is
important to remember that volume represents only one aspect of the data. In
fact, drawing on data from an increasingly diverse set of sources, healthcare
data presents an incredibly complex set of attributes that must be accounted
for throughout the machine learning pipeline. This chapter focuses on
highlighting such challenges, and is broken down into three distinct
components, each representing a phase of the pipeline. We begin with attributes
of the data accounted for during preprocessing, then move to considerations
during model building, and end with challenges to the interpretation of model
output. For each component, we present a discussion around data as it relates
to the healthcare domain and offer insight into the challenges each may impose
on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20
Pages, 1 Figur
Computer Aided Inspection: design of customer oriented benchmark for non contact 3D scanners evaluation
Estimation of changes in the force of infection for intestinal and urogenital schistosomiasis in countries with Schistosomiasis Control Initiative-assisted programmes
The last decade has seen an expansion of national schistosomiasis control programmes in Africa based on large-scale preventative chemotherapy. In many areas this has resulted in considerable reductions in infection and morbidity levels in treated individuals. In this paper, we quantify changes in the force of infection (FOI), defined here as the per (human) host parasite establishment rate, to ascertain the impact on transmission of some of these programmes under the umbrella of the Schistosomiasis Control Initiative (SCI)
Annealing study and thermal investigation on bismuth sulfide thin films prepared by chemical bath deposition in basic medium
This is a post-peer-review, pre-copyedit version of an article published in Applied Physics A 124.2 (2018): 166. The final authenticated version is available online at: http://doi.org/10.1007/s00339-018-1584-7Bismuth sulfide thin films were prepared by chemical bath deposition using thiourea as sulfide ion source in basic medium. First, the effects of both the deposition parameters on films growth as well as the annealing effect under argon and sulfur atmosphere on as-deposited thin films were studied. The parameters were found to be influential using the Doehlert matrix experimental design methodology. Ranges for a maximum surface mass of films (3 mg cm-2) were determined. A well crystallized major phase of bismuth sulfide with stoichiometric composition was achieved at 190°C for 3 hours. The prepared thin films were characterized using Grazing Incidence X-ray diffraction (GIXRD), Scanning Electron Microscopy (SEM) and Energy Dispersive X-ray analysis (EDX). Second, the band gap energy value was found to be 1.5 eV. Finally, the thermal properties have been studied for the first time by means of the electropyroelectric (EPE) technique. Indeed, the thermal conductivity varied in the range of 1.20 - 0.60 W m-1 K-1 while the thermal diffusivity values increased in terms of the annealing effect ranging from 1.8 to 3.5 10-7 m2s-1This work was financially
supported by the Tunisian Ministry of Higher Education and Scientific
Research and by the WINCOST (ENE2016-80788-C5-2-R) project
funded by the Spanish Ministry of Economy and Competitivenes
Recent Advances in Our Understanding of the Role of Meltwater in the Greenland Ice Sheet System
Nienow, Sole and Cowton’s Greenland research has been supported by a number of UK NERC research grants (NER/O/S/2003/00620; NE/F021399/1; NE/H024964/1; NE/K015249/1; NE/K014609/1) and Slater has been supported by a NERC PhD studentshipPurpose of the review: This review discusses the role that meltwater plays within the Greenland ice sheet system. The ice sheet’s hydrology is important because it affects mass balance through its impact on meltwater runoff processes and ice dynamics. The review considers recent advances in our understanding of the storage and routing of water through the supraglacial, englacial, and subglacial components of the system and their implications for the ice sheet Recent findings: There have been dramatic increases in surface meltwater generation and runoff since the early 1990s, both due to increased air temperatures and decreasing surface albedo. Processes in the subglacial drainage system have similarities to valley glaciers and in a warming climate, the efficiency of meltwater routing to the ice sheet margin is likely to increase. The behaviour of the subglacial drainage system appears to limit the impact of increased surface melt on annual rates of ice motion, in sections of the ice sheet that terminate on land, while the large volumes of meltwater routed subglacially deliver significant volumes of sediment and nutrients to downstream ecosystems. Summary: Considerable advances have been made recently in our understanding of Greenland ice sheet hydrology and its wider influences. Nevertheless, critical gaps persist both in our understanding of hydrology-dynamics coupling, notably at tidewater glaciers, and in runoff processes which ensure that projecting Greenland’s future mass balance remains challenging.Publisher PDFPeer reviewe
Cellular Traffic Prediction and Classification: a comparative evaluation of LSTM and ARIMA
Prediction of user traffic in cellular networks has attracted profound
attention for improving resource utilization. In this paper, we study the
problem of network traffic traffic prediction and classification by employing
standard machine learning and statistical learning time series prediction
methods, including long short-term memory (LSTM) and autoregressive integrated
moving average (ARIMA), respectively. We present an extensive experimental
evaluation of the designed tools over a real network traffic dataset. Within
this analysis, we explore the impact of different parameters to the
effectiveness of the predictions. We further extend our analysis to the problem
of network traffic classification and prediction of traffic bursts. The
results, on the one hand, demonstrate superior performance of LSTM over ARIMA
in general, especially when the length of the training time series is high
enough, and it is augmented by a wisely-selected set of features. On the other
hand, the results shed light on the circumstances in which, ARIMA performs
close to the optimal with lower complexity.Comment: arXiv admin note: text overlap with arXiv:1906.0095
Population mortality during the outbreak of Severe Acute Respiratory Syndrome in Toronto
<p>Abstract</p> <p>Background</p> <p>Extraordinary infection control measures limited access to medical care in the Greater Toronto Area during the 2003 Severe Acute Respiratory Syndrome (SARS) outbreak. The objective of this study was to determine if the period of these infection control measures was associated with changes in overall population mortality due to causes other than SARS.</p> <p>Methods</p> <p>Observational study of death registry data, using Poisson regression and interrupted time-series analysis to examine all-cause mortality rates (excluding deaths due to SARS) before, during, and after the SARS outbreak. The population of Ontario was grouped into the Greater Toronto Area (N = 2.9 million) and the rest of Ontario (N = 9.3 million) based upon the level of restrictions on delivery of clinical services during the SARS outbreak.</p> <p>Results</p> <p>There was no significant change in mortality in the Greater Toronto Area before, during, and after the period of the SARS outbreak in 2003 compared to the corresponding time periods in 2002 and 2001. The rate ratio for all-cause mortality during the SARS outbreak was 0.99 [95% Confidence Interval (CI) 0.93–1.06] compared to 2002 and 0.96 [95% CI 0.90–1.03] compared to 2001. An interrupted time series analysis found no significant change in mortality rates in the Greater Toronto Area associated with the period of the SARS outbreak.</p> <p>Conclusion</p> <p>Limitations on access to medical services during the 2003 SARS outbreak in Toronto had no observable impact on short-term population mortality. Effects on morbidity and long-term mortality were not assessed. Efforts to contain future infectious disease outbreaks due to influenza or other agents must consider effects on access to essential health care services.</p
- …
