1,894 research outputs found
Application of the Empirical Mode Decomposition On the Characterization and Forecasting of the Arrival Data of an Enterprise Cluster
Characterization and forecasting are two important processes in capacity planning. While they are closely related, their approaches have been different. In this research, a decomposition method called Empirical Mode Decomposition (EMD) has been applied as a preprocessing tool in order to bridge the input of both characterization and forecasting processes of the job arrivals of an enterprise cluster. Based on the facts that an enterprise cluster follows a standard preset working schedule and that EMD has the capability to extract hidden patterns within a data stream, we have developed a set of procedures that can preprocess the data for characterization as well as for forecasting. This comprehensive empirical study demonstrates that the addition of the preprocessing step is an improvement over the standard approaches in both characterization and forecasting. In addition, it is also shown that EMD is better than the popular wavelet-based decomposition in term of extracting different patterns from within a data stream
An overview of decision table literature 1982-1995.
This report gives an overview of the literature on decision tables over the past 15 years. As much as possible, for each reference, an author supplied abstract, a number of keywords and a classification are provided. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. The literature is classified according to application area, theoretical versus practical character, year of publication, country or origin (not necessarily country of publication) and the language of the document. After a description of the scope of the interview, classification results and the classification by topic are presented. The main body of the paper is the ordered list of publications with abstract, classification and comments.
Modelling cross-dependencies between Spain’s regional tourism markets with an extension of the Gaussian process regression model
This study presents an extension of the Gaussian process regression model for multiple-input multiple-output forecasting. This approach allows modelling the cross-dependencies between a given set of input variables and generating a vectorial prediction. Making use of the existing correlations in international tourism demand to all seventeen regions of Spain, the performance of the proposed model is assessed in a multiple-step-ahead forecasting comparison. The results of the experiment in a multivariate setting show that the Gaussian process regression model significantly improves the forecasting accuracy of a multi-layer perceptron neural network used as a benchmark. The results reveal that incorporating the connections between different markets in the modelling process may prove very useful to refine predictions at a regional level.Peer ReviewedPostprint (author's final draft
Modelling cross-dependencies between Spain's regional tourism markets with an extension of the Gaussian process regression model
This study presents an extension of the Gaussian process regression model for multiple-input multiple-output forecasting. This approach allows modelling the cross-dependencies between a given set of input variables and generating a vectorial prediction. Making use of the existing correlations in international tourism demand to all seventeen regions of Spain, the performance of the proposed model is assessed in a multiple-step-ahead forecasting comparison. The results of the experiment in a multivariate setting show that the Gaussian process regression model significantly improves the forecasting accuracy of a multi-layer perceptron neural network used as a benchmark. The results reveal that incorporating the connections between different markets in the modelling process may prove very useful to refine predictions at a regional level
Machine Learning Approaches for Traffic Flow Forecasting
Intelligent Transport Systems (ITS) as a field has emerged quite rapidly in the recent years. A competitive solution coupled with big data gathered for ITS applications needs the latest AI to drive the ITS for the smart and effective public transport planning and management. Although there is a strong need for ITS applications like Advanced Route Planning (ARP) and Traffic Control Systems (TCS) to take the charge and require the minimum of possible human interventions. This thesis develops the models that can predict the traffic link flows on a junction level such as road traffic flows for a freeway or highway road for all traffic conditions.
The research first reviews the state-of-the-art time series data prediction techniques with a deep focus in the field of transport Engineering along with the existing statistical and machine leaning methods and their applications for the freeway traffic flow prediction. This review setup a firm work focussed on the view point to look for the superiority in term of prediction performance of individual statistical or machine learning models over another. A detailed theoretical attention has been given, to learn the structure and working of individual chosen prediction models, in relation to the traffic flow data.
In modelling the traffic flows from the real-world Highway England (HE) gathered dataset, a traffic flow objective function for highway road prediction models is proposed in a 3-stage framework including the topological breakdown of traffic network into virtual patches, further into nodes and to the basic links flow profiles behaviour estimations. The proposed objective function is tested with ten different prediction models including the statistical, shallow and deep learning constructed hybrid models for bi-directional links flow prediction methods. The effectiveness of the proposed objective function greatly enhances the accuracy of traffic flow prediction, regardless of the machine learning model used.
The proposed prediction objective function base framework gives a new approach to model the traffic network to better understand the unknown traffic flow waves and the resulting congestions caused on a junction level. In addition, the results of applied Machine Learning models indicate that RNN variant LSTMs based models in conjunction with neural networks and Deep CNNs, when applied through the proposed objective function, outperforms other chosen machine learning methods for link flow predictions. The experimentation based practical findings reveal that to arrive at an efficient, robust, offline and accurate prediction model apart from feeding the ML mode with the correct representation of the network data, attention should be paid to the deep learning model structure, data pre-processing (i.e. normalisation) and the error matrices used for data behavioural learning.
The proposed framework, in future can be utilised to address one of the main aims of the smart transport systems i.e. to reduce the error rates in network wide congestion predictions and the inflicted general traffic travel time delays in real-time
Data-Driven Methods for Data Center Operations Support
During the last decade, cloud technologies have been evolving at
an impressive pace, such that we are now living in a cloud-native
era where developers can leverage on an unprecedented landscape
of (possibly managed) services for orchestration, compute, storage,
load-balancing, monitoring, etc. The possibility to have on-demand
access to a diverse set of configurable virtualized resources allows
for building more elastic, flexible and highly-resilient distributed
applications. Behind the scenes, cloud providers sustain the heavy
burden of maintaining the underlying infrastructures, consisting in
large-scale distributed systems, partitioned and replicated among
many geographically dislocated data centers to guarantee scalability,
robustness to failures, high availability and low latency. The larger the
scale, the more cloud providers have to deal with complex interactions
among the various components, such that monitoring, diagnosing and
troubleshooting issues become incredibly daunting tasks.
To keep up with these challenges, development and operations
practices have undergone significant transformations, especially in
terms of improving the automations that make releasing new software,
and responding to unforeseen issues, faster and sustainable at scale.
The resulting paradigm is nowadays referred to as DevOps. However,
while such automations can be very sophisticated, traditional DevOps
practices fundamentally rely on reactive mechanisms, that typically
require careful manual tuning and supervision from human experts.
To minimize the risk of outages—and the related costs—it is crucial to
provide DevOps teams with suitable tools that can enable a proactive
approach to data center operations.
This work presents a comprehensive data-driven framework to address
the most relevant problems that can be experienced in large-scale
distributed cloud infrastructures. These environments are indeed characterized
by a very large availability of diverse data, collected at each
level of the stack, such as: time-series (e.g., physical host measurements,
virtual machine or container metrics, networking components
logs, application KPIs); graphs (e.g., network topologies, fault graphs
reporting dependencies among hardware and software components,
performance issues propagation networks); and text (e.g., source code,
system logs, version control system history, code review feedbacks).
Such data are also typically updated with relatively high frequency,
and subject to distribution drifts caused by continuous configuration
changes to the underlying infrastructure. In such a highly dynamic scenario,
traditional model-driven approaches alone may be inadequate
at capturing the complexity of the interactions among system components. DevOps teams would certainly benefit from having robust
data-driven methods to support their decisions based on historical
information. For instance, effective anomaly detection capabilities may
also help in conducting more precise and efficient root-cause analysis.
Also, leveraging on accurate forecasting and intelligent control
strategies would improve resource management.
Given their ability to deal with high-dimensional, complex data,
Deep Learning-based methods are the most straightforward option for
the realization of the aforementioned support tools. On the other hand,
because of their complexity, this kind of models often requires huge
processing power, and suitable hardware, to be operated effectively
at scale. These aspects must be carefully addressed when applying
such methods in the context of data center operations. Automated
operations approaches must be dependable and cost-efficient, not to
degrade the services they are built to improve.
i
Improving Throughput and Predictability of High-volume Business Processes Through Embedded Modeling
Being faster is good. Being predictable is better. A faithful model of a system, loaded to reflect the system\u27s current state, can then be used to look into the future and predict performance. Building faithful models of processes with high degrees of uncertainty can be very challenging, especially where this uncertainty exists in terms of processing times, queuing behavior and re-work rates. Within the context of an electronic, multi-tiered workflow management system (WFMS) the author builds such a model to endogenously quote due dates. A WFMS that manages business objects can be recast as a flexible flow shop in which the stations that a job (representing the business object) passes through are known and the jobs in the stations queues at any point are known. All of the other parameters associated with the flow shop, including job processing times per station, and station queuing behavior are uncertain though there is a significant body of past performance data that might be brought to bear. The objective, in this environment, is to meet the delivery date promised when the job is accepted. To attack the problem the author develops a novel heuristic algorithm for decomposing the WFMS\u27s event logs exposing non-standard queuing behavior, develops a new simulation component to implement that behavior, and assembles a prototypical system to automate the required historical analysis and allow for on-demand due date quoting through the use of embedded discrete event simulation modeling. To attack the problem the author develops a novel heuristic algorithm for decomposing the WFMS\u27s event logs exposing non-standard queuing behavior, develops a new simulation component to implement that behavior, and assembles a prototypical system to automate the required historical analysis and allow for on-demand due date quoting through the use of embedded discrete event simulation modeling. The developed software components are flexible enough to allow for both the analysis of past performance in conjunction with the WFMS\u27s event logs, and on-demand analysis of new jobs entering the system. Using the proportion of jobs completed within the predicted interval as the measure of effectiveness, the author validates the performance of the system over six months of historical data and during live operations with both samples achieving the 90% service level targeted
- …