32 research outputs found
Forecasting: theory and practice
Forecasting has always been in the forefront of decision making and planning.
The uncertainty that surrounds the future is both exciting and challenging,
with individuals and organisations seeking to minimise risks and maximise
utilities. The lack of a free-lunch theorem implies the need for a diverse set
of forecasting methods to tackle an array of applications. This unique article
provides a non-systematic review of the theory and the practice of forecasting.
We offer a wide range of theoretical, state-of-the-art models, methods,
principles, and approaches to prepare, produce, organise, and evaluate
forecasts. We then demonstrate how such theoretical concepts are applied in a
variety of real-life contexts, including operations, economics, finance,
energy, environment, and social good. We do not claim that this review is an
exhaustive list of methods and applications. The list was compiled based on the
expertise and interests of the authors. However, we wish that our encyclopedic
presentation will offer a point of reference for the rich work that has been
undertaken over the last decades, with some key insights for the future of the
forecasting theory and practice
Forecasting: theory and practice
Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts.
We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases.info:eu-repo/semantics/publishedVersio
Forecasting: theory and practice
Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts. We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases
New Fundamental Technologies in Data Mining
The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining
Recommended from our members
The role of classifiers in feature selection: Number vs nature
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Wrapper feature selection approaches are widely used to select a small subset of relevant features from a dataset. However, Wrappers suffer from the fact that they only use a single classifier when selecting the features. The problem of using a single classifier is that each classifier is of a different nature and will have its own biases. This means that each classifier will select different feature subsets. To address this problem, this thesis aims to investigate the effects of using different classifiers for Wrapper feature selection. More specifically, it aims to investigate the effects of using different number of classifiers and classifiers of different nature.
This aim is achieved by proposing a new data mining method called Wrapper-based Decision Trees (WDT). The WDT method has the ability to combine multiple classifiers from four different families, including Bayesian Network, Decision Tree, Nearest Neighbour and Support Vector Machine, to select relevant features and visualise the relationships among the selected features using decision trees. Specifically, the WDT method is applied to investigate three research questions of this thesis: (1) the effects of number of classifiers on feature selection results; (2) the effects of nature of classifiers on feature selection results; and (3) which of the two (i.e., number or nature of classifiers) has more of an effect on feature selection results. Two types of user preference datasets derived from Human-Computer Interaction (HCI) are used with WDT to assist in answering these three research questions.
The results from the investigation revealed that the number of classifiers and nature of classifiers greatly affect feature selection results. In terms of number of classifiers, the results showed that few classifiers selected many relevant features whereas many classifiers selected few relevant features. In addition, it was found that using three classifiers resulted in highly accurate feature subsets. In terms of nature of classifiers, it was showed that Decision Tree, Bayesian Network and Nearest Neighbour classifiers caused signficant differences in both the number of features selected and the accuracy levels of the features. A comparison of results regarding number of classifiers and nature of classifiers revealed that the former has more of an effect on feature selection than the latter.
The thesis makes contributions to three communities: data mining, feature selection, and HCI. For the data mining community, this thesis proposes a new method called WDT which integrates the use of multiple classifiers for feature selection and decision trees to effectively select and visualise the most relevant features within a dataset. For the feature selection community, the results of this thesis have showed that the number of classifiers and nature of classifiers can truly affect the feature selection process. The results and suggestions based on the results can provide useful insight about classifiers when performing feature selection. For the HCI community, this thesis has showed the usefulness of feature selection for identifying a small number of highly relevant features for determining the preferences of different users
Data-driven method for enhanced corrosion assessment of reinforced concrete structures
Corrosion is a major problem affecting the durability of reinforced concrete structures. Corrosion related maintenance and repair of reinforced concrete structures cost multibillion USD per annum globally. It is often triggered by the ingression of carbon dioxide and/or chloride into the pores of concrete. Estimation of these corrosion causing factors using the conventional models results in suboptimal assessment since they are incapable of capturing the complex interaction of parameters. Hygrothermal interaction also plays a role in aggravating the corrosion of reinforcement bar and this is usually counteracted by applying surface protection systems. These systems have different degree of protection and they may even cause deterioration to the structure unintentionally.
The overall objective of this dissertation is to provide a framework that enhances the assessment reliability of the corrosion controlling factors. The framework is realized through the development of data-driven carbonation depth, chloride profile and hygrothermal performance prediction models.
The carbonation depth prediction model integrates neural network, decision tree, boosted and bagged ensemble decision trees. The ensemble tree based chloride profile prediction models evaluate the significance of chloride ingress controlling variables from various perspectives. The hygrothermal interaction prediction models are developed using neural networks to evaluate the status of corrosion and other unexpected deteriorations in surface-treated concrete elements. Long-term data for all models were obtained from three different field experiments.
The performance comparison of the developed carbonation depth prediction model with the conventional one confirmed the prediction superiority of the data-driven model. The variable importance measure revealed that plasticizers and air contents are among the top six carbonation governing parameters out of 25. The discovered topmost chloride penetration controlling parameters representing the composition of the concrete are aggregate size distribution, amount and type of plasticizers and supplementary cementitious materials. The performance analysis of the developed hygrothermal model revealed its prediction capability with low error. The integrated exploratory data analysis technique with the hygrothermal model had identified the surfaceprotection systems that are able to protect from corrosion, chemical and frost attacks.
All the developed corrosion assessment models are valid, reliable, robust and easily reproducible, which assist to define proactive maintenance plan. In addition, the determined influential parameters could help companies to produce optimized concrete mix that is able to resist carbonation and chloride penetration. Hence, the outcomes of this dissertation enable reduction of lifecycle costs
The Technological Emergence of AutoML: A Survey of Performant Software and Applications in the Context of Industry
With most technical fields, there exists a delay between fundamental academic
research and practical industrial uptake. Whilst some sciences have robust and
well-established processes for commercialisation, such as the pharmaceutical
practice of regimented drug trials, other fields face transitory periods in
which fundamental academic advancements diffuse gradually into the space of
commerce and industry. For the still relatively young field of
Automated/Autonomous Machine Learning (AutoML/AutonoML), that transitory period
is under way, spurred on by a burgeoning interest from broader society. Yet, to
date, little research has been undertaken to assess the current state of this
dissemination and its uptake. Thus, this review makes two primary contributions
to knowledge around this topic. Firstly, it provides the most up-to-date and
comprehensive survey of existing AutoML tools, both open-source and commercial.
Secondly, it motivates and outlines a framework for assessing whether an AutoML
solution designed for real-world application is 'performant'; this framework
extends beyond the limitations of typical academic criteria, considering a
variety of stakeholder needs and the human-computer interactions required to
service them. Thus, additionally supported by an extensive assessment and
comparison of academic and commercial case-studies, this review evaluates
mainstream engagement with AutoML in the early 2020s, identifying obstacles and
opportunities for accelerating future uptake
Shortest Route at Dynamic Location with Node Combination-Dijkstra Algorithm
Abstract— Online transportation has become a basic
requirement of the general public in support of all activities to go
to work, school or vacation to the sights. Public transportation
services compete to provide the best service so that consumers
feel comfortable using the services offered, so that all activities
are noticed, one of them is the search for the shortest route in
picking the buyer or delivering to the destination. Node
Combination method can minimize memory usage and this
methode is more optimal when compared to A* and Ant Colony
in the shortest route search like Dijkstra algorithm, but can’t
store the history node that has been passed. Therefore, using
node combination algorithm is very good in searching the
shortest distance is not the shortest route. This paper is
structured to modify the node combination algorithm to solve the
problem of finding the shortest route at the dynamic location
obtained from the transport fleet by displaying the nodes that
have the shortest distance and will be implemented in the
geographic information system in the form of map to facilitate
the use of the system.
Keywords— Shortest Path, Algorithm Dijkstra, Node
Combination, Dynamic Location (key words
The role of classifiers in feature selection : number vs nature
Wrapper feature selection approaches are widely used to select a small subset of relevant features from a dataset. However, Wrappers suffer from the fact that they only use a single classifier when selecting the features. The problem of using a single classifier is that each classifier is of a different nature and will have its own biases. This means that each classifier will select different feature subsets. To address this problem, this thesis aims to investigate the effects of using different classifiers for Wrapper feature selection. More specifically, it aims to investigate the effects of using different number of classifiers and classifiers of different nature. This aim is achieved by proposing a new data mining method called Wrapper-based Decision Trees (WDT). The WDT method has the ability to combine multiple classifiers from four different families, including Bayesian Network, Decision Tree, Nearest Neighbour and Support Vector Machine, to select relevant features and visualise the relationships among the selected features using decision trees. Specifically, the WDT method is applied to investigate three research questions of this thesis: (1) the effects of number of classifiers on feature selection results; (2) the effects of nature of classifiers on feature selection results; and (3) which of the two (i.e., number or nature of classifiers) has more of an effect on feature selection results. Two types of user preference datasets derived from Human-Computer Interaction (HCI) are used with WDT to assist in answering these three research questions. The results from the investigation revealed that the number of classifiers and nature of classifiers greatly affect feature selection results. In terms of number of classifiers, the results showed that few classifiers selected many relevant features whereas many classifiers selected few relevant features. In addition, it was found that using three classifiers resulted in highly accurate feature subsets. In terms of nature of classifiers, it was showed that Decision Tree, Bayesian Network and Nearest Neighbour classifiers caused signficant differences in both the number of features selected and the accuracy levels of the features. A comparison of results regarding number of classifiers and nature of classifiers revealed that the former has more of an effect on feature selection than the latter. The thesis makes contributions to three communities: data mining, feature selection, and HCI. For the data mining community, this thesis proposes a new method called WDT which integrates the use of multiple classifiers for feature selection and decision trees to effectively select and visualise the most relevant features within a dataset. For the feature selection community, the results of this thesis have showed that the number of classifiers and nature of classifiers can truly affect the feature selection process. The results and suggestions based on the results can provide useful insight about classifiers when performing feature selection. For the HCI community, this thesis has showed the usefulness of feature selection for identifying a small number of highly relevant features for determining the preferences of different users.EThOS - Electronic Theses Online ServiceGBUnited Kingdo