16 research outputs found

    Co-movement clustering: A novel approach for predicting inflation in the food and beverage industry

    Get PDF
    In the realm of food and beverage businesses, inflation poses a significant hurdle as it affects pricing, profitability, and consumer’s purchasing power, setting it apart from other industries. This study proposes a novel approach; co-movement clustering, to predict which items will be inflated together according to historical time-series data. Experiments were conducted to evaluate the proposed approach based on real-world data obtained from the UK Office for National Statistics. The predicted results of the proposed approach were compared against four classical methods (correlation, Euclidean distance, Cosine Similarity, and DTW). According to our experimental results, the accuracy of the proposed approach outperforms the above-mentioned classical methods. Moreover, the accuracy of the proposed approach is higher when an additional filter is applied. Our approach aids hospitality operators in accurately predicting food and beverage inflation, enabling the development of effective strategies to navigate the current challenging business environment in hospitality management. The lack of previous work has explored how time series clustering can be applied to support inflation prediction. This study opens a new research paradigm to the related field and this study can serve as a useful reference for future research in this emerging area. In addition, this study work contributes to the data analytics research stream in hospitality management literature

    Unsupervised Shift-invariant Feature Learning from Time-series Data

    Get PDF
    Unsupervised feature learning is one of the key components of machine learningand articial intelligence. Learning features from high dimensional streaming data isan important and dicult problem which is incorporated with number of challenges.Moreover, feature learning algorithms need to be evaluated and generalized for timeseries with dierent patterns and components. A detailed study is needed to clarifywhen simple algorithms fail to learn features and whether we need more complicatedmethods.In this thesis, we show that the systematic way to learn meaningful featuresfrom time-series is by using convolutional or shift-invariant versions of unsupervisedfeature learning. We experimentally compare the shift-invariant versions of clustering,sparse coding and non-negative matrix factorization algorithms for: reconstruction,noise separation, prediction, classication and simulating auditory lters from acousticsignals. The results show that the most ecient and highly scalable clustering algorithmwith a simple modication in inference and learning phase is able to produce meaningfulresults. Clustering features are also comparable with sparse coding and non-negativematrix factorization in most of the tasks (e.g. classication) and even more successful insome (e.g. prediction). Shift invariant sparse coding is also used on a novel application,inferring hearing loss from speech signal and produced promising results.Performance of algorithms with regard to some important factors such as: timeseries components, number of features and size of receptive eld is also analyzed. Theresults show that there is a signicant positive correlation between performance of clusteringwith degree of trend, frequency skewness, frequency kurtosis and serial correlationof data, whereas, the correlation is negative in the case of dataset average bandwidth.Performance of shift invariant sparse coding is aected by frequency skewness, frequencykurtosis and serial correlation of data. Non-Negative matrix factorization is influenced by data characteristics same as clustering

    Online pattern recognition in subsequence time series clustering

    Get PDF
    One of the open issues in the context of subsequence time series clustering is online pattern recognition. There are different fields in this clustering such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. Among these fields pattern recognition is one the essential concept. To implement the idea of online pattern recognition, we choose sequences of ECG data as a subsequence time series data. Additionally, using ECG data can help to interpret heart activity for finding heart diseases. This paper will offer a way to generate online pattern recognition in subsequence time series clustering in order to have a runtime results

    Subspace discovery for video anomaly detection

    Get PDF
    PhDIn automated video surveillance anomaly detection is a challenging task. We address this task as a novelty detection problem where pattern description is limited and labelling information is available only for a small sample of normal instances. Classification under these conditions is prone to over-fitting. The contribution of this work is to propose a novel video abnormality detection method that does not need object detection and tracking. The method is based on subspace learning to discover a subspace where abnormality detection is easier to perform, without the need of detailed annotation and description of these patterns. The problem is formulated as one-class classification utilising a low dimensional subspace, where a novelty classifier is used to learn normal actions automatically and then to detect abnormal actions from low-level features extracted from a region of interest. The subspace is discovered (using both labelled and unlabelled data) by a locality preserving graph-based algorithm that utilises the Graph Laplacian of a specially designed parameter-less nearest neighbour graph. The methodology compares favourably with alternative subspace learning algorithms (both linear and non-linear) and direct one-class classification schemes commonly used for off-line abnormality detection in synthetic and real data. Based on these findings, the framework is extended to on-line abnormality detection in video sequences, utilising multiple independent detectors deployed over the image frame to learn the local normal patterns and infer abnormality for the complete scene. The method is compared with an alternative linear method to establish advantages and limitations in on-line abnormality detection scenarios. Analysis shows that the alternative approach is better suited for cases where the subspace learning is restricted on the labelled samples, while in the presence of additional unlabelled data the proposed approach using graph-based subspace learning is more appropriate

    Segmentación de series temporales mediante un algoritmo multiobjetivo evolutivo

    Get PDF
    Premio extraordinario de Trabajo Fin de Máster curso 2015-2016. Ingeniería Informátic

    BARCH: a business analytics problem formulation and solving framework

    Get PDF
    The BARCH framework is a business framework that is specifically formulated to help analysts and management who want to identify and formulate a scenario to which Analytics can be applied and the outcome will have a direct impact on the business. This is the overarching public work that I have used extensively in various projects and research. This framework has been developed initially in the banking sector and has evolved progressively with successive projects. The framework’s name represents five aspects for the formulation and identification of an area that one can use Analytics to answer. The five aspects are Business, Analytics, Revenue, Cost and Human. The five aspects represent the entire system and approach to the identification, formulation, understanding and modelling of Analytic problems. The five aspects are not necessarily sequential but are interrelated in some ways where certain aspects are dependent on the other aspects. For example, revenue and cost are related to business and depend on the business from which they are derived. However, in most practices involving Analytics, Analytics are conducted independent of business and the techniques in Analytics are not derived from business directly. This lack of harmony between business and Analytics creates an unfortunate combination of factors that has led to the failure of Analytics projects for many businesses. In intensely practising Analytics and critically reflecting on every piece of work I have done, I have learned the importance of combining knowledge with skills and experience to come up with new knowledge and a form of practical wisdom. I also realize now the importance of understanding fields that are not directly related to my field of specialization. Through this context statement I have been able to increase the articulation of my thinking and the complexities of practice through approaches to knowledge such as transdisciplinarity which further supports the translation of what I can do and what needs to be done in a way that business clients can understand. Having the opportunity to explore concepts new to me from other academic fields and seeking their relevance and application in my own area of expertise has helped me considerably in the ongoing development of the BARCH framework and successful implementation of Analytics projects. I have selected the results of three projects published in papers that are listed in Appendices A-C to demonstrate how the model can be applied to solve problems successfully compared to other frameworks. The evolution of the model involves a continual feedback loop of learning from each successive project which contributes to the BARCH model being able to not only continuously demonstrate its applicability to various problems but to consistently produce better and more refined results. The majority of analytical models applied to the many problems in the business environment address the problems only superficially (Bose, 2009; Krioukov et. al., 2011), that is without understanding the impact on the business as a whole. Many Analytics projects have not delivered the promised impact because the models applied are overly complicated (Stubbs, 2013) to solve the root causes of the business problem. This situation is compounded by an increasing number of analysts applying Analytics to business problems without a proper understanding of the context, technique and environment (Stubbs, 2013). While many experts in the field interpret the problem as a multidisciplinary problem, the problem is in my opinion transdisciplinary in nature

    Mining previously unknown patterns in time series data

    Get PDF
    The emerging importance of distributed computing systems raises the needs of gaining a better understanding of system performance. As a major indicator of system performance, analysing CPU host load helps evaluate system performance in many ways. Discovering similar patterns in CPU host load is very useful since many applications rely on the pattern mined from the CPU host load, such as pattern-based prediction, classification and relative rule mining of CPU host load. Essentially, the problem of mining patterns in CPU host load is mining the time series data. Due to the complexity of the problem, many traditional mining techniques for time series data are not suitable anymore. Comparing to mining known patterns in time series, mining unknown patterns is a much more challenging task. In this thesis, we investigate the major difficulties of the problem and develop the techniques for mining unknown patterns by extending the traditional techniques of mining the known patterns. In this thesis, we develop two different CPU host load discovery methods: the segment-based method and the reduction-based method to optimize the pattern discovery process. The segment-based method works by extracting segment features while the reduction-based method works by reducing the size of raw data. The segment-based pattern discovery method maps the CPU host load segments to a 5-dimension space, then applies the DBSCAN clustering method to discover similar segments. The reduction-based method reduces the dimensionality and numerosity of the CPU host load to reduce the search space. A cascade method is proposed to support accurate pattern mining while maintaining efficiency. The investigations into the CPU host load data inspired us to further develop a pattern mining algorithm for general time series data. The method filters out the unlikely starting positions for reoccurring patterns at the early stage and then iteratively locates all best-matching patterns. The results obtained by our method do not contain any meaningless patterns, which has been a different problematic issue for a long time. Comparing to the state of art techniques, our method is more efficient and effective in most scenarios

    Smart Urban Water Networks

    Get PDF
    This book presents the paper form of the Special Issue (SI) on Smart Urban Water Networks. The number and topics of the papers in the SI confirm the growing interest of operators and researchers for the new paradigm of smart networks, as part of the more general smart city. The SI showed that digital information and communication technology (ICT), with the implementation of smart meters and other digital devices, can significantly improve the modelling and the management of urban water networks, contributing to a radical transformation of the traditional paradigm of water utilities. The paper collection in this SI includes different crucial topics such as the reliability, resilience, and performance of water networks, innovative demand management, and the novel challenge of real-time control and operation, along with their implications for cyber-security. The SI collected fourteen papers that provide a wide perspective of solutions, trends, and challenges in the contest of smart urban water networks. Some solutions have already been implemented in pilot sites (i.e., for water network partitioning, cyber-security, and water demand disaggregation and forecasting), while further investigations are required for other methods, e.g., the data-driven approaches for real time control. In all cases, a new deal between academia, industry, and governments must be embraced to start the new era of smart urban water systems
    corecore