335 research outputs found

    Dirichlet belief networks for topic structure learning

    Full text link
    Recently, considerable research effort has been devoted to developing deep architectures for topic models to learn topic structures. Although several deep models have been proposed to learn better topic proportions of documents, how to leverage the benefits of deep structures for learning word distributions of topics has not yet been rigorously studied. Here we propose a new multi-layer generative process on word distributions of topics, where each layer consists of a set of topics and each topic is drawn from a mixture of the topics of the layer above. As the topics in all layers can be directly interpreted by words, the proposed model is able to discover interpretable topic hierarchies. As a self-contained module, our model can be flexibly adapted to different kinds of topic models to improve their modelling accuracy and interpretability. Extensive experiments on text corpora demonstrate the advantages of the proposed model.Comment: accepted in NIPS 201

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

    Topic Modelling of Swedish Newspaper Articles about Coronavirus: a Case Study using Latent Dirichlet Allocation Method

    Full text link
    Topic Modelling (TM) is from the research branches of natural language understanding (NLU) and natural language processing (NLP) that is to facilitate insightful analysis from large documents and datasets, such as a summarisation of main topics and the topic changes. This kind of discovery is getting more popular in real-life applications due to its impact on big data analytics. In this study, from the social-media and healthcare domain, we apply popular Latent Dirichlet Allocation (LDA) methods to model the topic changes in Swedish newspaper articles about Coronavirus. We describe the corpus we created including 6515 articles, methods applied, and statistics on topic changes over approximately 1 year and two months period of time from 17th January 2020 to 13th March 2021. We hope this work can be an asset for grounding applications of topic modelling and can be inspiring for similar case studies in an era with pandemics, to support socio-economic impact research as well as clinical and healthcare analytics. Our data and source code are openly available at https://github. com/poethan/Swed_Covid_TM Keywords: Latent Dirichlet Allocation (LDA); Topic Modelling; Coronavirus; Pandemics; Natural Language Understanding; BERT-topicComment: 14 pages, 14 figure

    Strategic decision-making in multi-agent markets: The emergence of endogenous crises and volatility

    Get PDF
    Traditional economic frameworks are built upon perfectly rational agents and equilibrium outcomes. However, during times of crises, these frameworks prove insufficient. In this thesis, we take an alternative perspective based on "Complexity Economics", relaxing the assumption of perfectly rational agents and allowing for out-of-equilibrium dynamics. While many contemporary approaches explain crises and non-equilibrium market phenomena as the rational reaction to external news, the emergence of endogenous crises remains an open question. We begin addressing this question by demonstrating how a multi-agent model of heterogeneous boundedly rational agents acting according to heuristics can reproduce and forecast key non-linear price movements in the Australian housing market, during boom and bust cycles. In order to provide foundations for such heuristic-based reasoning, we then propose a novel information-theoretic approach, Quantal Hierarchy, for modelling limitations in strategic reasoning, demonstrating how this convincingly and generically captures the decision-making of interacting agents in competitive markets outperforming existing approaches. In addition, we demonstrate how a concise generalised market model can generate important stylised facts, such as fat-tails and volatility clustering, and allow for the emergence of crises, purely endogenously. This thesis provides support to the interacting agent hypothesis, addressing a crucial question of whether crisis emergence and various stylised facts can be seen as endogenous phenomena, and provides a generic method for representing strategic agent reasoning

    Display Advertising with Real-Time Bidding (RTB) and Behavioural Targeting

    Get PDF
    The most significant progress in recent years in online display advertising is what is known as the Real-Time Bidding (RTB) mechanism to buy and sell ads. RTB essentially facilitates buying an individual ad impression in real time while it is still being generated from a user’s visit. RTB not only scales up the buying process by aggregating a large amount of available inventories across publishers but, most importantly, enables direct targeting of individual users. As such, RTB has fundamentally changed the landscape of digital marketing. Scientifically, the demand for automation, integration and optimisation in RTB also brings new research opportunities in information retrieval, data mining, machine learning and other related fields. In this monograph, an overview is given of the fundamental infrastructure, algorithms, and technical solutions of this new frontier of computational advertising. The covered topics include user response prediction, bid landscape forecasting, bidding algorithms, revenue optimisation, statistical arbitrage, dynamic pricing, and ad fraud detection

    Display Advertising with Real-Time Bidding (RTB) and Behavioural Targeting

    Get PDF
    The most significant progress in recent years in online display advertising is what is known as the Real-Time Bidding (RTB) mechanism to buy and sell ads. RTB essentially facilitates buying an individual ad impression in real time while it is still being generated from a user’s visit. RTB not only scales up the buying process by aggregating a large amount of available inventories across publishers but, most importantly, enables direct targeting of individual users. As such, RTB has fundamentally changed the landscape of digital marketing. Scientifically, the demand for automation, integration and optimisation in RTB also brings new research opportunities in information retrieval, data mining, machine learning and other related fields. In this monograph, an overview is given of the fundamental infrastructure, algorithms, and technical solutions of this new frontier of computational advertising. The covered topics include user response prediction, bid landscape forecasting, bidding algorithms, revenue optimisation, statistical arbitrage, dynamic pricing, and ad fraud detection

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

    Energy efficient enabling technologies for semantic video processing on mobile devices

    Get PDF
    Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art

    Forecasting: theory and practice

    Get PDF
    Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts. We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases.info:eu-repo/semantics/publishedVersio
    corecore