335 research outputs found
Dirichlet belief networks for topic structure learning
Recently, considerable research effort has been devoted to developing deep
architectures for topic models to learn topic structures. Although several deep
models have been proposed to learn better topic proportions of documents, how
to leverage the benefits of deep structures for learning word distributions of
topics has not yet been rigorously studied. Here we propose a new multi-layer
generative process on word distributions of topics, where each layer consists
of a set of topics and each topic is drawn from a mixture of the topics of the
layer above. As the topics in all layers can be directly interpreted by words,
the proposed model is able to discover interpretable topic hierarchies. As a
self-contained module, our model can be flexibly adapted to different kinds of
topic models to improve their modelling accuracy and interpretability.
Extensive experiments on text corpora demonstrate the advantages of the
proposed model.Comment: accepted in NIPS 201
An Overview of the Use of Neural Networks for Data Mining Tasks
In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks
Topic Modelling of Swedish Newspaper Articles about Coronavirus: a Case Study using Latent Dirichlet Allocation Method
Topic Modelling (TM) is from the research branches of natural language
understanding (NLU) and natural language processing (NLP) that is to facilitate
insightful analysis from large documents and datasets, such as a summarisation
of main topics and the topic changes. This kind of discovery is getting more
popular in real-life applications due to its impact on big data analytics. In
this study, from the social-media and healthcare domain, we apply popular
Latent Dirichlet Allocation (LDA) methods to model the topic changes in Swedish
newspaper articles about Coronavirus. We describe the corpus we created
including 6515 articles, methods applied, and statistics on topic changes over
approximately 1 year and two months period of time from 17th January 2020 to
13th March 2021. We hope this work can be an asset for grounding applications
of topic modelling and can be inspiring for similar case studies in an era with
pandemics, to support socio-economic impact research as well as clinical and
healthcare analytics. Our data and source code are openly available at
https://github. com/poethan/Swed_Covid_TM Keywords: Latent Dirichlet Allocation
(LDA); Topic Modelling; Coronavirus; Pandemics; Natural Language Understanding;
BERT-topicComment: 14 pages, 14 figure
Strategic decision-making in multi-agent markets: The emergence of endogenous crises and volatility
Traditional economic frameworks are built upon perfectly rational agents and equilibrium outcomes. However, during times of crises, these frameworks prove insufficient. In this thesis, we take an alternative perspective based on "Complexity Economics", relaxing the assumption of perfectly rational agents and allowing for out-of-equilibrium dynamics. While many contemporary approaches explain crises and non-equilibrium market phenomena as the rational reaction to external news, the emergence of endogenous crises remains an open question.
We begin addressing this question by demonstrating how a multi-agent model of heterogeneous boundedly rational agents acting according to heuristics can reproduce and forecast key non-linear price movements in the Australian housing market, during boom and bust cycles. In order to provide foundations for such heuristic-based reasoning, we then propose a novel information-theoretic approach, Quantal Hierarchy, for modelling limitations in strategic reasoning, demonstrating how this convincingly and generically captures the decision-making of interacting agents in competitive markets outperforming existing approaches. In addition, we demonstrate how a concise generalised market model can generate important stylised facts, such as fat-tails and volatility clustering, and allow for the emergence of crises, purely endogenously. This thesis provides support to the interacting agent hypothesis, addressing a crucial question of whether crisis emergence and various stylised facts can be seen as endogenous phenomena, and provides a generic method for representing strategic agent reasoning
Display Advertising with Real-Time Bidding (RTB) and Behavioural Targeting
The most significant progress in recent years in online display advertising is what is known as the Real-Time Bidding (RTB) mechanism to buy and sell ads. RTB essentially facilitates buying an individual ad impression in real time while it is still being generated from a user’s visit. RTB not only scales up the buying process by aggregating a large amount of available inventories across publishers but, most importantly, enables direct targeting of individual users. As such, RTB has fundamentally changed the landscape of digital marketing. Scientifically, the demand for automation, integration and optimisation in RTB also brings new research opportunities in information retrieval, data mining, machine learning and other related fields. In this monograph, an overview is given of the fundamental infrastructure, algorithms, and technical solutions of this new frontier of computational advertising. The covered topics include user response prediction, bid landscape forecasting, bidding algorithms, revenue optimisation, statistical arbitrage, dynamic pricing, and ad fraud detection
Display Advertising with Real-Time Bidding (RTB) and Behavioural Targeting
The most significant progress in recent years in online display advertising is what is known as the Real-Time Bidding (RTB) mechanism to buy and sell ads. RTB essentially facilitates buying an individual ad impression in real time while it is still being generated from a user’s visit. RTB not only scales up the buying process by aggregating a large amount of available inventories across publishers but, most importantly, enables direct targeting of individual users. As such, RTB has fundamentally changed the landscape of digital marketing. Scientifically, the demand for automation, integration and optimisation in RTB also brings new research opportunities in information retrieval, data mining, machine learning and other related fields. In this monograph, an overview is given of the fundamental infrastructure, algorithms, and technical solutions of this new frontier of computational advertising. The covered topics include user response prediction, bid landscape forecasting, bidding algorithms, revenue optimisation, statistical arbitrage, dynamic pricing, and ad fraud detection
Analyzing Granger causality in climate data with time series classification methods
Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested
Energy efficient enabling technologies for semantic video processing on mobile devices
Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This
thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the
human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and
reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing
any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art
Forecasting: theory and practice
Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts.
We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases.info:eu-repo/semantics/publishedVersio
- …