1,232 research outputs found
Infinite Factorial Finite State Machine for Blind Multiuser Channel Estimation
New communication standards need to deal with machine-to-machine
communications, in which users may start or stop transmitting at any time in an
asynchronous manner. Thus, the number of users is an unknown and time-varying
parameter that needs to be accurately estimated in order to properly recover
the symbols transmitted by all users in the system. In this paper, we address
the problem of joint channel parameter and data estimation in a multiuser
communication channel in which the number of transmitters is not known. For
that purpose, we develop the infinite factorial finite state machine model, a
Bayesian nonparametric model based on the Markov Indian buffet that allows for
an unbounded number of transmitters with arbitrary channel length. We propose
an inference algorithm that makes use of slice sampling and particle Gibbs with
ancestor sampling. Our approach is fully blind as it does not require a prior
channel estimation step, prior knowledge of the number of transmitters, or any
signaling information. Our experimental results, loosely based on the LTE
random access channel, show that the proposed approach can effectively recover
the data-generating process for a wide range of scenarios, with varying number
of transmitters, number of receivers, constellation order, channel length, and
signal-to-noise ratio.Comment: 15 pages, 15 figure
Anomaly Detection in Streaming Sensor Data
In this chapter we consider a cell phone network as a set of automatically
deployed sensors that records movement and interaction patterns of the
population. We discuss methods for detecting anomalies in the streaming data
produced by the cell phone network. We motivate this discussion by describing
the Wireless Phone Based Emergency Response (WIPER) system, a proof-of-concept
decision support system for emergency response managers. We also discuss some
of the scientific work enabled by this type of sensor data and the related
privacy issues. We describe scientific studies that use the cell phone data set
and steps we have taken to ensure the security of the data. We describe the
overall decision support system and discuss three methods of anomaly detection
that we have applied to the data.Comment: 35 pages. Book chapter to appear in "Intelligent Techniques for
Warehousing and Mining Sensor Network Data" (IGI Global), edited by A.
Cuzzocre
Infinite Factorial Finite State Machine for Blind Multiuser Channel Estimation
New communication standards need to deal with machine-to-machine
communications, in which users may start or stop transmitting at any time in an
asynchronous manner. Thus, the number of users is an unknown and time-varying
parameter that needs to be accurately estimated in order to properly recover
the symbols transmitted by all users in the system. In this paper, we address
the problem of joint channel parameter and data estimation in a multiuser
communication channel in which the number of transmitters is not known. For
that purpose, we develop the infinite factorial finite state machine model, a
Bayesian nonparametric model based on the Markov Indian buffet that allows for
an unbounded number of transmitters with arbitrary channel length. We propose
an inference algorithm that makes use of slice sampling and particle Gibbs with
ancestor sampling. Our approach is fully blind as it does not require a prior
channel estimation step, prior knowledge of the number of transmitters, or any
signaling information. Our experimental results, loosely based on the LTE
random access channel, show that the proposed approach can effectively recover
the data-generating process for a wide range of scenarios, with varying number
of transmitters, number of receivers, constellation order, channel length, and
signal-to-noise ratio
Statistical modelling of clickstream behaviour to inform real-time advertising decisions
Online user browsing generates vast quantities of typically unexploited data. Investigating this data and uncovering the valuable information it contains can be of substantial value to online businesses, and statistics plays a key role in this process.
The data takes the form of an anonymous digital footprint associated with each unique visitor, resulting in unique profiles across individual page visits on a daily basis. Exploring, cleaning and transforming data of this scale and high dimensionality (2TB+ of memory) is particularly challenging, and requires cluster computing.
We outline a variable selection method to summarise clickstream behaviour with a single value, and make comparisons to other dimension reduction techniques. We illustrate how to apply generalised linear models and zero-inflated models to predict sponsored search advert clicks based on keywords.
We consider the problem of predicting customer purchases (known as conversions), from the customer’s journey or clickstream, which is the sequence of pages seen during a single visit to a website. We consider each page as a discrete state with probabilities of transitions between the pages, providing the basis for a simple Markov model.
Further, Hidden Markov models (HMMs) are applied to relate the observed clickstream to a sequence of hidden states, uncovering meta-states of user activity. We can also apply conventional logistic regression to model conversions in terms of summaries of the profile’s browsing behaviour and incorporate both into a set of tools to solve a wide range of conversion types where we can directly compare the predictive capability of each model.
In real-time, predicting profiles that are likely to follow similar behaviour patterns to known conversions, will have a critical impact on targeted advertising. We illustrate these analyses with results from real data collected by an Audience Management Platform (AMP) - Carbon
Representing Conversations for Scalable Overhearing
Open distributed multi-agent systems are gaining interest in the academic
community and in industry. In such open settings, agents are often coordinated
using standardized agent conversation protocols. The representation of such
protocols (for analysis, validation, monitoring, etc) is an important aspect of
multi-agent applications. Recently, Petri nets have been shown to be an
interesting approach to such representation, and radically different approaches
using Petri nets have been proposed. However, their relative strengths and
weaknesses have not been examined. Moreover, their scalability and suitability
for different tasks have not been addressed. This paper addresses both these
challenges. First, we analyze existing Petri net representations in terms of
their scalability and appropriateness for overhearing, an important task in
monitoring open multi-agent systems. Then, building on the insights gained, we
introduce a novel representation using Colored Petri nets that explicitly
represent legal joint conversation states and messages. This representation
approach offers significant improvements in scalability and is particularly
suitable for overhearing. Furthermore, we show that this new representation
offers a comprehensive coverage of all conversation features of FIPA
conversation standards. We also present a procedure for transforming AUML
conversation protocol diagrams (a standard human-readable representation), to
our Colored Petri net representation
Bayesian Nonparametric Approaches for Modelling Stochastic Temporal Events
Modelling stochastic temporal events is a classic machine learning problem that has drawn enormous research attentions over recent decades. Traditional approaches heavily focused on the parametric models that pre-specify model complexity. Comprehensive model comparison and selection are necessary to prevent over-fitting and under-fitting problems.
The recently developed Bayesian nonparametric learning framework provides an appealing alternative to traditional approaches. It can automatically learn the model complexity from data. In this thesis, I propose a set of Bayesian nonparametric approaches for stochastic temporal event modelling with the consideration of event similarity, interaction, occurrence time and emitted observation. Specifically, I tackle following three main challenges in the modelling.
1. Data sparsity. Data sparsity problem is common in many real-world temporal event modelling applications, e.g., water pipes failures prediction. A Bayesian nonparametric model that allows pipes with similar behaviour to share failure data is proposed to attain a more effective failure prediction. It is shown that flexible event clustering can help alleviate the data sparsity problem. The clustering process is fully data-driven and it does not require predefining the number of clusters.
2. Event interaction. Stochastic events can interact with each other over time. One event can cause or repel the occurrence of other events. An unexplored theoretical bridge is established between interaction point processes and distance dependent Chinese restaurant process. Hence an integrated model, namely infinite branching model, is developed to estimate point event intensity, interaction mechanism and branching structure simultaneously.
3. Event correlation. The stochastic temporal events are correlated not only between arrival times but also between observations. A novel unified Bayesian nonparametric model that generalizes Hidden Markov model and interaction point processes is constructed to exploit two types of underlying correlation in a well-integrated way rather than individually. The proposed model provides a comprehensive insight into the interaction mechanism and correlation between events.
At last, a future vision of Bayesian nonparametric research for stochastic temporal events is highlighted from both application and modelling perspectives
Tools and Algorithms for the Construction and Analysis of Systems
This open access two-volume set constitutes the proceedings of the 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2021, which was held during March 27 – April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The total of 41 full papers presented in the proceedings was carefully reviewed and selected from 141 submissions. The volume also contains 7 tool papers; 6 Tool Demo papers, 9 SV-Comp Competition Papers. The papers are organized in topical sections as follows: Part I: Game Theory; SMT Verification; Probabilities; Timed Systems; Neural Networks; Analysis of Network Communication. Part II: Verification Techniques (not SMT); Case Studies; Proof Generation/Validation; Tool Papers; Tool Demo Papers; SV-Comp Tool Competition Papers
- …