27 research outputs found

    A contribution to adaptive robust estimation

    Get PDF
    Includes bibliography.This study initially set out to consider the possibility of constructing an adaptive robust estimation procedure for the standard linear regression model when the disturbance vector deviated from normality, however, after the initial success in that field it seemed only appropriate that the approach be extended to robust location parameter estimation. This is a particular case of the regression model and an area in which a number of different estimators have been proposed and a great deal of comparative research work done. Due to the wider scope of such research the greater part of the thesis is devoted to this field of research which led to many interesting and useful results and conclusions

    Modelling computer network traffic using wavelets and time series analysis

    Get PDF
    Modelling of network traffic is a notoriously difficult problem. This is primarily due to the ever-increasing complexity of network traffic and the different ways in which a network may be excited by user activity. The ongoing development of new network applications, protocols, and usage profiles further necessitate the need for models which are able to adapt to the specific networks in which they are deployed. These considerations have in large part driven the evolution of statistical profiles of network traffic from simple Poisson processes to non-Gaussian models that incorporate traffic burstiness, non-stationarity, self-similarity, long-range dependence (LRD) and multi-fractality. The need for ever more sophisticated network traffic models has led to the specification of a myriad of traffic models since. Many of these are listed in [91, 14]. In networks comprised of IoT devices much of the traffic is generated by devices which function autonomously and in a more deterministic fashion. Thus in this dissertation the activity of building time series models for IoT network traffic is undertaken. In the work that follows a broad review of the historical development of network traffic modelling is presented tracing a path that leads to the use of time series analysis for the said task. An introduction to time series analysis is provided in order to facilitate the theoretical discussion regarding the feasibility and suitability of time series analysis techniques for modelling network traffic. The theory is then followed by a summary of the techniques and methodology that might be followed to detect, remove and/or model the typical characteristics associated with network traffic such as linear trends, cyclic trends, periodicity, fractality, and long range dependence. A set of experiments is conducted in order determine the effect of fractality on the estimation of AR and MA components of a time series model. A comparison of various Hurst estimation techniques is also performed on synthetically generated data. The wavelet-based Abry-Veitch Hurst estimator is found to perform consistly well with respect to its competitors, and the subsequent removal of fractality via fractional differencing is found to provide a substantial improvement on the estimation of time series model parameters

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    Vol. 16, No. 1 (Full Issue)

    Get PDF

    Vol. 16, No. 2 (Full Issue)

    Get PDF

    On factor models for high-dimensional time series

    Get PDF
    The aim of this thesis is to develop statistical methods for use with factor models for high-dimensional time series. We consider three broad areas: estimation, changepoint detection, and determination of the number of factors. In Chapter 1, we sketch the backdrop for our thesis and review key aspects of the literature. In Chapter 2, we develop a method to estimate the factors and parameters in an approximate dynamic factor model. Specifically, we present a spectral expectation-maximisation (or \spectral EM") algorithm, whereby we derive the E and M step equations in the frequency domain. Our E step relies on the Wiener-Kolmogorov smoother, the frequency domain counterpart of the Kalman smoother, and our M step is based on maximisation of the Whittle Likelihood with respect to the parameters of the model. We initialise our procedure using dynamic principal components analysis (or \dynamic PCA"), and by leveraging results on lag-window estimators of spectral density by Wu and Zaffaroni (2018), we establish consistency-with-rates of our spectral EM estimator of the parameters and factors as both the dimension (N) and the sample size (T) go to infinity. We find rates commensurate with the literature. Finally, we conduct a simulation study to numerically validate our theoretical results. In Chapter 3, we develop a sequential procedure to detect changepoints in an approximate static factor model. Specifically, we define a ratio of eigenvalues of the covariance matrix of N observed variables. We compute this ratio each period using a rolling window of size m over time, and declare a changepoint when its value breaches an alarm threshold. We investigate the asymptotic behaviour (as N;m ! 1) of our ratio, and prove that, for specific eigenvalues, the ratio will spike upwards when a changepoint is encountered but not otherwise. We use a block-bootstrap to obtain alarm thresholds. We present simulation results and an empirical application based on Financial Times Stock Exchange 100 Index (or \FTSE 100") data. In Chapter 4, we conduct an exploratory analysis which aims to extend the randomised sequential procedure of Trapani (2018) into the frequency domain. Specifically, we aim to estimate the number of dynamically loaded factors by applying the test of Trapani (2018) to eigenvalues of the estimated spectral density matrix (as opposed to the covariance matrix) of the data

    Robust Adaptation and Learning Over Networks

    Get PDF
    This doctoral dissertation centers on robust adaptive networks. Robust adaptation strategies are devised to solve typical network inference tasks such as estimation and detection in a decentralized manner in the presence of impulsive contamination. Typical in wireless communication environments, an impulsive noise process can be described as one whose realizations contain sparse, random samples of amplitude much higher than nominally accounted for. An attractive feature that these robust adaptive strategies enjoy is that neither their development nor operation hinges on the availability of exact knowledge of the noise distribution: The robust adaptive strategies are capable of learning it on-the-fly and adapting their parameters accordingly. Forgoing data fusion centers, the network agents employing these strategies rely solely on local interactions and in-network processing to perform inference tasks, which renders networks more reliable, resilient to node and link failure, scalable, and resource efficient. Distributed cooperative processing finds applications in many areas including wireless sensor networks in smart-home, environmental, and industrial monitoring; healthcare; and military surveillance. Since adaptive systems based on the mean-square-error criterion see their performance degrade in the presence of non-Gaussian noise, the robust adaptive strategies developed in this dissertation harness nonlinear data processing and robust statistics instead to mitigate the detrimental effects of impulsive noise. To this end, a robust adaptive filtering algorithm is developed that employs an adaptive error nonlinearity. The error nonlinearity is chosen to be a convex combination of preselected basis functions where the combination coefficients are adapted jointly with the estimate of the parameter of interest such that the mean-square-error relative to the optimal error nonlinearity is minimized in each iteration. Then, a robust diffusion adaptation algorithm of the adapt-then-combine variety is developed as an extension of its stand-alone counterpart for distributed estimation over networks where the measurements may be corrupted by impulsive noise. Each node in the network runs a combination of its neighbors’ estimates through one iteration of a local robust adaptive filter update to ameliorate the effects of contamination, leading to better overall network performance matching that of a centralized strategy at steady-state. Finally, the robust diffusion adaptation algorithm is extended further to solve the problem of distributed detection over adaptive networks where the measurements may be corrupted by impulsive noise. The estimates generated by the robust algorithm are used as basis for the design of robust local detectors, where the form of the test- statistics and the rule for the computation of the detection thresholds are motivated by the analysis of the algorithm dynamics. Each node in the network cooperates with its neighbors, utilizing their estimates, to update its local detector. Effectively, information pertaining to the event of interest percolates across the network, leading to enhanced detection performance. The transient and steady-state behavior of the developed algorithms are analyzed in the mean and mean-square sense using the energy conservation framework. The performance of the algorithm is also examined in the context of distributed detection. Performance is validated extensively through numerical simulations in an impulsive noise scenario, revealing the robustness of the proposed strategies in comparison with state-of-the-art algorithms as well as good agreement between theory and practice

    Vol. 5, No. 1 (Full Issue)

    Get PDF

    Bayesian Gaussian Process Models: PAC-Bayesian Generalisation Error Bounds and Sparse Approximations

    Get PDF
    Non-parametric models and techniques enjoy a growing popularity in the field of machine learning, and among these Bayesian inference for Gaussian process (GP) models has recently received significant attention. We feel that GP priors should be part of the standard toolbox for constructing models relevant to machine learning in the same way as parametric linear models are, and the results in this thesis help to remove some obstacles on the way towards this goal. In the first main chapter, we provide a distribution-free finite sample bound on the difference between generalisation and empirical (training) error for GP classification methods. While the general theorem (the PAC-Bayesian bound) is not new, we give a much simplified and somewhat generalised derivation and point out the underlying core technique (convex duality) explicitly. Furthermore, the application to GP models is novel (to our knowledge). A central feature of this bound is that its quality depends crucially on task knowledge being encoded faithfully in the model and prior distributions, so there is a mutual benefit between a sharp theoretical guarantee and empirically well-established statistical practices. Extensive simulations on real-world classification tasks indicate an impressive tightness of the bound, in spite of the fact that many previous bounds for related kernel machines fail to give non-trivial guarantees in this practically relevant regime. In the second main chapter, sparse approximations are developed to address the problem of the unfavourable scaling of most GP techniques with large training sets. Due to its high importance in practice, this problem has received a lot of attention recently. We demonstrate the tractability and usefulness of simple greedy forward selection with information-theoretic criteria previously used in active learning (or sequential design) and develop generic schemes for automatic model selection with many (hyper)parameters. We suggest two new generic schemes and evaluate some of their variants on large real-world classification and regression tasks. These schemes and their underlying principles (which are clearly stated and analysed) can be applied to obtain sparse approximations for a wide regime of GP models far beyond the special cases we studied here
    corecore