Search CORE

2,147 research outputs found

Generalising Exponential Distributions Using an Extended Marshall-Olkin Procedure

Author: García Victoriano
Martel-Escobar María
Vázquez-Polo F.J.
Publication venue: 'MDPI AG'
Publication date: 01/03/2020
Field of study

This paper presents a three-parameter family of distributions which includes the common exponential and the Marshall–Olkin exponential as special cases. This distribution exhibits a monotone failure rate function, which makes it appealing for practitioners interested in reliability, and means it can be included in the catalogue of appropriate non-symmetric distributions to model these issues, such as the gamma and Weibull three-parameter families. Given the lack of symmetry of this kind of distribution, various statistical and reliability properties of this model are examined. Numerical examples based on real data reflect the suitable behaviour of this distribution for modelling purposes

Multidisciplinary Digital Publishing Institute

Repositorio de Objetos de Docencia e Investigación de la Universidad de Cádiz

Data Sketches for Disaggregated Subset Sum and Frequent Item Estimation

Author: Demaine E. D.
Mitzenmacher M.
Publication venue
Publication date: 12/09/2017
Field of study

We introduce and study a new data sketch for processing massive datasets. It addresses two common problems: 1) computing a sum given arbitrary filter conditions and 2) identifying the frequent items or heavy hitters in a data set. For the former, the sketch provides unbiased estimates with state of the art accuracy. It handles the challenging scenario when the data is disaggregated so that computing the per unit metric of interest requires an expensive aggregation. For example, the metric of interest may be total clicks per user while the raw data is a click stream with multiple rows per user. Thus the sketch is suitable for use in a wide range of applications including computing historical click through rates for ad prediction, reporting user metrics from event streams, and measuring network traffic for IP flows. We prove and empirically show the sketch has good properties for both the disaggregated subset sum estimation and frequent item problems. On i.i.d. data, it not only picks out the frequent items but gives strongly consistent estimates for the proportion of each frequent item. The resulting sketch asymptotically draws a probability proportional to size sample that is optimal for estimating sums over the data. For non i.i.d. data, we show that it typically does much better than random sampling for the frequent item problem and never does worse. For subset sum estimation, we show that even for pathological sequences, the variance is close to that of an optimal sampling design. Empirically, despite the disadvantage of operating on disaggregated data, our method matches or bests priority sampling, a state of the art method for pre-aggregated data and performs orders of magnitude better on skewed data compared to uniform sampling. We propose extensions to the sketch that allow it to be used in combining multiple data sets, in distributed systems, and for time decayed aggregation

arXiv.org e-Print Archive

Crossref

Blind image separation based on exponentiated transmuted Weibull distribution

Author: Adam A. M.
El-aziz M. E. Abd
Farouk R. M.
Publication venue
Publication date: 11/05/2016
Field of study

In recent years the processing of blind image separation has been investigated. As a result, a number of feature extraction algorithms for direct application of such image structures have been developed. For example, separation of mixed fingerprints found in any crime scene, in which a mixture of two or more fingerprints may be obtained, for identification, we have to separate them. In this paper, we have proposed a new technique for separating a multiple mixed images based on exponentiated transmuted Weibull distribution. To adaptively estimate the parameters of such score functions, an efficient method based on maximum likelihood and genetic algorithm will be used. We also calculate the accuracy of this proposed distribution and compare the algorithmic performance using the efficient approach with other previous generalized distributions. We find from the numerical results that the proposed distribution has flexibility and an efficient resultComment: 14 pages, 12 figures, 4 tables. International Journal of Computer Science and Information Security (IJCSIS),Vol. 14, No. 3, March 2016 (pp. 423-433

arXiv.org e-Print Archive