72 research outputs found
Inflated Beta Distributions
This paper considers the issue of modeling fractional data observed in the
interval [0,1), (0,1] or [0,1]. Mixed continuous-discrete distributions are
proposed. The beta distribution is used to describe the continuous component of
the model since its density can have quite diferent shapes depending on the
values of the two parameters that index the distribution. Properties of the
proposed distributions are examined. Also, maximum likelihood and method of
moments estimation is discussed. Finally, practical applications that employ
real data are presented.Comment: 15 pages, 4 figures. Submitted to Statistical Paper
A Note on the Shannon Entropy of Short Sequences
For source sequences of length L symbols we proposed to use a more realistic
value to the usual benchmark of number of code letters by source letters. Our
idea is based on a quantifier of information fluctuation of a source, F(U),
which corresponds to the second central moment of the random variable that
measures the information content of a source symbol. An alternative
interpretation of typical sequences is additionally provided through this
approach.Comment: 3 figure
Classification and Verification of Online Handwritten Signatures with Time Causal Information Theory Quantifiers
We present a new approach for online handwritten signature classification and
verification based on descriptors stemming from Information Theory. The
proposal uses the Shannon Entropy, the Statistical Complexity, and the Fisher
Information evaluated over the Bandt and Pompe symbolization of the horizontal
and vertical coordinates of signatures. These six features are easy and fast to
compute, and they are the input to an One-Class Support Vector Machine
classifier. The results produced surpass state-of-the-art techniques that
employ higher-dimensional feature spaces which often require specialized
software and hardware. We assess the consistency of our proposal with respect
to the size of the training sample, and we also use it to classify the
signatures into meaningful groups.Comment: Submitted to PLOS On
Proceso de galton-watson
Se presenta una sÃntesis de las principales caracterÃsticas que se incluyen al realizar un análisis del proceso de Galton-Watson: el tiempo de extinción del proceso, los resultados asintóticos para los casos crÃtico, subcrÃtico y supercrÃtico, la estimación por máxima verosimilitud del promedio de reproducción y la construcción de algunas variables aleatorias simuladas para verificar su comportamiento normal asintótico.A synthesis of the practical theoretical main results is presented that involves the analysis of the process of Galton-Watson; as they are the results asymptotics for the cases critical subcritical and supercritical, the time of extinction of the process, the estimate of the reproduction average way maximum likelihood and the construction of some random variables which were simulated to verify their behavior normal asymptotically
Choosing the right strategy to model longitudinal count data in Epidemiology: An application with CD4 cell counts
Background: Statistical models for analysis of correlated count data are important for answering epidemiological questions that involve taking individual count measurements repeatedly over time through the use of longitudinal studies. Conventional regression models for this type of data are inadequate, leading to improper conclusions and inference. An important application of longitudinal studies in Public Health is the evaluation and monitoring of patients with infectious diseases, such as HIV/AIDS, to determine their health status, to verify the treatment effects, and to make prognosis concerning the evolution of the disease, including interdependencies of clinical manifestations. The purpose of this article is to characterize different statistical strategies for analysis of longitudinal count data, emphasizing how to choose the most suitable model for the data and how to interpret the results.
Methods:We illustrate their applicability by evaluating the effect of associated factors on lymphocyte CD4+T cell count in HIV seropositive patients in Salvador/Bahia - Brazil. We describe Poisson and Negative Binomial models using multilevel (ML) approach and generalized estimations equations (GEE) for analysis of longitudinal count data.
Results: It is worth noting that the interpretation of the results from ML and GEE differs and they should not be compared directly.
Conclusion: We believe that the statistical methodology for analysis of longitudinal studies with correlated count data can be useful to address several important questions in public health, particularly by helping to monitor patients and checking the effectiveness of treatments
- …