458 research outputs found

    Robust estimation and forecasting for beta-mixed hierarchical models of grouped binary data

    Get PDF
    The paper focuses on robust estimation and forecasting techniques for grouped binary data with misclassified responses. It is assumed that the data are described by the beta-mixed hierarchical model (the beta-binomial or the beta-logistic), while the misclassifications are caused by the stochastic additive distortions of binary observations. For these models, the effect of ignoring the misclassifications is evaluated and expressions for the biases of the method-of-moments estimators and maximum likelihood estimators, as well as expressions for the increase in the mean square error of forecasting for the Bayes predictor are given. To compensate the misclassification effects, new consistent estimators and a new Bayes predictor, which take into account the distortion model, are constructed. The robustness of the developed techniques is demonstrated via computer simulations and a real-life case study.Peer Reviewe

    How Noisy Data Affects Geometric Semantic Genetic Programming

    Full text link
    Noise is a consequence of acquiring and pre-processing data from the environment, and shows fluctuations from different sources---e.g., from sensors, signal processing technology or even human error. As a machine learning technique, Genetic Programming (GP) is not immune to this problem, which the field has frequently addressed. Recently, Geometric Semantic Genetic Programming (GSGP), a semantic-aware branch of GP, has shown robustness and high generalization capability. Researchers believe these characteristics may be associated with a lower sensibility to noisy data. However, there is no systematic study on this matter. This paper performs a deep analysis of the GSGP performance over the presence of noise. Using 15 synthetic datasets where noise can be controlled, we added different ratios of noise to the data and compared the results obtained with those of a canonical GP. The results show that, as we increase the percentage of noisy instances, the generalization performance degradation is more pronounced in GSGP than GP. However, in general, GSGP is more robust to noise than GP in the presence of up to 10% of noise, and presents no statistical difference for values higher than that in the test bed.Comment: 8 pages, In proceedings of Genetic and Evolutionary Computation Conference (GECCO 2017), Berlin, German

    Global changes in extreme daily temperature since 1950

    Get PDF
    Copyright 2008 by the American Geophysical UnionExtreme value analysis of observed daily temperature anomalies from a new quasi-global data set indicates that extreme daily maximum and minimum temperatures (>98.5 or <1.5 percentile) have warmed for most regions since 1950. Changes in extreme anomalous daily temperatures are determined by fitting extreme value distributions with time-varying parameters. Changes in the distribution of anomaly exceedances above a high threshold are found to be statistically significant at the 10% level for most land areas when compared with a time-invariant distribution and with the unforced natural variability produced by a coupled climate model. The largest positive trends in the location parameter of the extreme distribution are found in Canada and Eurasia where daily maximum temperatures have typically warmed by 1 to 3 degrees C since 1950. The total area exhibiting positive trends is significantly greater than can be attributed to unforced natural variability. For most regions, positive trend magnitudes are larger and cover a greater area for daily minimum temperatures than for maximum temperatures. The comparatively small areas of cooling are found to be consistent with unforced natural climate variability. The North Atlantic Oscillation (NAO) is found to have a significant influence on extreme winter daily temperatures for many areas, with a negative NAO of one standard deviation reducing expected extreme winter daily temperatures by similar to 2 degrees C over Eurasia but increasing temperatures over northeastern North America

    Many kinases for controlling the water channel aquaporin‐2

    Get PDF
    Aquaporin-2 (AQP2) is a member of the aquaporin water channel family. In the kidney, AQP2 is expressed in collecting duct principal cells where it facilitates water reabsorption in response to antidiuretic hormone (arginine vasopressin, AVP). AVP induces the redistribution of AQP2 from intracellular vesicles and its incorporation into the plasma membrane. The plasma membrane insertion of AQP2 represents the crucial step in AVP-mediated water reabsorption. Dysregulation of the system preventing the AQP2 plasma membrane insertion causes diabetes insipidus (DI), a disease characterised by an impaired urine concentrating ability and polydipsia. There is no satisfactory treatment of DI available. This review discusses kinases that control the localisation of AQP2 and points out potential kinase-directed targets for the treatment of DI

    Statistical Diagnostics of Metastatic Involvement of Regional Lymph Nodes

    Get PDF
    The method of statistical classification with indicating patients that require more detailed diagnostics is proposed and analysed

    ЦИВИЛИЗАЦИОННЫЕ СЦЕНАРИИ РАЗВИТИЯ РОССИИ

    Get PDF
    In the modern world plays an important role civilization factor. In this connection before Russia a task stands forming of civilization identity. A problem consists in that, that exists a few scenarios of civilization development. To their consideration and this article is devoted.В современном мире важную роль играет цивилизационный фактор. В связи с этим перед Россией стоит задача формирования цивилизационной идентичности. Проблема заключается в том, что существует несколько сценариев цивилизационного развития. Их рассмотрению и посвящена данная статья

    Последовательное статистическое принятие решений в задачах анализа потоков данных

    Get PDF
    In the problems of data flows analysis, the problems of statistical decision making on parameters of observed data flows are important. For their solution it is proposed to use sequential statistical decision rules. The rules are constructed for three models of observation flows: sequence of independent homogeneous observations; sequence of observations forming a time series with a trend; sequence of dependent observations forming a homogeneous Markov chain. For each case the situation is considered, where the model describes the observed stochastic data with a distortion. "Outliers" ("contamination") are used as the admissible distortions that adequately describe the majority of situations appear in practice. For such situations the families of sequential decision rules are proposed, and robust decision rules are constructed that allow to reduce influence of distortion to the efficiency characteristics. The results of computer experiments are given to illustrate the constructed decision rules.В задачах анализа потоков данных актуальны проблемы статистического принятия решений о параметрах наблюдаемых потоков. Для их решения в работе предлагается использовать последовательные статистические решающие правила. Такие правила построены в статье для трех моделей потоков наблюдений: последовательности независимых однородных наблюдений; последовательности наблюдений, образующих временной ряд с трендом; последовательности зависимых наблюдений, образующих однородную цепь Маркова. Для каждого случая рассмотрена также ситуация, когда модель описывает наблюдаемые стохастические данные с искажениями. В качестве допустимых искажений используются «выбросы» («засорения»), которые адекватно описывают наиболее часто встречающиеся на практике ситуации. Предложены семейства последовательных решающих правил, в рамках которых строятся робастные решающие правила, позволяющие снизить влияние искажений на характеристики эффективности. Для иллюстрации преимуществ построенных решающих правил приводятся результаты компьютерных экспериментов

    МЕТОДЫ АНАЛИЗА ЭФФЕКТИВНОСТИ ПОСЛЕДОВАТЕЛЬНЫХ СТАТИСТИЧЕСКИХ ТЕСТОВ

    Get PDF
    Sequential statistical tests for simple hypotheses on parameters of probability distributions of independent observations, as well as of Markov chains are considered in the article. Methods for analysis of performance characteristics (I and II type error probabilities, conditional expected sample sizes) of sequential statistical tests are constructed both on the basis of the approximations of test statistics and on the basis of absorbing Markov chain theory. The proposed methods allow assessing the performance characteristics of sequential statistical tests not only for the hypothetical model of data, but also under deviations from this model, which can be used for robustness analysis of sequential tests. В работе исследуются последовательные статистические тесты проверки простых гипотез о значениях параметров распределений вероятностей независимых наблюдений, а также наблюдений, образующих цепь Маркова. Предложены методы анализа характеристик эффективности (вероятностей ошибок первого и второго рода, а также среднего числа наблюдений) последовательных статистических тестов, основанные на приближении тестовой статистики и использующие теорию поглощающих цепей Маркова. Предложенные методы позволяют вычислять характеристики эффективности последовательных статистических тестов не только для гипотетической модели данных, но и при отклонениях от этой модели, что может быть использовано при анализе робастности последовательных тестов.
    corecore