21 research outputs found

### Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families

We study online learning under logarithmic loss with regular parametric
models. Hedayati and Bartlett (2012b) showed that a Bayesian prediction
strategy with Jeffreys prior and sequential normalized maximum likelihood
(SNML) coincide and are optimal if and only if the latter is exchangeable, and
if and only if the optimal strategy can be calculated without knowing the time
horizon in advance. They put forward the question what families have
exchangeable SNML strategies. This paper fully answers this open problem for
one-dimensional exponential families. The exchangeability can happen only for
three classes of natural exponential family distributions, namely the Gaussian,
Gamma, and the Tweedie exponential family of order 3/2. Keywords: SNML
Exchangeability, Exponential Family, Online Learning, Logarithmic Loss,
Bayesian Strategy, Jeffreys Prior, Fisher Information1Comment: 23 page

### Global, regional, and national burden of tuberculosis, 1990â€“2016: results from the Global Burden of Diseases, Injuries, and Risk Factors 2016 Study

Background
Although a preventable and treatable disease, tuberculosis causes more than a million deaths each year. As countries work towards achieving the Sustainable Development Goal (SDG) target to end the tuberculosis epidemic by 2030, robust assessments of the levels and trends of the burden of tuberculosis are crucial to inform policy and programme decision making. We assessed the levels and trends in the fatal and non-fatal burden of tuberculosis by drug resistance and HIV status for 195 countries and territories from 1990 to 2016.
Methods
We analysed 15â€ˆ943 site-years of vital registration data, 1710 site-years of verbal autopsy data, 764 site-years of sample-based vital registration data, and 361 site-years of mortality surveillance data to estimate mortality due to tuberculosis using the Cause of Death Ensemble model. We analysed all available data sources, including annual case notifications, prevalence surveys, population-based tuberculin surveys, and estimated tuberculosis cause-specific mortality to generate internally consistent estimates of incidence, prevalence, and mortality using DisMod-MR 2.1, a Bayesian meta-regression tool. We assessed how the burden of tuberculosis differed from the burden predicted by the Socio-demographic Index (SDI), a composite indicator of income per capita, average years of schooling, and total fertility rate.
Findings
Globally in 2016, among HIV-negative individuals, the number of incident cases of tuberculosis was 9Â·02 million (95% uncertainty interval [UI] 8Â·05â€“10Â·16) and the number of tuberculosis deaths was 1Â·21 million (1Â·16â€“1Â·27). Among HIV-positive individuals, the number of incident cases was 1Â·40 million (1Â·01â€“1Â·89) and the number of tuberculosis deaths was 0Â·24 million (0Â·16â€“0Â·31). Globally, among HIV-negative individuals the age-standardised incidence of tuberculosis decreased annually at a slower rate (â€“1Â·3% [â€“1Â·5 to âˆ’1Â·2]) than mortality did (â€“4Â·5% [â€“5Â·0 to âˆ’4Â·1]) from 2006 to 2016. Among HIV-positive individuals during the same period, the rate of change in annualised age-standardised incidence was âˆ’4Â·0% (â€“4Â·5 to âˆ’3Â·7) and mortality was âˆ’8Â·9% (â€“9Â·5 to âˆ’8Â·4). Several regions had higher rates of age-standardised incidence and mortality than expected on the basis of their SDI levels in 2016. For drug-susceptible tuberculosis, the highest observed-to-expected ratios were in southern sub-Saharan Africa (13Â·7 for incidence and 14Â·9 for mortality), and the lowest ratios were in high-income North America (0Â·4 for incidence) and Oceania (0Â·3 for mortality). For multidrug-resistant tuberculosis, eastern Europe had the highest observed-to-expected ratios (67Â·3 for incidence and 73Â·0 for mortality), and high-income North America had the lowest ratios (0Â·4 for incidence and 0Â·5 for mortality).
Interpretation
If current trends in tuberculosis incidence continue, few countries are likely to meet the SDG target to end the tuberculosis epidemic by 2030. Progress needs to be accelerated by improving the quality of and access to tuberculosis diagnosis and care, by developing new tools, scaling up interventions to prevent risk factors for tuberculosis, and integrating control programmes for tuberculosis and HIV

### Global, regional, and national disability-adjusted life-years (DALYs) for 333 diseases and injuries and healthy life expectancy (HALE) for 195 countries and territories, 1990â€“2016: a systematic analysis for the Global Burden of Disease Study 2016

BACKGROUND: Measurement of changes in health across locations is useful to compare and contrast changing epidemiological patterns against health system performance and identify specific needs for resource allocation in research, policy development, and programme decision making. Using the Global Burden of Diseases, Injuries, and Risk Factors Study 2016, we drew from two widely used summary measures to monitor such changes in population health: disability-adjusted life-years (DALYs) and healthy life expectancy (HALE). We used these measures to track trends and benchmark progress compared with expected trends on the basis of the Socio-demographic Index (SDI).
METHODS: We used results from the Global Burden of Diseases, Injuries, and Risk Factors Study 2016 for all-cause mortality, cause-specific mortality, and non-fatal disease burden to derive HALE and DALYs by sex for 195 countries and territories from 1990 to 2016. We calculated DALYs by summing years of life lost and years of life lived with disability for each location, age group, sex, and year. We estimated HALE using age-specific death rates and years of life lived with disability per capita. We explored how DALYs and HALE differed from expected trends when compared with the SDI: the geometric mean of income per person, educational attainment in the population older than age 15 years, and total fertility rate.
FINDINGS: The highest globally observed HALE at birth for both women and men was in Singapore, at 75Â·2 years (95% uncertainty interval 71Â·9-78Â·6) for females and 72Â·0 years (68Â·8-75Â·1) for males. The lowest for females was in the Central African Republic (45Â·6 years [42Â·0-49Â·5]) and for males was in Lesotho (41Â·5 years [39Â·0-44Â·0]). From 1990 to 2016, global HALE increased by an average of 6Â·24 years (5Â·97-6Â·48) for both sexes combined. Global HALE increased by 6Â·04 years (5Â·74-6Â·27) for males and 6Â·49 years (6Â·08-6Â·77) for females, whereas HALE at age 65 years increased by 1Â·78 years (1Â·61-1Â·93) for males and 1Â·96 years (1Â·69-2Â·13) for females. Total global DALYs remained largely unchanged from 1990 to 2016 (-2Â·3% [-5Â·9 to 0Â·9]), with decreases in communicable, maternal, neonatal, and nutritional (CMNN) disease DALYs offset by increased DALYs due to non-communicable diseases (NCDs). The exemplars, calculated as the five lowest ratios of observed to expected age-standardised DALY rates in 2016, were Nicaragua, Costa Rica, the Maldives, Peru, and Israel. The leading three causes of DALYs globally were ischaemic heart disease, cerebrovascular disease, and lower respiratory infections, comprising 16Â·1% of all DALYs. Total DALYs and age-standardised DALY rates due to most CMNN causes decreased from 1990 to 2016. Conversely, the total DALY burden rose for most NCDs; however, age-standardised DALY rates due to NCDs declined globally.
INTERPRETATION: At a global level, DALYs and HALE continue to show improvements. At the same time, we observe that many populations are facing growing functional health loss. Rising SDI was associated with increases in cumulative years of life lived with disability and decreases in CMNN DALYs offset by increased NCD DALYs. Relative compression of morbidity highlights the importance of continued health interventions, which has changed in most locations in pace with the gross domestic product per person, education, and family planning. The analysis of DALYs and HALE and their relationship to SDI represents a robust framework with which to benchmark location-specific health performance. Country-specific drivers of disease burden, particularly for causes with higher-than-expected DALYs, should inform health policies, health system improvement initiatives, targeted prevention efforts, and development assistance for health, including financial and research investments for all countries, regardless of their level of sociodemographic development. The presence of countries that substantially outperform others suggests the need for increased scrutiny for proven examples of best practices, which can help to extend gains, whereas the presence of underperforming countries suggests the need for devotion of extra attention to health systems that need more robust support.
FUNDING: Bill & Melinda Gates Foundation

Recommended from our members

### Minimax Optimality in Online Learning under Logarithmic Loss with Parametric Constant Experts

We study online prediction of individual sequences under logarithmic loss with parametric experts. The goal is to predict a sequence of outcomes, revealed one at a time, almost as well as a set of experts. At round t, the forecaster's prediction takes the form of a conditional probability density. The loss that the forecaster suffers at that round is the negative log of the conditional probability of the outcome revealed after the forecaster's prediction. The performance of the prediction strategy is measured relative to the best in a reference set of experts, a parametric class of i.i.d distributions. The difference between the accumulated loss of the prediction strategy and the best expert in the reference set is called the regret. We focus on the minimax regret, which is the regret of the strategy with the minimum of the worst-case regret over outcome sequences.The minimax regret is achieved by the normalized maximum likelihood (NML) strategy. This strategy knows the length of the sequence in advance and the probability it assigns to each sequence is proportional to the maximum likelihood of the sequence. Conditionals are computed at each round by marginalization which is very costly for NML. Due to this drawback, much focus has been given to alternative strategies such as sequential normalized maximum likelihood (SNML) and Bayesian strategies. The conditional probability that SNML assigns to the next outcome is proportional to the maximum likelihood of the data seen so far and the next outcome.Â We investigate conditions that lead to optimality of SNML and Bayesian strategies. A major part of this thesis is dedicated to showing that optimality of SNML and optimality of a certain Bayesian strategy, namely the Bayesian strategy under Jeffreys prior are equivalent to each other, i.e. if SNML is optimal, then so is the Bayesian strategy under Jeffreys prior and if the Bayesian strategy under the Jeffreys prior is optimal then so is SNML. Note that Jeffreys prior in parametric families is proportional to the square root of the determinant of the Fisher information. Furthermore we show that optimality of SNML happens if and only if the joint distribution on sequences defined by SNML is exchangeable, i.e. the probability that SNML assigns to any sequence is invariant under any permutation of the sequence. These results are proven for exponential families and any parametric family for which the maximum likelihood estimator is asymptotically normal. The most important implication of these results is that when SNML-exchangeability holds NML becomes horizon-independent, and it could be either calculated through a Bayesian update with Jeffreys prior or through a one step-ahead maximum likelihood calculation as in SNML. Another major part of this thesis is focused on showing that SNML-exchangeabilty holds for a large class of one-dimensional exponential family distributions, namely for Gaussian, the gamma, and the Tweedie exponential family of order 3/2, and any one-to-one transformation of them and that it cannot hold for other one-dimensional exponential family distributions.Â Finally in this thesis we investigate horizon-dependent priors when Jeffreys prior is not optimal. Only Jeffreys prior can make a Bayesian strategy optimal. This means that if Jeffreys prior is not optimal then nor is any other prior, except for possibly a horizon-dependent prior. This is because if there does not exist a prior that can make the Bayesian strategy optimal for all horizons then the only possibilities are priors that depend on the horizon of the game. We investigate the behavior of a natural horizon-dependent prior called the NML prior. We show that the NML prior converges in distribution to Jeffreys prior, which makes it asymptotically optimal, but not necessarily optimal for an arbitrary horizon. Furthermore we show that there are exactly three families, namely Gaussian, gamma and inverse Gaussian, where the NML prior is equal to Jeffreys prior and hence horizon-independent. Two of these families namely gamma and Gaussian have optimal NML prior. We also investigate the problem of finding an optimal horizon-dependent prior for online binary prediction with Bernoulli experts. We could not solve this problem, but we describe insights gained from our investigation and possible directions that researchers can take in tackling this open problem

Recommended from our members

### Minimax Optimality in Online Learning under Logarithmic Loss with Parametric Constant Experts

We study online prediction of individual sequences under logarithmic loss with parametric experts. The goal is to predict a sequence of outcomes, revealed one at a time, almost as well as a set of experts. At round t, the forecaster's prediction takes the form of a conditional probability density. The loss that the forecaster suffers at that round is the negative log of the conditional probability of the outcome revealed after the forecaster's prediction. The performance of the prediction strategy is measured relative to the best in a reference set of experts, a parametric class of i.i.d distributions. The difference between the accumulated loss of the prediction strategy and the best expert in the reference set is called the regret. We focus on the minimax regret, which is the regret of the strategy with the minimum of the worst-case regret over outcome sequences.The minimax regret is achieved by the normalized maximum likelihood (NML) strategy. This strategy knows the length of the sequence in advance and the probability it assigns to each sequence is proportional to the maximum likelihood of the sequence. Conditionals are computed at each round by marginalization which is very costly for NML. Due to this drawback, much focus has been given to alternative strategies such as sequential normalized maximum likelihood (SNML) and Bayesian strategies. The conditional probability that SNML assigns to the next outcome is proportional to the maximum likelihood of the data seen so far and the next outcome.Â We investigate conditions that lead to optimality of SNML and Bayesian strategies. A major part of this thesis is dedicated to showing that optimality of SNML and optimality of a certain Bayesian strategy, namely the Bayesian strategy under Jeffreys prior are equivalent to each other, i.e. if SNML is optimal, then so is the Bayesian strategy under Jeffreys prior and if the Bayesian strategy under the Jeffreys prior is optimal then so is SNML. Note that Jeffreys prior in parametric families is proportional to the square root of the determinant of the Fisher information. Furthermore we show that optimality of SNML happens if and only if the joint distribution on sequences defined by SNML is exchangeable, i.e. the probability that SNML assigns to any sequence is invariant under any permutation of the sequence. These results are proven for exponential families and any parametric family for which the maximum likelihood estimator is asymptotically normal. The most important implication of these results is that when SNML-exchangeability holds NML becomes horizon-independent, and it could be either calculated through a Bayesian update with Jeffreys prior or through a one step-ahead maximum likelihood calculation as in SNML. Another major part of this thesis is focused on showing that SNML-exchangeabilty holds for a large class of one-dimensional exponential family distributions, namely for Gaussian, the gamma, and the Tweedie exponential family of order 3/2, and any one-to-one transformation of them and that it cannot hold for other one-dimensional exponential family distributions.Â Finally in this thesis we investigate horizon-dependent priors when Jeffreys prior is not optimal. Only Jeffreys prior can make a Bayesian strategy optimal. This means that if Jeffreys prior is not optimal then nor is any other prior, except for possibly a horizon-dependent prior. This is because if there does not exist a prior that can make the Bayesian strategy optimal for all horizons then the only possibilities are priors that depend on the horizon of the game. We investigate the behavior of a natural horizon-dependent prior called the NML prior. We show that the NML prior converges in distribution to Jeffreys prior, which makes it asymptotically optimal, but not necessarily optimal for an arbitrary horizon. Furthermore we show that there are exactly three families, namely Gaussian, gamma and inverse Gaussian, where the NML prior is equal to Jeffreys prior and hence horizon-independent. Two of these families namely gamma and Gaussian have optimal NML prior. We also investigate the problem of finding an optimal horizon-dependent prior for online binary prediction with Bernoulli experts. We could not solve this problem, but we describe insights gained from our investigation and possible directions that researchers can take in tackling this open problem

### Exchangeability characterizes optimality of sequential normalized maximum likelihood and Bayesian prediction

We study online learning under logarithmic loss with regular parametric models. In this setting, each strategy corresponds to a joint distribution on sequences. The minimax optimal strategy is the normalized maximum likelihood (NML) strategy. We show that the sequential NML (SNML) strategy predicts minimax optimally (i.e., as NML) if and only if the joint distribution on sequences defined by SNML is exchangeable. This property also characterizes the optimality of a Bayesian prediction strategy. In that case, the optimal prior distribution is Jeffreys prior for a broad class of parametric models for which the maximum likelihood estimator is asymptotically normal. The optimal prediction strategy, NML, depends on the number n of rounds of the game, in general. However, when a Bayesian strategy is optimal, NML becomes independent of n. Our proof uses this to exploit the asymptotics of NML. The asymptotic normality of the maximum likelihood estimator is responsible for the necessity of Jeffreys prior

### The Optimality of Jeffreys Prior for Online Density Estimation and the Asymptotic Normality of Maximum Likelihood Estimators

We study online learning under logarithmic loss with regular parametric models. We show that a Bayesian strategy predicts optimally only if it uses Jeffreys prior. This result was known for canonical exponential families; we extend it to parametric models for which the maximum likelihood estimator is asymptotically normal. The optimal prediction strategy, normalized maximum likelihood, depends on the number n of rounds of the game, in general. However, when a Bayesian strategy is optimal, normalized maximum likelihood becomes independent of n. Our proof uses this to exploit the asymptotics of normalized maximum likelihood. The asymptotic normality of the maximum likelihood estimator is responsible for the necessity of Jeffreys prior