1,128 research outputs found

    F-measure Maximization in Multi-Label Classification with Conditionally Independent Label Subsets

    Full text link
    We discuss a method to improve the exact F-measure maximization algorithm called GFM, proposed in (Dembczynski et al. 2011) for multi-label classification, assuming the label set can be can partitioned into conditionally independent subsets given the input features. If the labels were all independent, the estimation of only mm parameters (mm denoting the number of labels) would suffice to derive Bayes-optimal predictions in O(m2)O(m^2) operations. In the general case, m2+1m^2+1 parameters are required by GFM, to solve the problem in O(m3)O(m^3) operations. In this work, we show that the number of parameters can be reduced further to m2/nm^2/n, in the best case, assuming the label set can be partitioned into nn conditionally independent subsets. As this label partition needs to be estimated from the data beforehand, we use first the procedure proposed in (Gasse et al. 2015) that finds such partition and then infer the required parameters locally in each label subset. The latter are aggregated and serve as input to GFM to form the Bayes-optimal prediction. We show on a synthetic experiment that the reduction in the number of parameters brings about significant benefits in terms of performance

    Multiple morbidities in companion dogs: a novel model for investigating age-related disease

    Get PDF
    The proportion of men and women surviving over 65 years has been steadily increasing over the last century. In their later years, many of these individuals are afflicted with multiple chronic conditions, placing increasing pressure on healthcare systems. The accumulation of multiple health problems with advanced age is well documented, yet the causes are poorly understood. Animal models have long been employed in attempts to elucidate these complex mechanisms with limited success. Recently, the domestic dog has been proposed as a promising model of human aging for several reasons. Mean lifespan shows twofold variation across dog breeds. In addition, dogs closely share the environments of their owners, and substantial veterinary resources are dedicated to comprehensive diagnosis of conditions in dogs. However, while dogs are therefore useful for studying multimorbidity, little is known about how aging influences the accumulation of multiple concurrent disease conditions across dog breeds. The current study examines how age, body weight, and breed contribute to variation in multimorbidity in over 2,000 companion dogs visiting private veterinary clinics in England. In common with humans, we find that the number of diagnoses increases significantly with age in dogs. However, we find no significant weight or breed effects on morbidity number. This surprising result reveals that while breeds may vary in their average longevity and causes of death, their age-related trajectories of morbidities differ little, suggesting that age of onset of disease may be the source of variation in lifespan across breeds. Future studies with increased sample sizes and longitudinal monitoring may help us discern more breed-specific patterns in morbidity. Overall, the large increase in multimorbidity seen with age in dogs mirrors that seen in humans and lends even more credence to the value of companion dogs as models for human morbidity and mortality

    Calculating partial expected value of perfect information via Monte Carlo sampling algorithms

    Get PDF
    Partial expected value of perfect information (EVPI) calculations can quantify the value of learning about particular subsets of uncertain parameters in decision models. Published case studies have used different computational approaches. This article examines the computation of partial EVPI estimates via Monte Carlo sampling algorithms. The mathematical definition shows 2 nested expectations, which must be evaluated separately because of the need to compute a maximum between them. A generalized Monte Carlo sampling algorithm uses nested simulation with an outer loop to sample parameters of interest and, conditional upon these, an inner loop to sample remaining uncertain parameters. Alternative computation methods and shortcut algorithms are discussed and mathematical conditions for their use considered. Maxima of Monte Carlo estimates of expectations are biased upward, and the authors show that the use of small samples results in biased EVPI estimates. Three case studies illustrate 1) the bias due to maximization and also the inaccuracy of shortcut algorithms 2) when correlated variables are present and 3) when there is nonlinearity in net benefit functions. If relatively small correlation or nonlinearity is present, then the shortcut algorithm can be substantially inaccurate. Empirical investigation of the numbers of Monte Carlo samples suggests that fewer samples on the outer level and more on the inner level could be efficient and that relatively small numbers of samples can sometimes be used. Several remaining areas for methodological development are set out. A wider application of partial EVPI is recommended both for greater understanding of decision uncertainty and for analyzing research priorities

    New distance measures for classifying X-ray astronomy data into stellar classes

    Full text link
    The classification of the X-ray sources into classes (such as extragalactic sources, background stars, ...) is an essential task in astronomy. Typically, one of the classes corresponds to extragalactic radiation, whose photon emission behaviour is well characterized by a homogeneous Poisson process. We propose to use normalized versions of the Wasserstein and Zolotarev distances to quantify the deviation of the distribution of photon interarrival times from the exponential class. Our main motivation is the analysis of a massive dataset from X-ray astronomy obtained by the Chandra Orion Ultradeep Project (COUP). This project yielded a large catalog of 1616 X-ray cosmic sources in the Orion Nebula region, with their series of photon arrival times and associated energies. We consider the plug-in estimators of these metrics, determine their asymptotic distributions, and illustrate their finite-sample performance with a Monte Carlo study. We estimate these metrics for each COUP source from three different classes. We conclude that our proposal provides a striking amount of information on the nature of the photon emitting sources. Further, these variables have the ability to identify X-ray sources wrongly catalogued before. As an appealing conclusion, we show that some sources, previously classified as extragalactic emissions, have a much higher probability of being young stars in Orion Nebula.Comment: 29 page

    The impact of predation by marine mammals on Patagonian toothfish longline fisheries

    Get PDF
    Predatory interaction of marine mammals with longline fisheries is observed globally, leading to partial or complete loss of the catch and in some parts of the world to considerable financial loss. Depredation can also create additional unrecorded fishing mortality of a stock and has the potential to introduce bias to stock assessments. Here we aim to characterise depredation in the Patagonian toothfish (Dissostichus eleginoides) fishery around South Georgia focusing on the spatio-temporal component of these interactions. Antarctic fur seals (Arctocephalus gazella), sperm whales (Physeter macrocephalus), and orcas (Orcinus orca) frequently feed on fish hooked on longlines around South Georgia. A third of longlines encounter sperm whales, but loss of catch due to sperm whales is insignificant when compared to that due to orcas, which interact with only 5% of longlines but can take more than half of the catch in some cases. Orca depredation around South Georgia is spatially limited and focused in areas of putative migration routes, and the impact is compounded as a result of the fishery also concentrating in those areas at those times. Understanding the seasonal behaviour of orcas and the spatial and temporal distribution of “depredation hot spots” can reduce marine mammal interactions, will improve assessment and management of the stock and contribute to increased operational efficiency of the fishery. Such information is valuable in the effort to resolve the human-mammal conflict for resources

    Major depressive disorder and current psychological distress moderate the effect of polygenic risk for obesity on body mass index

    Get PDF
    We are grateful to the families who took part in GS:SFHS, the GPs and Scottish School of Primary Care for their help in recruiting them, and the whole GS team, which includes academic researchers, clinic staff, laboratory technicians, clerical workers, IT staff, statisticians and research managers. This work is supported by the Wellcome Trust through a Strategic Award, reference 104036/Z/14/Z. The Chief Scientist Office of the Scottish Government and the Scottish Funding Council provided core support for Generation Scotland. GS:SFHS was funded by a grant from the Scottish Government Health Department, Chief Scientist Office, number CZD/16/6. We acknowledge with gratitude the financial support received for this work from the Dr Mortimer and Theresa Sackler Foundation. PT, DJP, IJD and AMM are members of The University of Edinburgh Centre for Cognitive Ageing and Cognitive Epidemiology, part of the cross council Lifelong Health and Wellbeing Initiative (MR/K026992/1). Funding from the Biotechnology and Biological Sciences Research Council and Medical Research Council is gratefully acknowledged. DJM is an NRS Career Fellow, funded by the CSO. Supplementary Information accompanies the paper on the Translational Psychiatry websitePeer reviewedPublisher PD

    Light smoking at base-line predicts a higher mortality risk to women than to men; evidence from a cohort with long follow-up

    Get PDF
    BACKGROUND: There is conflicting evidence as to whether smoking is more harmful to women than to men. The UK Cotton Workers’ Cohort was recruited in the 1960s and contained a high proportion of men and women smokers who were well matched in terms of age, job and length of time in job. The cohort has been followed up for 42 years. METHODS: Mortality in the cohort was analysed using an individual relative survival method and Cox regression. Whether smoking, ascertained at baseline in the 1960s, was more hazardous to women than to men was examined by estimating the relative risk ratio women to men, smokers to never smoked, for light (1–14), medium (15–24), heavy (25+ cigarettes per day) and former smoking. RESULTS: For all-cause mortality relative risk ratios were 1.35 for light smoking at baseline (95% CI 1.07-1.70), 1.15 for medium smoking (95% CI 0.89-1.49) and 1.00 for heavy smoking (95% CI 0.63-1.61). Relative risk ratios for light smoking at baseline for circulatory system disease was 1.42 (95% CI 1.01 to 1.98) and for respiratory disease was 1.89 (95% CI 0.99 to 3.63). Heights of participants provided no explanation for the gender difference. CONCLUSIONS: Light smoking at baseline was shown to be significantly more hazardous to women than to men but the effect decreased as consumption increased indicating a dose response relationship. Heavy smoking was equally hazardous to both genders. This result may help explain the conflicting evidence seen elsewhere. However gender differences in smoking cessation may provide an alternative explanation

    Surfactant status and respiratory outcome in premature infants receiving late surfactant treatment.

    Get PDF
    BACKGROUND:Many premature infants with respiratory failure are deficient in surfactant, but the relationship to occurrence of bronchopulmonary dysplasia (BPD) is uncertain. METHODS:Tracheal aspirates were collected from 209 treated and control infants enrolled at 7-14 days in the Trial of Late Surfactant. The content of phospholipid, surfactant protein B, and total protein were determined in large aggregate (active) surfactant. RESULTS:At 24 h, surfactant treatment transiently increased surfactant protein B content (70%, p < 0.01), but did not affect recovered airway surfactant or total protein/phospholipid. The level of recovered surfactant during dosing was directly associated with content of surfactant protein B (r = 0.50, p < 0.00001) and inversely related to total protein (r = 0.39, p < 0.0001). For all infants, occurrence of BPD was associated with lower levels of recovered large aggregate surfactant, higher protein content, and lower SP-B levels. Tracheal aspirates with lower amounts of recovered surfactant had an increased proportion of small vesicle (inactive) surfactant. CONCLUSIONS:We conclude that many intubated premature infants are deficient in active surfactant, in part due to increased intra-alveolar metabolism, low SP-B content, and protein inhibition, and that the severity of this deficit is predictive of BPD. Late surfactant treatment at the frequency used did not provide a sustained increase in airway surfactant

    Environmental variables, habitat discontinuity and life history shaping the genetic structure of Pomatoschistus marmoratus

    Get PDF
    Coastal lagoons are semi-isolated ecosystems exposed to wide fluctuations of environmental conditions and showing habitat fragmentation. These features may play an important role in separating species into different populations, even at small spatial scales. In this study, we evaluate the concordance between mitochondrial (previous published data) and nuclear data analyzing the genetic variability of Pomatoschistus marmoratus in five localities, inside and outside the Mar Menor coastal lagoon (SE Spain) using eight microsatellites. High genetic diversity and similar levels of allele richness were observed across all loci and localities, although significant genic and genotypic differentiation was found between populations inside and outside the lagoon. In contrast to the FST values obtained from previous mitochondrial DNA analyses (control region), the microsatellite data exhibited significant differentiation among samples inside the Mar Menor and between lagoonal and marine samples. This pattern was corroborated using Cavalli-Sforza genetic distances. The habitat fragmentation inside the coastal lagoon and among lagoon and marine localities could be acting as a barrier to gene flow and contributing to the observed genetic structure. Our results from generalized additive models point a significant link between extreme lagoonal environmental conditions (mainly maximum salinity) and P. marmoratus genetic composition. Thereby, these environmental features could be also acting on genetic structure of coastal lagoon populations of P. marmoratus favoring their genetic divergence. The mating strategy of P. marmoratus could be also influencing our results obtained from mitochondrial and nuclear DNA. Therefore, a special consideration must be done in the selection of the DNA markers depending on the reproductive strategy of the species
    corecore