91 research outputs found

    Error Rates for Unvalidated Medical Age Assessment Procedures

    Get PDF
    During 2014-15 Sweden received asylum applications from more than 240.000 people, of which more than 40.000 were termed unaccompanied minors. In a large number of cases, claims by asylum seekers of being below 18 years were not trusted by Swedish authorities. To handle the situation, the Swedish national board of forensic medicine (R\"attsmedicinalverket, RMV) was assigned by the government to create a centralized system for medical age assessments. RMV introduced a procedure including two biological age indicators; x-ray of the third molars and magnetic resonance imaging of the distal femoral epiphysis. In 2017 a total of 9617 males and 337 females were subjected to this procedure. No validation study for the procedure was however published, and the observed number of cases with different maturity combinations in teeth and femur were unexpected given the claims originally made by RMV. Such unexpected results might be caused by systematic errors and need to be analysed thoroughly. In the present paper we present a general stochastic model enabling us to study which combinations of age indicator model parameters and age population profiles are consistent with the observed 2017 data for males. We find that, contrary to some RMV claims, maturity of the femur, as observed by RMV, appears on average well before maturity of teeth. Although results naturally contain much uncertainty, we find that classification error rates for certain groups who based on the RMV procedure are classified as above 18 years may be around 10-30%, possibly as high as 50%

    Using Object Oriented Bayesian Networks to Model Linkage, Linkage Disequilibrium and Mutations between STR Markers

    Get PDF
    In a number of applications there is a need to determine the most likely pedigree for a group of persons based on genetic markers. Adequate models are needed to reach this goal. The markers used to perform the statistical calculations can be linked and there may also be linkage disequilibrium (LD) in the population. The purpose of this paper is to present a graphical Bayesian Network framework to deal with such data. Potential LD is normally ignored and it is important to verify that the resulting calculations are not biased. Even if linkage does not influence results for regular paternity cases, it may have substantial impact on likelihood ratios involving other, more extended pedigrees. Models for LD influence likelihoods for all pedigrees to some degree and an initial estimate of the impact of ignoring LD and/or linkage is desirable, going beyond mere rules of thumb based on marker distance. Furthermore, we show how one can readily include a mutation model in the Bayesian Network; extending other programs or formulas to include such models may require considerable amounts of work and will in many case not be practical. As an example, we consider the two STR markers vWa and D12S391. We estimate probabilities for population haplotypes to account for LD using a method based on data from trios, while an estimate for the degree of linkage is taken from the literature. The results show that accounting for haplotype frequencies is unnecessary in most cases for this specific pair of markers. When doing calculations on regular paternity cases, the markers can be considered statistically independent. In more complex cases of disputed relatedness, for instance cases involving siblings or so-called deficient cases, or when small differences in the LR matter, independence should not be assumed. (The networks are freely available at http://arken.umb.no/~dakl/BayesianNetwor​ks.

    Improved computations for relationship inference using low-coverage sequencing data

    Get PDF
    Pedigree inference, for example determining whether two persons are second cousins or unrelated, can be done by comparing their genotypes at a selection of genetic markers. When the data for one or more of the persons is from low-coverage next generation sequencing (lcNGS), currently available computational methods either ignore genetic linkage or do not take advantage of the probabilistic nature of lcNGS data, relying instead on first estimating the genotype. We provide a method and software (see familias.name/lcNGS) bridging the above gap. Simulations indicate how our results are considerably more accurate compared to some previously available alternatives. Our method, utilizing a version of the Lander-Green algorithm, uses a group of symmetries to speed up calculations. This group may be of further interest in other calculations involving linked loci

    Slope and generalization properties of neural networks

    Get PDF
    Neural networks are very successful tools in for example advanced classification. From a statistical point of view, fitting a neural network may be seen as a kind of regression, where we seek a function from the input space to a space of classification probabilities that follows the "general" shape of the data, but avoids overfitting by avoiding memorization of individual data points. In statistics, this can be done by controlling the geometric complexity of the regression function. We propose to do something similar when fitting neural networks by controlling the slope of the network. After defining the slope and discussing some of its theoretical properties, we go on to show empirically in examples, using ReLU networks, that the distribution of the slope of a well-trained neural network classifier is generally independent of the width of the layers in a fully connected network, and that the mean of the distribution only has a weak dependence on the model architecture in general. The slope is of similar size throughout the relevant volume, and varies smoothly. It also behaves as predicted in rescaling examples. We discuss possible applications of the slope concept, such as using it as a part of the loss function or stopping criterion during network training, or ranking data sets in terms of their complexity

    Exact spectral norm regularization for neural networks

    Get PDF
    We pursue a line of research that seeks to regularize the spectral norm of the Jacobian of the input-output mapping for deep neural networks. While previous work rely on upper bounding techniques, we provide a scheme that targets the exactspectral norm. We showcase that our algorithm achieves an improved generalization performance compared to previous spectral regularization techniques while simultaneously maintaining a strong safeguard against natural and adversarialnoise. Moreover, we further explore some previous reasoning concerning the strong adversarial protection that Jacobian regularization provides and show that it can be misleading

    Response to: DNA identification by pedigree likelihood ratio accommodating population substructure and mutations.

    Get PDF
    Mutation models are important in many areas of genetics including forensics. This letter criticizes the model of the paper 'DNA identification by pedigree likelihood ratio accommodating population substructure and mutations' by Ge et al. (2010). Furthermore, we argue that the paper in some cases misrepresents previously published papers.Please see related letter: http://www.investigativegenetics.com/content/2/1/8.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    Transcriptome analysis of a respiratory Saccharomyces cerevisiae strain suggests the expression of its phenotype is glucose insensitive and predominantly controlled by Hap4, Cat8 and Mig1

    Get PDF
    BACKGROUND: We previously described the first respiratory Saccharomyces cerevisiae strain, KOY.TM6*P, by integrating the gene encoding a chimeric hexose transporter, Tm6*, into the genome of an hxt null yeast. Subsequently we transferred this respiratory phenotype in the presence of up to 50 g/L glucose to a yeast strain, V5 hxt1-7Delta, in which only HXT1-7 had been deleted. In this study, we compared the transcriptome of the resultant strain, V5.TM6*P, with that of its wild-type parent, V5, at different glucose concentrations. RESULTS: cDNA array analyses revealed that alterations in gene expression that occur when transitioning from a respiro-fermentative (V5) to a respiratory (V5.TM6*P) strain, are very similar to those in cells undergoing a diauxic shift. We also undertook an analysis of transcription factor binding sites in our dataset by examining previously-published biological data for Hap4 (in complex with Hap2, 3, 5), Cat8 and Mig1, and used this in combination with verified binding consensus sequences to identify genes likely to be regulated by one or more of these. Of the induced genes in our dataset, 77% had binding sites for the Hap complex, with 72% having at least two. In addition, 13% were found to have a binding site for Cat8 and 21% had a binding site for Mig1. Unexpectedly, both the up- and down-regulation of many of the genes in our dataset had a clear glucose dependence in the parent V5 strain that was not present in V5.TM6*P. This indicates that the relief of glucose repression is already operable at much higher glucose concentrations than is widely accepted and suggests that glucose sensing might occur inside the cell. CONCLUSION: Our dataset gives a remarkably complete view of the involvement of genes in the TCA cycle, glyoxylate cycle and respiratory chain in the expression of the phenotype of V5.TM6*P. Furthermore, 88% of the transcriptional response of the induced genes in our dataset can be related to the potential activities of just three proteins: Hap4, Cat8 and Mig1. Overall, our data support genetic remodelling in V5.TM6*P consistent with a respiratory metabolism which is insensitive to external glucose concentrations

    Empirical Bayes models for multiple probe type microarrays at the probe level

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>When analyzing microarray data a primary objective is often to find differentially expressed genes. With empirical Bayes and penalized t-tests the sample variances are adjusted towards a global estimate, producing more stable results compared to ordinary t-tests. However, for Affymetrix type data a clear dependency between variability and intensity-level generally exists, even for logged intensities, most clearly for data at the probe level but also for probe-set summarizes such as the MAS5 expression index. As a consequence, adjustment towards a global estimate results in an intensity-level dependent false positive rate.</p> <p>Results</p> <p>We propose two new methods for finding differentially expressed genes, Probe level Locally moderated Weighted median-t (PLW) and Locally Moderated Weighted-t (LMW). Both methods use an empirical Bayes model taking the dependency between variability and intensity-level into account. A global covariance matrix is also used allowing for differing variances between arrays as well as array-to-array correlations. PLW is specially designed for Affymetrix type arrays (or other multiple-probe arrays). Instead of making inference on probe-set summaries, comparisons are made separately for each perfect-match probe and are then summarized into one score for the probe-set.</p> <p>Conclusion</p> <p>The proposed methods are compared to 14 existing methods using five spike-in data sets. For RMA and GCRMA processed data, PLW has the most accurate ranking of regulated genes in four out of the five data sets, and LMW consistently performs better than all examined moderated t-tests when used on RMA, GCRMA, and MAS5 expression indexes.</p

    Some Applications of Bayesian Statistics

    No full text
    This paper is intended as an introduction to Bayesian statistics for mathematicians who have no or very little previous experience with the subject. We start with a rather philosophical presentation of central concepts, as the philosophical approach to statistics differs from the standard approach of frequentist statistics. We also define probability distributions in a non-standard way, mostly to illustrate how "integration constants" can often be disregarded in Bayesian statistics, simplifying computations. We continue with a quick presentation of some central computational methods, followed by three longer examples of Bayesian data analysis and modelling. The goal is to give an impression of the applicability of the general concepts, and hopefully spark ideas for new applications in the reader

    Some Applications of Bayesian Statistics

    No full text
    This paper is intended as an introduction to Bayesian statistics for mathematicians who have no or very little previous experience with the subject. We start with a rather philosophical presentation of central concepts, as the philosophical approach to statistics differs from the standard approach of frequentist statistics. We also define probability distributions in a non-standard way, mostly to illustrate how "integration constants" can often be disregarded in Bayesian statistics, simplifying computations. We continue with a quick presentation of some central computational methods, followed by three longer examples of Bayesian data analysis and modelling. The goal is to give an impression of the applicability of the general concepts, and hopefully spark ideas for new applications in the reader
    corecore