47 research outputs found

    Approximate Statistical Solutions to the Forensic Identification of Source Problem

    Get PDF
    Currently in forensic science, the statistical methods for solving the identification of source problems are inherently subjective and generally ad-hoc. The formal Bayesian decision framework provides the most statistically rigorous foundation for these problems to date. However, computing a solution under this framework, which relies on a Bayes Factor, tends to be computationally intensive and highly sensitive to the subjective choice of prior distributions for the parameters. Therefore, this dissertation aims to develop statistical solutions to the forensic identification of source problems which are less subjective, but which retain the statistical rigor of the Bayesian solution. First, this dissertation focuses on computational issues during the subjective quantification of the Bayes Factor, and on characterizing the numerical error associated with the resulting quantification. Secondly, the asymptotic properties of the Bayes Factor for a fixed set of unknown source evidence are considered as the number of control samples increases. Under the formal Bayesian paradigm, Doob’s Consistency Theorem implies that a Bayesian believes in the existence of a value of evidence analogous to a true likelihood ratio in the Frequentist paradigm. Finally, two approximations to the value of evidence for the forensic identification of source problems are derived relative to the existence of a true likelihood ratio. The first approximation is derived as a result of the Bernstein-von Mises Theorem. This Bernstein-von Mises approximation eliminates the determination of prior distributions for the parameters. Under suitable conditions, the Bernstein-von Mises approximation converges in probability to the Bayes Factor as the size of the control samples increases. However the Bernstein-von Mises approximation su↵ers from similar computational issues as the Bayes Factor. The second approximation is derived as a result of various theorems regarding the asymptotic properties of M-estimators. This Neyman-Pearson approximation requires no prior distributions, and is generally less computationally intractable. Under suitable conditions, the Neyman-Pearson approximation converges in probability to the true likelihood ratio as the number of control samples increases. In addition, the Neyman-Pearson approximation can replace the Bayes Factor in the forensic identification of source problems, and result in decisions that are approximately equivalent to using the Bayes Factor

    Session 8: \u3cem\u3eEnsemble of Score Likelihood Ratios for the common source problem\u3c/em\u3e

    Get PDF
    Machine learning-based Score Likelihood Ratios have been proposed as an alternative to traditional Likelihood Ratios and Bayes Factor to quantify the value of evidence when contrasting two opposing propositions. Under the common source problem, the opposing proposition relates to the inferential problem of assessing whether two items come from the same source. Machine learning techniques can be used to construct a (dis)similarity score for complex data when developing a traditional model is infeasible, and density estimation is used to estimate the likelihood of the scores under both propositions. In practice, the metric and its distribution are developed using pairwise comparisons constructed from a sample of the background population. Generating these comparisons results in a complex dependence structure violating assumptions fundamental to most methods. To remedy this lack of independence, we introduce a sampling approach to construct training and estimation sets where assumptions are met. Using these newly created datasets, we construct multiple base SLR systems and aggregate their information into a final score to quantify the value of evidence. Our experimental results show that this ensembled SLR can outperform traditional SLR in terms of the rate of misleading evidence, discriminatory power and show they are more reliable

    Two-Stage Approach for Forensic Handwriting Analysis

    Get PDF
    Trained experts currently perform the handwriting analysis required in the criminal justice field, but this can create biases, delays, and expenses, leaving room for improvement. Prior research has sought to address this by analyzing handwriting through feature-based and score-based likelihood ratios for assessing evidence within a probabilistic framework. However, error rates are not well defined within this framework, making it difficult to evaluate the method and can lead to making a greater-than-expected number of errors when applying the approach. This research explores a method for assessing handwriting within the Two-Stage framework, which allows for quantifying error rates as recommended by a federal report by PCAST (Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature Comparison Methods). The coincidence probabilities produced here can be used in later research to asses error rates using a ROC curve

    Session 8: \u3cem\u3eStatistical Discrimination Methods for Forensic Source Interpretation of Aluminum Powders in Explosives\u3c/em\u3e

    Get PDF
    Aluminum (Al) powder is often used as a fuel in explosive devices; therefore, individuals attempting to make illegal improvised explosive devices often obtain it from legitimate commercial products or make it themselves using readily available Al starting materials. The characterization and differentiation between sources of Al powder for additional investigative and intelligence value has become increasingly important. Previous research modeled the distributions of micromorphometric features of Al powder particles within a subsample to support Al source discrimination. Since then, additional powder samples from a variety of different source types have been obtained and analyzed, providing a more comprehensive dataset for applying the two statistical methods for interpretation and discrimination of source. Here, we compare two different statistical techniques: one using linear discriminant analysis (LDA), and the other using a modification to the method used in ASTM E2927-16e1 and E2330-19. The LDA method results in an Al source classification for each questioned sample. Alternatively, our modification to the ASTM method uses an interval-based match criterion to associate or exclude each of the known sources as the actual source of a trace. Although the outcomes of these two statistical methods are fundamentally different, their performance with respect to the closed-set identification of source problem is compared. Additionally, the modified ASTM method will be adapted to provide a vector of scores in lieu of the binary decision as the first step towards a score-based likelihood ratio for interpreting Al powder micromorphometric measurement data

    Statistical Analysis of Handwriting: Probabilistic Outcomes for Closed-Set Writer Identification.

    Get PDF
    Learning Overview: The goal of this presentation is to provide insights into features of handwritten documents that are important for statistical modeling with the task of writer identification

    Development of Strategies for Estimating a Response Surface to Characterize a Black-box Algorithm in Terms of a White-box Algorithm

    Get PDF
    In forensic identification of source problems, there is an increasing lack of explainability of the complex black-box algorithms for the assignment of evidential value. Generally speaking, black-box algorithms are designed with prediction in mind. Although the information fed into the algorithm and the features used to make the prediction are often known to the user, the complexity of the algorithm limits the ability of the end user to understand how the input features are used. On the other hand, more transparent algorithms (sometimes referred to as “white-box”) are typically less accurate even if they provide direct information on how the input object is directly used for predicting a class or outcome. In this work, we begin the development on a response surface that characterizes the output of a black-box algorithm with the output of a white-box algorithm. Using a set of handwriting samples, we use a complex black-box algorithm across multiple features to produce a set of pairwise scores and a simple, transparent algorithm that uses individual features to produce another set of pairwise scores. A generalized least squares method is used to test the null hypothesis that there is no relationship between the two types of scores. The outcome of the significance tests helps to determine which of the individual feature scores have an influence on the black-box scores

    Statistical Analysis of Handwriting for Writer Identification

    Get PDF
    Posted with permission of CSAFE.</p
    corecore