1,132 research outputs found

    The Critical Role of Statistics in Demostrating the Reliability of Expert Evidence

    Get PDF
    Federal Rule of Evidence 702, which covers testimony by expert witnesses, allows a witness to testify “in the form of an opinion or otherwise” if “the testimony is based on sufficient facts or data” and “is the product of reliable principles and methods” that have been “reliably applied.” The determination of “sufficient” (facts or data) and whether the “reliable principles and methods” relate to the scientific question at hand involve more discrimination than the current Rule 702 may suggest. Using examples from latent fingerprint matching and trace evidence (bullet lead and glass), I offer some criteria that scientists often consider in assessing the “trustworthiness” of evidence to enable courts to better distinguish between “trustworthy” and “questionable” evidence. The codification of such criteria may ultimately strengthen the current Rule 702 so courts can better distinguish between demonstrably scientific sufficiency and “opinion” based on inadequate (or inappurtenant) methods

    Statistical Assessment of the Significance of Fracture Fits in Trace Evidence

    Get PDF
    Fracture fits are often regarded as the highest degree of association of trace materials due to the common belief that inherently random fracturing events produce individualizing patterns. Often referred to as physical matches, fracture matches, or physical fits, these assessments consist of the realignment of two or more items with distinctive features and edge morphologies to demonstrate they were once part of the same object. Separated materials may provide a valuable link between items, individuals, or locations in forensic casework in a variety of criminal situations. Physical fit examinations require the use of the examiner’s judgment, which rarely can be supported by a quantifiable uncertainty or vastly reported error rates. Therefore, there is a need to develop, validate, and standardize fracture fit examination methodology and respective interpretation protocols. This research aimed to develop systematic methods of examination and quantitative measures to assess the significance of trace evidence physical fits. This was facilitated through four main objectives: 1) an in-depth review manuscript consisting of 112 case reports, fractography studies, and quantitative-based studies to provide an organized summary establishing the current physical fit research base, 2) a pilot inter-laboratory study of a systematic, score-based technique previously developed by our research group for evaluation of duct tape physical fit pairs and referred as the Edge Similarity Score (ESS), 3) the initial expansion of ESS methodology into textile materials, and 4) an expanded optimization and evaluation study of X-ray Fluorescence (XRF) Spectroscopy for electrical tape backing analysis, for implementation in an amorphous material of which physical fits may not be feasible due to lack of distinctive features. Objective 1 was completed through a large-scale literature review and manuscript compilation of 112 fracture fit reports and research studies. Literature was evaluated in three overall categories: case reports, fractography or qualitative-based studies, and quantitative-based studies. In addition, 12 standard operating protocols (SOP) provided by various state and federal-level forensic laboratories were reviewed to provide an assessment of current physical fit practice. A review manuscript was submitted to Forensic Science International and has been accepted for publication. This manuscript provides for the first time, a literature review of physical fits of trace materials and served as the basis for this project. The pilot inter-laboratory study (Objective 2) consisted of three study kits, each consisting of 7 duct tape comparison pairs with a ground truth of 4 matching pairs (3 of expected M+ qualifier range, 1 of the more difficult M- range) and 3 non-matching pairs (NM). The kits were distributed as a Round Robin study resulting in 16 overall participants and 112 physical fit comparisons. Prior to kit distribution, a consensus on each sample’s ESS was reached between 4 examiners with an agreement criterion of better than ± 10% ESS. Along with the physical comparison pairs, the study iii included a brief, post-study survey allowing the distributors to receive feedback on the participants’ opinions on method ease of use and practicality. No misclassifications were observed across all study kits. The majority (86.6%) of reported ESS scores were within ± 20 ESS compared to consensus values determined before the administration of the test. Accuracy ranged from 88% to 100%, depending on the criteria used for evaluation of the error rates. In addition, on average, 77% of ESS attributed no significant differences from the respective pre-distribution, consensus mean scores when subjected to ANOVA-Dunnett’s analysis using the level of difficulty as blocking variables. These differences were more often observed on sets of higher difficulty (M-, 5 out of 16 participants, or 31%) than on lower difficulty sets (M+ or M-, 3 out of 16 participants, or 19%). Three main observations were derived from the participant results: 1) overall good agreement between ESS reported by examiners was observed, 2) the ESS score represented a good indicator of the quality of the match and rendered low percent of error rates on conclusions 3) those examiners that did not participate in formal method training tended to have ESS falling outside of expected pre-distribution ranges. This interlaboratory study serves as an important precedent, as it represents the largest inter-laboratory study ever reported using a quantitative assessment of physical fits of duct tapes. In addition, the study provides valuable insights to move forward with the standardization of protocols of examination and interpretation. Objective 3 consisted of a preliminary study on the assessment of 274 total comparisons of stabbed (N=100) and hand-torn (N=174) textile pairs as completed by two examiners. The first 74 comparisons resulted in a high incidence of false exclusions (63%) on textiles prone to distortion, revealing the need to assess suitability prior to physical fit examination of fabrics. For the remaining dataset, five clothing items were subject to fracture of various textile composition and construction. The overall set consisted of 100 comparison pairs, 20 per textile item, 10 each per separation method of stabbed or hand-torn fractured edges, each examined by two analysts. Examiners determined ESS through the analysis of 10 bins of equal divisions of the total fracture edge length. A weighted ESS was also determined with the addition of three optional weighting factors per bin due to the continuation of a pattern, separation characteristics (i.e. damage or protrusions/gaps), or partial pattern fluorescence across the fractured edges. With the addition of a weighted ESS, a rarity ratio was determined as the ratio between the weighted ESS and non-weighted ESS. In addition, the frequency of occurrence of all noted distinctive characteristics leading to the addition of a weighting factor by the examiner was determined. Overall, 93% accuracy was observed for the hand-torn set while 95% accuracy was observed for the stabbed set. Higher misclassification in the hand-torn set was observed in textile items of either 100% polyester composition or jersey knit construction, as higher elasticity led to greater fracture edge distortion. In addition, higher misclassification was observed in the stabbed set for those textiles of no pattern as the stabbed edges led to straight, featureless bins often only associated due to pattern continuation. The results of this study are anticipated to provide valuable knowledge for the future development of protocols for evaluation of relevant features of textile fractures and assessments of the suitability for fracture fit comparisons. Finally, the XRF methodology optimization and evaluation study (Objective 4) expanded upon our group’s previous discrimination studies by broadening the total sample set of characterized iv tapes and evaluating the use of spectral overlay, spectral contrast angle, and Quadratic Discriminant Analysis (QDA) for the comparison of XRF spectra. The expanded sample set consisted of 114 samples, 94 from different sources, and 20 from the same roll. Twenty sections from the same roll were used to assess intra-roll variability, and for each sample, replicate measurements on different locations of the tape were analyzed (n=3) to assess the intra-sample variability. Inter-source variability was evaluated through 94 rolls of tapes of a variety of labeled brands, manufacturers, and product names. Parameter optimization included a comparison of atmospheric conditions, collection times, and instrumental filters. A study of the effects of adhesive and backing thickness on spectrum collection revealed key implications to the method that required modification to the sample support material Figures of merit assessed included accuracy and discrimination over time, precision, sensitivity, and selectivity. One of the most important contributions of this study is the proposal of alternative objective methods of spectral comparisons. The performance of different methods for comparing and contrasting spectra was evaluated. The optimization of this method was part of an assessment to incorporate XRF to a forensic laboratory protocol for rapid, highly informative elemental analysis of electrical tape backings and to expand examiners’ casework capabilities in the circumstance that a physical fit conclusion is limited due to the amorphous nature of electrical tape backings. Overall, this work strengthens the fracture fit research base by further developing quantitative methodologies for duct tape and textile materials and initiating widespread distribution of the technique through an inter-laboratory study to begin steps towards laboratory implementation. Additional projects established the current state of forensic physical fit to provide the foundation from which future quantitative work such as the studies presented here must grow and provided highly sensitive techniques of analysis for materials that present limited fracture fit capabilities

    A Class of Regression Models for Pairwise Comparisons of Forensic Handwriting Comparison Systems

    Get PDF
    Handwriting analysis is a complex field largely living in forensic science and the legal realm. One task of a forensic document examiner (FDE) may be to determine the writer(s) of handwritten documents. Automated identification systems (AIS) were built to aid FDEs in their examinations. Part of the uses of these AIS (such as FISH[5] [7],WANDA [6], CEDAR-FOX [17], and FLASHIDÂź2) are tomeasure features about a handwriting sample and to provide the user with a numeric value of the evidence. These systems use their own algorithms and definitions of features to quantify the writing and can be considered a black-box. The outputs of two AIS are used to compare to the results of a survey of FDE writership opinions. In this dissertation I will be focusing on the development of a response surface that characterizes the feature outputs of AIS outputs. Using a set of handwriting samples, a pairwise metric, or scoring method, is applied to each of the individual features provided by the AIS to produce sets of pairwise scores. The pairwise scores lead to a degenerate U-statistic. We use a generalized least squares method to test the null hypothesis that there is no relationship between two metrics (ÎČ1 = 0.) Monte Carlo simulations are developed and ran to ensure the results, considering the structure of the pairwisemetric, behave under the null hypothesis, and to ensure the modeling will catch a relationship under the alternative hypothesis. The outcome of the significance tests helps to determine which of the metrics are related to each other

    Geochemical wolframite fingerprinting - the likelihood ratio approach for laser ablation ICP-MS data

    Get PDF
    Wolframite has been specified as a ‘conflict mineral’ by a U.S. Government Act, which obliges companies that use these minerals to report their origin. Minerals originating from conflict regions in the Democratic Republic of the Congo shall be excluded from the market as their illegal mining, trading, and taxation are supposed to fuel ongoing violent conflicts. The German Federal Institute for Geosciences and Natural Resources (BGR) developed a geochemical fingerprinting method for wolframite based on laser ablation inductively coupled plasma-mass spectrometry. Concentrations of 46 elements in about 5300 wolframite grains from 64 mines were determined. The issue of verifying the declared origins of the wolframite samples may be framed as a forensic problem by considering two contrasting hypotheses: the examined sample and a sample collected from the declared mine originate from the same mine (H 1 ), and the two samples come from different mines (H 2 ). The solution is found using the likelihood ratio (LR) theory. On account of the multidimensionality, the lack of normal distribution of data within each sample, and the huge within-sample dispersion in relation to the dispersion between samples, the classic LR models had to be modified. Robust principal component analysis and linear discriminant analysis were used to characterize samples. The similarity of two samples was expressed by Kolmogorov-Smirnov distances, which were interpreted in view of H 1 and H 2 hypotheses within the LR framework. The performance of the models, controlled by the levels of incorrect responses and the empirical cross entropy, demonstrated that the proposed LR models are successful in verifying the authenticity of the wolframite samples

    Likelihood ratio calibration in a transparent and testable forensic speaker recognition framework

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. D. Ramos, J. GonzĂĄlez-RodrĂ­guez, J. Ortega-garcĂ­a, "Likelihood Ratio Calibration in a Transparent and Testable Forensic Speaker Recognition Framework " in The Speaker and Language Recognition Workshop, ODYSSEY, San Juan (Puerto Rico), 2006, 1 - 8A recently reopened debate about the infallibility of some classical forensic disciplines is leading to new requirements in forensic science. Standardization of procedures, proficiency testing, transparency in the scientific evaluation of the evidence and testability of the system and protocols are emphasized in order to guarantee the scientific objectivity of the procedures. Those ideas will be exploited in this paper in order to walk towards an appropriate framework for the use of forensic speaker recognition in courts. Evidence is interpreted using the Bayesian approach for the analysis of the evidence, as a scientific and logical methodology, in a two-stage approach based in the similarity-typicality pair, which facilitates the transparency in the process. The concept of calibration as a way of reporting reliable and accurate opinions is also deeply addressed, presenting experimental results which illustrate its effects. The testability of the system is then accomplished by the use of the NIST SRE 2005 evaluation protocol. Recently proposed application-independent evaluation techniques (Cllr and APE curves) are finally addressed as a proper way for presenting results of proficiency testing in courts, as these evaluation metrics clearly show the influence of calibration errors in the accuracy of the inferential decision processThis work has been supported by the Spanish Ministry for Science and Technology under project TIC2003-09068-C02-01

    Euclidean distances as measures of speaker similarity including identical twin pairs: a forensic investigation using source and filter voice characteristics

    Get PDF
    AbstractThere is a growing consensus that hybrid approaches are necessary for successful speaker characterization in Forensic Speaker Comparison (FSC); hence this study explores the forensic potential of voice features combining source and filter characteristics. The former relate to the action of the vocal folds while the latter reflect the geometry of the speaker’s vocal tract. This set of features have been extracted from pause fillers, which are long enough for robust feature estimation while spontaneous enough to be extracted from voice samples in real forensic casework. Speaker similarity was measured using standardized Euclidean Distances (ED) between pairs of speakers: 54 different-speaker (DS) comparisons, 54 same-speaker (SS) comparisons and 12 comparisons between monozygotic twins (MZ). Results revealed that the differences between DS and SS comparisons were significant in both high quality and telephone-filtered recordings, with no false rejections and limited false acceptances; this finding suggests that this set of voice features is highly speaker-dependent and therefore forensically useful. Mean ED for MZ pairs lies between the average ED for SS comparisons and DS comparisons, as expected according to the literature on twin voices. Specific cases of MZ speakers with very high ED (i.e. strong dissimilarity) are discussed in the context of sociophonetic and twin studies. A preliminary simplification of the Vocal Profile Analysis (VPA) Scheme is proposed, which enables the quantification of voice quality features in the perceptual assessment of speaker similarity, and allows for the calculation of perceptual–acoustic correlations. The adequacy of z-score normalization for this study is also discussed, as well as the relevance of heat maps for detecting the so-called phantoms in recent approaches to the biometric menagerie

    Evidence evaluation in craniofacial superimposition using likelihood ratios

    Get PDF
    Craniofacial Superimposition is a forensic identification technique that supports decision-making when skeletal remains are involved. It is based on the analysis of the overlapping of a post-mortem skull with antemortem facial photographs. Despite its importance and wide applicability, the process remains complex and challenging. To address this, computerized methods have been proposed, but subjectivity and qualitative reporting persist in decision-making. This study introduces an evidence evaluation system proposal based on Likelihood Ratios, previously used in other forensic fields, such as DNA, voice, fingerprint, and facial comparison. We present a novel application of this framework to Craniofacial Superimposition. Our work comprises three experiments in which our LR system is trained and tested under distinct conditions concerning facial images: the first utilizes frontal facial photographs; the second employs lateral facial photographs; and the last one integrates both frontal and lateral facial photographs. In the three experiments, the proposed LR system stands out in terms of calibration and discriminating power, providing practitioners with a quantitative tool for evidence evaluation and integration. However, the lack of massive actual data obliged us to focus our study on synthetic data only. Therefore, it should be considered a proof of concept. Nevertheless, the resulting likelihood-ratio system offers objective decision support in Craniofacial Superimposition. Further studies are required to validate in a real scenario the conclusions achieved.R&D project CONFIA (grant PID2021-122916NB-I00), funded by MICIU/AEI/10.13039/ 501100011033 and by ERDF/EU - ‘‘ERDF A way of making Europe’’Grant FORAGE (B-TIC-456-UGR20) funded by ConsejerĂ­a de Universidad, InvestigaciĂłn e InnovaciĂłn and by ‘‘ERDF A way of making Europe’’Miss MartĂ­nez-Moreno is supported by grant PRE2022-102029 funded by MICIU/AEI/10.13039/501100011033 and the FSE+Dr. Valsecchi’s work is supported by Red.es under grant Skeleton-ID2.0 (2021/C005/00141299)Dr. Ibåñez’s work is funded by the Spanish Ministry of Science, Innovation and Universities under grant RYC2020-029454-I and by Xunta de Galicia, Spain by grant ED431F 2022/21Funding for open access charge: Universidad de Granada / CBU

    Development and Properties of Kernel-based Methods for the Interpretation and Presentation of Forensic Evidence

    Get PDF
    The inference of the source of forensic evidence is related to model selection. Many forms of evidence can only be represented by complex, high-dimensional random vectors and cannot be assigned a likelihood structure. A common approach to circumvent this is to measure the similarity between pairs of objects composing the evidence. Such methods are ad-hoc and unstable approaches to the judicial inference process. While these methods address the dimensionality issue they also engender dependencies between scores when 2 scores have 1 object in common that are not taken into account in these models. The model developed in this research captures the dependencies between pairwise scores from a hierarchical sample and models them in the kernel space using a linear model. Our model is flexible to accommodate any kernel satisfying basic conditions and as a result is applicable to any type of complex high-dimensional data. An important result of this work is the asymptotic multivariate normality of the scores as the data dimension increases. As a result, we can: 1) model very high-dimensional data when other methods fail; 2) determine the source of multiple samples from a single trace in one calculation. Our model can be used to address high-dimension model selection problems in different situations and we show how to use it to assign Bayes factors to forensic evidence. We will provide examples of real-life problems using data from very small particles and dust analyzed by SEM/EDX, and colors of fibers quantified by microspectrophotometry
    • 

    corecore