547 research outputs found

    Inferring the Latent Incidence of Inefficiency from DEA Estimates and Bayesian Priors

    Get PDF
    Data envelopment analysis (DEA) is among the most popular empirical tools for measuring cost and productive efficiency. Because DEA is a linear programming technique, establishing formal statistical properties for outcomes is difficult. We show that the incidence of inefficiency within a population of Decision Making Units (DMUs) is a latent variable, with DEA outcomes providing only noisy sample-based categorizations of inefficiency. We then use a Bayesian approach to infer an appropriate posterior distribution for the incidence of inefficient DMUs based on a random sample of DEA outcomes and a prior distribution on the incidence of inefficiency. The methodology applies to both finite and infinite populations, and to sampling DMUs with and without replacement, and accounts for the noise in the DEA characterization of inefficiency within a coherent Bayesian approach to the problem. The result is an appropriately up-scaled, noise-adjusted inference regarding the incidence of inefficiency in a population of DMUs.Data Envelopment Analysis, latent inefficiency, Bayesian inference,Beta priors, posterior incidence of inefficiency

    Face Liveness Detection under Processed Image Attacks

    Get PDF
    Face recognition is a mature and reliable technology for identifying people. Due to high-definition cameras and supporting devices, it is considered the fastest and the least intrusive biometric recognition modality. Nevertheless, effective spoofing attempts on face recognition systems were found to be possible. As a result, various anti-spoofing algorithms were developed to counteract these attacks. They are commonly referred in the literature a liveness detection tests. In this research we highlight the effectiveness of some simple, direct spoofing attacks, and test one of the current robust liveness detection algorithms, i.e. the logistic regression based face liveness detection from a single image, proposed by the Tan et al. in 2010, against malicious attacks using processed imposter images. In particular, we study experimentally the effect of common image processing operations such as sharpening and smoothing, as well as corruption with salt and pepper noise, on the face liveness detection algorithm, and we find that it is especially vulnerable against spoofing attempts using processed imposter images. We design and present a new facial database, the Durham Face Database, which is the first, to the best of our knowledge, to have client, imposter as well as processed imposter images. Finally, we evaluate our claim on the effectiveness of proposed imposter image attacks using transfer learning on Convolutional Neural Networks. We verify that such attacks are more difficult to detect even when using high-end, expensive machine learning techniques

    Repeatability and Reproducibility of Decisions by Latent Fingerprint Examiners

    Get PDF
    The interpretation of forensic fingerprint evidence relies on the expertise of latent print examiners. We tested latent print examiners on the extent to which they reached consistent decisions. This study assessed intra-examiner repeatability by retesting 72 examiners on comparisons of latent and exemplar fingerprints, after an interval of approximately seven months; each examiner was reassigned 25 image pairs for comparison, out of total pool of 744 image pairs. We compare these repeatability results with reproducibility (inter-examiner) results derived from our previous study. Examiners repeated 89.1% of their individualization decisions, and 90.1% of their exclusion decisions; most of the changed decisions resulted in inconclusive decisions. Repeatability of comparison decisions (individualization, exclusion, inconclusive) was 90.0% for mated pairs, and 85.9% for nonmated pairs. Repeatability and reproducibility were notably lower for comparisons assessed by the examiners as “difficult” than for “easy” or “moderate” comparisons, indicating that examiners' assessments of difficulty may be useful for quality assurance. No false positive errors were repeated (n = 4); 30% of false negative errors were repeated. One percent of latent value decisions were completely reversed (no value even for exclusion vs. of value for individualization). Most of the inter- and intra-examiner variability concerned whether the examiners considered the information available to be sufficient to reach a conclusion; this variability was concentrated on specific image pairs such that repeatability and reproducibility were very high on some comparisons and very low on others. Much of the variability appears to be due to making categorical decisions in borderline cases

    Identifying Phenotypes Based on TCR Repertoire Using Machine Learning Methods

    Get PDF
    The adaptive immune system can prevent human beings being infected by pathogens. T cells, a kind of lymphocytes in the adaptive immunity, recognise antigens by T cell receptors (TCRs) and then generate cell-mediated immune responses. After primary immune responses, the adaptive immunity can generate corresponding immunological memory. TCRs are generated by a process of somatic gene rearrangement and therefore have high diversity. An individual's TCR repertoire can reveal his pathogen exposure history, which can assist in biological studies such as disease diagnosis. This master thesis targets to make predictions about phenotype statuses based on high-throughput TCR sequencing data using machine learning approaches, to see how accurate the phenotype identification based on TCR repertoire can be. The raw TCR data is preprocessed in three different ways and then proceed the next steps separately. Several feature selection approaches are applied to obtain the most important TCRs. The machine learning algorithms including Beta-binomial model (baseline), Logistic regression, Random forest and a Boosting algorithm LightGBM are trained and evaluated. Two datasets, Cytomegalovirus (CMV) and rheumatoid arthritis (RA), are explored. For the CMV dataset, Random forest performs best, even though only a little bit better than the baseline model. However, the classification results of the RA dataset are not so good whatever models used, and the best classifier is LightGBM. The results imply that the TCR data needs to be large enough to make powerful predictions. Using a sufficiently large dataset, the prediction ability of the baseline model is great, and there may exist certain algorithms such as Random forest outperform it

    Equivalence testing for identity authentication using pulse waves from photoplethysmograph

    Get PDF
    Doctor of PhilosophyDepartment of StatisticsSuzanne DubnickaChristopher VahlPhotoplethysmograph sensors use a light-based technology to sense the rate of blood flow as controlled by the heart’s pumping action. This allows for a graphical display of a patient’s pulse wave form and the description of its key features. A person’s pulse wave has been proposed as a tool in a wide variety of applications. For example, it could be used to diagnose the cause of coldness felt in the extremities or to measure stress levels while performing certain tasks. It could also be applied to quantify the risk of heart disease in the general population. In the present work, we explore its use for identity authentication. First, we visualize the pulse waves from individual patients using functional boxplots which assess the overall behavior and identify unusual observations. Functional boxplots are also shown to be helpful in preprocessing the data by shifting individual pulse waves to a proper starting point. We then employ functional analysis of variance (FANOVA) and permutation tests to demonstrate that the identities of a group of subjects could be differentiated and compared by their pulse wave forms. One of the primary tasks of the project is to confirm the identity of a person, i.e., we must decide if a given person is whom they claim to be. We used an equivalence test to determine whether the pulse wave of the person under verification and the actual person were close enough to be considered equivalent. A nonparametric bootstrap functional equivalence test was applied to evaluate equivalence by constructing point-wise confidence intervals for the metric of identity assurance. We also proposed new testing procedures, including the way of building the equivalence hypothesis and test statistics, determination of evaluation range and equivalence bands, to authenticate the identity

    Maritime threat response

    Get PDF
    This report was prepared by Systems Engineering and Analysis Cohort Nine (SEA-9) Maritime Threat Response, (MTR) team members.Background: The 2006 Naval Postgraduate School (NPS) Cross-Campus Integrated Study, titled “Maritime Threat Response” involved the combined effort of 7 NPS Systems Engineering students, 7 Singaporean Temasek Defense Systems Institute (TDSI) students, 12 students from the Total Ship Systems Engineering (TSSE) curriculum, and numerous NPS faculty members from different NPS departments. After receiving tasking provided by the Wayne E. Meyer Institute of Systems Engineering at NPS in support of the Office of the Assistant Secretary of Defense for Homeland Defense, the study examined ways to validate intelligence and respond to maritime terrorist attacks against United States coastal harbors and ports. Through assessment of likely harbors and waterways to base the study upon, the San Francisco Bay was selected as a representative test-bed for the integrated study. The NPS Systems Engineering and Analysis Cohort 9 (SEA-9) Maritime Threat Response (MTR) team, in conjunction with the TDSI students, used the Systems Engineering Lifecycle Process (SELP) [shown in Figure ES-1, p. xxiii ] as a systems engineering framework to conduct the multi-disciplinary study. While not actually fabricating any hardware, such a process was well-suited for tailoring to the team’s research efforts and project focus. The SELP was an iterative process used to bound and scope the MTR problem, determine needs, requirements, functions, and to design architecture alternatives to satisfy stakeholder needs and desires. The SoS approach taken [shown in Figure ES-2, p. xxiv ]enabled the team to apply a systematic approach to problem definition, needs analysis, requirements, analysis, functional analysis, and then architecture development and assessment.In the twenty-first century, the threat of asymmetric warfare in the form of terrorism is one of the most likely direct threats to the United States homeland. It has been recognized that perhaps the key element in protecting the continental United States from terrorist threats is obtaining intelligence of impending attacks in advance. Enormous amounts of resources are currently allocated to obtaining and parsing such intelligence. However, it remains a difficult problem to deal with such attacks once intelligence is obtained. In this context, the Maritime Threat Response Project has applied Systems Engineering processes to propose different cost-effective System of Systems (SoS) architecture solutions to surface-based terrorist threats emanating from the maritime domain. The project applied a five-year time horizon to provide near-term solutions to the prospective decision makers and take maximum advantage of commercial off-the-shelf (COTS) solutions and emphasize new Concepts of Operations (CONOPS) for existing systems. Results provided insight into requirements for interagency interactions in support of Maritime Security and demonstrated the criticality of timely and accurate intelligence in support of counterterror operations.This report was prepared for the Office of the Assistant Secretary of Defense for Homeland DefenseApproved for public release; distribution is unlimited

    Speaker Recognition in Unconstrained Environments

    Get PDF
    Speaker recognition is applied in smart home devices, interactive voice response systems, call centers, online banking and payment solutions as well as in forensic scenarios. This dissertation is concerned with speaker recognition systems in unconstrained environments. Before this dissertation, research on making better decisions in unconstrained environments was insufficient. Aside from decision making, unconstrained environments imply two other subjects: security and privacy. Within the scope of this dissertation, these research subjects are regarded as both security against short-term replay attacks and privacy preservation within state-of-the-art biometric voice comparators in the light of a potential leak of biometric data. The aforementioned research subjects are united in this dissertation to sustain good decision making processes facing uncertainty from varying signal quality and to strengthen security as well as preserve privacy. Conventionally, biometric comparators are trained to classify between mated and non-mated reference,--,probe pairs under idealistic conditions but are expected to operate well in the real world. However, the more the voice signal quality degrades, the more erroneous decisions are made. The severity of their impact depends on the requirements of a biometric application. In this dissertation, quality estimates are proposed and employed for the purpose of making better decisions on average in a formalized way (quantitative method), while the specifications of decision requirements of a biometric application remain unknown. By using the Bayesian decision framework, the specification of application-depending decision requirements is formalized, outlining operating points: the decision thresholds. The assessed quality conditions combine ambient and biometric noise, both of which occurring in commercial as well as in forensic application scenarios. Dual-use (civil and governmental) technology is investigated. As it seems unfeasible to train systems for every possible signal degradation, a low amount of quality conditions is used. After examining the impact of degrading signal quality on biometric feature extraction, the extraction is assumed ideal in order to conduct a fair benchmark. This dissertation proposes and investigates methods for propagating information about quality to decision making. By employing quality estimates, a biometric system's output (comparison scores) is normalized in order to ensure that each score encodes the least-favorable decision trade-off in its value. Application development is segregated from requirement specification. Furthermore, class discrimination and score calibration performance is improved over all decision requirements for real world applications. In contrast to the ISOIEC 19795-1:2006 standard on biometric performance (error rates), this dissertation is based on biometric inference for probabilistic decision making (subject to prior probabilities and cost terms). This dissertation elaborates on the paradigm shift from requirements by error rates to requirements by beliefs in priors and costs. Binary decision error trade-off plots are proposed, interrelating error rates with prior and cost beliefs, i.e., formalized decision requirements. Verbal tags are introduced to summarize categories of least-favorable decisions: the plot's canvas follows from Bayesian decision theory. Empirical error rates are plotted, encoding categories of decision trade-offs by line styles. Performance is visualized in the latent decision subspace for evaluating empirical performance regarding changes in prior and cost based decision requirements. Security against short-term audio replay attacks (a collage of sound units such as phonemes and syllables) is strengthened. The unit-selection attack is posed by the ASVspoof 2015 challenge (English speech data), representing the most difficult to detect voice presentation attack of this challenge. In this dissertation, unit-selection attacks are created for German speech data, where support vector machine and Gaussian mixture model classifiers are trained to detect collage edges in speech representations based on wavelet and Fourier analyses. Competitive results are reached compared to the challenged submissions. Homomorphic encryption is proposed to preserve the privacy of biometric information in the case of database leakage. In this dissertation, log-likelihood ratio scores, representing biometric evidence objectively, are computed in the latent biometric subspace. Conventional comparators rely on the feature extraction to ideally represent biometric information, latent subspace comparators are trained to find ideal representations of the biometric information in voice reference and probe samples to be compared. Two protocols are proposed for the the two-covariance comparison model, a special case of probabilistic linear discriminant analysis. Log-likelihood ratio scores are computed in the encrypted domain based on encrypted representations of the biometric reference and probe. As a consequence, the biometric information conveyed in voice samples is, in contrast to many existing protection schemes, stored protected and without information loss. The first protocol preserves privacy of end-users, requiring one public/private key pair per biometric application. The latter protocol preserves privacy of end-users and comparator vendors with two key pairs. Comparators estimate the biometric evidence in the latent subspace, such that the subspace model requires data protection as well. In both protocols, log-likelihood ratio based decision making meets the requirements of the ISOIEC 24745:2011 biometric information protection standard in terms of unlinkability, irreversibility, and renewability properties of the protected voice data

    Modeling the ongoing dynamics of short and long-range temporal correlations in broadband EEG during movement

    Get PDF
    Electroencephalogram (EEG) undergoes complex temporal and spectral changes during voluntary movement intention. Characterization of such changes has focused mostly on narrowband spectral processes such as Event-Related Desynchronization (ERD) in the sensorimotor rhythms because EEG is mostly considered as emerging from oscillations of the neuronal populations. However, the changes in the temporal dynamics, especially in the broadband arrhythmic EEG have not been investigated for movement intention detection. The Long-Range Temporal Correlations (LRTC) are ubiquitously present in several neuronal processes, typically requiring longer timescales to detect. In this paper, we study the ongoing changes in the dynamics of long- as well as short-range temporal dependencies in the single trial broadband EEG during movement intention. We obtained LRTC in 2 s windows of broadband EEG and modeled it using the Autoregressive Fractionally Integrated Moving Average (ARFIMA) model which allowed simultaneous modeling of short- and long-range temporal correlations. There were significant (p < 0.05) changes in both broadband long- and short-range temporal correlations during movement intention and execution. We discovered that the broadband LRTC and narrowband ERD are complementary processes providing distinct information about movement because eliminating LRTC from the signal did not affect the ERD and conversely, eliminating ERD from the signal did not affect LRTC. Exploring the possibility of applications in Brain Computer Interfaces (BCI), we used hybrid features with combinations of LRTC, ARFIMA, and ERD to detect movement intention. A significantly higher (p < 0.05) classification accuracy of 88.3 ± 4.2% was obtained using the combination of ARFIMA and ERD features together, which also predicted the earliest movement at 1 s before its onset. The ongoing changes in the long- and short-range temporal correlations in broadband EEG contribute to effectively capturing the motor command generation and can be used to detect movement successfully. These temporal dependencies provide different and additional information about the movement

    Quality of Information in Mobile Crowdsensing: Survey and Research Challenges

    Full text link
    Smartphones have become the most pervasive devices in people's lives, and are clearly transforming the way we live and perceive technology. Today's smartphones benefit from almost ubiquitous Internet connectivity and come equipped with a plethora of inexpensive yet powerful embedded sensors, such as accelerometer, gyroscope, microphone, and camera. This unique combination has enabled revolutionary applications based on the mobile crowdsensing paradigm, such as real-time road traffic monitoring, air and noise pollution, crime control, and wildlife monitoring, just to name a few. Differently from prior sensing paradigms, humans are now the primary actors of the sensing process, since they become fundamental in retrieving reliable and up-to-date information about the event being monitored. As humans may behave unreliably or maliciously, assessing and guaranteeing Quality of Information (QoI) becomes more important than ever. In this paper, we provide a new framework for defining and enforcing the QoI in mobile crowdsensing, and analyze in depth the current state-of-the-art on the topic. We also outline novel research challenges, along with possible directions of future work.Comment: To appear in ACM Transactions on Sensor Networks (TOSN
    corecore