12 research outputs found

    A Risk Based Approach to Node Insertion within Social Networks

    Get PDF
    Social Network Analysis (SNA) is a primary tool for counter-terrorism operations, ranging from resiliency and influence to interdiction on threats stemming from illicit overt and clandestine network operations. In an ideal world, SNA would provide a perfect course of action to eliminate dangerous situations that terrorist organizations bring. Unfortunately, the covert nature of terrorist networks makes the effects of these techniques unknown and possibly detrimental. To avoid potentially harmful changes to enemy networks, tactical involvement must evolve, beginning with the intelligent use of network in filtration through the application of the node insertion problem. The framework for the node insertion problem includes a risk-benefit model to assess the utility of various node insertion scenarios. This model incorporates local, intermediate and global SNA measures, such as Laplacian centrality and assortative mixing, to account for the benefit and risk. Application of the model to the Zachary Karate Club produces a set of recommended insertion scenarios. A designed experiment validates the robustness of the methodology against network structure and characteristics. Ultimately, the research provides an SNA method to identify optimal and near-optimal node insertion strategies and extend past node utility models into a general form with the inclusion of benefit, risk, and bias functions

    Using Conformal Win Probability to Predict the Winners of the Cancelled 2020 NCAA Basketball Tournaments

    Get PDF
    The COVID-19 pandemic was responsible for the cancellation of both the men's and women's 2020 National Collegiate Athletic Association (NCAA) Division 1 basketball tournaments. Starting from the point at which the Division 1 tournaments and any unfinished conference tournaments were cancelled, we deliver closed-form probabilities for each team of making the Division 1 tournaments, had they not been cancelled, aided by use of conformal predictive distributions. We also deliver probabilities of a team winning March Madness, given a tournament bracket. We then compare single-game win probabilities generated with conformal predictive distributions, aptly named conformal win probabilities, to those generated through linear and logistic regression on seven years of historical college basketball data, specifically from the 2014-2015 season through the 2020-2021 season. Conformal win probabilities are shown to be better calibrated than other methods, resulting in more accurate win probability estimates, while requiring fewer distributional assumptions.Comment: preprint submitted to Journal of Quantitative Analysis in Sports, 28 pages without figures; figures included at end of documen

    Anomaly Detection in the Molecular Structure of Gallium Arsenide Using Convolutional Neural Networks

    Get PDF
    This paper concerns the development of a machine learning tool to detect anomalies in the molecular structure of Gallium Arsenide. We employ a combination of a CNN and a PCA reconstruction to create the model, using real images taken with an electron microscope in training and testing. The methodology developed allows for the creation of a defect detection model, without any labeled images of defects being required for training. The model performed well on all tests under the established assumptions, allowing for reliable anomaly detection. To the best of our knowledge, such methods are not currently available in the open literature; thus, this work fills a gap in current capabilities

    Shape-restricted random forests and semiparametric prediction intervals

    No full text
    This dissertation is made up of three projects, all of which focus on prediction or uncertainty quantification with parametric and nonparametric methods. Chapter 2 introduces novel approaches to generating semiparametric prediction intervals for linear models. We compare these new methods to other prediction interval methods with simulated and real-world data. We show our method is competitive with other methods in most cases, and better in a subset of cases. We provide multiple theorems related to the marginal coverage of the new methods. We also use these methods to provide estimation for event outcomes in the sports realm. The results show the effectiveness of our methods in providing asymptotically valid, semiparametric prediction intervals. Chapter 3 introduces a new R package that implements multiple state-of-the-art methodologies to generate prediction intervals for random forests. We compare these methods via simulation. We also apply a subset of these methods to a drug-discovery data analysis problem. Chapter 4 introduces multiple monotone restriction methods for random forest predictions. We compare our methods to other tree-based monotone restriction methods, showing that our method stays competitive, while guaranteeing partially monotone predictions. We also extend our monotone restriction methods to generate monotone restricted prediction intervals for random forests.</p

    Shape-restricted random forests and semiparametric prediction intervals

    No full text
    This dissertation is made up of three projects, all of which focus on prediction or uncertainty quantification with parametric and nonparametric methods. Chapter 2 introduces novel approaches to generating semiparametric prediction intervals for linear models. We compare these new methods to other prediction interval methods with simulated and real-world data. We show our method is competitive with other methods in most cases, and better in a subset of cases. We provide multiple theorems related to the marginal coverage of the new methods. We also use these methods to provide estimation for event outcomes in the sports realm. The results show the effectiveness of our methods in providing asymptotically valid, semiparametric prediction intervals. Chapter 3 introduces a new R package that implements multiple state-of-the-art methodologies to generate prediction intervals for random forests. We compare these methods via simulation. We also apply a subset of these methods to a drug-discovery data analysis problem. Chapter 4 introduces multiple monotone restriction methods for random forest predictions. We compare our methods to other tree-based monotone restriction methods, showing that our method stays competitive, while guaranteeing partially monotone predictions. We also extend our monotone restriction methods to generate monotone restricted prediction intervals for random forests

    Anomaly Detection in the Molecular Structure of Gallium Arsenide Using Convolutional Neural Networks

    No full text
    This paper concerns the development of a machine learning tool to detect anomalies in the molecular structure of Gallium Arsenide. We employ a combination of a CNN and a PCA reconstruction to create the model, using real images taken with an electron microscope in training and testing. The methodology developed allows for the creation of a defect detection model, without any labeled images of defects being required for training. The model performed well on all tests under the established assumptions, allowing for reliable anomaly detection. To the best of our knowledge, such methods are not currently available in the open literature; thus, this work fills a gap in current capabilities

    Toward Automated Instructor Pilots in Legacy Air Force Systems: Physiology-based Flight Difficulty Classification via Machine Learning

    No full text
    The United States Air Force (USAF) is struggling to train enough pilots to meet operational requirements. Technology has advanced rapidly over the last 70 years but USAF pilot training has not. Modern operational requirements demand a change and, for this reason, USAF senior leadership has advocated for innovation. The automation of instructor and evaluator pilots in select bottlenecks (e.g., simulators) is one such measure. However, to implement this vision, numerous technical issues must be mitigated. Accurate classification of flight difficulty is a foundational problem underpinning many of these technical issues, which requires either the acquisition of new systems or the development of new procedures. Therefore, given this need and the costly nature of purchasing new equipment, physiological-based classification of flight difficulty is our focus herein. Leveraging multimodal data from a designed experiment of pilots landing a simulated aircraft, we develop a high-quality machine learning pipeline for classifying flight difficulty, called the Multi-Modal Functional-based Decision Support System (MMF-DSS). MMF-DSS distills a tabular set of features from our multimodal and functional data through the use of functional principal component analysis, summary statistics, and BorutaSHAP. In this manner, information is derived from the time-series data via the generation of hundreds of features, of which a small subset having the most predictive capability is discerned. Four full factorial designs are used to perform hyperparameter tuning on a set of classifiers. In so doing, a superlative technique is identified. Impacts on executive decision making are examined as well as associated policymaking implications. Alternative classifiers are considered for use within our pipeline that trade predictive accuracy for cost efficiency, and recommendations for choosing among these alternatives is provided
    corecore