2,766 research outputs found

    Guided data augmentation for improved semi-supervised image classification in low data regime.

    Get PDF
    Deep learning models have achieved state of the art performances, especially for computer vision applications. Much of the recent successes can be attributed to the existence of large, high quality, labeled datasets. However, in many real-world applications, collecting similar datasets is often cumbersome and time consuming. For instance, developing robust automatic target recognition models from infrared images still faces major challenges. This is mainly due to the difficulty of acquiring high resolution inputs, sensitivity to the thermal sensors\u27 calibration, meteorological conditions, targets\u27 scale and viewpoint invariance. Ideally, a good training set should contain enough variations within each class for the model to learn the most optimal decision boundaries. However, when there are under-represented regions in the training feature space, especially in low data regime or in presence of low-quality inputs, the model risks learning sub-optimal decision boundaries, resulting in sub-optimal predictions. This dissertation presents novel data augmentation (DA) strategies aimed at improving the performance of machine learning models in low data regimes. The proposed techniques are designed to augment limited labeled datasets, providing the models with additional information to learn from.\\ The first contribution of this work is the development of Confidence-Guided Generative Augmentation (CGG-DA), a technique that trains and learns a generative model, such as Variational Autoencoder (VAE) and Deep Convolutional Generative Adversarial Networks (DCGAN), to generate synthetic augmentations. These generative models can generate labeled and/or unlabeled data by drawing from the same distribution as the under-performing samples based on a baseline reference model. By augmenting the training dataset with these synthetic images, CGG-DA aims to bridge the performance gap across different regions of the training feature space. We also introduce a Tool-Supported Contextual Augmentation (TSC-DA) technique that leverages existing ML models, such as classifiers or object detectors, to label available unlabeled data. Samples with consistent and high confidence predictions are used as labeled augmentations. On the other hand, samples with low confidence predictions might still contain some information even though they are more likely to be noisy and inconsistent. Hence, we keep them and use them as unlabeled samples during. Our third proposed DA explores the use of existing ML tools and external image repositories for data augmentation. This approach, called Guided External Data Augmentation (EG-DA), leverages external image repositories to augment the available dataset. External repositories are typically noisy, and might include a lot of out-of-distribution (OOD) samples. If included in the training process without proper handling, OOD samples can confuse the model and degrade the performance. To tackle this issue, we design and train a VAE-based anomaly detection component and use it to filter out any OOD samples. Since our DA includes both labeled data and a larger set of unlabeled data, we use semi-supervised training to exploit the information contained in the generated augmentations. This can guide the network to learn complex representations, and generalize to new data. The proposed data augmentation techniques are evaluated on two computer vision applications, and using multiple scenarios. We also compare our approach, using benchmark datasets, to baseline models trained on the initial labeled data only, and to existing data augmentation techniques. We show that each proposed augmentation consistently improve the results. We also perform an in-depth analysis to justify the observed improvements

    Virtual environmental applications for buried waste characterization technology evaluation report

    Full text link

    Model-Informed Drug Development: In Silico Assessment of Drug Bioperformance following Oral and Percutaneous Administration

    Get PDF
    The pharmaceutical industry has faced significant changes in recent years, primarily influenced by regulatory standards, market competition, and the need to accelerate drug development. Model-informed drug development (MIDD) leverages quantitative computational models to facilitate decision-making processes. This approach sheds light on the complex interplay between the influence of a drug’s performance and the resulting clinical outcomes. This comprehensive review aims to explain the mechanisms that control the dissolution and/or release of drugs and their subsequent permeation through biological membranes. Furthermore, the importance of simulating these processes through a variety of in silico models is emphasized. Advanced compartmental absorption models provide an analytical framework to understand the kinetics of transit, dissolution, and absorption associated with orally administered drugs. In contrast, for topical and transdermal drug delivery systems, the prediction of drug permeation is predominantly based on quantitative structure–permeation relationships and molecular dynamics simulations. This review describes a variety of modeling strategies, ranging from mechanistic to empirical equations, and highlights the growing importance of state-of-the-art tools such as artificial intelligence, as well as advanced imaging and spectroscopic techniques

    Novel micron- and nano-scale energetic materials for advanced gun propulsion, their material properties, and their effects on ballistic performance

    Get PDF
    This dissertation focused on the investigation of novel materials that are both energetic and inert in their micron- and nano-scale crystalline form. The characterization of the materials properties and its effects on the ballistic performance when incorporated into a composite material were evaluated as a gun propellant for application in a future weapon system for the US Army. Some of these materials may find dual use in civilian applications. Applications in small and medium arms, artillery, tank, aircraft, and shipboard gun systems will all benefit from these advancements. Not only will gun system performance be improved for greater stand-off range and accuracy, but the ability to perform consistently across a broad temperature range. Additionally, an improved performance and longer gun barrel life achievable by tailoring the combustion products, lowering the propellant flame temperature, minimum sensitivity of burning velocity to pressure, temperature and gas velocity (erosive burning) and with munitions that are insensitive to outside stimulus attack will give such systems a significant advantage during military use. In addition, green chemistry and lower lifecycle cost were taken into consideration during this research. The approach to be taken was to incorporate these novel materials into a gun propellant formulation by using nitramine-based micron scale cyclotrimethylene trinitramine (RDX) explosives in combination with synthesized novel ingredients in nanoscale crystalline form, characterize the material properties and predict the ballistic performance across the ballistic temperature range. The nano-scale crystalline materials evaluated consisted of polymeric nitrogen stabilized in single wall carbon nanotubes (SWNTs), nitrogenated boron nanotubes / nanofibers (BNNTs/BNNFs), nano-aluminum, and titanium dioxide. The polymeric nitrogen and the nitrogenated boron nanotubes / nanofibers (BNNTs/BNNFs), should provide an enhancement in the propellant burn rate by achieving the burn rate differential goal of 3:1 between the fast and the slow burning propellant and at the same time improve the gun propellant performance by lowering the CO/CO2 ratio and raising the N2 / CO ratio for mitigating gun bore wear and erosion, respectively. For the synthesis approaches of polymeric nitrogen stabilized in carbon nanotubes, the following synthesis method were performed, optimized and compared: Electrochemical Reactions, Microwave Induced Electrochemical Chemical Reactions and Plasma Enhanced Chemical Vapor Deposition (PE-CVD). The Electrochemical Reaction process has proven to be the most efficient synthesis approach for the polymeric nitrogen based on analytical results obtained through Raman Spectroscopy, Laser Ablation Mass Spectroscopy, Scanning Electron Microscope, Fourier Transform Infrared-Attenuated Total Reflectance (FTIR-ATR) and Differential Scanning Calorimeter/Thermal Gravimetric Analysis (DSC/TGA). The PE-CVD is the second recommended synthesis approach to synthesize the polymeric nitrogen although a cost benefit economic analysis has to be performed which is beyond the objectives of this research work. For the synthesis of the nitrogenated boron nanotubes, the use of the magnesium borohydride to initiate the reaction has proven to be the most optimized process due to a much lower reaction temperature which is approximately 500°C when compared with the reaction temperature of 950°C when using Magnesium Boride (MgB2) in the thermally induced CVD process. The small scale synthesis of boron nanotubes /nanofibers carried out using MgB2 powder, Nickel Boride (Ni2B) powder catalysts and mesostructured hexagonal framework zeolite powder was successfully achieved at 950C. The quality of the nanotubes produced was checked by Raman spectroscopy and transmission electron microscope analysis. The TEM data shows the production of 10-20 nm boron nanotubes using the MgB2, Ni2B and Mobile Crystalline Material (MCM-41) in the synthesis process

    Convolutional Neural Networks - Generalizability and Interpretations

    Get PDF

    Archangel: A Hybrid UAV-based Human Detection Benchmark with Position and Pose Metadata

    Full text link
    Learning to detect objects, such as humans, in imagery captured by an unmanned aerial vehicle (UAV) usually suffers from tremendous variations caused by the UAV's position towards the objects. In addition, existing UAV-based benchmark datasets do not provide adequate dataset metadata, which is essential for precise model diagnosis and learning features invariant to those variations. In this paper, we introduce Archangel, the first UAV-based object detection dataset composed of real and synthetic subsets captured with similar imagining conditions and UAV position and object pose metadata. A series of experiments are carefully designed with a state-of-the-art object detector to demonstrate the benefits of leveraging the metadata during model evaluation. Moreover, several crucial insights involving both real and synthetic data during model optimization are presented. In the end, we discuss the advantages, limitations, and future directions regarding Archangel to highlight its distinct value for the broader machine learning community.Comment: Submission to IEEE Acces

    Conceptual Framework and Methodology for Analysing Previous Molecular Docking Results

    Get PDF
    Modern drug discovery relies on in-silico computational simulations such as molecular docking. Molecular docking models biochemical interactions to predict where and how two molecules would bind. The results of large-scale molecular docking simulations can provide valuable insight into the relationship between two molecules. This is useful to a biomedical scientist before conducting in-vitro or in-vivo wet-lab experiments. Although this ˝eld has seen great advancements, feedback from biomedical scientists shows that there is a need for storage and further analysis of molecular docking results. To meet this need, biomedical scientists need to have access to computing, data, and network resources, and require speci˝c knowledge or skills they might lack. Therefore, a conceptual framework speci˝cally tailored to enable biomedical scientists to reuse molecular docking results, and a methodology which uses regular input from scientists, has been proposed. The framework is composed of 5 types of elements and 13 interfaces. The methodology is light and relies on frequent communication between biomedical sciences and computer science experts, speci˝ed by particular roles. It shows how developers can bene˝t from using the framework which allows them to determine whether a scenario ˝ts the framework, whether an already implemented element can be reused, or whether a newly proposed tool can be used as an element. Three scenarios that show the versatility of this new framework and the methodology based on it, have been identi˝ed and implemented. A methodical planning and design approach was used and it was shown that the implementations are at least as usable as existing solutions. To eliminate the need for access to expensive computing infrastructure, state-of-the-art cloud computing techniques are used. The implementations enable faster identi˝cation of new molecules for use in docking, direct querying of existing databases, and simpler learning of good molecular docking practice without the need to manually run multiple tools. Thus, the framework and methodol-ogy enable more user-friendly implementations, and less error-prone use of computational methods in drug discovery. Their use could lead to more e˙ective discovery of new drugs
    corecore