216 research outputs found

    Thyroid nodule ultrasound image analysis and feature extraction

    Get PDF
    In this study, I introduce a novel workflow for extracting useful features in thyroid ultrasound images using deep learning and machine learning methods. The methodology combines Convolutional Auto-Encoder, Local Binary Patterns, Histogram of Oriented Gradients and professional image characterization together to extract useful information from medical images. Multiple machine learning classifiers are used to build an effective thyroid tumor diagnosis model from extracted features. The experimental results show that Support Vector Machine with a specifically designed preprocessing scheme and a customized objective function outperforms human on the test set. The final model can effectively reduce the number of unnecessary biopsies and the number of missing malignancies

    Automatic Machine Learning in Optimization and Data Augmentation

    Get PDF
    This dissertation introduces Automatic Machine Learning (AutoML) as a potential approach to overcome current deep learning challenges on efficiency and cost. It also proposes two novel AutoML workflows to areas in deep learning where AutoML is less recognized by the AI community: optimization and data augmentation. The proposed AutoML workflow in optimization can automatically adjust the learning rate for deep learning tasks. It monitors the signals generated during optimization and dynamically changes the learning rate based on the signal observed. The workflow is successfully deployed in image classification, instance detection and language modeling tasks. The method delivers better performance and faster convergence speed than widely-used static and learning-based schedulers under various different settings. The new AutoML workflow in data augmentation can help deep neural networks achieve better generalization performance through automated optimization of data augmentation policies. Comparing with the prior best method, the workflow halves the computation required while achieving equivalent or better results on the same benchmarks. In addition, it also removes the need of human intervention in the workflow, making the workflow truly automated for deep learning applications. Finally, the dissertation concludes that AutoML can play a significant role on various aspects of deep learning through further efficiency improvement and cost reduction. The hope of the dissertation is to inspire more AutoML research on all areas of deep learning, so that AutoML can eventually facilitate the development of fully automated learning, an important milestone in our long pursuit of Artificial Intelligence that can lead us to a brighter future

    Data Mining Applications in Reservoir Modeling

    Get PDF
    Data science has gained great attentions in many fields over the last decade, in this thesis, I further explored the use of data science technique in oil industry and developed three data mining applications that could be useful for reservoir modeling and exploratory data analysis. A detailed illustration of data mining algorithms such as Support Vector Machines (SVM), Probabilistic Neural Network (PNN) and Ensemble Learning algorithm is incorporated in the thesis. The performance of the proposed workflows are tested on real field data including Barnett Shale play and Mississippi Limestone. The first two applications are for the Barnett Shale play. For the first application, I used Support Vector Machines for the prediction on lithotypes derives from core data, the prediction algorithm takes a set of well log curves as input and lithotype as output. The test results showed that we achieved 76% accuracy in the blind test well, which indicates that we can identify lithotypes in uncored wells with high accuracy. For the second application, I proposed a workflow that used Ensemble Learning and Probabilistic Neural Network to make prediction on Total Organic Content (TOC) using a different set of well log curves. The blind test results showed that the predicted TOC zones share a great similarity with the core-based TOC measurement in the lab. The last application is for the Mississippi Lime in north-central Anadarko shelf of Oklahoma. I introduced a new porosity modeling workflow which combines Sequential Gaussian Simulation and Support Vector Machines. The results showed that my proposed workflow allows better use of exploratory data and make a more accurate estimation of the porosity in the reservoir model

    Incorporating the 10th Edition Institute of Traffic Engineers (ITE) Trip Generation Rates Into Virginia Department of Transportation Guidelines

    Get PDF
    The Institute of Transportation Engineers (ITE) released the Trip Generation (TG) 10th edition in 2017, which significantly updated its database, and some of its trip generation rates were substantially lower than those of earlier editions. This study aims to investigate the applicability of the TG 10th edition in various Virginia contexts and to recommend how to incorporate the TG 10th edition into state guidelines. The research team surveyed 31 state transportation agencies to obtain a clear understanding of current practices in the adoption of trip rates and trip estimation approaches. We systematically compared trip rates of TG 9th and 10th editions using hypothesis tests and identified land uses with significant rate reduction. Trip generation data were collected from 37 sites in Virginia during weekday PM peaks for the mixed-use sites and single-use sites with significantly reduced 10th edition rates (multi-family low-rise and general office). To investigate the use of trip rates in different settings, general offices in both general urban/suburban and dense multi-use urban were considered. For mixed-use developments, we explored the combinations of four internal trip capture models and TG rates of 9th and 10th editions to identify the best trip estimation approach. Given that all trip data were collected after the outbreak of the COVID-19 pandemic, Streetlight data were used to adjust trip counts to account for the impacts of COVID. This study recommends that VDOT’s Office of Land Use provide guidance to VDOT districts to accept traffic impact analysis reports using ITE’s 10th Edition Trip Generation and the 3rd Edition of the Trip Generation Handbook. It is further recommended that the Office of Land Use provide guidance to the districts to accept traffic impact analysis reports prepared using the methodology presented in the 3rd edition of the Trip Generation Handbook to estimate internal capture for mixed-use developments

    Automating Intersection Marking Data Collection and Condition Assessment at Scale With An Artificial Intelligence-Powered System

    Get PDF
    Intersection markings play a vital role in providing road users with guidance and information. The conditions of intersection markings will be gradually degrading due to vehicular traffic, rain, and/or snowplowing. Degraded markings can confuse drivers, leading to increased risk of traffic crashes. Timely obtaining high-quality information of intersection markings lays a foundation for making informed decisions in safety management and maintenance prioritization. However, current labor-intensive and high-cost data collection practices make it very challenging to gather intersection data on a large scale. This paper develops an automated system to intelligently detect intersection markings and to assess their degradation conditions with existing roadway Geographic information systems (GIS) data and aerial images. The system harnesses emerging artificial intelligence (AI) techniques such as deep learning and multi-task learning to enhance its robustness, accuracy, and computational efficiency. AI models were developed to detect lane-use arrows (85% mean average precision) and crosswalks (89% mean average precision) and to assess the degradation conditions of markings (91% overall accuracy for lane-use arrows and 83% for crosswalks). Data acquisition and computer vision modules developed were integrated and a graphical user interface (GUI) was built for the system. The proposed system can fully automate the processes of marking data collection and condition assessment on a large scale with almost zero cost and short processing time. The developed system has great potential to propel urban science forward by providing fundamental urban infrastructure data for analysis and decision-making across various critical areas such as data-driven safety management and prioritization of infrastructure maintenance

    Adversarial Focal Loss: Asking Your Discriminator for Hard Examples

    Full text link
    Focal Loss has reached incredible popularity as it uses a simple technique to identify and utilize hard examples to achieve better performance on classification. However, this method does not easily generalize outside of classification tasks, such as in keypoint detection. In this paper, we propose a novel adaptation of Focal Loss for keypoint detection tasks, called Adversarial Focal Loss (AFL). AFL not only is semantically analogous to Focal loss, but also works as a plug-and-chug upgrade for arbitrary loss functions. While Focal Loss requires output from a classifier, AFL leverages a separate adversarial network to produce a difficulty score for each input. This difficulty score can then be used to dynamically prioritize learning on hard examples, even in absence of a classifier. In this work, we show AFL's effectiveness in enhancing existing methods in keypoint detection and verify its capability to re-weigh examples based on difficulty

    Automating Augmentation Through Random Unidimensional Search

    Full text link
    It is no secret amongst deep learning researchers that finding the right data augmentation strategy during training can mean the difference between a state-of-the-art result and a run-of-the-mill ranking. To that end, the community has seen many efforts to automate the process of finding the perfect augmentation procedure for any task at hand. Unfortunately, even recent cutting-edge methods bring massive computational overhead, requiring as many as 100 full model trainings to settle on an ideal configuration. We show how to achieve even better performance in just 7: with Random Unidimensional Augmentation. Source code is available at https://github.com/fastestimator/RU

    Numerical Investigation of the Effect of Grids and Turbulence Models on Critical Heat Flux in a Vertical Pipe

    Get PDF
    Numerical simulation has been widely used in nuclear reactor safety analyses to gain insight into key phenomena. The Critical Heat Flux (CHF) is one of the limiting criteria in the design and operation of nuclear reactors. It is a two-phase flow phenomenon, which rapidly decreases the heat transfer performance at the rod surface. This paper presents a numerical simulation of a steady state flow in a vertical pipe to predict the CHF phenomena. The detailed Computational Fluid Dynamic (CFD) modeling methodology was developed using FLUENT. Eulerian two-phase flow model is used to model the flow and heat transfer phenomena. In order to gain the peak wall temperature accurately and stably, the effect of different turbulence models and wall functions are investigated based on different grids. Results show that O type grid should be used for the simulation of CHF phenomenon. Grids with Y+ larger than 70 are recommended for the CHF simulation because of the acceptable results of all the turbulence models while Grids with Y+ lower than 50 should be avoided. To predict the dry-out position accurately in a fine grid, Realizable k-ε model with standard wall function is recommended. These conclusions have some reference significance to better predict the CHF phenomena of vertical pipe. It can also be expanded to rod bundle of Boiling Water Reactor (BWR) by using same pressure condition

    To Raise or Not To Raise: The Autonomous Learning Rate Question

    Full text link
    There is a parameter ubiquitous throughout the deep learning world: learning rate. There is likewise a ubiquitous question: what should that learning rate be? The true answer to this question is often tedious and time consuming to obtain, and a great deal of arcane knowledge has accumulated in recent years over how to pick and modify learning rates to achieve optimal training performance. Moreover, the long hours spent carefully crafting the perfect learning rate can come to nothing the moment your network architecture, optimizer, dataset, or initial conditions change ever so slightly. But it need not be this way. We propose a new answer to the great learning rate question: the Autonomous Learning Rate Controller. Find it at https://github.com/fastestimator/AR
    • …
    corecore