2 research outputs found

    Development of in-field data acquisition systems and machine learning-based data processing and analysis approaches for turfgrass quality rating and peanut flower detection

    Get PDF
    Digital image processing and machine vision techniques provide scientists with an objective measure of crop quality that adds to the validity of study results without burdening the evaluation process. This dissertation aimed to develop in-field data acquisition systems and supervised machine learning-based data processing and analysis approaches for turfgrass quality classification and peanut flower detection. The new 3D Scanner App for Apple iPhone 12 Pro's camera with a LiDAR sensor provided high resolution of rendered turfgrass images. The battery life lasted for the entire time of data acquisition for an experimental field (49 m × 15 m size) that had 252 warm-season turfgrass plots. The utilized smartphone as an image acquisition tool at the same time achieved a similar outcome to the traditional image acquisition methods described in other studies. Experiments were carried out on turfgrass quality classification grouped into two classes (“Poor”, “Acceptable”) and four classes (“Very poor,” “Poor,” “Acceptable,” “High”) using supervised machine learning techniques. Gray-level Co-occurrence Matrix (GLCM) feature extractor with Random Forest classifier achieved the highest accuracy rate (81%) for the testing dataset for two classes. For four classes, Gabor filter was the best feature extractor and performed the best with Support Vector Machine (SVM) and XGBoost classifiers achieving 82% accuracy rates. The presented method will further assist researchers to develop a smartphone application for turfgrass quality rating. The study also applied deep learning-based features to feed machine learning classifiers. ResNet-101 deep feature extractor with SVM classifier achieved accuracy rate of 91% for two classes. ResNet-152 deep feature extractor with the SVM classifier achieved 86% accuracy rate for four classes. YOLOX-L and YOLOX-X models were compared with different data augmentation configurations to find the best YOLOX object detector for peanut flower detection. Peanut flowers were detected from images collected from a research field. YOLOX-X with weak data augmentation configurations achieved the highest mean average precision result at the Intersection over Union threshold of 50%. The presented method will further assist researchers in developing a counting method on flowers in images. The presented detection technique with required minor modifications can be implemented for other crops or flowers

    Predictive Learning from Real-World Medical Data: Overcoming Quality Challenges

    Get PDF
    Randomized controlled trials (RCTs) are pivotal in medical research, notably as the gold standard, but face challenges, especially with specific groups like pregnant women and newborns. Real-world data (RWD), from sources like electronic medical records and insurance claims, complements RCTs in areas like disease risk prediction and diagnosis. However, RWD's retrospective nature leads to issues such as missing values and data imbalance, requiring intensive data preprocessing. To enhance RWD's quality for predictive modeling, this thesis introduces a suite of algorithms developed to automatically resolve RWD's low-quality issues for predictive modeling. In this study, the AMI-Net method is first introduced, innovatively treating samples as bags with various feature-value pairs and unifying them in an embedding space using a multi-instance neural network. It excels in handling incomplete datasets, a frequent issue in real-world scenarios, and shows resilience to noise and class imbalances. AMI-Net's capability to discern informative instances minimizes the effects of low-quality data. The enhanced version, AMI-Net+, improves instance selection, boosting performance and generalization. However, AMI-Net series initially only processes binary input features, a constraint overcome by AMI-Net3, which supports binary, nominal, ordinal, and continuous features. Despite advancements, challenges like missing values, data inconsistencies, and labeling errors persist in real-world data. The AMI-Net series also shows promise for regression and multi-task learning, potentially mitigating low-quality data issues. Tested on various hospital datasets, these methods prove effective, though risks of overfitting and bias remain, necessitating further research. Overall, while promising for clinical studies and other applications, ensuring data quality and reliability is crucial for these methods' success
    corecore