International Journal of Advances in Intelligent Informatics
Not a member yet
174 research outputs found
Sort by
Cocoa bean quality identification using a computer vision-based color and texture feature extraction
The current pressing issue in the downstream processing of cocoa beans in cocoa production is a strict quality control system. However, visually inspecting raw cocoa beans reveals the need for advanced technological solutions, especially in Industry 4.0. This paper introduces an innovative image-processing approach to extracting color and texture features to identify cocoa bean quality. Image acquisition involved capturing video with a data acquisition box device connected to a conveyor, resulting in image samples of Good-quality and Poor-quality of non-cutting cocoa beans dataset. Our methodology includes multifaceted advanced pre-processing, sharpening techniques, and comparative analysis of feature extraction methodologies using Hue-Saturation-Value (HSV) and Gray Level Cooccurrence Matrix (GLCM) with correlated features. This study used 15 features with the highest correlation. Machine Learning models using Support Vector Machine (SVM) with some parameter variation value alongside an RBF kernel. Some parameters were measured to compare each approach, and the results show that pre-processing without sharpening achieves better accuracy, notably with the HSV and GLCM combination reaching 0.99 accuracy. Adequate technical lighting during data acquisition is crucial for accuracy. This study sheds light on the efficacy of image processing in cocoa bean quality identification, addressing a critical gap in industrial-scale implementation of technological solutions and advancing quality control measures in the cocoa industry
Lightweight deep learning model with ResNet14 and spatial attention for anterior cruciate ligament diagnosis
The accuracy of diagnosing an Anterior Cruciate Ligament (ACL) tear depends on the radiologist’s or surgeon’s expertise, experience, and skills. In this study, we contribute to the development of an automated diagnostic model for anterior cruciate ligament (ACL) tears using a lightweight deep learning model, specifically ResNet-14, combined with a Spatial Attention mechanism to enhance diagnostic performance while conserving computational resources. The model processes knee MRI scans using a ResNet architecture, comprising a series of residual blocks and a spatial attention mechanism, to focus on the essential features in the imaging data. The methodology, which includes the training and evaluation process, was conducted using the Stanford dataset, comprising 1,370 knee MRI scans. Data augmentation techniques were also implemented to mitigate biases. The model’s assessment uses performance metrics, ROC-AUC, sensitivity, and specificity. The results show that the proposed model achieved an ROC-AUC score of 0.8696, a sensitivity of 79.81%, and a specificity of 79.82%. At 6.67 MB in size, with 1,684,517 parameters, the model is significantly more compact than existing models, such as MRNet. The findings demonstrate that embedding spatial attention into a lightweight deep learning framework augments the diagnostic accuracy for ACL tears while maintaining computational efficiency. Therefore, lightweight models have the potential to enhance diagnostic capability in medical imaging, allowing them to be deployed in resource-constrained clinical settings
Traffic light optimization (TLO) using reinforcement learning for automated transport systems
Current traffic light systems follow predefined timing sequences, causing the light to turn green even when no cars are waiting, while the side road with waiting vehicles may still face a red light. Reinforcement learning can help by training an intelligent model to analyze real-time traffic conditions and dynamically adjust signal lights based on actual demand and necessity. If the traffic light becomes intelligent and autonomous then it can significantly reduce the time wasted everyday commuting due to previously determined traffic light timing sequences. In our previous work, we used fuzzy logic to control the traffic light where the time was fixed but in this paper, the waiting time becomes a variable that changes depending on other road variables like vehicles, pedestrians, and times. Moreover, we trained an agent in this work using reinforcement learning to optimize the traffic flow in junctions with traffic lights. The trained agent worked using the greedy method to improve traffic flow to maximize the rewards by changing the signals appropriately. We have two states and there are only two actions to take for the agent. The results of the training of the model are promising. In normal situations, the average waiting time was 9.16 seconds. After applying our fuzzy rules, the average waiting time was reduced to 0.26 seconds, and after applying reinforcement learning, it was 0.12 seconds in a simulator. The average waiting time was reduced by 97~98%. These models have the potential to improve real-world traffic efficiency by approximately 67~68%
Geometry-aware light field angular super-resolution using multiple representations
Light Field Angular Super-Resolution (LFASR) is a critical task that enables applications such as depth estimation, refocusing, and 3D scene reconstruction. Acquiring LFASR from Plenoptic cameras has an inherent trade-off between the angular and spatial resolution due to sensor limitations. To address this challenge, many learning-based LFASR methods have been proposed; however, the reconstruction problem of LF with a wide baseline remains a significant challenge. In this study, we proposed an end-to-end learning-based geometry-aware network using multiple representations. A multi-scale residual network with varying receptive fields is employed to effectively extract spatial and angular features, enabling angular resolution enhancement without compromising spatial fidelity. Extensive experiments demonstrate that the proposed method effectively recovers fine details with high angular resolution while preserving the intricate parallax structure of the light field. Quantitative and qualitative evaluations on both synthetic and real-world datasets further confirm that the proposed approach outperforms existing state-of-the-art methods. This research improves the angular resolution of the light field without reducing spatial sharpness, supporting applications such as depth estimation and 3D reconstruction. The method is able to preserve parallax details and structure with better results than current methods
LUNGINFORMER: A Multiclass of lung pneumonia diseases detection based on chest X-ray image using contrast enhancement and hybridization inceptionresnet and transformer
Lung pneumonia is categorically a serious disease on Earth. In December 2019, COVID-19 was first identified in Wuhan, China. COVID-19 caused severe lung pneumonia. The majority of lung pneumonia diseases are diagnosed using traditional medical tools and specialized medical personnel. This process is both time-consuming and expensive. To address the problem, many researchers have employed deep learning algorithms to develop an automated detection system for pneumonia. Deep learning faces the issue of low-quality X-ray images and biased X-ray image information. The X-ray image is the primary material for creating a transfer learning model. The problem in the dataset led to inaccurate classification results. Many previous works with a deep learning approach have faced inaccurate results. To address the situation mentioned, we propose a novel framework that utilizes two essential mechanisms: advanced image contrast enhancement based on Contrast Limited Adaptive Histogram Equalization (CLAHE) and a hybrid deep learning model combining InceptionResNet and Transformer. Our novel framework is named LUNGINFORMER. The experiment report demonstrated LUNGINFORMER achieved an accuracy of 0.98, a recall of 0.97, an F1-score of 0.98, and a precision of 0.96. According to the AUC test, LUNGINFORMER achieved a tremendous performance with a score of 1.00 for each class. We believed that our performance model was influenced by contrast enhancement and a hybrid deep learning model
A genetic algorithm approach to green vehicle routing: Optimizing vehicle allocation and route planning for perishable products
This paper introduces a novel approach to the Green Vehicle Routing Problem (GVRP) by integrating multiple trips, heterogeneous vehicles, and time windows, specifically applied to the distribution of bakery products. The primary objective of the proposed model is to optimize route planning and vehicle allocation, aiming to minimize transportation costs and carbon emissions while maximizing product quality upon delivery to retailers. Utilizing a Genetic Algorithm (GA), the model demonstrates its effectiveness in achieving near-optimal solutions that balance economic, environmental, and quality-focused goals. Empirical results reveal a total transportation cost of Rp. 856,458.12, carbon emissions of 365.43 kgCO2e, and an impressive average product quality of 99.90% across all vehicle trips. These findings underscore the capability of the model to efficiently navigate the complexities of real-world logistics while maintaining high standards of product delivery. The proposed GVRP model serves as a valuable tool for industries seeking sustainable and cost-effective distribution strategies, with implications for broader advancements in supply chain management
Modified particle swarm optimization (MPSO) optimized CNN’s hyperparameters for classification
This paper proposes a convolutional neural network architectural design approach using the modified particle swarm optimization (MPSO) algorithm. Adjusting hyper-parameters and searching for optimal network architecture from convolutional neural networks (CNN) is an interesting challenge. Network performance and increasing the efficiency of learning models on certain problems depend on setting hyperparameter values, resulting in large and complex search spaces in their exploration. The use of heuristic-based searches allows for this type of problem, where the main contribution in this research is to apply the MPSO algorithm to find the optimal parameters of CNN, including the number of convolution layers, the filters used in the convolution process, the number of convolution filters and the batch size. The parameters obtained using MPSO are kept in the same condition in each convolution layer, and the objective function is evaluated by MPSO, which is given by classification rate. The optimized architecture is implemented in the Batik motif database. The research found that the proposed model produced the best results, with a classification rate higher than 94%, showing good results compared to other state-of-the-art approaches. This research demonstrates the performance of the MPSO algorithm in optimizing CNN architectures, highlighting its potential for improving image recognition tasks
Advanced deep learning techniques for sentiment analysis: combining Bi-LSTM, CNN, and attention layers
Online platforms enhance customer engagement and provide businesses with valuable data for predictive analysis, critical for strategic sales forecasting and customer relationship management. This study explores in depth the potential of sentiment analysis (SA) to enhance sales forecasting and customer retention for small and large businesses. We collected a large dataset of product review tweets, representing a rich consumer sentiment source. We developed an artificial neural network based on a dataset of product review tweets that captures both positive and negative sentiments. The core of our model is Bi-LSTM (Bidirectional Long Short-Term Memory) architecture, enhanced by an attention mechanism to capture relationships between words and emphasize key terms. Then, a one-dimensional convolutional neural network with 64 filters of size 3x3 is applied, followed by Average_Max_Pooling to reduce the feature map. Finally, two dense layers classify the sentiment as positive or negative. This research provides significant benefits and contributions to sentiment analysis by accurately identifying consumer sentiment in product review tweets. The proposed model that integrated Bi-LSTM with attention mechanism and CNN detects negative sentiment with a precision of 0.97, recall of 0.98, and F1-score of 0.98, allowing companies to address customer concerns, improving satisfaction and brand loyalty proactively. In addition, the proposed model presents a better sentiment classification on average for both positive and negative sentiments, and accuracy (96%) compared to the other baselines. It ensures high-quality input data by reducing noise and inconsistencies in product review tweets. Moreover, the dataset collected in this study serves as a valuable benchmark for future research in sentiment analysis and predictive analytics
Optimized image-based grouping of e-commerce products using deep hierarchical clustering
Managing large and constantly evolving product catalogs is a significant challenge for e-commerce platforms, especially when visually similar products cannot be reliably distinguished using text-based methods. This study proposes a product grouping method that combines a fine-tuned EfficientNetV2M model with an adaptive Agglomerative Clustering strategy. Unlike conventional CNN-based approaches, which have limited scalability and a fixed number of clusters, the proposed method dynamically adjusts similarity thresholds and automatically forms clusters for unseen product variations. By linking deep visual feature extraction with adaptive clustering, the method enhances flexibility in handling product diversity. Experiments on the Shopee product image dataset show that it achieves a high Normalized Mutual Information (NMI) score of 0.924, outperforming standard baselines. These results demonstrate the method’s effectiveness in automating catalog organization and offer a scalable solution for inventory management and personalized recommendations in e-commerce platforms
Semantic-BERT and semantic-FastText model for education question classification
Question classification (QC) is critical in an educational question-answering (QA) system. However, most existing models suffer from limited semantic accuracy, particularly when dealing with complex or ambiguous education queries. The problem lies in their reliance on surface-level features, such as keyword matching, which hampers their ability to capture deeper syntactic and semantic relationship in question. This results in misclassification and generic responses that fail to address the specific intent of prospective students. This study addresses, this gap by integrating semantic dependency parsing into Semantic-BERT (S-BERT) and Semantic-FastText (S-FastText) to enhance question classification performance. Semantic dependency parsing is applied to structure the semantics of interrogative sentences before classification processing by BERT and FastText. A dataset of 2,173 educational questions covering five question classes (5W1H) is used for training and validation. The model evaluation uses a confusion matrix and K-Fold cross-validation, ensuring robust performance assessment. Experimental results show that both models achieve 100% accuracy, precision, and recall in classifying question sentences, demonstrating their effectiveness in educational question classification. These findings contribute to the development of intelligent educational assistants, paving the way for more efficient and accurate automated question-answering systems in academic environments