IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
Not a member yet
460 research outputs found
Sort by
The Impact of Data Augmentation Techniques on Improving Speech Recognition Performance for English in Indonesian Children Based on Wav2Vec 2.0
Early childhood education is a crucial phase in shaping children's character and language skills. This study develops an Automatic Speech Recognition (ASR) model to recognize the speech of Indonesian children speaking English. The process begins with collecting and processing a dataset of children's speech recordings, which is then expanded using data augmentation techniques to enhance pronunciation variations. The pre-trained ASR Wav2Vec 2.0 model is fine-tuned with both the original and augmented datasets. Evaluation using Word Error Rate (WER) and Character Error Rate (CER) shows a significant accuracy improvement, with WER decreasing from 53% to 45% and CER from 33% to 27%, reflecting a performance increase of approximately 15%. Further analysis reveals pronunciation errors in phonemes such as /ð/, /θ/, /r/, /v/, /z/, and /ʃ/, which are uncommon in the Indonesian language, manifesting as substitutions, omissions, or additions in words like "three," "that," "rabbit," "very," and "zebra." These findings highlight the need for targeted phoneme training, audio-based approaches with ASR feedback, and the listen-and-repeat technique in English language instruction for children.Keywords— Early childhood education, Automatic Speech Recognition, Augmentation, Character Error Rate, Word Error Rat
DESICION SUPPORT SYSTEM OF LAND SUITABILITY FOR CORN SEED VARIETIES
Decision making in selecting suitable agricultural land is a key factor for the success of corn cultivation. The selection of agricultural land is still largely based on the experience of farmers, which lacks a strong analytical foundation, this can lead to a decrease in production as the evidenced in 2023, the dry corn kernel production decline by 12,5% compared to the previous year. This research develops a Decision Support System (DSS) to analyze land suitability for corn varieties by using the Analytical Hierarchy Process (AHP) method to calculate the priority weights of each evaluation criterion, and the Profile Matching (PM) method to rank agricultural. The research uses data from 22 sub-districts in Blitar Regency as alternatives and 5 types of corn varieties as ideal profiles. The ranking results of this research indicate that the best agricultural land for varieties V1, V2, V3, and V4 is in Sanankulon Sub-district, while for variety V5, it is in Doko Sub-district. The validity test results showed a “Strong” coefficient, and the reliability test yielded a Cronbach's alpha of 0.8019, indicating a "Good" level of consistency
Comparison of Artificial Intelligence Methods for Tuberculosis Detection Using X-Ray Images
Penyakit tuberkulosis (TB), yang disebabkan oleh bakteri Mycobacterium tuberculosis, merupakan penyakit menular yang sangat berbahaya. Di Indonesia, TB adalah penyakit menular paling mematikan setelah COVID-19 dan menempati urutan ke-13 sebagai penyebab kematian global. Deteksi dini TB sangat penting untuk meningkatkan peluang kesembuhan, namun keterbatasan jumlah ahli radiologi menjadi tantangan utama. Teknologi deep learning, khususnya Convolutional Neural Network (CNN), mejadi solusi efektif untuk masalah ini. Oleh karena itu, pada penelitian ini akan membandingkan dua arsitektur CNN, yaitu AlexNet dan VGG-19, dalam mendeteksi TB pada citra rontgen paru-paru, dengan penerapan metode perbaikan kualitas citra, seperti Histogram Equalization (HE), Adaptive Histogram Equalization (AHE), Contrast Limited Adaptive Histogram Equalization (CLAHE), dan Gamma Correction. Dataset yang digunakan diperoleh dari Kaggle dan mencakup citra rontgen paru-paru normal serta TB. Evaluasi performa dilakukan berdasarkan akurasi, presisi, recall, dan F1-score. Hasil penelitian menunjukkan bahwa VGG-19 dengan CLAHE memberikan performa terbaik dengan akurasi 93.5%, presisi 98.88%, recall 88%, dan F1-score 93.12%. VGG-19 dengan Gamma Correction juga menunjukkan hasil yang sangat baik dengan akurasi 91%, presisi 97.67%, recall 84%, dan F1-score 90.32%. Temuan ini menggarisbawahi efektivitas kombinasi CNN dan metode pemrosesan citra dalam meningkatkan deteksi TB
Enhancing Image Classification Performance Using Multi CNN Feature Fusion Method
This research aims to overcome general challenges in the field of image pattern recognition using a convolutional neural network (CNN), which is still faced with the complexity and limitations of image data. Achieving high accuracy is essential because it significantly influences the effectiveness and success of numerous areas. Although deep learning technology, especially CNNs, offers the potential to improve accuracy, it is still limited to the 70–80% range for achieving the expected level of accuracy. In this research, a fusion method was developed that combines pre-trained models using concatenation techniques to increase accuracy. By utilizing pre-trained models such as ResNet50, VGG16, and MobileNet-v2, which were then adapted to various datasets and cross-validation techniques, researchers managed to achieve significant improvements in accuracy. The results of this study show an improvement in the accuracy of the Fusion Multi-CNN model for various datasets. On the fashion dataset, MNIST managed to achieve an accuracy of 0.87840, while on CIFAR-10 and Oxford-102, the accuracy was 0.81260 and 0.84004, respectively
Support Vector Machine for Accurate Classification of Diabetes Risk Levels
This research explores the application of Support Vector Machines (SVM) for accurately classifying diabetes risk levels based on a publicly available dataset containing 768 instances and 9 attributes, including glucose levels, BMI, blood pressure, and insulin levels. The model's systematic development process involved data preprocessing, feature selection, and hyperparameter optimization to ensure robust performance. Results indicate an overall accuracy of 76%, with high precision and recall for the non-diabetic risk class, but relatively lower performance for the diabetic risk class, highlighting the challenges posed by class imbalance and overlapping data features. To address these issues, future research should incorporate advanced resampling techniques, refined feature engineering, and alternative machine learning models like Random Forest or XGBoost. This research underscores the potential of SVM as a valuable tool for early diabetes detection, offering healthcare professionals a reliable means to identify at-risk individuals and personalize intervention strategies. By bridging theoretical advancements and practical applications, the research contributes to enhancing predictive analytics in medical diagnostics, paving the way for improved patient outcomes and efficient public health managemen
Two-Step Iris Recognition Verification Using 2D Gabor Wavelet and Domain-Specific Binarized Statistical Image Features
The Iris is one of the most reliable biometric features due to its complex textural properties. However, using coloured contact lenses renders the iris unreliable in iris recognition systems. Colored contact lenses are one of the spoofing methods in biometrics that can conceal a person's identity. To prevent spoofing, a two-step verification process is needed in the iris recognition system. The first verification step is to detect colored contact lenses, while the second is to recognize or match a person's identity. The feature extraction methods used are Domain Specific Binarized Statistical Image Features (DSBSIF) and Gabor Wavelet. The method for detecting contact lenses is Support Vector Machine (SVM), and matching is performed using Hamming Distance (HD). This study conducted experiments using single features, feature fusion, and hybrid feature extraction methods combining DSBSIF and Gabor Wavelet for two-step iris recognition verification. The results indicate that the hybrid feature extraction method of DSBSIF and Gabor Wavelet achieved the highest accuracy of 99.95% for the first verification and 95.40% for the second verification. These results are 0.02 and 0.31 percentage points better, respectively than previous methods in the first and second verifications
Implementation of Chi-Square Feature Selection for Parkinson’s Disease Classification Using LightGBM
Penyakit Parkinson merupakan penyakit yang disebabkan oleh kerusakan sel saraf otak dan termasuk penyakit yang jumlah kasusnya meningkat pesat di dunia. Salah satu cara yang dapat dilakukan untuk mencegah meningkatnya kasus penyakit Parkinson adalah dengan melakukan diagnosis melalui metode klasifikasi dengan pendekatan pembelajaran algoritmik. Penelitian ini mengimplementasikan teknik Chi-Square untuk pendekatan pemilihan fitur yang relevan dengan algoritma Light Gradient Boosting Machine (LightGBM) dalam klasifikasi penyakit Parkinson. Pemilihan fitur Chi-Square bertujuan untuk mengurangi fitur yang kurang relevan sehingga dapat meningkatkan hasil kinerja model. Selain itu, metode SMOTE diterapkan untuk menangani ketidakseimbangan data dan penyetelan hiperparameter guna menentukan kombinasi parameter yang optimal. Pengujian dilakukan terhadap sepuluh variasi jumlah fitur, dengan hasil terbaik diperoleh dengan menggunakan 200 fitur yang menghasilkan akurasi sebesar 96,05%. Dengan menggunakan metode Chi-Square, kinerja model LightGBM meningkat dibandingkan dengan kinerja tanpa pemilihan fitur. Penerapan kombinasi metode ini dapat meningkatkan kinerja model klasifikasi secara signifikan dan berpotensi untuk diterapkan dalam sistem pendukung diagnosis penyakit Parkinson
Breast Cancer Classification Based on Mammogram Images Using CNN Method with NASNet Mobile Model
In Indonesia, the type of cancer that contributes to the highest death rate is breast cancer, so there is a great need for early examination, clinical examination, and screening, which includes mammography. Mammography is currently the most effective method for detecting early-stage breast cancer. This study aims to classify breast cancer cells based on mammogram images. The method used in this research is CNN (Convolutional Neural Network) with the NASNet Mobile model for classifying three classes: normal, benign, and malignant. The CNN method can learn various input attributes powerfully so that CNN can obtain more detailed data characteristics and has better detection capabilities. This research obtained the most optimal model based on the percentage of accuracy, sensitivity, and specificity values of 99.67%, 98.78%, and 99.35%, respectively. This research can be used to help radiologists as considerations in making breast cancer diagnosis decisions
Personality Classification of Myers Briggs Type Indicators (MBTI) Using BERT and Machine Learning
Personality classification using textual data from social media or online forums is a complex task due to the unstructured text and the multifaceted nature of personality. While the Myers-Briggs Type Indicator (MBTI) provides a comprehensive framework, adapting it to media data and handling diverse linguistic patterns requires effective algorithms. The psychological basis of MBTI is intricate, especially when using complex methods like deep learning, which can be challenging. This study classifies personality types based on each individual's behavior on an online forum by observing the linguistic patterns of posted textual data using the SVM, Random Forest, BERT, and Word2Vec algorithms. The SVM and Random Forest algorithms are traditional machine learning algorithms known for their capabilities and effectiveness in text classification. Meanwhile, BERT and Word2Vec identify semantic relationships and contextual information from textual data. In addition, the IndoBERT model will be used for the BERT model because this study focuses on the classification of Indonesian language texts.Testing was carried out using textual data from posts on the PersonalityCafe forum. The test results showed that the combination of the SVM and IndoBERT models outperformed other models with an accuracy rate of 82% and an F1 score of 75%
Systematic Review of High Interaction Honeypots for Microsoft SQL Server
This systematic review aims to dive into high interaction honeypots for Microsoft SQL Server. Topics covered include various honeypot environments (bare-metal, virtual machine, container) and monitoring methods (network-based, VMM-based, honeypot-based) to understand how to effectively monitor encrypted communications. The main focus is to compare different data monitoring techniques for high-interaction honeypots, especially considering the challenges posed by encrypted protocols such as TDS used by Microsoft SQL Server. This research identifies limitations in current research and proposes the use of encrypted MITM proxies as a potential solution. Ultimately, this research highlights the need for further research in this area due to the limited existing literature on high interaction honeypots for Microsoft SQL Server