31 research outputs found
Deep Learning-based Method for Enhancing the Detection of Arabic Authorship Attribution using Acoustic and Textual-based Features
Authorship attribution (AA) is defined as the identification of the original author of an unseen text. It is found that the style of the author’s writing can change from one topic to another, but the author’s habits are still the same in different texts. The authorship attribution has been extensively studied for texts written in different languages such as English. However, few studies investigated the Arabic authorship attribution (AAA) due to the special challenges faced with the Arabic scripts. Additionally, there is a need to identify the authors of texts extracted from livestream broadcasting and the recorded speeches to protect the intellectual property of these authors. This paper aims to enhance the detection of Arabic authorship attribution by extracting different features and fusing the outputs of two deep learning models. The dataset used in this study was collected from the weekly livestream and recorded Arabic sermons that are available publicly on the official website of Al-Haramain in Saudi Arabia. The acoustic, textual and stylometric features were extracted for five authors. Then, the data were pre-processed and fed into the deep learning-based models (CNN architecture and its pre-trained ResNet34). After that the hard and soft voting ensemble methods were applied for combining the outputs of the applied models and improve the overall performance. The experimental results showed that the use of CNN with textual data obtained an acceptable performance using all evaluation metrics. Then, the performance of ResNet34 model with acoustic features outperformed the other models and obtained the accuracy of 90.34%. Finally, the results showed that the soft voting ensemble method enhanced the performance of AAA and outperformed the other method in terms of accuracy and precision, which obtained 93.19% and 0.9311 respectively
Machine Learning-Based Predictive Models for Detection of Cardiovascular Diseases
Cardiovascular diseases present a significant global health challenge that emphasizes the critical need for developing accurate and more effective detection methods. Several studies have contributed valuable insights in this field, but it is still necessary to advance the predictive models and address the gaps in the existing detection approaches. For instance, some of the previous studies have not considered the challenge of imbalanced datasets, which can lead to biased predictions, especially when the datasets include minority classes. This study’s primary focus is the early detection of heart diseases, particularly myocardial infarction, using machine learning techniques. It tackles the challenge of imbalanced datasets by conducting a comprehensive literature review to identify effective strategies. Seven machine learning and deep learning classifiers, including K-Nearest Neighbors, Support Vector Machine, Logistic Regression, Convolutional Neural Network, Gradient Boost, XGBoost, and Random Forest, were deployed to enhance the accuracy of heart disease predictions. The research explores different classifiers and their performance, providing valuable insights for developing robust prediction models for myocardial infarction. The study’s outcomes emphasize the effectiveness of meticulously fine-tuning an XGBoost model for cardiovascular diseases. This optimization yields remarkable results: 98.50% accuracy, 99.14% precision, 98.29% recall, and a 98.71% F1 score. Such optimization significantly enhances the model’s diagnostic accuracy for heart disease
An Ensemble Machine Learning and Data Mining Approach to Enhance Stroke Prediction
Stroke poses a significant health threat, affecting millions annually. Early and precise prediction is crucial to providing effective preventive healthcare interventions. This study applied an ensemble machine learning and data mining approach to enhance the effectiveness of stroke prediction. By employing the cross-industry standard process for data mining (CRISP-DM) methodology, various techniques, including random forest, ExtraTrees, XGBoost, artificial neural network (ANN), and genetic algorithm with ANN (GANN) were applied on two benchmark datasets to predict stroke based on several parameters, such as gender, age, various diseases, smoking status, BMI, HighCol, physical activity, hypertension, heart disease, lifestyle, and others. Due to dataset imbalance, Synthetic Minority Oversampling Technique (SMOTE) was applied to the datasets. Hyperparameter tuning optimized the models via grid search and randomized search cross-validation. The evaluation metrics included accuracy, precision, recall, F1-score, and area under the curve (AUC). The experimental results show that the ensemble ExtraTrees classifier achieved the highest accuracy (98.24%) and AUC (98.24%). Random forest also performed well, achieving 98.03% in both accuracy and AUC. Comparisons with state-of-the-art stroke prediction methods revealed that the proposed approach demonstrates superior performance, indicating its potential as a promising method for stroke prediction and offering substantial benefits to healthcare
Viral shedding and antibody response in 37 patients with MERS-coronavirus infection
Background. The Middle East respiratory syndrome (MERS) coronavirus causes isolated cases and outbreaks of severe respiratory disease. Essential features of the natural history of disease are poorly understood.
Methods. We studied 37 adult patients infected with MERS coronavirus for viral load in the lower and upper respiratory tracts (LRT and URT, respectively), blood, stool, and urine. Antibodies and serum neutralizing activities were determined over the course of disease.
Results. One hundred ninety-nine LRT samples collected during the 3 weeks following diagnosis yielded virus RNA in 93% of tests. Average (maximum) viral loads were 5 × 106 (6 × 1010) copies/mL. Viral loads (positive detection frequencies) in 84 URT samples were 1.9 × 104 copies/mL (47.6%). Thirty-three percent of all 108 serum samples tested yielded viral RNA. Only 14.6% of stool and 2.4% of urine samples yielded viral RNA. All seroconversions occurred during the first 2 weeks after diagnosis, which corresponds to the second and third week after symptom onset. Immunoglobulin M detection provided no advantage in sensitivity over immunoglobulin G (IgG) detection. All surviving patients, but only slightly more than half of all fatal cases, produced IgG and neutralizing antibodies. The levels of IgG and neutralizing antibodies were weakly and inversely correlated with LRT viral loads. Presence of antibodies did not lead to the elimination of virus from LRT.
Conclusions. The timing and intensity of respiratory viral shedding in patients with MERS closely matches that of those with severe acute respiratory syndrome. Blood viral RNA does not seem to be infectious. Extrapulmonary loci of virus replication seem possible. Neutralizing antibodies do not suffice to clear the infection
Prediction of the SARS-CoV-2 Derived T-Cell Epitopes’ Response Against COVID Variants
The COVID-19 outbreak began in December 2019 and was declared a global health emergency by the World Health Organization. The four most dominating variants are Beta, Gamma, Delta, and Omicron. After the administration of vaccine doses, an eminent decline in new cases has been observed. The COVID-19 vaccine induces neutralizing antibodies and T-cells in our bodies. However, strong variants like Delta and Omicron tend to escape these neutralizing antibodies elicited by COVID-19 vaccination. Therefore, it is indispensable to study, analyze and most importantly, predict the response of SARS-CoV-2-derived t-cell epitopes against Covid variants in vaccinated and unvaccinated persons. In this regard, machine learning can be effectively utilized for predicting the response of COVID-derived t-cell epitopes. In this study, prediction of T-cells Epitopes’ response was conducted for vaccinated and unvaccinated people for Beta, Gamma, Delta, and Omicron variants. The dataset was divided into two classes, i.e., vaccinated and unvaccinated, and the predicted response of T-cell Epitopes was divided into three categories, i.e., Strong, Impaired, and Over-activated. For the aforementioned prediction purposes, a self-proposed Bayesian neural network has been designed by combining variational inference and flow normalization optimizers. Furthermore, the Hidden Markov Model has also been trained on the same dataset to compare the results of the self-proposed Bayesian neural network with this state-of-the-art statistical approach. Extensive experimentation and results demonstrate the efficacy of the proposed network in terms of accurate prediction and reduced error
An Adaptive Early Stopping Technique for DenseNet169-Based Knee Osteoarthritis Detection Model
Knee osteoarthritis (OA) detection is an important area of research in health informatics that aims to improve the accuracy of diagnosing this debilitating condition. In this paper, we investigate the ability of DenseNet169, a deep convolutional neural network architecture, for knee osteoarthritis detection using X-ray images. We focus on the use of the DenseNet169 architecture and propose an adaptive early stopping technique that utilizes gradual cross-entropy loss estimation. The proposed approach allows for the efficient selection of the optimal number of training epochs, thus preventing overfitting. To achieve the goal of this study, the adaptive early stopping mechanism that observes the validation accuracy as a threshold was designed. Then, the gradual cross-entropy (GCE) loss estimation technique was developed and integrated to the epoch training mechanism. Both adaptive early stopping and GCE were incorporated into the DenseNet169 for the OA detection model. The performance of the model was measured using several metrics including accuracy, precision, and recall. The obtained results were compared with those obtained from the existing works. The comparison shows that the proposed model outperformed the existing solutions in terms of accuracy, precision, recall, and loss performance, which indicates that the adaptive early stopping coupled with GCE improved the ability of DenseNet169 to accurately detect knee OA
Presentation and outcome of Middle East respiratory syndrome in Saudi intensive care unit patients.
BACKGROUND: Middle East respiratory syndrome coronavirus infection is associated with high mortality rates but limited clinical data have been reported. We describe the clinical features and outcomes of patients admitted to an intensive care unit (ICU) with Middle East respiratory syndrome coronavirus (MERS-CoV) infection. METHODS: Retrospective analysis of data from all adult (>18 years old) patients admitted to our 20-bed mixed ICU with Middle East respiratory syndrome coronavirus infection between October 1, 2012 and May 31, 2014. Diagnosis was confirmed in all patients using real-time reverse transcription polymerase chain reaction on respiratory samples. RESULTS: During the observation period, 31 patients were admitted with MERS-CoV infection (mean age 59 ± 20 years, 22 [71 %] males). Cough and tachypnea were reported in all patients; 22 (77.4 %) patients had bilateral pulmonary infiltrates. Invasive mechanical ventilation was applied in 27 (87.1 %) and vasopressor therapy in 25 (80.6 %) patients during the intensive care unit stay. Twenty-three (74.2 %) patients died in the ICU. Nonsurvivors were older, had greater APACHE II and SOFA scores on admission, and were more likely to have received invasive mechanical ventilation and vasopressor therapy. After adjustment for the severity of illness and the degree of organ dysfunction, the need for vasopressors was an independent risk factor for death in the ICU (odds ratio = 18.33, 95 % confidence interval: 1.11-302.1, P = 0.04). CONCLUSIONS: MERS-CoV infection requiring admission to the ICU is associated with high morbidity and mortality. The need for vasopressor therapy is the main risk factor for death in these patients
Query refinement for correlation-based time series exploration
In this paper, we focus on the problem of exploring sequential data to discover time sub-intervals that satisfy certain pairwise correlation constraints. Differently than most existing works, we use the deviation from targeted pairwise correlation constraints as an objective to minimize in our problem. Moreover, we include users preferences as an objective in the form of maximizing similarity to users’ initial sub-intervals. The combination of these two objectives are prevalent in applications where users explore time series data to locate time sub-intervals in which targeted patterns exist. Discovering these sub-intervals among time series data is extremely useful in various application areas such as network and environment monitoring. Towards finding the optimal sub-interval (i.e., optimal query) satisfying these objectives, we propose applying query refinement techniques to enable efficient processing of candidate queries. Specifically, we propose QFind, an efficient algorithm which refines a user’s initial query to discover the optimal query by applying novel pruning techniques. QFind applies two-level pruning techniques to safely skip processing unqualified candidate queries, and early abandon the computations of correlation for some pairs based on a monotonic property. We experimentally validate the efficiency of our proposed algorithm against state-of-the-art algorithm under different settings using real and synthetic data