12 research outputs found

    Experiments on the Use of Feature Selection and Machine Learning Methods in Automatic Malay Text Categorization

    Get PDF
    AbstractDue to the rapid growth of documents in digital form, research in automatic text categorization into predefined categories has witnessed a booming interest. Although, there is a wide range of supervised machine learning methods have been applied to categorize English, relatively, only a few studies have been done on Malay text categorization. This paper reports our comparative evaluation of three machine learning methods on Malay text categorization. Two feature selection methods (Information gain (IG) and Chi-square) and three machine learning methods (K-Nearest Neighbor (k-NN), Naive Bayes (NB) and N-gram) were investigated. The three supervised machine learning models were evaluated on categorized Malay corpus, and experimental results showed that the k- NN with the Chi-square feature selection gave the best performance (Macro-F1 = 96.14)

    Automatic Extraction Of Malay Compound Nouns Using A Hybrid Of Statistical And Machine Learning Methods

    Get PDF
    Identifying of compound nouns is important for a wide spectrum of applications in the field of natural language processing such as machine translation and information retrieval. Extraction of compound nouns requires deep or shallow syntactic preprocessing tools and large corpora. This paper investigates several methods for extracting Noun compounds from Malay text corpora. First, we present the empirical results of sixteen statistical association measures of Malay <N+N> compound nouns extraction. Second, we introduce the possibility of integrating multiple association measures. Third, this work also provides a standard dataset intended to provide a common platform for evaluating research on the identification compound Nouns in Malay language. The standard data set contains 7,235 unique N-N candidates, 2,970 of them are N-N compound nouns collocations. The extraction algorithms are evaluated against this reference data set. The experimental results  demonstrate that a group of association measures (T-test , Piatersky-Shapiro (PS) , C_value, FGM and  rank combination method) are the best association measure and outperforms the other association measures for <N+N> collocations in the Malay  corpus. Finally, we describe several classification methods for combining association measures scores of the basic measures, followed by their evaluation. Evaluation results show that classification algorithms significantly outperform individual association measures. Experimental results obtained are quite satisfactory in terms of the Precision, Recall and F-score

    Effects of hospital facilities on patient outcomes after cancer surgery: an international, prospective, observational study

    Get PDF
    Background Early death after cancer surgery is higher in low-income and middle-income countries (LMICs) compared with in high-income countries, yet the impact of facility characteristics on early postoperative outcomes is unknown. The aim of this study was to examine the association between hospital infrastructure, resource availability, and processes on early outcomes after cancer surgery worldwide.Methods A multimethods analysis was performed as part of the GlobalSurg 3 study-a multicentre, international, prospective cohort study of patients who had surgery for breast, colorectal, or gastric cancer. The primary outcomes were 30-day mortality and 30-day major complication rates. Potentially beneficial hospital facilities were identified by variable selection to select those associated with 30-day mortality. Adjusted outcomes were determined using generalised estimating equations to account for patient characteristics and country-income group, with population stratification by hospital.Findings Between April 1, 2018, and April 23, 2019, facility-level data were collected for 9685 patients across 238 hospitals in 66 countries (91 hospitals in 20 high-income countries; 57 hospitals in 19 upper-middle-income countries; and 90 hospitals in 27 low-income to lower-middle-income countries). The availability of five hospital facilities was inversely associated with mortality: ultrasound, CT scanner, critical care unit, opioid analgesia, and oncologist. After adjustment for case-mix and country income group, hospitals with three or fewer of these facilities (62 hospitals, 1294 patients) had higher mortality compared with those with four or five (adjusted odds ratio [OR] 3.85 [95% CI 2.58-5.75]; p<0.0001), with excess mortality predominantly explained by a limited capacity to rescue following the development of major complications (63.0% vs 82.7%; OR 0.35 [0.23-0.53]; p<0.0001). Across LMICs, improvements in hospital facilities would prevent one to three deaths for every 100 patients undergoing surgery for cancer.Interpretation Hospitals with higher levels of infrastructure and resources have better outcomes after cancer surgery, independent of country income. Without urgent strengthening of hospital infrastructure and resources, the reductions in cancer-associated mortality associated with improved access will not be realised

    Arabic English Cross-Lingual Plagiarism Detection Based on Keyphrases Extraction, Monolingual and Machine Learning Approach

    No full text
    Due to rapid growth of research articles in various languages, cross-lingual plagiarism detection problem has received increasing interest in recent years. Cross-lingual plagiarism detection is more challenging task than monolingual plagiarism detection. This paper addresses the problem of cross-lingual plagiarism detection (CLPD) by proposing a method that combines keyphrases extraction, monolingual detection methods and machine learning approach. The research methodology used in this study has facilitated to accomplish the objectives in terms of designing, developing, and implementing an efficient Arabic – English cross lingual plagiarism detection. This paper empirically evaluates five different monolingual plagiarism detection methods namely i)N-Grams Similarity, ii)Longest Common Subsequence, iii)Dice Coefficient, iv)Fingerprint based Jaccard Similarity  and v) Fingerprint based Containment Similarity. In addition, three machine learning approaches namely i) naïve Bayes, ii) Support Vector Machine, and iii) linear logistic regression classifiers are used for Arabic-English Cross-language plagiarism detection. Several experiments are conducted to evaluate the performance of the key phrases extraction methods. In addition, Several experiments to investigate the performance of machine learning techniques to find the best method for Arabic-English Cross-language plagiarism detection. According to the experiments of Arabic-English Cross-language plagiarism detection, the highest result was obtained using SVM   classifier with 92% f-measure. In addition, the highest results were obtained by all classifiers are achieved, when most of the monolingual plagiarism detection methods are used.&nbsp

    Global variation in postoperative mortality and complications after cancer surgery: a multicentre, prospective cohort study in 82 countries

    Get PDF
    Background: 80% of individuals with cancer will require a surgical procedure, yet little comparative data exist on early outcomes in low-income and middle-income countries (LMICs). We compared postoperative outcomes in breast, colorectal, and gastric cancer surgery in hospitals worldwide, focusing on the effect of disease stage and complications on postoperative mortality. Methods: This was a multicentre, international prospective cohort study of consecutive adult patients undergoing surgery for primary breast, colorectal, or gastric cancer requiring a skin incision done under general or neuraxial anaesthesia. The primary outcome was death or major complication within 30 days of surgery. Multilevel logistic regression determined relationships within three-level nested models of patients within hospitals and countries. Hospital-level infrastructure effects were explored with three-way mediation analyses. This study was registered with ClinicalTrials.gov, NCT03471494. Findings: Between April 1, 2018, and Jan 31, 2019, we enrolled 15 958 patients from 428 hospitals in 82 countries (high income 9106 patients, 31 countries; upper-middle income 2721 patients, 23 countries; or lower-middle income 4131 patients, 28 countries). Patients in LMICs presented with more advanced disease compared with patients in high-income countries. 30-day mortality was higher for gastric cancer in low-income or lower-middle-income countries (adjusted odds ratio 3·72, 95% CI 1·70–8·16) and for colorectal cancer in low-income or lower-middle-income countries (4·59, 2·39–8·80) and upper-middle-income countries (2·06, 1·11–3·83). No difference in 30-day mortality was seen in breast cancer. The proportion of patients who died after a major complication was greatest in low-income or lower-middle-income countries (6·15, 3·26–11·59) and upper-middle-income countries (3·89, 2·08–7·29). Postoperative death after complications was partly explained by patient factors (60%) and partly by hospital or country (40%). The absence of consistently available postoperative care facilities was associated with seven to 10 more deaths per 100 major complications in LMICs. Cancer stage alone explained little of the early variation in mortality or postoperative complications. Interpretation: Higher levels of mortality after cancer surgery in LMICs was not fully explained by later presentation of disease. The capacity to rescue patients from surgical complications is a tangible opportunity for meaningful intervention. Early death after cancer surgery might be reduced by policies focusing on strengthening perioperative care systems to detect and intervene in common complications. Funding: National Institute for Health Research Global Health Research Unit

    Global variation in postoperative mortality and complications after cancer surgery: a multicentre, prospective cohort study in 82 countries

    No full text
    © 2021 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY-NC-ND 4.0 licenseBackground: 80% of individuals with cancer will require a surgical procedure, yet little comparative data exist on early outcomes in low-income and middle-income countries (LMICs). We compared postoperative outcomes in breast, colorectal, and gastric cancer surgery in hospitals worldwide, focusing on the effect of disease stage and complications on postoperative mortality. Methods: This was a multicentre, international prospective cohort study of consecutive adult patients undergoing surgery for primary breast, colorectal, or gastric cancer requiring a skin incision done under general or neuraxial anaesthesia. The primary outcome was death or major complication within 30 days of surgery. Multilevel logistic regression determined relationships within three-level nested models of patients within hospitals and countries. Hospital-level infrastructure effects were explored with three-way mediation analyses. This study was registered with ClinicalTrials.gov, NCT03471494. Findings: Between April 1, 2018, and Jan 31, 2019, we enrolled 15 958 patients from 428 hospitals in 82 countries (high income 9106 patients, 31 countries; upper-middle income 2721 patients, 23 countries; or lower-middle income 4131 patients, 28 countries). Patients in LMICs presented with more advanced disease compared with patients in high-income countries. 30-day mortality was higher for gastric cancer in low-income or lower-middle-income countries (adjusted odds ratio 3·72, 95% CI 1·70–8·16) and for colorectal cancer in low-income or lower-middle-income countries (4·59, 2·39–8·80) and upper-middle-income countries (2·06, 1·11–3·83). No difference in 30-day mortality was seen in breast cancer. The proportion of patients who died after a major complication was greatest in low-income or lower-middle-income countries (6·15, 3·26–11·59) and upper-middle-income countries (3·89, 2·08–7·29). Postoperative death after complications was partly explained by patient factors (60%) and partly by hospital or country (40%). The absence of consistently available postoperative care facilities was associated with seven to 10 more deaths per 100 major complications in LMICs. Cancer stage alone explained little of the early variation in mortality or postoperative complications. Interpretation: Higher levels of mortality after cancer surgery in LMICs was not fully explained by later presentation of disease. The capacity to rescue patients from surgical complications is a tangible opportunity for meaningful intervention. Early death after cancer surgery might be reduced by policies focusing on strengthening perioperative care systems to detect and intervene in common complications. Funding: National Institute for Health Research Global Health Research Unit

    Effects of hospital facilities on patient outcomes after cancer surgery: an international, prospective, observational study

    No full text
    © 2022 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 licenseBackground: Early death after cancer surgery is higher in low-income and middle-income countries (LMICs) compared with in high-income countries, yet the impact of facility characteristics on early postoperative outcomes is unknown. The aim of this study was to examine the association between hospital infrastructure, resource availability, and processes on early outcomes after cancer surgery worldwide. Methods: A multimethods analysis was performed as part of the GlobalSurg 3 study—a multicentre, international, prospective cohort study of patients who had surgery for breast, colorectal, or gastric cancer. The primary outcomes were 30-day mortality and 30-day major complication rates. Potentially beneficial hospital facilities were identified by variable selection to select those associated with 30-day mortality. Adjusted outcomes were determined using generalised estimating equations to account for patient characteristics and country-income group, with population stratification by hospital. Findings: Between April 1, 2018, and April 23, 2019, facility-level data were collected for 9685 patients across 238 hospitals in 66 countries (91 hospitals in 20 high-income countries; 57 hospitals in 19 upper-middle-income countries; and 90 hospitals in 27 low-income to lower-middle-income countries). The availability of five hospital facilities was inversely associated with mortality: ultrasound, CT scanner, critical care unit, opioid analgesia, and oncologist. After adjustment for case-mix and country income group, hospitals with three or fewer of these facilities (62 hospitals, 1294 patients) had higher mortality compared with those with four or five (adjusted odds ratio [OR] 3·85 [95% CI 2·58–5·75]; p<0·0001), with excess mortality predominantly explained by a limited capacity to rescue following the development of major complications (63·0% vs 82·7%; OR 0·35 [0·23–0·53]; p<0·0001). Across LMICs, improvements in hospital facilities would prevent one to three deaths for every 100 patients undergoing surgery for cancer. Interpretation: Hospitals with higher levels of infrastructure and resources have better outcomes after cancer surgery, independent of country income. Without urgent strengthening of hospital infrastructure and resources, the reductions in cancer-associated mortality associated with improved access will not be realised. Funding: National Institute for Health and Care Research
    corecore