112 research outputs found

    Diagnosa Prediksi Penyakit Thypoid Fever Menggunakan Data Mining Dengan Metode Algoritma Naive Bayes Classifier

    Get PDF
    Wabah penyakit Typhoid Fever di Indonesia memang tengah memuncak, penyakit tersebut disebabkan oleh kuman Salmonella Typosa dan menyebar ke manusia melalui makanan dan minuman yang sudah terkotaminasi. Berdasarkan pada data tahun 2018 awal di RS Budi Asih didapatkan bahwa typhoid fever memasuki 3 besar penyakit yang banyak terjadi selama tahun 2018. Seiring dengan banyaknya pasien kasus typhoid fever akan memungkinkan data dengan jumlah skala yang sangat besar dapat terakumulasi, dengan memanfaatkan data tersebut penulis ingin menerapkan salah satu teknik data mining dengan perhitungan statiska dalam melakukan diagnosis penyakit typhoid fever. Metode yang digunakan adalah Naive Bayes dengan menggunakan sebanyak 250 data pasien kasus typhoid fever. Diagnosa Prediksi Penyakit Typhoid Fever menggunakan metode Naive Bayes merupakan aplikasi juga bertujuan membantu masyarakat dalam mendiagnosis penyakit typhoid fever secara dini. Hasil analisis menunjukkan bahwa gejala demam, mual muntah, pusing, batuk, diare, bradikardi bisa menjadi indikator untuk mendiagnosis penyakit typhoid fever. Hasil analisis juga menunjukkan bahwa ketepatan klasifikasi pasien kasus typhoid fever menggunakan metode naive bayes pada penelitian ini adalah sebesar 92%

    An automated, broad-based, near real-time public health surveillance system using presentations to hospital Emergency Departments in New South Wales, Australia

    Get PDF
    BACKGROUND: In a climate of concern over bioterrorism threats and emergent diseases, public health authorities are trialling more timely surveillance systems. The 2003 Rugby World Cup (RWC) provided an opportunity to test the viability of a near real-time syndromic surveillance system in metropolitan Sydney, Australia. We describe the development and early results of this largely automated system that used data routinely collected in Emergency Departments (EDs). METHODS: Twelve of 49 EDs in the Sydney metropolitan area automatically transmitted surveillance data from their existing information systems to a central database in near real-time. Information captured for each ED visit included patient demographic details, presenting problem and nursing assessment entered as free-text at triage time, physician-assigned provisional diagnosis codes, and status at departure from the ED. Both diagnoses from the EDs and triage text were used to assign syndrome categories. The text information was automatically classified into one or more of 26 syndrome categories using automated "naïve Bayes" text categorisation techniques. Automated processes were used to analyse both diagnosis and free text-based syndrome data and to produce web-based statistical summaries for daily review. An adjusted cumulative sum (cusum) was used to assess the statistical significance of trends. RESULTS: During the RWC the system did not identify any major public health threats associated with the tournament, mass gatherings or the influx of visitors. This was consistent with evidence from other sources, although two known outbreaks were already in progress before the tournament. Limited baseline in early monitoring prevented the system from automatically identifying these ongoing outbreaks. Data capture was invisible to clinical staff in EDs and did not add to their workload. CONCLUSION: We have demonstrated the feasibility and potential utility of syndromic surveillance using routinely collected data from ED information systems. Key features of our system are its nil impact on clinical staff, and its use of statistical methods to assign syndrome categories based on clinical free text information. The system is ongoing, and has expanded to cover 30 EDs. Results of formal evaluations of both the technical efficiency and the public health impacts of the system will be described subsequently

    Disaster and Pandemic Management Using Machine Learning: A Survey

    Get PDF
    This article provides a literature review of state-of-the-art machine learning (ML) algorithms for disaster and pandemic management. Most nations are concerned about disasters and pandemics, which, in general, are highly unlikely events. To date, various technologies, such as IoT, object sensing, UAV, 5G, and cellular networks, smartphone-based system, and satellite-based systems have been used for disaster and pandemic management. ML algorithms can handle multidimensional, large volumes of data that occur naturally in environments related to disaster and pandemic management and are particularly well suited for important related tasks, such as recognition and classification. ML algorithms are useful for predicting disasters and assisting in disaster management tasks, such as determining crowd evacuation routes, analyzing social media posts, and handling the post-disaster situation. ML algorithms also find great application in pandemic management scenarios, such as predicting pandemics, monitoring pandemic spread, disease diagnosis, etc. This article first presents a tutorial on ML algorithms. It then presents a detailed review of several ML algorithms and how we can combine these algorithms with other technologies to address disaster and pandemic management. It also discusses various challenges, open issues and, directions for future research

    New perspectives and methods for stream learning in the presence of concept drift.

    Get PDF
    153 p.Applications that generate data in the form of fast streams from non-stationary environments, that is,those where the underlying phenomena change over time, are becoming increasingly prevalent. In thiskind of environments the probability density function of the data-generating process may change overtime, producing a drift. This causes that predictive models trained over these stream data become obsoleteand do not adapt suitably to the new distribution. Specially in online learning scenarios, there is apressing need for new algorithms that adapt to this change as fast as possible, while maintaining goodperformance scores. Examples of these applications include making inferences or predictions based onfinancial data, energy demand and climate data analysis, web usage or sensor network monitoring, andmalware/spam detection, among many others.Online learning and concept drift are two of the most hot topics in the recent literature due to theirrelevance for the so-called Big Data paradigm, where nowadays we can find an increasing number ofapplications based on training data continuously available, named as data streams. Thus, learning in nonstationaryenvironments requires adaptive or evolving approaches that can monitor and track theunderlying changes, and adapt a model to accommodate those changes accordingly. In this effort, Iprovide in this thesis a comprehensive state-of-the-art approaches as well as I identify the most relevantopen challenges in the literature, while focusing on addressing three of them by providing innovativeperspectives and methods.This thesis provides with a complete overview of several related fields, and tackles several openchallenges that have been identified in the very recent state of the art. Concretely, it presents aninnovative way to generate artificial diversity in ensembles, a set of necessary adaptations andimprovements for spiking neural networks in order to be used in online learning scenarios, and finally, adrift detector based on this former algorithm. All of these approaches together constitute an innovativework aimed at presenting new perspectives and methods for the field

    New perspectives and methods for stream learning in the presence of concept drift.

    Get PDF
    153 p.Applications that generate data in the form of fast streams from non-stationary environments, that is,those where the underlying phenomena change over time, are becoming increasingly prevalent. In thiskind of environments the probability density function of the data-generating process may change overtime, producing a drift. This causes that predictive models trained over these stream data become obsoleteand do not adapt suitably to the new distribution. Specially in online learning scenarios, there is apressing need for new algorithms that adapt to this change as fast as possible, while maintaining goodperformance scores. Examples of these applications include making inferences or predictions based onfinancial data, energy demand and climate data analysis, web usage or sensor network monitoring, andmalware/spam detection, among many others.Online learning and concept drift are two of the most hot topics in the recent literature due to theirrelevance for the so-called Big Data paradigm, where nowadays we can find an increasing number ofapplications based on training data continuously available, named as data streams. Thus, learning in nonstationaryenvironments requires adaptive or evolving approaches that can monitor and track theunderlying changes, and adapt a model to accommodate those changes accordingly. In this effort, Iprovide in this thesis a comprehensive state-of-the-art approaches as well as I identify the most relevantopen challenges in the literature, while focusing on addressing three of them by providing innovativeperspectives and methods.This thesis provides with a complete overview of several related fields, and tackles several openchallenges that have been identified in the very recent state of the art. Concretely, it presents aninnovative way to generate artificial diversity in ensembles, a set of necessary adaptations andimprovements for spiking neural networks in order to be used in online learning scenarios, and finally, adrift detector based on this former algorithm. All of these approaches together constitute an innovativework aimed at presenting new perspectives and methods for the field

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

    Predictive analytics framework for electronic health records with machine learning advancements : optimising hospital resources utilisation with predictive and epidemiological models

    Get PDF
    The primary aim of this thesis was to investigate the feasibility and robustness of predictive machine-learning models in the context of improving hospital resources’ utilisation with data- driven approaches and predicting hospitalisation with hospital quality assessment metrics such as length of stay. The length of stay predictions includes the validity of the proposed methodological predictive framework on each hospital’s electronic health records data source. In this thesis, we relied on electronic health records (EHRs) to drive a data-driven predictive inpatient length of stay (LOS) research framework that suits the most demanding hospital facilities for hospital resources’ utilisation context. The thesis focused on the viability of the methodological predictive length of stay approaches on dynamic and demanding healthcare facilities and hospital settings such as the intensive care units and the emergency departments. While the hospital length of stay predictions are (internal) healthcare inpatients outcomes assessment at the time of admission to discharge, the thesis also considered (external) factors outside hospital control, such as forecasting future hospitalisations from the spread of infectious communicable disease during pandemics. The internal and external splits are the thesis’ main contributions. Therefore, the thesis evaluated the public health measures during events of uncertainty (e.g. pandemics) and measured the effect of non-pharmaceutical intervention during outbreaks on future hospitalised cases. This approach is the first contribution in the literature to examine the epidemiological curves’ effect using simulation models to project the future hospitalisations on their strong potential to impact hospital beds’ availability and stress hospital workflow and workers, to the best of our knowledge. The main research commonalities between chapters are the usefulness of ensembles learning models in the context of LOS for hospital resources utilisation. The ensembles learning models anticipate better predictive performance by combining several base models to produce an optimal predictive model. These predictive models explored the internal LOS for various chronic and acute conditions using data-driven approaches to determine the most accurate and powerful predicted outcomes. This eventually helps to achieve desired outcomes for hospital professionals who are working in hospital settings

    Implementing decision tree-based algorithms in medical diagnostic decision support systems

    Get PDF
    As a branch of healthcare, medical diagnosis can be defined as finding the disease based on the signs and symptoms of the patient. To this end, the required information is gathered from different sources like physical examination, medical history and general information of the patient. Development of smart classification models for medical diagnosis is of great interest amongst the researchers. This is mainly owing to the fact that the machine learning and data mining algorithms are capable of detecting the hidden trends between features of a database. Hence, classifying the medical datasets using smart techniques paves the way to design more efficient medical diagnostic decision support systems. Several databases have been provided in the literature to investigate different aspects of diseases. As an alternative to the available diagnosis tools/methods, this research involves machine learning algorithms called Classification and Regression Tree (CART), Random Forest (RF) and Extremely Randomized Trees or Extra Trees (ET) for the development of classification models that can be implemented in computer-aided diagnosis systems. As a decision tree (DT), CART is fast to create, and it applies to both the quantitative and qualitative data. For classification problems, RF and ET employ a number of weak learners like CART to develop models for classification tasks. We employed Wisconsin Breast Cancer Database (WBCD), Z-Alizadeh Sani dataset for coronary artery disease (CAD) and the databanks gathered in Ghaem Hospital’s dermatology clinic for the response of patients having common and/or plantar warts to the cryotherapy and/or immunotherapy methods. To classify the breast cancer type based on the WBCD, the RF and ET methods were employed. It was found that the developed RF and ET models forecast the WBCD type with 100% accuracy in all cases. To choose the proper treatment approach for warts as well as the CAD diagnosis, the CART methodology was employed. The findings of the error analysis revealed that the proposed CART models for the applications of interest attain the highest precision and no literature model can rival it. The outcome of this study supports the idea that methods like CART, RF and ET not only improve the diagnosis precision, but also reduce the time and expense needed to reach a diagnosis. However, since these strategies are highly sensitive to the quality and quantity of the introduced data, more extensive databases with a greater number of independent parameters might be required for further practical implications of the developed models

    Complexity Reduction in Image-Based Breast Cancer Care

    Get PDF
    The diversity of malignancies of the breast requires personalized diagnostic and therapeutic decision making in a complex situation. This thesis contributes in three clinical areas: (1) For clinical diagnostic image evaluation, computer-aided detection and diagnosis of mass and non-mass lesions in breast MRI is developed. 4D texture features characterize mass lesions. For non-mass lesions, a combined detection/characterisation method utilizes the bilateral symmetry of the breast s contrast agent uptake. (2) To improve clinical workflows, a breast MRI reading paradigm is proposed, exemplified by a breast MRI reading workstation prototype. Instead of mouse and keyboard, it is operated using multi-touch gestures. The concept is extended to mammography screening, introducing efficient navigation aids. (3) Contributions to finite element modeling of breast tissue deformations tackle two clinical problems: surgery planning and the prediction of the breast deformation in a MRI biopsy device

    CLADAG 2021 BOOK OF ABSTRACTS AND SHORT PAPERS

    Get PDF
    The book collects the short papers presented at the 13th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society (SIS). The meeting has been organized by the Department of Statistics, Computer Science and Applications of the University of Florence, under the auspices of the Italian Statistical Society and the International Federation of Classification Societies (IFCS). CLADAG is a member of the IFCS, a federation of national, regional, and linguistically-based classification societies. It is a non-profit, non-political scientific organization, whose aims are to further classification research
    • …
    corecore