1,110 research outputs found

    Cluster Analysis on Dengue Incidence and Weather Data Using K-Medoids and Fuzzy C-Means Clustering Algorithms (Case Study: Spread of Dengue in the DKI Jakarta Province)

    Get PDF
    In Indonesia, Dengue incidence tends to increase every year but has been fluctuating in recent years. The potential for Dengue outbreaks in DKI Jakarta, the capital city, deserves serious attention. Weather factors are suspected of being associated with the incidence of Dengue in Indonesia. This research used weather and Dengue incidence data for five regions of DKI Jakarta, Indonesia, from December 30, 2008, to January 2, 2017. The study used a clustering approach on time-series and non-time-series data using K-Medoids and Fuzzy C-Means Clustering. The clustering results for the non-time-series data showed a positive correlation between the number of Dengue incidents and both average relative humidity and amount of rainfall. However, Dengue incidence and average temperature were negatively correlated. Moreover, the clustering implementation on the time-series data showed that rainfall patterns most closely resembled those of Dengue incidence. Therefore, rainfall can be used to estimate Dengue incidence. Both results suggest that the government could utilize weather data to predict possible spikes in DHF incidence, especially when entering the rainy season and alert the public to greater probability of a Dengue outbreak

    Cluster Analysis on Dengue Incidence and Weather Data Using K-Medoids and Fuzzy C-Means Clustering Algorithms (Case Study: Spread of Dengue in the DKI Jakarta Province)

    Get PDF
    In Indonesia, Dengue incidence tends to increase every year but has been fluctuating in recent years. The potential for Dengue outbreaks in DKI Jakarta, the capital city, deserves serious attention. Weather factors are suspected of being associated with the incidence of Dengue in Indonesia. This research used weather and Dengue incidence data for five regions of DKI Jakarta, Indonesia, from December 30, 2008, to January 2, 2017. The study used a clustering approach on time-series and non-time-series data using K-Medoids and Fuzzy C-Means Clustering. The clustering results for the non-time-series data showed a positive correlation between the number of Dengue incidents and both average relative humidity and amount of rainfall. However, Dengue incidence and average temperature were negatively correlated. Moreover, the clustering implementation on the time-series data showed that rainfall patterns most closely resembled those of Dengue incidence. Therefore, rainfall can be used to estimate Dengue incidence. Both results suggest that the government could utilize weather data to predict possible spikes in DHF incidence, especially when entering the rainy season and alert the public to greater probability of a Dengue outbreak

    Feature selection algorithms for Malaysian dengue outbreak detection model

    Get PDF
    Dengue fever is considered as one of the most common mosquito borne diseases worldwide. Dengue outbreak detection can be very useful in terms of practical efforts to overcome the rapid spread of the disease by providing the knowledge to predict the next outbreak occurrence. Many studies have been conducted to model and predict dengue outbreak using different data mining techniques. This research aimed to identify the best features that lead to better predictive accuracy of dengue outbreaks using three different feature selection algorithms; particle swarm optimization (PSO), genetic algorithm (GA) and rank search (RS). Based on the selected features, three predictive modeling techniques (J48, DTNB and Naive Bayes) were applied for dengue outbreak detection. The dataset used in this research was obtained from the Public Health Department, Seremban, Negeri Sembilan, Malaysia. The experimental results showed that the predictive accuracy was improved by applying feature selection process before the predictive modeling process. The study also showed the set of features to represent dengue outbreak detection for Malaysian health agencies

    Early Diagnosis for Dengue Disease Prediction Using Efficient Machine Learning Techniques Based on Clinical Data

    Get PDF
    Dengue fever is a worldwide issue, especially in Yemen. Although early detection is critical to reducing dengue disease deaths, accurate dengue diagnosis requires a long time due to the numerous clinical examinations. Thus, this issue necessitates the development of a new diagnostic schema. The objective of this work is to develop a diagnostic model for the earlier diagnosis of dengue disease using Efficient Machine Learning Techniques (EMLT). This paper proposed prediction models for dengue disease based on EMLT. Five different efficient machine learning models, including K-Nearest Neighbor (KNN), Gradient Boosting Classifier (GBC), Extra Tree Classifier (ETC), eXtreme Gradient Boosting (XGB), and Light Gradient Boosting Machine (LightGBM). All classifiers are trained and tested on the dataset using 10-Fold Cross-Validation and Holdout Cross-Validation approaches. On a test set, all models were evaluated using different metrics: accuracy, F1-sore, Recall, Precision, AUC, and operating time. Based on the findings, the ETC model achieved the highest accuracy in Hold-out and 10-fold cross-validation, with 99.12 % and 99.03 %, respectively. In the Holdout cross-validation approach, we conclude that the best classifier with high accuracy is ETC, which achieved 99.12 %. Finally, the experimental results indicate that classifier performance in holdout cross-validation outperforms 10-fold cross-validation. Accordingly, the proposed dengue prediction system demonstrates its efficacy and effectiveness in assisting doctors in accurately predicting dengue disease

    HYBRID K MEANS-MULTIVARIATE ADAPTIVE REGRESSION SPLINES FOR DISTRIBUTION OF DENGUE FEVER RISK MAPPING IN BOJONEGORO DISTRICT

    Get PDF
    Dengue Hemorrhagic Fever (DHF) is a dangerous disease transmitted by Aedes aegypti and Aedes albopictus mosquitoes’ bites. WHO data shows that almost half of the world's humans are exposed to Dengue Hemorrhagic Fever. The number of mortality caused by dengue disease is around 20,000 every year. In East Java, Bojonegoro District has the highest number of dengue hemorrhagic fever cases (416). To reduce this number, the causative factors need to be known. Additionally, it's important to pinpoint the region or cluster where the variables driving the spread are located so that prevention and treatment efforts are effective. Based on the elements contributing to the transmission of Dengue Hemorrhagic Fever, this study seeks to identify and categorize locations at risk for the spread of the illness. This study uses Hybrid K Means-Multivariate Adaptive Regression Splines (MARS) which is a combination of K-Means and MARS methods in the hope of providing better analytical results. This is because the data was divided into simpler parts by considering the Oakley distance. The results obtained from the K Means-MARS hybrid shows the relationship between response variables and predictor variables for each cluster. There are three clusters of risk for the spread of dengue hemorrhagic fever in Bojonegoro district with categories: high risk cluster, medium risk cluster and low risk cluster. The high risk cluster consists of 7 sub-districts (Baureno, Kepohbaru, Balen, Sumberrejo, Kedungadem, Bojonegoro and Dander). The variables affecting the DHF Sufferer in the high risk cluster were population density (X2), Altitude (X3) and Health Worker (X6). Meanwhile, the medium risk cluster consists of 10 sub-districts (Kalitidu, Kanor, Kapas, Ngasem, Ngraho, Padangan, Sugihwaras, Sukosewu, Tambakrejo, and Trucuk). The variables that affect the DHF Sufferer in the medium cluster are Number of Dead (X1), Population Density (X2) and Health Facility (X5). The low risk cluster consisted of 11 sub-districts (Bubulan, Gayam, Gondang, Kasiman, Kedewan, Malo, Margomulyo, Ngambon, Purwosari, Sekar, and Temayang). The variables affecting the DHF Sufferer rate in the low risk cluster were number of dead (X1) and population density (X2)

    Infectious Disease Ontology

    Get PDF
    Technological developments have resulted in tremendous increases in the volume and diversity of the data and information that must be processed in the course of biomedical and clinical research and practice. Researchers are at the same time under ever greater pressure to share data and to take steps to ensure that data resources are interoperable. The use of ontologies to annotate data has proven successful in supporting these goals and in providing new possibilities for the automated processing of data and information. In this chapter, we describe different types of vocabulary resources and emphasize those features of formal ontologies that make them most useful for computational applications. We describe current uses of ontologies and discuss future goals for ontology-based computing, focusing on its use in the field of infectious diseases. We review the largest and most widely used vocabulary resources relevant to the study of infectious diseases and conclude with a description of the Infectious Disease Ontology (IDO) suite of interoperable ontology modules that together cover the entire infectious disease domain

    A Multi-Stage Machine Learning Approach to Predict Dengue Incidence: A Case Study in Mexico

    Get PDF
    © 2013 IEEE. The mosquito-borne dengue fever is a major public health problem in tropical countries, where it is strongly conditioned by climate factors such as temperature. In this paper, we formulate a holistic machine learning strategy to analyze the temporal dynamics of temperature and dengue data and use this knowledge to produce accurate predictions of dengue, based on temperature on an annual scale. The temporal dynamics are extracted from historical data by utilizing a novel multi-stage combination of auto-encoding, window-based data representation and trend-based temporal clustering. The prediction is performed with a trend association-based nearest neighbour predictor. The effectiveness of the proposed strategy is evaluated in a case study that comprises the number of dengue and dengue hemorrhagic fever cases collected over the period 1985-2010 in 32 federal states of Mexico. The empirical study proves the viability of the proposed strategy and confirms that it outperforms various state-of-the-art competitor methods formulated both in regression and in time series forecasting analysis

    Automatic domain-specific learning: towards a methodology for ontology enrichment

    Get PDF
    [EN] At the current rate of technological development, in a world where enormous amount of data are constantly created and in which the Internet is used as the primary means for information exchange, there exists a need for tools that help processing, analyzing and using that information. However, while the growth of information poses many opportunities for social and scientific advance, it has also highlighted the difficulties of extracting meaningful patterns from massive data. Ontologies have been claimed to play a major role in the processing of large-scale data, as they serve as universal models of knowledge representation, and are being studied as possible solutions to this. This paper presents a method for the automatic expansion of ontologies based on corpus and terminological data exploitation. The proposed ¿ontology enrichment method¿ (OEM) consists of a sequence of tasks aimed at classifying an input keyword automatically under its corresponding node within a target ontology. Results prove that the method can be successfully applied for the automatic classification of specialized units into a reference ontology.Financial support for this research has been provided by the DGI, Spanish Ministry of Education and Science, grant FFI2011-29798-C0201.Ureña Gómez-Moreno, P.; Mestre-Mestre, EM. (2017). Automatic domain-specific learning: towards a methodology for ontology enrichment. LFE. Revista de Lenguas para Fines Específicos. 23(2):63-85. http://hdl.handle.net/10251/148357S638523
    corecore