10 research outputs found

    Clustering Kabupaten/Kota di Jawa Tengah Tahun 2022 berdasarkan Jumlah Kasus Kemunculan Penyakit dengan Algoritma K-Means

    Get PDF
    This research aims to conduct clustering or grouping of Regencies/Cities in Central Java Province based on the number of occurrences of specific diseases in 2022 using the K-Means algorithm. The research results obtained 3 clusters, namely high, medium, and low for 29 Regencies and 6 Cities. The percentage for cluster 1 is 34.29%, consisting of 10 regencies and 2 cities, cluster 2 is 40.00%, consisting of 11 regencies and 3 cities, and cluster 3 is 25.71%, consisting of 8 regencies and 1 city. These clustering results can be used as a basis for making effective strategic decisions in the development of prevention and control efforts for diseases in each region

    30th Anniversary of Applied Intelligence: A combination of bibliometrics and thematic analysis using SciMAT

    Get PDF
    Applied Intelligence is one of the most important international scientific journals in the field of artificial intelligence. From 1991, Applied Intelligence has been oriented to support research advances in new and innovative intelligent systems, methodologies, and their applications in solving real-life complex problems. In this way, Applied Intelligence hosts more than 2,400 publications and achieves around 31,800 citations. Moreover, Applied Intelligence is recognized by the industrial, academic, and scientific communities as a source of the latest innovative and advanced solutions in intelligent manufacturing, privacy-preserving systems, risk analysis, knowledge-based management, modern techniques to improve healthcare systems, methods to assist government, and solving industrial problems that are too complex to be solved through conventional approaches. Bearing in mind that Applied Intelligence celebrates its 30th anniversary in 2021, it is appropriate to analyze its bibliometric performance, conceptual structure, and thematic evolution. To do that, this paper conducts a bibliometric performance and conceptual structure analysis of Applied Intelligence from 1991 to 2020 using SciMAT. Firstly, the performance of the journal is analyzed according to the data retrieved from Scopus, putting the focus on the productivity of the authors, citations, countries, organizations, funding agencies, and most relevant publications. Finally, the conceptual structure of the journal is analyzed with the bibliometric software tool SciMAT, identifying the main thematic areas that have been the object of research and their composition, relationship, and evolution during the period analyzed

    A Review of Research Methodologies Employed in Serendipity Studies in the Context of Information Research

    Get PDF
    Background: The concept of serendipity has become increasingly interesting for those undertaking serendipity research in recent years. However, serendipitous encounters are subjective and rare in a real-world context, making this an extremely challenging subject to study. Methods: Various methods have been proposed to enable researchers to understand and measure serendipity, but there is no broad consensus on which methods to use in different experimental settings. A comprehensive literature review was first conducted, which summarizes the research methods being employed to study serendipity. It was followed by a series of interviews with experts that specified the relative strengths and weaknesses of each method identified in the literature review, in addition to the challenges usually confronted in serendipity research. Results: The findings suggest using mixed research methods to produce a more complete picture of serendipity and contribute to the verification of any research findings. Several challenges and implications relating to empirical studies in the investigation of serendipity have been derived from this study. Conclusions: This paper investigated research methods employed to study serendipity by synthesizing finding from a literature review and the interviews with experts. It provides a methodological contribution to serendipity studies by systematically summarizing the methods employed in the studies of serendipity and identifying the strengths and weakness of each method. It also suggests the novel approach of using mixed research methods to study serendipity. This study has potential limitations related to a small number of experts involved in the expert interview. However, it should be noted that the nature of the topic is a relatively focused area, and it was observed after interviewing the experts that new data seems to not contribute to the findings owing to its repetition of comment

    A Hybrid Chimp Optimization Algorithm and Generalized Normal Distribution Algorithm with Opposition-Based Learning Strategy for Solving Data Clustering Problems

    Full text link
    This paper is concerned with data clustering to separate clusters based on the connectivity principle for categorizing similar and dissimilar data into different groups. Although classical clustering algorithms such as K-means are efficient techniques, they often trap in local optima and have a slow convergence rate in solving high-dimensional problems. To address these issues, many successful meta-heuristic optimization algorithms and intelligence-based methods have been introduced to attain the optimal solution in a reasonable time. They are designed to escape from a local optimum problem by allowing flexible movements or random behaviors. In this study, we attempt to conceptualize a powerful approach using the three main components: Chimp Optimization Algorithm (ChOA), Generalized Normal Distribution Algorithm (GNDA), and Opposition-Based Learning (OBL) method. Firstly, two versions of ChOA with two different independent groups' strategies and seven chaotic maps, entitled ChOA(I) and ChOA(II), are presented to achieve the best possible result for data clustering purposes. Secondly, a novel combination of ChOA and GNDA algorithms with the OBL strategy is devised to solve the major shortcomings of the original algorithms. Lastly, the proposed ChOAGNDA method is a Selective Opposition (SO) algorithm based on ChOA and GNDA, which can be used to tackle large and complex real-world optimization problems, particularly data clustering applications. The results are evaluated against seven popular meta-heuristic optimization algorithms and eight recent state-of-the-art clustering techniques. Experimental results illustrate that the proposed work significantly outperforms other existing methods in terms of the achievement in minimizing the Sum of Intra-Cluster Distances (SICD), obtaining the lowest Error Rate (ER), accelerating the convergence speed, and finding the optimal cluster centers.Comment: 48 pages, 14 Tables, 12 Figure

    Document clustering with optimized unsupervised feature selection and centroid allocation

    Get PDF
    An effective document clustering system can significantly improve the tasks of document analysis, grouping, and retrieval. The performance of a document clustering system mainly depends on document preparation and allocation of cluster positions. As achieving optimal document clustering is a combinatorial NP-hard optimization problem, it becomes essential to utilize non-traditional methods to look for optimal or near-optimal solutions. During the allocation of cluster positions or the centroids allocation process, the extra text features that represent keywords in each document have an effect on the clustering results. A large number of features need to be reduced using dimensionality reduction techniques. Feature selection is an important step that can be used to reduce the redundant and inconsistent features. Due to a large number of the potential feature combinations, text feature selection is considered a complicated process. The persistent drawbacks of the current text feature selection methods such as local optima and absence of class labels of features were addressed in this thesis. The supervised and unsupervised feature selection methods were investigated. To address the problems of optimizing the supervised feature selection methods so as to improve document clustering, memetic hybridization between filter and wrapper feature selection, known as Memetic Algorithm Feature Selection, was presented first. In order to deal with the unlabelled features, unsupervised feature selection method was also proposed. The proposed unsupervised feature selection method integrates Simulated Annealing to the global search using Differential Evolution. This combination also aims to combine the advantages of both the wrapper and filter methods in a memetic scheme but on an unsupervised basis. Two versions of this hybridization were proposed. The first was named Differential Evolution Simulated Annealing, which uses the standard mutation of Differential Evolution, and the second was named Dichotomous Differential Evolution Simulated Annealing, which used the dichotomous mutation of the differential evolution. After feature selection two centroid allocation methods were proposed; the first is the combination of Chaotic Logistic Search and Discrete Differential Evolution global search, which was named Differential Evolution Memetic Clustering (DEMC) and the second was based on using the Gradient search using the k-means as a local search with a modified Differential Harmony global Search. The resulting method was named Memetic Differential Harmony Search (MDHS). In order to intensify the exploitation aspect of MDHS, a binomial crossover was used with it. Finally, the improved method is named Crossover Memetic Differential Harmony Search (CMDHS). The test results using the F-measure, Average Distance of Document to Cluster (ADDC) and the nonparametric statistical tests showed the superiority of the CMDHS over the baseline methods, namely the HS, DHS, k-means and the MDHS. The tests also show that CMDHS is better than the DEMC proposed earlier. Finally the proposed CMDHS was compared with two current state-of-the-art methods, namely a Krill Herd (KH) based centroid allocation method and an Artifice Bee Colony (ABC) based method, and found to outperform these two methods in most cases

    Advances in Meta-Heuristic Optimization Algorithms in Big Data Text Clustering

    Full text link
    This paper presents a comprehensive survey of the meta-heuristic optimization algorithms on the text clustering applications and highlights its main procedures. These Artificial Intelligence (AI) algorithms are recognized as promising swarm intelligence methods due to their successful ability to solve machine learning problems, especially text clustering problems. This paper reviews all of the relevant literature on meta-heuristic-based text clustering applications, including many variants, such as basic, modified, hybridized, and multi-objective methods. As well, the main procedures of text clustering and critical discussions are given. Hence, this review reports its advantages and disadvantages and recommends potential future research paths. The main keywords that have been considered in this paper are text, clustering, meta-heuristic, optimization, and algorithm

    Modelling spatio-temporal human behaviour with mobile phone data : a data analytical approach

    Get PDF

    Machine Learning Modeling for Image Segmentation in Manufacturing and Agriculture Applications

    Get PDF
    Doctor of PhilosophyDepartment of Industrial & Manufacturing Systems EngineeringShing I ChangThis dissertation focuses on applying machine learning (ML) modelling for image segmentation tasks of various applications such as additive manufacturing monitoring, agricultural soil cover classification, and laser scribing quality control. The proposed ML framework uses various ML models such as gradient boosting classifier and deep convolutional neural network to improve and automate image segmentation tasks. In recent years, supervised ML methods have been widely adopted for imaging processing applications in various industries. The presence of cameras installed in production processes has generated a vast amount of image data that can potentially be used for process monitoring. Specifically, deep supervised machine learning models have been successfully implemented to build automatic tools for filtering and classifying useful information for process monitoring. However, successful implementations of deep supervised learning algorithms depend on several factors such as distribution and size of training data, selected ML models, and consistency in the target domain distribution that may change based on different environmental conditions over time. The proposed framework takes advantage of general-purposed, trained supervised learning models and applies them for process monitoring applications related to manufacturing and agriculture. In Chapter 2, a layer-wise framework is proposed to monitor the quality of 3D printing parts based on top-view images. The proposed statistical process monitoring method starts with self-start control charts that require only two successful initial prints. Unsupervised machine learning methods can be used for problems in which high accuracy is not required, but statistical process monitoring usually demands high classification accuracies to avoid Type I and II errors. Answering the challenges of image processing using unsupervised methods due to lighting, a supervised Gradient Boosting Classifier (GBC) with 93 percent accuracy is adopted to classify each printed layer from the printing bed. Despite the power of GBC or other decision-tree-based ML models to comparable to unsupervised ML models, their capability is limited in terms of accuracy and running time for complex classification problems such as soil cover classification. In Chapter 3, a deep convolutional neural network (DCNN) for semantic segmentation is trained to quantify and monitor soil coverage in agricultural fields. The trained model is capable of accurately quantifying green canopy cover, counting plants, and classifying stubble. Due to the wide variety of scenarios in a real agricultural field, 3942 high-resolution images were collected and labeled for training and test data set. The difficulty and hardship of collecting, cleaning, and labeling the mentioned dataset was the motivation to find a better approach to alleviate data-wrangling burden for any ML model training. One of the most influential factors is the need for a high volume of labeled data from an exact problem domain in terms of feature space and distributions of data of all classes. Image data preparation for deep learning model training is expensive in terms of the time for labelling due to tedious manual processing. Multiple human labelers can work simultaneously but inconsistent labeling will generate a training data set that often compromises model performance. In addition, training a ML model for a complication problem from scratch will also demand vast computational power. One of the potential approaches for alleviating data wrangling challenges is transfer learning (TL). In Chapter 4, a TL approach was adopted for monitoring three laser scribing characteristics – scribe width, straightness, and debris to answer these challenges. The proposed transfer deep convolutional neural network (TDCNN) model can reduce timely and costly processing of data preparation. The proposed framework leverages a deep learning model already trained for a similar problem and only uses 21 images generated gleaned from the problem domain. The proposed TDCNN overcame the data challenge by leveraging the DCNN model called VGG16 already trained for basic geometric features using more than two million pictures. Appropriate image processing techniques were provided to measure scribe width and line straightness as well as total scribe and debris area using classified images with 96 percent accuracy. In addition to the fact that the TDCNN is functioning with less trainable parameters (i.e., 5 million versus 15 million for VGG16), increasing training size to 154 did not provide significant improvement in accuracy that shows the TDCNN does not need high volume of data to be successful. Finally, chapter 5 summarizes the proposed work and lays out the topics for future research

    Advances in Artificial Intelligence: Models, Optimization, and Machine Learning

    Get PDF
    The present book contains all the articles accepted and published in the Special Issue “Advances in Artificial Intelligence: Models, Optimization, and Machine Learning” of the MDPI Mathematics journal, which covers a wide range of topics connected to the theory and applications of artificial intelligence and its subfields. These topics include, among others, deep learning and classic machine learning algorithms, neural modelling, architectures and learning algorithms, biologically inspired optimization algorithms, algorithms for autonomous driving, probabilistic models and Bayesian reasoning, intelligent agents and multiagent systems. We hope that the scientific results presented in this book will serve as valuable sources of documentation and inspiration for anyone willing to pursue research in artificial intelligence, machine learning and their widespread applications
    corecore