4 research outputs found

    Swarm Intelligence Optimization Algorithms and Their Application

    Get PDF
    Swarm intelligence optimization algorithm is an emerging technology tosimulate the evolution of the law of nature and acts of biological communities, it has simple and robust characteristics. The algorithm has been successfully applied in many fields. This paper summarizes the research status of swarm intelligence optimization algorithm and application progress. Elaborate the basic principle of ant colony algorithm and particle swarm algorithm. Carry out a detailed analysis of drosophila algorithm and firefly algorithm developed in recent years, and put forward deficiencies of each algorithm and direction for improvement

    Adaptive semi-supervised affinity propagation clustering algorithm based on structural similarity

    Get PDF
    Uzimajući u obzir nezadovoljavajuće djelovanje grupiranja srodnog širenja algoritma grupiranja, kada se radi o nizovima podataka složenih struktura, u ovom se radu predlaže prilagodljivi nadzirani algoritam grupiranja srodnog širenja utemeljen na strukturnoj sličnosti (SAAP-SS). Najprije se predlaže nova strukturna sličnost rješavanjem nelinearnog problema zastupljenosti niskoga ranga. Zatim slijedi srodno širenje na temelju podešavanja matrice sličnosti primjenom poznatih udvojenih ograničenja. Na kraju se u postupak algoritma uvodi ideja eksplozija kod vatrometa. Prilagodljivo pretražujući preferencijalni prostor u dva smjera, uravnotežuju se globalne i lokalne pretraživačke sposobnosti algoritma u cilju pronalaženja optimalne strukture grupiranja. Rezultati eksperimenata i sa sintetičkim i s realnim nizovima podataka pokazuju poboljšanja u radu predloženog algoritma u usporedbi s AP, FEO-SAP i K-means metodama.In view of the unsatisfying clustering effect of affinity propagation (AP) clustering algorithm when dealing with data sets of complex structures, an adaptive semi-supervised affinity propagation clustering algorithm based on structural similarity (SAAP-SS) is proposed in this paper. First, a novel structural similarity is proposed by solving a non-linear, low-rank representation problem. Then we perform affinity propagation on the basis of adjusting the similarity matrix by utilizing the known pairwise constraints. Finally, the idea of fireworks explosion is introduced into the process of the algorithm. By adaptively searching the preference space bi-directionally, the algorithm’s global and local searching abilities are balanced in order to find the optimal clustering structure. The results of the experiments with both synthetic and real data sets show performance improvements of the proposed algorithm compared with AP, FEO-SAP and K-means methods

    Data mining for heart failure : an investigation into the challenges in real life clinical datasets

    Get PDF
    Clinical data presents a number of challenges including missing data, class imbalance, high dimensionality and non-normal distribution. A motivation for this research is to investigate and analyse the manner in which the challenges affect the performance of algorithms. The challenges were explored with the help of a real life heart failure clinical dataset known as Hull LifeLab, obtained from a live cardiology clinic at the Hull Royal Infirmary Hospital. A Clinical Data Mining Workflow (CDMW) was designed with three intuitive stages, namely, descriptive, predictive and prescriptive. The naming of these stages reflects the nature of the analysis that is possible within each stage; therefore a number of different algorithms are employed. Most algorithms require the data to be distributed in a normal manner. However, the distribution is not explicitly used within the algorithms. Approaches based on Bayes use the properties of the distributions very explicitly, and thus provides valuable insight into the nature of the data.The first stage of the analysis is to investigate if the assumptions made for Bayes hold, e.g. the strong independence assumption and the assumption of a Gaussian distribution. The next stage is to investigate the role of missing values. Results found that imputation does not affect the performance as much as those records which are initially complete. These records are often not outliers, but contain problem variables. A method was developed to identify these. The effect of skews in the data was also investigated within the CDMW. However, it was found that methods based on Bayes were able to handle these, albeit with a small variability in performance. The thesis provides an insight into the reasons why clinical data often causes problems. Even the issue of imbalanced classes is not an issue, for Bayes is independent of this

    Use of Entropy for Feature Selection with Intrusion Detection System Parameters

    Get PDF
    The metric of entropy provides a measure about the randomness of data and a measure of information gained by comparing different attributes. Intrusion detection systems can collect very large amounts of data, which are not necessarily manageable by manual means. Collected intrusion detection data often contains redundant, duplicate, and irrelevant entries, which makes analysis computationally intensive likely leading to unreliable results. Reducing the data to what is relevant and pertinent to the analysis requires the use of data mining techniques and statistics. Identifying patterns in the data is part of analysis for intrusion detections in which the patterns are categorized as normal or anomalous. Anomalous data needs to be further characterized to determine if representative attacks to the network are in progress. Often time subtleties in the data may be too muted to identify certain types of attacks. Many statistics including entropy are used in a number of analysis techniques for identifying attacks, but these analyzes can be improved upon. This research expands the use of Approximate entropy and Sample entropy for feature selection and attack analysis to identify specific types of subtle attacks to network systems. Through enhanced analysis techniques using entropy, the granularity of feature selection and attack identification is improved
    corecore