47 research outputs found

    Shared Nearest-Neighbor Quantum Game-Based Attribute Reduction with Hierarchical Coevolutionary Spark and Its Application in Consistent Segmentation of Neonatal Cerebral Cortical Surfaces

    Full text link
    © 2012 IEEE. The unprecedented increase in data volume has become a severe challenge for conventional patterns of data mining and learning systems tasked with handling big data. The recently introduced Spark platform is a new processing method for big data analysis and related learning systems, which has attracted increasing attention from both the scientific community and industry. In this paper, we propose a shared nearest-neighbor quantum game-based attribute reduction (SNNQGAR) algorithm that incorporates the hierarchical coevolutionary Spark model. We first present a shared coevolutionary nearest-neighbor hierarchy with self-evolving compensation that considers the features of nearest-neighborhood attribute subsets and calculates the similarity between attribute subsets according to the shared neighbor information of attribute sample points. We then present a novel attribute weight tensor model to generate ranking vectors of attributes and apply them to balance the relative contributions of different neighborhood attribute subsets. To optimize the model, we propose an embedded quantum equilibrium game paradigm (QEGP) to ensure that noisy attributes do not degrade the big data reduction results. A combination of the hierarchical coevolutionary Spark model and an improved MapReduce framework is then constructed that it can better parallelize the SNNQGAR to efficiently determine the preferred reduction solutions of the distributed attribute subsets. The experimental comparisons demonstrate the superior performance of the SNNQGAR, which outperforms most of the state-of-the-art attribute reduction algorithms. Moreover, the results indicate that the SNNQGAR can be successfully applied to segment overlapping and interdependent fuzzy cerebral tissues, and it exhibits a stable and consistent segmentation performance for neonatal cerebral cortical surfaces

    PENENTUAN JUMLAH UKURAN PAKAIAN OPTIMAL SEBAGAI RANCANGAN SISTEM UKURAN PAKAIAN ANAK LAKI-LAKI DI INDONESIA DENGAN ANALISIS KESEIMBANGAN DAN FUZZY C MEANS BERBASIS ARTIFICIAL BEE COLONY

    Get PDF
    Jumlah ukuran adalah salah satu hal terpenting dalam merancang sistem ukuran pakaian. Semakin banyak jumlah ukuran pakaian maka akan semakin pas dengan bentuk tubuh konsumen sehingga kepuasan konsumen dapat tercapai dari sisi ketepatan ukuran pakaian dengan ukuran tubuh. Namun dari sisi produsen, semakin besar jumlah ukuran pakaian akan berdampak pada biaya setup ataupun penambahan lini produksi akibat penambahan variasi jumlah ukuran pakaian. Penelitian ini akan mengembangkan sistem ukuran pakaian baru dimana akan melihat titik titik seimbang antara biaya produksi dengan jumlah ukuran maksimal. Titik seimbang itu adalah jumlah ukuran optimal yang dapat memenuhi kebutuhan konsumen dan produsen secara bersama-sama. Metode FCM ABC akan digunakan untuk mengelompokkan ukuran tubuh menjadi beberapa kelompok. Sampel menggunakan 106 anak laki-laki umur 8-10 tahun. Penelitian terdiri dari tahapan yaitu Analisis faktor, Penentuan jumlah kelompok optimal, dan Evaluasi. Tujuh kelompok ukuran pakaian yang optimal dihasilkan. Nilai aggregate loss memenuhi syarat validasi sehingga dapat dikatakan pengembangan sistem ukuran baru dapat digunakan sebagai teknik untuk mendapatkan jumlah ukuran pakaian yang optimal

    Evolutionary approaches for feature selection in biological data

    Get PDF
    Data mining techniques have been used widely in many areas such as business, science, engineering and medicine. The techniques allow a vast amount of data to be explored in order to extract useful information from the data. One of the foci in the health area is finding interesting biomarkers from biomedical data. Mass throughput data generated from microarrays and mass spectrometry from biological samples are high dimensional and is small in sample size. Examples include DNA microarray datasets with up to 500,000 genes and mass spectrometry data with 300,000 m/z values. While the availability of such datasets can aid in the development of techniques/drugs to improve diagnosis and treatment of diseases, a major challenge involves its analysis to extract useful and meaningful information. The aims of this project are: 1) to investigate and develop feature selection algorithms that incorporate various evolutionary strategies, 2) using the developed algorithms to find the “most relevant” biomarkers contained in biological datasets and 3) and evaluate the goodness of extracted feature subsets for relevance (examined in terms of existing biomedical domain knowledge and from classification accuracy obtained using different classifiers). The project aims to generate good predictive models for classifying diseased samples from control

    Multi-Criterion Mammographic Risk Analysis Supported with Multi-Label Fuzzy-Rough Feature Selection

    Get PDF
    Context and background Breast cancer is one of the most common diseases threatening the human lives globally, requiring effective and early risk analysis for which learning classifiers supported with automated feature selection offer a potential robust solution. Motivation Computer aided risk analysis of breast cancer typically works with a set of extracted mammographic features which may contain significant redundancy and noise, thereby requiring technical developments to improve runtime performance in both computational efficiency and classification accuracy. Hypothesis Use of advanced feature selection methods based on multiple diagnosis criteria may lead to improved results for mammographic risk analysis. Methods An approach for multi-criterion based mammographic risk analysis is proposed, by adapting the recently developed multi-label fuzzy-rough feature selection mechanism. Results A system for multi-criterion mammographic risk analysis is implemented with the aid of multi-label fuzzy-rough feature selection and its performance is positively verified experimentally, in comparison with representative popular mechanisms. Conclusions The novel approach for mammographic risk analysis based on multiple criteria helps improve classification accuracy using selected informative features, without suffering from the redundancy caused by such complex criteria, with the implemented system demonstrating practical efficacy

    Front Matter - Soft Computing for Data Mining Applications

    Get PDF
    Efficient tools and algorithms for knowledge discovery in large data sets have been devised during the recent years. These methods exploit the capability of computers to search huge amounts of data in a fast and effective manner. However, the data to be analyzed is imprecise and afflicted with uncertainty. In the case of heterogeneous data sources such as text, audio and video, the data might moreover be ambiguous and partly conflicting. Besides, patterns and relationships of interest are usually vague and approximate. Thus, in order to make the information mining process more robust or say, human-like methods for searching and learning it requires tolerance towards imprecision, uncertainty and exceptions. Thus, they have approximate reasoning capabilities and are capable of handling partial truth. Properties of the aforementioned kind are typical soft computing. Soft computing techniques like Genetic

    Dealing with imbalanced and weakly labelled data in machine learning using fuzzy and rough set methods

    Get PDF

    Can bank interaction during rating measurement of micro and very small enterprises ipso facto Determine the collapse of PD status?

    Get PDF
    This paper begins with an analysis of trends - over the period 2012-2018 - for total bank loans, non-performing loans, and the number of active, working enterprises. A review survey was done on national data from Italy with a comparison developed on a local subset from the Sardinia Region. Empirical evidence appears to support the hypothesis of the paper: can the rating class assigned by banks - using current IRB and A-IRB systems - to micro and very small enterprises, whose ability to replace financial resources using endogenous means is structurally impaired, ipso facto orient the results of performance in the same terms of PD assigned by the algorithm, thereby upending the principle of cause and effect? The thesis is developed through mathematical modeling that demonstrates the interaction of the measurement tool (the rating algorithm applied by banks) on the collapse of the loan status (default, performing, or some intermediate point) of the assessed micro-entity. Emphasis is given, in conclusion, to the phenomenon using evidence of the intrinsically mutualistic link of the two populations of banks and (micro) enterprises provided by a system of differential equation

    Parking lot monitoring system using an autonomous quadrotor UAV

    Get PDF
    The main goal of this thesis is to develop a drone-based parking lot monitoring system using low-cost hardware and open-source software. Similar to wall-mounted surveillance cameras, a drone-based system can monitor parking lots without affecting the flow of traffic while also offering the mobility of patrol vehicles. The Parrot AR Drone 2.0 is the quadrotor drone used in this work due to its modularity and cost efficiency. Video and navigation data (including GPS) are communicated to a host computer using a Wi-Fi connection. The host computer analyzes navigation data using a custom flight control loop to determine control commands to be sent to the drone. A new license plate recognition pipeline is used to identify license plates of vehicles from video received from the drone
    corecore