420 research outputs found

    Attribute Equilibrium Dominance Reduction Accelerator (DCCAEDR) Based on Distributed Coevolutionary Cloud and Its Application in Medical Records

    Full text link
    © 2013 IEEE. Aimed at the tremendous challenge of attribute reduction for big data mining and knowledge discovery, we propose a new attribute equilibrium dominance reduction accelerator (DCCAEDR) based on the distributed coevolutionary cloud model. First, the framework of N-populations distributed coevolutionary MapReduce model is designed to divide the entire population into N subpopulations, sharing the reward of different subpopulations' solutions under a MapReduce cloud mechanism. Because the adaptive balancing between exploration and exploitation can be achieved in a better way, the reduction performance is guaranteed to be the same as those using the whole independent data set. Second, a novel Nash equilibrium dominance strategy of elitists under the N bounded rationality regions is adopted to assist the subpopulations necessary to attain the stable status of Nash equilibrium dominance. This further enhances the accelerator's robustness against complex noise on big data. Third, the approximation parallelism mechanism based on MapReduce is constructed to implement rule reduction by accelerating the computation of attribute equivalence classes. Consequently, the entire attribute reduction set with the equilibrium dominance solution can be achieved. Extensive simulation results have been used to illustrate the effectiveness and robustness of the proposed DCCAEDR accelerator for attribute reduction on big data. Furthermore, the DCCAEDR is applied to solve attribute reduction for traditional Chinese medical records and to segment cortical surfaces of the neonatal brain 3-D-MRI records, and the DCCAEDR shows the superior competitive results, when compared with the representative algorithms

    Streaming Feature Grouping and Selection (Sfgs) For Big Data Classification

    Get PDF
    Real-time data has always been an essential element for organizations when the quickness of data delivery is critical to their businesses. Today, organizations understand the importance of real-time data analysis to maintain benefits from their generated data. Real-time data analysis is also known as real-time analytics, streaming analytics, real-time streaming analytics, and event processing. Stream processing is the key to getting results in real-time. It allows us to process the data stream in real-time as it arrives. The concept of streaming data means the data are generated dynamically, and the full stream is unknown or even infinite. This data becomes massive and diverse and forms what is known as a big data challenge. In machine learning, streaming feature selection has always been a preferred method in the preprocessing of streaming data. Recently, feature grouping, which can measure the hidden information between selected features, has begun gaining attention. This dissertation’s main contribution is in solving the issue of the extremely high dimensionality of streaming big data by delivering a streaming feature grouping and selection algorithm. Also, the literature review presents a comprehensive review of the current streaming feature selection approaches and highlights the state-of-the-art algorithms trending in this area. The proposed algorithm is designed with the idea of grouping together similar features to reduce redundancy and handle the stream of features in an online fashion. This algorithm has been implemented and evaluated using benchmark datasets against state-of-the-art streaming feature selection algorithms and feature grouping techniques. The results showed better performance regarding prediction accuracy than with state-of-the-art algorithms

    Uncertain Multi-Criteria Optimization Problems

    Get PDF
    Most real-world search and optimization problems naturally involve multiple criteria as objectives. Generally, symmetry, asymmetry, and anti-symmetry are basic characteristics of binary relationships used when modeling optimization problems. Moreover, the notion of symmetry has appeared in many articles about uncertainty theories that are employed in multi-criteria problems. Different solutions may produce trade-offs (conflicting scenarios) among different objectives. A better solution with respect to one objective may compromise other objectives. There are various factors that need to be considered to address the problems in multidisciplinary research, which is critical for the overall sustainability of human development and activity. In this regard, in recent decades, decision-making theory has been the subject of intense research activities due to its wide applications in different areas. The decision-making theory approach has become an important means to provide real-time solutions to uncertainty problems. Theories such as probability theory, fuzzy set theory, type-2 fuzzy set theory, rough set, and uncertainty theory, available in the existing literature, deal with such uncertainties. Nevertheless, the uncertain multi-criteria characteristics in such problems have not yet been explored in depth, and there is much left to be achieved in this direction. Hence, different mathematical models of real-life multi-criteria optimization problems can be developed in various uncertain frameworks with special emphasis on optimization problems

    Combining rough and fuzzy sets for feature selection

    Get PDF

    Discrete Mathematics and Symmetry

    Get PDF
    Some of the most beautiful studies in Mathematics are related to Symmetry and Geometry. For this reason, we select here some contributions about such aspects and Discrete Geometry. As we know, Symmetry in a system means invariance of its elements under conditions of transformations. When we consider network structures, symmetry means invariance of adjacency of nodes under the permutations of node set. The graph isomorphism is an equivalence relation on the set of graphs. Therefore, it partitions the class of all graphs into equivalence classes. The underlying idea of isomorphism is that some objects have the same structure if we omit the individual character of their components. A set of graphs isomorphic to each other is denominated as an isomorphism class of graphs. The automorphism of a graph will be an isomorphism from G onto itself. The family of all automorphisms of a graph G is a permutation group

    Interpretability-oriented data-driven modelling of bladder cancer via computational intelligence

    Get PDF

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

    Computer aided diagnosis algorithms for digital microscopy

    Get PDF
    Automatic analysis and information extraction from an image is still a highly chal- lenging research problem in the computer vision area, attempting to describe the image content with computational and mathematical techniques. Moreover the in- formation extracted from the image should be meaningful and as most discrimi- natory as possible, since it will be used to categorize its content according to the analysed problem. In the Medical Imaging domain this issue is even more felt because many important decisions that affect the patient care, depend on the use- fulness of the information extracted from the image. Manage medical image is even more complicated not only due to the importance of the problem, but also because it needs a fair amount of prior medical knowledge to be able to represent with data the visual information to which pathologist refer. Today medical decisions that impact patient care rely on the results of laboratory tests to a greater extent than ever before, due to the marked expansion in the number and complexity of offered tests. These developments promise to improve the care of patients, but the more increase the number and complexity of the tests, the more increases the possibility to misapply and misinterpret the test themselves, leading to inappropriate diagnosis and therapies. Moreover, with the increased number of tests also the amount of data to be analysed increases, forcing pathologists to devote much time to the analysis of the tests themselves rather than to patient care and the prescription of the right therapy, especially considering that most of the tests performed are just check up tests and most of the analysed samples come from healthy patients. Then, a quantitative evaluation of medical images is really essential to overcome uncertainty and subjectivity, but also to greatly reduce the amount of data and the timing for the analysis. In the last few years, many computer assisted diagno- sis systems have been developed, attempting to mimic pathologists by extracting features from the images. Image analysis involves complex algorithms to identify and characterize cells or tissues using image pattern recognition technology. This thesis addresses the main problems associated to the digital microscopy analysis in histology and haematology diagnosis, with the development of algorithms for the extraction of useful information from different digital images, but able to distinguish different biological structures in the images themselves. The proposed methods not only aim to improve the degree of accuracy of the analysis, and reducing time, if used as the only means of diagnoses, but also they can be used as intermediate tools for skimming the number of samples to be analysed directly from the pathologist, or as double check systems to verify the correct results of the automated facilities used today

    Computer aided diagnosis algorithms for digital microscopy

    Get PDF
    Automatic analysis and information extraction from an image is still a highly chal- lenging research problem in the computer vision area, attempting to describe the image content with computational and mathematical techniques. Moreover the in- formation extracted from the image should be meaningful and as most discrimi- natory as possible, since it will be used to categorize its content according to the analysed problem. In the Medical Imaging domain this issue is even more felt because many important decisions that affect the patient care, depend on the use- fulness of the information extracted from the image. Manage medical image is even more complicated not only due to the importance of the problem, but also because it needs a fair amount of prior medical knowledge to be able to represent with data the visual information to which pathologist refer. Today medical decisions that impact patient care rely on the results of laboratory tests to a greater extent than ever before, due to the marked expansion in the number and complexity of offered tests. These developments promise to improve the care of patients, but the more increase the number and complexity of the tests, the more increases the possibility to misapply and misinterpret the test themselves, leading to inappropriate diagnosis and therapies. Moreover, with the increased number of tests also the amount of data to be analysed increases, forcing pathologists to devote much time to the analysis of the tests themselves rather than to patient care and the prescription of the right therapy, especially considering that most of the tests performed are just check up tests and most of the analysed samples come from healthy patients. Then, a quantitative evaluation of medical images is really essential to overcome uncertainty and subjectivity, but also to greatly reduce the amount of data and the timing for the analysis. In the last few years, many computer assisted diagno- sis systems have been developed, attempting to mimic pathologists by extracting features from the images. Image analysis involves complex algorithms to identify and characterize cells or tissues using image pattern recognition technology. This thesis addresses the main problems associated to the digital microscopy analysis in histology and haematology diagnosis, with the development of algorithms for the extraction of useful information from different digital images, but able to distinguish different biological structures in the images themselves. The proposed methods not only aim to improve the degree of accuracy of the analysis, and reducing time, if used as the only means of diagnoses, but also they can be used as intermediate tools for skimming the number of samples to be analysed directly from the pathologist, or as double check systems to verify the correct results of the automated facilities used today

    Transformation of graphical models to support knowledge transfer

    Get PDF
    Menschliche Experten verfügen über die Fähigkeit, ihr Entscheidungsverhalten flexibel auf die jeweilige Situation abzustimmen. Diese Fähigkeit zahlt sich insbesondere dann aus, wenn Entscheidungen unter beschränkten Ressourcen wie Zeitrestriktionen getroffen werden müssen. In solchen Situationen ist es besonders vorteilhaft, die Repräsentation des zugrunde liegenden Wissens anpassen und Entscheidungsmodelle auf unterschiedlichen Abstraktionsebenen verwenden zu können. Weiterhin zeichnen sich menschliche Experten durch die Fähigkeit aus, neben unsicheren Informationen auch unscharfe Wahrnehmungen in die Entscheidungsfindung einzubeziehen. Klassische entscheidungstheoretische Modelle basieren auf dem Konzept der Rationalität, wobei in jeder Situation die nutzenmaximale Entscheidung einer Entscheidungsfunktion zugeordnet wird. Neuere graphbasierte Modelle wie Bayes\u27sche Netze oder Entscheidungsnetze machen entscheidungstheoretische Methoden unter dem Aspekt der Modellbildung interessant. Als Hauptnachteil lässt sich die Komplexität nennen, wobei Inferenz in Entscheidungsnetzen NP-hart ist. Zielsetzung dieser Dissertation ist die Transformation entscheidungstheoretischer Modelle in Fuzzy-Regelbasen als Zielsprache. Fuzzy-Regelbasen lassen sich effizient auswerten, eignen sich zur Approximation nichtlinearer funktionaler Beziehungen und garantieren die Interpretierbarkeit des resultierenden Handlungsmodells. Die Übersetzung eines Entscheidungsmodells in eine Fuzzy-Regelbasis wird durch einen neuen Transformationsprozess unterstützt. Ein Agent kann zunächst ein Bayes\u27sches Netz durch Anwendung eines in dieser Arbeit neu vorgestellten parametrisierten Strukturlernalgorithmus generieren lassen. Anschließend lässt sich durch Anwendung von Präferenzlernverfahren und durch Präzisierung der Wahrscheinlichkeitsinformation ein entscheidungstheoretisches Modell erstellen. Ein Transformationsalgorithmus kompiliert daraus eine Regelbasis, wobei ein Approximationsmaß den erwarteten Nutzenverlust als Gütekriterium berechnet. Anhand eines Beispiels zur Zustandsüberwachung einer Rotationsspindel wird die Praxistauglichkeit des Konzeptes gezeigt.Human experts are able to flexible adjust their decision behaviour with regard to the respective situation. This capability pays in situations under limited resources like time restrictions. It is particularly advantageous to adapt the underlying knowledge representation and to make use of decision models at different levels of abstraction. Furthermore human experts have the ability to include uncertain information and vague perceptions in decision making. Classical decision-theoretic models are based directly on the concept of rationality, whereby the decision behaviour prescribed by the principle of maximum expected utility. For each observation some optimal decision function prescribes an action that maximizes expected utility. Modern graph-based methods like Bayesian networks or influence diagrams make use of modelling. One disadvantage of decision-theoretic methods concerns the issue of complexity. Finding an optimal decision might become very expensive. Inference in decision networks is known to be NP-hard. This dissertation aimed at combining the advantages of decision-theoretic models with rule-based systems by transforming a decision-theoretic model into a fuzzy rule-based system. Fuzzy rule bases are an efficient implementation from a computational point of view, they can approximate non-linear functional dependencies and they are also intelligible. There was a need for establishing a new transformation process to generate rule-based representations from decision models, which provide an efficient implementation architecture and represent knowledge in an explicit, intelligible way. At first, an agent can apply the new parameterized structure learning algorithm to identify the structure of the Bayesian network. The use of learning approaches to determine preferences and the specification of probability information subsequently enables to model decision and utility nodes and to generate a consolidated decision-theoretic model. Hence, a transformation process compiled a rule base by measuring the utility loss as approximation measure. The transformation process concept has been successfully applied to the problem of representing condition monitoring results for a rotation spindle
    corecore