14,055 research outputs found

    Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs

    Full text link
    Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic or fuzzy dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Provable optimal recovery using the algorithm is analytically shown for a nontrivial class of cluster graphs. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of fuzzy spectral methods, beginning with generalized heat or diffusion equations in physics, are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.Comment: 13 figures, 35 reference

    Comparison of Thematic Maps Using Symbolic Entropy

    Get PDF
    Comparison of thematic maps is an important task in a number of disciplines. Map comparison has traditionally been conducted using cell-by-cell agreement indicators, such as the Kappa measure. More recently, other methods have been proposed that take into account not only spatially coincident cells in two maps, but also their surroundings or the spatial structure of their differences. The objective of this paper is to propose a framework for map comparison that considers 1) the patterns of spatial association in two maps, in other words, the map elements in their surroundings; 2) the equivalence of those patterns; and 3) the independence of patterns between maps. Two new statistics for the spatial analysis of qualitative data are introduced. These statistics are based on the symbolic entropy of the maps, and function as measures of map compositional equivalence and independence. As well, all inferential elements to conduct hypothesis testing are developed. The framework is illustrated using real and synthetic maps. Key word: Thematic maps, map comparison, qualitative variables, spatial association, symbolic entropy, hypothesis tests

    Matrices of forests, analysis of networks, and ranking problems

    Get PDF
    The matrices of spanning rooted forests are studied as a tool for analysing the structure of networks and measuring their properties. The problems of revealing the basic bicomponents, measuring vertex proximity, and ranking from preference relations / sports competitions are considered. It is shown that the vertex accessibility measure based on spanning forests has a number of desirable properties. An interpretation for the stochastic matrix of out-forests in terms of information dissemination is given.Comment: 8 pages. This article draws heavily from arXiv:math/0508171. Published in Proceedings of the First International Conference on Information Technology and Quantitative Management (ITQM 2013). This version contains some corrections and addition

    Decision support model for the selection of asphalt wearing courses in highly trafficked roads

    Get PDF
    The suitable choice of the materials forming the wearing course of highly trafficked roads is a delicate task because of their direct interaction with vehicles. Furthermore, modern roads must be planned according to sustainable development goals, which is complex because some of these might be in conflict. Under this premise, this paper develops a multi-criteria decision support model based on the analytic hierarchy process and the technique for order of preference by similarity to ideal solution to facilitate the selection of wearing courses in European countries. Variables were modelled using either fuzzy logic or Monte Carlo methods, depending on their nature. The views of a panel of experts on the problem were collected and processed using the generalized reduced gradient algorithm and a distance-based aggregation approach. The results showed a clear preponderance by stone mastic asphalt over the remaining alternatives in different scenarios evaluated through sensitivity analysis. The research leading to these results was framed in the European FP7 Project DURABROADS (No. 605404).The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007–2013) under Grant Agreement No. 605404

    Organic Farming in Europe by 2010: Scenarios for the future

    Get PDF
    How will organic farming in Europe evolve by the year 2010? The answer provides a basis for the development of different policy options and for anticipating the future relative competitiveness of organic and conventional farming. The authors tackle the question using an innovative approach based on scenario analysis, offering the reader a range of scenarios that encompass the main possible evolutions of the organic farming sector. This book constitutes an innovative and reliable decision-supporting tool for policy makers, farmers and the private sector. Researchers and students operating in the field of agricultural economics will also benefit from the methodological approach adopted for the scenario analysis

    ISIPTA'07: Proceedings of the Fifth International Symposium on Imprecise Probability: Theories and Applications

    Get PDF
    B

    Feature and Decision Level Fusion Using Multiple Kernel Learning and Fuzzy Integrals

    Get PDF
    The work collected in this dissertation addresses the problem of data fusion. In other words, this is the problem of making decisions (also known as the problem of classification in the machine learning and statistics communities) when data from multiple sources are available, or when decisions/confidence levels from a panel of decision-makers are accessible. This problem has become increasingly important in recent years, especially with the ever-increasing popularity of autonomous systems outfitted with suites of sensors and the dawn of the ``age of big data.\u27\u27 While data fusion is a very broad topic, the work in this dissertation considers two very specific techniques: feature-level fusion and decision-level fusion. In general, the fusion methods proposed throughout this dissertation rely on kernel methods and fuzzy integrals. Both are very powerful tools, however, they also come with challenges, some of which are summarized below. I address these challenges in this dissertation. Kernel methods for classification is a well-studied area in which data are implicitly mapped from a lower-dimensional space to a higher-dimensional space to improve classification accuracy. However, for most kernel methods, one must still choose a kernel to use for the problem. Since there is, in general, no way of knowing which kernel is the best, multiple kernel learning (MKL) is a technique used to learn the aggregation of a set of valid kernels into a single (ideally) superior kernel. The aggregation can be done using weighted sums of the pre-computed kernels, but determining the summation weights is not a trivial task. Furthermore, MKL does not work well with large datasets because of limited storage space and prediction speed. These challenges are tackled by the introduction of many new algorithms in the following chapters. I also address MKL\u27s storage and speed drawbacks, allowing MKL-based techniques to be applied to big data efficiently. Some algorithms in this work are based on the Choquet fuzzy integral, a powerful nonlinear aggregation operator parameterized by the fuzzy measure (FM). These decision-level fusion algorithms learn a fuzzy measure by minimizing a sum of squared error (SSE) criterion based on a set of training data. The flexibility of the Choquet integral comes with a cost, however---given a set of N decision makers, the size of the FM the algorithm must learn is 2N. This means that the training data must be diverse enough to include 2N independent observations, though this is rarely encountered in practice. I address this in the following chapters via many different regularization functions, a popular technique in machine learning and statistics used to prevent overfitting and increase model generalization. Finally, it is worth noting that the aggregation behavior of the Choquet integral is not intuitive. I tackle this by proposing a quantitative visualization strategy allowing the FM and Choquet integral behavior to be shown simultaneously
    corecore