26 research outputs found

    Sublinear Computation Paradigm

    Get PDF
    This open access book gives an overview of cutting-edge work on a new paradigm called the “sublinear computation paradigm,” which was proposed in the large multiyear academic research project “Foundations of Innovative Algorithms for Big Data.” That project ran from October 2014 to March 2020, in Japan. To handle the unprecedented explosion of big data sets in research, industry, and other areas of society, there is an urgent need to develop novel methods and approaches for big data analysis. To meet this need, innovative changes in algorithm theory for big data are being pursued. For example, polynomial-time algorithms have thus far been regarded as “fast,” but if a quadratic-time algorithm is applied to a petabyte-scale or larger big data set, problems are encountered in terms of computational resources or running time. To deal with this critical computational and algorithmic bottleneck, linear, sublinear, and constant time algorithms are required. The sublinear computation paradigm is proposed here in order to support innovation in the big data era. A foundation of innovative algorithms has been created by developing computational procedures, data structures, and modelling techniques for big data. The project is organized into three teams that focus on sublinear algorithms, sublinear data structures, and sublinear modelling. The work has provided high-level academic research results of strong computational and algorithmic interest, which are presented in this book. The book consists of five parts: Part I, which consists of a single chapter on the concept of the sublinear computation paradigm; Parts II, III, and IV review results on sublinear algorithms, sublinear data structures, and sublinear modelling, respectively; Part V presents application results. The information presented here will inspire the researchers who work in the field of modern algorithms

    Graph representation learning for security analytics in decentralized software systems and social networks

    Get PDF
    With the rapid advancement in digital transformation, various daily interactions, transactions, and operations typically depend on extensive network-structured systems. The inherent complexity of these platforms has become a critical challenge in ensuring their security and robustness, with impacts spanning individual users to large-scale organizations. Graph representation learning has emerged as a potential methodology to address various security analytics within these complex systems, especially in software code and social network analysis, and its applications in criminology. For software code, graph representations can capture the information of control-flow graphs and call graphs, which can be leveraged to detect vulnerabilities and improve software reliability. In the case of social network analysis in criminal investigation, graph representations can capture the social connections and interactions between individuals, which can be used to identify key players, detect illegal activities, and predict new/unobserved criminal cases. In this thesis, we focus on two critical security topics using graph learning-based approaches: (1) addressing criminal investigation issues and (2) detecting vulnerabilities of Ethereum blockchain smart contracts. First, we propose the SoChainDB database, which facilitates obtaining data from blockchain-based social networks and conducting extensive analyses to understand Hive blockchain social data. Moreover, to apply social network analysis in criminal investigation, two graph-based machine learning frameworks are presented to address investigation issues in a burglary use case, one being transductive link prediction and the other being inductive link prediction.Then, we propose MANDO, an approach that utilizes a new heterogeneous graph representation of control-flow graphs and call graphs to learn the structures of heterogeneous contract graphs. Building upon MANDO, two deep graph learning-based frameworks, MANDO-GURU and MANDO-HGT, are proposed for accurate vulnerability detection at both the coarse-grained contract and fine-grained line levels. Empirical results show that MANDO frameworks significantly improve the detection accuracy of other state-of-the-art techniques for various vulnerability types in either source code or bytecode

    29th International Symposium on Algorithms and Computation: ISAAC 2018, December 16-19, 2018, Jiaoxi, Yilan, Taiwan

    Get PDF

    An Analysis of Some Algorithms and Heuristics for Multiobjective Graph Search

    Get PDF
    Muchos problemas reales requieren examinar un número exponencial de alternativas para encontrar la elección óptima. A este tipo de problemas se les llama de optimización combinatoria. Además, en problemas reales normalmente se evalúan múltiples magnitudes que presentan conflicto entre ellas. Cuando se optimizan múltiples obje-tivos simultáneamente, generalmente no existe un valor óptimo que satisfaga al mismo tiempo los requisitos para todos los criterios. Solucionar estos problemas combinatorios multiobjetivo deriva comúnmente en un gran conjunto de soluciones Pareto-óptimas, que definen los balances óptimos entre los objetivos considerados. En esta tesis se considera uno de los problemas multiobjetivo más recurrentes: la búsqueda de caminos más cortos en un grafo, teniendo en cuenta múltiples objetivos al mismo tiempo. Se pueden señalar muchas aplicaciones prácticas de la búsqueda multiobjetivo en diferentes dominios: enrutamiento en redes multimedia (Clímaco et al., 2003), programación de satélites (Gabrel & Vanderpooten, 2002), problemas de transporte (Pallottino & Scutellà, 1998), enrutamiento en redes de ferrocarril (Müller-Hannemann & Weihe, 2006), planificación de rutas en redes de carreteras (Jozefowiez et al., 2008), vigilancia con robots (delle Fave et al., 2009) o planificación independiente del dominio (Refanidis & Vlahavas, 2003). La planificación de rutas multiobjetivo sobre mapas de carretera realistas ha sido considerada como un escenario de aplicación potencial para los algoritmos y heurísticos multiobjetivo considerados en esta tesis. El transporte de materias peligrosas (Erkut et al., 2007), otro problema de enrutamiento multiobjetivo relacionado, ha sido también considerado como un escenario de aplicación potencial interesante. Los métodos de optimización de un solo criterio son bien conocidos y han sido ampliamente estudiados. La Búsqueda Heurística permite la reducción de los requisitos de espacio y tiempo de estos métodos, explotando el uso de estimaciones de la distancia real al objetivo. Los problemas multiobjetivo son bastante más complejos que sus equivalentes de un solo objetivo y requieren métodos específicos. Éstos, van desde técnicas de solución exactas a otras aproximadas, que incluyen los métodos metaheurísticos aproximados comúnmente encontrados en la literatura. Esta tesis se ocupa de algoritmos exactos primero-el-mejor y, en particular, del uso de información heurística para mejorar su rendimiento. Esta tesis contribuye análisis tanto formales como empíricos de algoritmos y heurísticos para búsqueda multiobjetivo. La caracterización formal de estos algoritmos es importante para el campo. Sin embargo, la evaluación empírica es también de gran importancia para la aplicación real de estos métodos. Se han utilizado diversas clases de problemas bien conocidos para probar su rendimiento, incluyendo escenarios realistas como los descritos más arriba. Los resultados de esta tesis proporcionan una mejor comprensión de qué métodos de los disponibles sonmejores en situaciones prácticas. Se presentan explicaciones formales y empíricas acerca de su comportamiento. Se muestra que la búsqueda heurística reduce considerablemente los requisitos de espacio y tiempo en la mayoría de las ocasiones. En particular, se presentan los primeros resultados sistemáticos mostrando las ventajas de la aplicación de heurísticos multiobjetivo precalculados. Esta tesis también aporta un método mejorado para el precálculo de los heurísticos, y explora la conveniencia de heurísticos precalculados más informados.Many real problems require the examination of an exponential number of alternatives in order to find the best choice. They are the so-called combinatorial optimization problems. Besides, real problems usually involve the consideration of several conflicting magnitudes. When multiple objectives must be simultaneously optimized, there is generally not an optimal value satisfying the requirements for all the criteria at the same time. Solving these multiobjective combinatorial problems commonly results in a large set of Pareto-optimal solutions, which define the optimal tradeoffs between the objectives under consideration. One of most recurrent multiobjective problems is considered in this thesis: the search for shortest paths in a graph, taking into account several objectives at the same time. Many practical applications of multiobjective search in different domains can be pointed out: routing in multimedia networks (Clímaco et al., 2003), satellite scheduling (Gabrel & Vanderpooten, 2002), transportation problems (Pallottino & Scutellà, 1998), routing in railway networks (Müller-Hannemann & Weihe, 2006), route planning in road maps (Jozefowiez et al., 2008), robot surveillance (delle Fave et al., 2009) or domain independent planning (Refanidis & Vlahavas, 2003). Multiobjective route planning over realistic road maps has been considered as a potential application scenario for the multiobjective algorithms and heuristics considered in this thesis. Hazardous material transportation (Erkut et al., 2007), another related multiobjective routing problem, has also been considered as an interesting potential application scenario. Single criterion shortest path methods are well known and have been widely studied. Heuristic Search allows the reduction of the space and time requirements of these methods, exploiting estimates of the actual distance to the goal. Multiobjective problems are much more complex than their single-objective counterparts, and require specific methods. These range from exact solution techniques to approximate ones, including the metaheuristic approximate methods usually found in the literature. This thesis is concerned with exact best-first algorithms, and particularly, with the use of heuristic information to improve their performance. This thesis contributes both formal and empirical analysis of algorithms and heuristics for multiobjective search. The formal characterization of algorithms is important for the field. However, empirical evaluation is also of great importance for the real application of these methods. Several well known classes of problems have been used to test their performance, including some realistic scenarios as described above. The results of this thesis provide a better understanding of which of the available methods are better in practical situations. Formal and empirical explanations of their behaviour are presented. Heuristic search is shown to reduce considerably space and time requirements in most situations. In particular, the first systematic results showing the advantages of the application of precalculated multiobjective heuristics are presented. The thesis also contributes an improved method for heuristic precalculation, and explores the convenience of more informed precalculated heuristics.This work is partially funded by / Este trabajo está financiado por: Consejería de Economía, Innovación, Ciencia y Empresa. Junta de Andalucía (España) Referencia: P07-TIC-0301

    Seventh Biennial Report : June 2003 - March 2005

    No full text

    Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

    Get PDF
    The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov

    A REST Model for High Throughput Scheduling in Computational Grids

    Get PDF
    Current grid computing architectures have been based on cluster management and batch queuing systems, extended to a distributed, federated domain. These have shown shortcomings in terms of scalability, stability, and modularity. To address these problems, this dissertation applies architectural styles from the Internet and Web to the domain of generic computational grids. Using the REST style, a flexible model for grid resource interaction is developed which removes the need for any centralised services or specific protocols, thereby allowing a range of implementations and layering of further functionality. The context for resource interaction is a generalisation and formalisation of the Condor ClassAd match-making mechanism. This set theoretic model is described in depth, including the advantages and features which it realises. This RESTful style is also motivated by operational experience with existing grid infrastructures, and the design, operation, and performance of a proto-RESTful grid middleware package named DIRAC. This package was designed to provide for the LHCb particle physics experiment's âワoff-lineâ computational infrastructure, and was first exercised during a 6 month data challenge which utilised over 670 years of CPU time and produced 98 TB of data through 300,000 tasks executed at computing centres around the world. The design of DIRAC and performance measures from the data challenge are reported. The main contribution of this work is the development of a REST model for grid resource interaction. In particular, it allows resource templating for scheduling queues which provide a novel distributed and scalable approach to resource scheduling on the grid
    corecore