Search CORE

10 research outputs found

Parallel particle swarm optimization based on spark for academic paper co-authorship prediction

Author: Congmin Yang
Huansheng Ning
Liming Chen
Tao Zhu
Yang Zhang
Zhenyu Liu
Publication venue: 'MDPI AG'
Publication date: 20/12/2021
Field of study

The particle swarm optimization (PSO) algorithm has been widely used in various optimization problems. Although PSO has been successful in many fields, solving optimization problems in big data applications often requires processing of massive amounts of data, which cannot be handled by traditional PSO on a single machine. There have been several parallel PSO based on Spark, however they are almost proposed for solving numerical optimization problems, and few for big data optimization problems. In this paper, we propose a new Spark-based parallel PSO algorithm to predict the co-authorship of academic papers, which we formulate as an optimization problem from massive academic data. Experimental results show that the proposed parallel PSO can achieve good prediction accuracy

Multidisciplinary Digital Publishing Institute

Ulster University's Research Portal

New contributions to spatial partitioning and parallel global illumination algorithms

Author: Garmann Robert
Publication venue
Publication date: 27/09/2000
Field of study

Diese Dissertation ist an der Schnittstelle zweier Disziplinen der Informatik angesiedelt: Computergrafik (Globale Beleuchtung) und Paralleles Rechnen (Dynamisches Partitionieren). Einerseits wird der Hierarchische Radiosity Algorithmus (HRA) - ein berühmter und effizienter Algorithmus zur globalen Beleuchtungssimulation - bzgl. seiner Parallelisierungsfähigkeit untersucht. Andererseits wird ein Werkzeug aus der Gattung der orthogonalen rekursiven Zweiteilungsverfahren zur dynamischen Partitionierung räumlich abgebildeter Aufgaben entwickelt sowie theoretisch und experimentell analysiert. Der HRA ist eine spezielle Instanz von Algorithmen, die als eine Ansammlung von räumlich abgebildeten Aufgaben formuliert werden können. Als Beweis der Praktikabilität unseres Werkzeugs wenden wir das Werkzeug auf den HRA an und beobachten ein gut skalierbares Verhalten und nützliche Werte bzgl. der Steigerung der Berechnungsgeschwindigkeit.This thesis resides around the interface of two disciplines in computer science: computer graphics (global illumination) and parallel computing (dynamic partitioning). On the one hand the hierarchical radiosity algorithm (HRA) - a famous and efficient global illumination algorithm - is examined with respect to its capability of being parallelized. On the other hand a dynamic orthogonal recursive bisection tool for the dynamic partitioning of spatially mapped tasks is developped and analyzed theoretically and experimentally. The HRA is a special instance of algorithms that can be formulated as a collection of spatially mapped tasks. As a proof of practicability of our tool we apply the tool to the HRA and observe a well scalable behaviour and useful speedup values

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Semi-supervised Eigenvectors for Large-scale Locally-biased Learning

Author: Hansen Toke J.
Mahoney Michael W.
Publication venue
Publication date: 28/04/2013
Field of study

In many applications, one has side information, e.g., labels that are provided in a semi-supervised manner, about a specific target region of a large data set, and one wants to perform machine learning and data analysis tasks "nearby" that prespecified target region. For example, one might be interested in the clustering structure of a data graph near a prespecified "seed set" of nodes, or one might be interested in finding partitions in an image that are near a prespecified "ground truth" set of pixels. Locally-biased problems of this sort are particularly challenging for popular eigenvector-based machine learning and data analysis tools. At root, the reason is that eigenvectors are inherently global quantities, thus limiting the applicability of eigenvector-based methods in situations where one is interested in very local properties of the data. In this paper, we address this issue by providing a methodology to construct semi-supervised eigenvectors of a graph Laplacian, and we illustrate how these locally-biased eigenvectors can be used to perform locally-biased machine learning. These semi-supervised eigenvectors capture successively-orthogonalized directions of maximum variance, conditioned on being well-correlated with an input seed set of nodes that is assumed to be provided in a semi-supervised manner. We show that these semi-supervised eigenvectors can be computed quickly as the solution to a system of linear equations; and we also describe several variants of our basic method that have improved scaling properties. We provide several empirical examples demonstrating how these semi-supervised eigenvectors can be used to perform locally-biased learning; and we discuss the relationship between our results and recent machine learning algorithms that use global eigenvectors of the graph Laplacian

arXiv.org e-Print Archive

Online Research Database In Technology

Improving scalability of large-scale distributed Spiking Neural Network simulations on High Performance Computing systems using novel architecture-aware streaming hypergraph partitioning

Author: Fernandez Musoles Carlos
Publication venue
Publication date: 01/12/2020
Field of study

After theory and experimentation, modelling and simulation is regarded as the third pillar of science, helping scientists to further their understanding of a complex system. In recent years there has been a growing scientific focus on computational neuroscience as a means to understand the brain and its functions, with large international projects (Human Brain Project, Brain Activity Map, MindScope and \textit{China Brain Project}) aiming to further our knowledge of high level cognitive functions. They are a testament to the enormous interest, difficulty and importance of solving the mysteries of the brain. Spiking Neural Network (SNN) simulations are widely used in the domain to facilitate experimentation. Scaling SNN simulations to large networks usually results in more-than-linear increase in computational complexity. The computing resources required at the brain scale simulation far surpass the capabilities of personal computers today. If those demands are to be met, distributed computation models need to be adopted, since there is a slow down of improvements in individual processors speed due to physical limitations on heat dissipation. This is a significant change that requires careful management of the workload in many levels: partition of work, communication and workload balancing, efficient inter-process communication and efficient use of available memory. If large scale neuronal network models are to be run successfully, simulators must consider these, and offer a viable solution to the challenges they pose. Large scale SNN simulations evidence most of the issues of general HPC systems evident in large distributed computation. Commonly used distribution of workload algorithms (round robin, random and manual allocation) do not take into consideration connectivity locality, which is natural in biological networks, which can lead to increased communication requirements when distributing the simulation in multiple computing nodes. State-of-the-art SNN simulations use dense communication collectives to distribute spike data. The common method of point to point communication in distributed computation is through dense patterns. Sparse communication collectives have been suggested to incur in lower overheads when the application's pattern of communication is sparse. In this work we characterise the bottlenecks on communication-bound SNN simulations and identify communication balance and sparsity as the main contributors to scalability. We propose hypergraph partitioning to distribute neurons along computing nodes to minimise communication (increasing sparsity). A hypergraph is a generalisation of graphs, where a (hyper)edge can link 2 or more vertices at once. Coupled with a novel use of sparse-aware communication collective, computational efficiency increases by up to 40.8 percent points and simulation time reduces by up to 73\%, compared to the common round-robin allocation in neuronal simulators. HPC systems have, by design, highly hierarchical communication network links, with qualitative differences in communication speed and latency between computing nodes. This can create a mismatch between the distributed simulation communication patterns and the physical capabilities of the hardware. If large distributed simulations are to take full advantage of these systems, the communication properties of the HPC need to be taken into consideration when allocating workload to route frequent, heavy communication through fast network links. Strategies that consider the heterogeneous physical communication capabilities are called architecture-aware. After demonstrating that hypergraph partitioning leads to more efficient workload allocation in SNN simulations, this thesis proposes a novel sequential hypergraph partitioning algorithm that incorporates network bandwidth via profiling. This leads to a significant reduction in execution time (up to 14x speedup in synthetic benchmark simulations compared to architecture-agnostic partitioners). The motivating context of this work is large scale brain simulations, however in the era of social media, large graphs and hypergraphs are increasingly relevant in many other scientific applications. A common feature of such graphs is that they are too big for a single machine to cope, both in terms of performance and memory requirements. State-of-the-art multilevel partitioning has been shown to struggle to scale to large graphs in distributed memory, not just because they take a long time to process, but also because they require full knowledge of the graph (not possible in dynamic graphs) and to fit the graph entirely in memory (not possible for very large graphs). To address those limitations we propose a parallel implementation of our architecture-aware streaming hypergraph partitioning algorithm (HyperPRAW) to model distributed applications. Results demonstrate that HyperPRAW produces consistent speedup over previous streaming approaches that only consider hyperedge overlap (up to 5.2x speedup). Compared to multilevel global partitioner in dense hypergraphs (those with high average cardinality), HyperPRAW is able to produce workload allocations that result in speeding up runtime in a synthetic simulation benchmark (up to 4.3x). HyperPRAW has the potential to scale to very large hypergraphs as it only requires local information to make allocation decisions, with an order of magnitude less memory footprint than global partitioners. The combined contributions of this thesis lead to a novel, parallel, scalable, streaming hypergraph partitioning algorithm (HyperPRAW) that can be used to help scale large distributed simulations in HPC systems. HyperPRAW helps tackle three of the main scalability challenges: it produces highly balanced distributed computation and communication, minimising idle time between computing nodes; it reduces the communication overhead by placing frequently communicating simulation elements close to each other (where the communication cost is minimal); and it provides a solution with a reasonable memory footprint that allows tackling larger problems than state-of-the-art alternatives such as global multilevel partitioning

White Rose E-theses Online

Large-scale Machine Learning in High-dimensional Datasets

Author: Hansen Toke Jansen
Publication venue: Technical University of Denmark
Publication date: 01/01/2013
Field of study

Online Research Database In Technology

Designing for adaptability in architecture

Author: Robert Schmidt III (7175630)
Publication venue
Publication date: 01/01/2014
Field of study

The research is framed on the premise that designing buildings that can adapt by accommodating change easier and more cost-effectively provides an effective means to a desired end a more sustainable built environment. In this context, adaptability can be viewed as a means to decrease the amount of new construction (reduce), (re)activate underused or vacant building stock (reuse) and enhance disassembly/ deconstruction of components (reuse, recycle) - prolonging the useful life of buildings (reduce, reuse, recycle). The aim of the research is to gain a holistic overview of the concept of adaptability in the construction industry and provide an improved framework to design for, deploy and implement adaptability. An over-arching research question was posited to guide the inquiry: how can architects understand, communicate, design for and test the concept of adaptability in the context of the design process? The research followed Dubois and Gadde s (2002) systematic combining as an over-arching approach that continuously moves between the empirical world and theoretical models allowing the co-evolution of data collection and theory from the beginning as part of a non-linear process with the objective of matching theory with reality. An initial framework was abducted from a preliminary collection of data from which a set of mixed research methods was deployed to explore adaptability (interviews, building case studies, dependency structural matrices, practitioner surveys and workshop). Emergent from the data is an expanded and revised theory on designing for adaptability consisting of concepts, models and propositions. The models illustrate many of the casual links between the physical design structure of the building (e.g. plan depth, storey height) and the soft contingencies of a messy design/construction/occupation process (e.g. procurement route, funding methods, stakeholder mindsets). In an effort to enhance building adaptability, the abducted propositions suggest a shift in the way the industry values buildings and conducts aspects of the design process and how designer s approach designing for adaptability

Loughborough University Institutional Repository

31th International Symposium on Theoretical Aspects of Computer Science: STACS '14, March 5th to March 8th, 2014, Lyon, France

Author: STACS <31 2014, Lyon>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/03/2014
Field of study

Digitale Bibliothek Thüringen

Algorithms and Techniques for Dynamic Resource Management across Cloud-Edge Resource Spectrum

Author: Shekhar Shashank
Publication venue: VANDERBILT
Publication date
Field of study

Protocolos de pertenencia a grupos para entornos dinámicos

Author: Bañuls Polo María del Carmen
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 06/05/2008
Field of study

Los sistemas distribuidos gozan hoy de fundamental importancia entre los sistemas de información, debido a sus potenciales capacidades de tolerancia a fallos y escalabilidad, que permiten su adecuación a las aplicaciones actuales, crecientemente exigentes. Por otra parte, el desarrollo de aplicaciones distribuidas presenta también dificultades específicas, precisamente para poder ofrecer la escalabilidad, tolerancia a fallos y alta disponibilidad que constituyen sus ventajas. Por eso es de gran utilidad contar con componentes distribuidas específicamente diseñadas para proporcionar, a más bajo nivel, un conjunto de servicios bien definidos, sobre los cuales las aplicaciones de más alto nivel puedan construir su propia semántica más fácilmente. Es el caso de los servicios orientados a grupos, de uso muy extendido por las aplicaciones distribuidas, a las que permiten abstraerse de los detalles de las comunicaciones. Tales servicios proporcionan primitivas básicas para la comunicación entre dos miembros del grupo o, sobre todo, las transmisiones de mensajes a todo el grupo, con garantías concretas. Un caso particular de servicio orientado a grupos lo constituyen los servicios de pertenencia a grupos, en los cuales se centra esta tesis. Los servicios de pertenencia a grupos proporcionan a sus usuarios una imagen del conjunto de procesos o máquinas del sistema que permanecen simultáneamente conectados y correctos. Es más, los diversos participantes reciben esta información con garantías concretas de consistencia. Así pues, los servicios de pertenencia constituyen una componente fundamental para el desarrollo de sistemas de comunicación a grupos y otras aplicaciones distribuidas. El problema de pertenencia a grupos ha sido ampliamente tratado en la literatura tanto desde un punto de vista teórico como práctico, y existen múltiples realizaciones de servicios de pertenencia utilizables. A pesar de ello, la definición del problema no es única. Por el contrario, dependienBañuls Polo, MDC. (2006). Protocolos de pertenencia a grupos para entornos dinámicos [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/1886Palanci

Crossref

RiuNet