105 research outputs found

    A Web Aggregation Approach for Distributed Randomized PageRank Algorithms

    Full text link
    The PageRank algorithm employed at Google assigns a measure of importance to each web page for rankings in search results. In our recent papers, we have proposed a distributed randomized approach for this algorithm, where web pages are treated as agents computing their own PageRank by communicating with linked pages. This paper builds upon this approach to reduce the computation and communication loads for the algorithms. In particular, we develop a method to systematically aggregate the web pages into groups by exploiting the sparsity inherent in the web. For each group, an aggregated PageRank value is computed, which can then be distributed among the group members. We provide a distributed update scheme for the aggregated PageRank along with an analysis on its convergence properties. The method is especially motivated by results on singular perturbation techniques for large-scale Markov chains and multi-agent consensus.Comment: To appear in the IEEE Transactions on Automatic Control, 201

    Control Theory: A Mathematical Perspective on Cyber-Physical Systems

    Get PDF
    Control theory is an interdisciplinary field that is located at the crossroads of pure and applied mathematics with systems engineering and the sciences. Recently the control field is facing new challenges motivated by application domains that involve networks of systems. Examples are interacting robots, networks of autonomous cars or the smart grid. In order to address the new challenges posed by these application disciplines, the special focus of this workshop has been on the currently very active field of Cyber-Physical Systems, which forms the underlying basis for many network control applications. A series of lectures in this workshop was devoted to give an overview on current theoretical developments in Cyber-Physical Systems, emphasizing in particular the mathematical aspects of the field. Special focus was on the dynamics and control of networks of systems, distributed optimization and formation control, fundamentals of nonlinear interconnected systems, as well as open problems in control

    Application of Graph Neural Networks and graph descriptors for graph classification

    Full text link
    Graph classification is an important area in both modern research and industry. Multiple applications, especially in chemistry and novel drug discovery, encourage rapid development of machine learning models in this area. To keep up with the pace of new research, proper experimental design, fair evaluation, and independent benchmarks are essential. Design of strong baselines is an indispensable element of such works. In this thesis, we explore multiple approaches to graph classification. We focus on Graph Neural Networks (GNNs), which emerged as a de facto standard deep learning technique for graph representation learning. Classical approaches, such as graph descriptors and molecular fingerprints, are also addressed. We design fair evaluation experimental protocol and choose proper datasets collection. This allows us to perform numerous experiments and rigorously analyze modern approaches. We arrive to many conclusions, which shed new light on performance and quality of novel algorithms. We investigate application of Jumping Knowledge GNN architecture to graph classification, which proves to be an efficient tool for improving base graph neural network architectures. Multiple improvements to baseline models are also proposed and experimentally verified, which constitutes an important contribution to the field of fair model comparison.Comment: Master's thesis submitted at AGH University of Science and Technolog

    Understanding human-machine networks: A cross-disciplinary survey

    Get PDF
    © 2017 ACM. In the current hyperconnected era, modern Information and Communication Technology (ICT) systems form sophisticated networks where not only do people interact with other people, but also machines take an increasingly visible and participatory role. Such Human-Machine Networks (HMNs) are embedded in the daily lives of people, both for personal and professional use. They can have a significant impact by producing synergy and innovations. The challenge in designing successful HMNs is that they cannot be developed and implemented in the same manner as networks of machines nodes alone, or following a wholly human-centric view of the network. The problem requires an interdisciplinary approach. Here, we review current research of relevance to HMNs across many disciplines. Extending the previous theoretical concepts of sociotechnical systems, actor-network theory, cyber-physical-social systems, and social machines, we concentrate on the interactions among humans and between humans and machines. We identify eight types of HMNs: public-resource computing, crowdsourcing, web search engines, crowdsensing, online markets, social media, multiplayer online games and virtual worlds, and mass collaboration. We systematically select literature on each of these types and review it with a focus on implications for designing HMNs. Moreover, we discuss risks associated with HMNs and identify emerging design and development trends

    The Democratization of News - Analysis and Behavior Modeling of Users in the Context of Online News Consumption

    Get PDF
    Die Erfindung des Internets ebnete den Weg fĂŒr die Demokratisierung von Information. Die Tatsache, dass Nachrichten fĂŒr die breite Öffentlichkeit zugĂ€nglicher wurden, barg wichtige politische Versprechen, wie zum Beispiel das Erreichen von zuvor uninformierten und daher oft inaktiven BĂŒrgern. Diese konnten sich nun dank des Internets tagesaktuell ĂŒber das politische Geschehen informieren und selbst politisch engagieren. WĂ€hrend viele Politiker und Journalisten ein Jahrzehnt lang mit dieser Entwicklung zufrieden waren, Ă€nderte sich die Situation mit dem Aufkommen der sozialen Online-Netzwerke (OSN). Diese OSNs sind heute nahezu allgegenwĂ€rtig – so beziehen inzwischen 67%67\% der Amerikaner zumindest einen Teil ihrer Nachrichten ĂŒber die sozialen Medien. Dieser Trend hat die Kosten fĂŒr die Veröffentlichung von Inhalten weiter gesenkt. Dies sah zunĂ€chst nach einer positiven Entwicklung aus, stellt inzwischen jedoch ein ernsthaftes Problem fĂŒr Demokratien dar. Anstatt dass eine schier unendliche Menge an leicht zugĂ€nglichen Informationen uns klĂŒger machen, wird die Menge an Inhalten zu einer Belastung. Eine ausgewogene Nachrichtenauswahl muss einer Flut an BeitrĂ€gen und Themen weichen, die durch das digitale soziale Umfeld des Nutzers gefiltert werden. Dies fördert die politische Polarisierung und ideologische Segregation. Mehr als die HĂ€lfte der OSN-Nutzer trauen zudem den Nachrichten, die sie lesen, nicht mehr (54%54\% machen sich Sorgen wegen Falschnachrichten). In dieses Bild passt, dass Studien berichten, dass Nutzer von OSNs dem Populismus extrem linker und rechter politischer Akteure stĂ€rker ausgesetzt sind, als Personen ohne Zugang zu sozialen Medien. Um die negativen Effekt dieser Entwicklung abzumildern, trĂ€gt meine Arbeit zum einen zum VerstĂ€ndnis des Problems bei und befasst sich mit Grundlagenforschung im Bereich der Verhaltensmodellierung. Abschließend beschĂ€ftigen wir uns mit der Gefahr der Beeinflussung der Internetnutzer durch soziale Bots und prĂ€sentieren eine auf Verhaltensmodellierung basierende Lösung. Zum besseren VerstĂ€ndnis des Nachrichtenkonsums deutschsprachiger Nutzer in OSNs, haben wir deren Verhalten auf Twitter analysiert und die Reaktionen auf kontroverse - teils verfassungsfeindliche - und nicht kontroverse Inhalte verglichen. ZusĂ€tzlich untersuchten wir die Existenz von Echokammern und Ă€hnlichen PhĂ€nomenen. Hinsichtlich des Nutzerverhaltens haben wir uns auf Netzwerke konzentriert, die ein komplexeres Nutzerverhalten zulassen. Wir entwickelten probabilistische Verhaltensmodellierungslösungen fĂŒr das Clustering und die Segmentierung von Zeitserien. Neben den BeitrĂ€gen zum VerstĂ€ndnis des Problems haben wir Lösungen zur Erkennung automatisierter Konten entwickelt. Diese Bots nehmen eine wichtige Rolle in der frĂŒhen Phase der Verbreitung von Fake News ein. Unser Expertenmodell - basierend auf aktuellen Deep-Learning-Lösungen - identifiziert, z. B., automatisierte Accounts anhand ihres Verhaltens. Meine Arbeit sensibilisiert fĂŒr diese negative Entwicklung und befasst sich mit der Grundlagenforschung im Bereich der Verhaltensmodellierung. Auch wird auf die Gefahr der Beeinflussung durch soziale Bots eingegangen und eine auf Verhaltensmodellierung basierende Lösung prĂ€sentiert

    Straggler-Resilient Distributed Computing

    Get PDF
    In reference to IEEE copyrighted material which is used with permission in this thesis, the IEEE does not endorse any of University of Bergen's products or services. Internal or personal use of this material is permitted. If interested in reprinting/republishing IEEE copyrighted material for advertising or promotional purposes or for creating new collective works for resale or redistribution, please go to http://www.ieee.org/publications_standards/publications/rights/rights_link.html to learn how to obtain a License from RightsLink.Utbredelsen av distribuerte datasystemer har Ăžkt betydelig de siste Ă„rene. Dette skyldes fĂžrst og fremst at behovet for beregningskraft Ăžker raskere enn hastigheten til en enkelt datamaskin, slik at vi mĂ„ bruke flere datamaskiner for Ă„ mĂžte etterspĂžrselen, og at det blir stadig mer vanlig at systemer er spredt over et stort geografisk omrĂ„de. Dette paradigmeskiftet medfĂžrer mange tekniske utfordringer. En av disse er knyttet til "straggler"-problemet, som er forĂ„rsaket av forsinkelsesvariasjoner i distribuerte systemer, der en beregning forsinkes av noen fĂ„ langsomme noder slik at andre noder mĂ„ vente fĂžr de kan fortsette. Straggler-problemet kan svekke effektiviteten til distribuerte systemer betydelig i situasjoner der en enkelt node som opplever en midlertidig overbelastning kan lĂ„se et helt system. I denne avhandlingen studerer vi metoder for Ă„ gjĂžre beregninger av forskjellige typer motstandsdyktige mot slike problemer, og dermed gjĂžre det mulig for et distribuert system Ă„ fortsette til tross for at noen noder ikke svarer i tide. Metodene vi foreslĂ„r er skreddersydde for spesielle typer beregninger. Vi foreslĂ„r metoder tilpasset distribuert matrise-vektor-multiplikasjon (som er en grunnleggende operasjon i mange typer beregninger), distribuert maskinlĂŠring og distribuert sporing av en tilfeldig prosess (for eksempel det Ă„ spore plasseringen til kjĂžretĂžy for Ă„ unngĂ„ kollisjon). De foreslĂ„tte metodene utnytter redundans som enten blir introdusert som en del av metoden, eller som naturlig eksisterer i det underliggende problemet, til Ă„ kompensere for manglende delberegninger. For en av de foreslĂ„tte metodene utnytter vi redundans for ogsĂ„ Ă„ Ăžke effektiviteten til kommunikasjonen mellom noder, og dermed redusere mengden data som mĂ„ kommuniseres over nettverket. I likhet med straggler-problemet kan slik kommunikasjon begrense effektiviteten i distribuerte systemer betydelig. De foreslĂ„tte metodene gir signifikante forbedringer i ventetid og pĂ„litelighet sammenlignet med tidligere metoder.The number and scale of distributed computing systems being built have increased significantly in recent years. Primarily, that is because: i) our computing needs are increasing at a much higher rate than computers are becoming faster, so we need to use more of them to meet demand, and ii) systems that are fundamentally distributed, e.g., because the components that make them up are geographically distributed, are becoming increasingly prevalent. This paradigm shift is the source of many engineering challenges. Among them is the straggler problem, which is a problem caused by latency variations in distributed systems, where faster nodes are held up by slower ones. The straggler problem can significantly impair the effectiveness of distributed systems—a single node experiencing a transient outage (e.g., due to being overloaded) can lock up an entire system. In this thesis, we consider schemes for making a range of computations resilient against such stragglers, thus allowing a distributed system to proceed in spite of some nodes failing to respond on time. The schemes we propose are tailored for particular computations. We propose schemes designed for distributed matrix-vector multiplication, which is a fundamental operation in many computing applications, distributed machine learning—in the form of a straggler-resilient first-order optimization method—and distributed tracking of a time-varying process (e.g., tracking the location of a set of vehicles for a collision avoidance system). The proposed schemes rely on exploiting redundancy that is either introduced as part of the scheme, or exists naturally in the underlying problem, to compensate for missing results, i.e., they are a form of forward error correction for computations. Further, for one of the proposed schemes we exploit redundancy to also improve the effectiveness of multicasting, thus reducing the amount of data that needs to be communicated over the network. Such inter-node communication, like the straggler problem, can significantly limit the effectiveness of distributed systems. For the schemes we propose, we are able to show significant improvements in latency and reliability compared to previous schemes.Doktorgradsavhandlin

    Eight Biennial Report : April 2005 – March 2007

    No full text

    Near Data Processing for Efficient and Trusted Systems

    Full text link
    We live in a world which constantly produces data at a rate which only increases with time. Conventional processor architectures fail to process this abundant data in an efficient manner as they expend significant energy in instruction processing and moving data over deep memory hierarchies. Furthermore, to process large amounts of data in a cost effective manner, there is increased demand for remote computation. While cloud service providers have come up with innovative solutions to cater to this increased demand, the security concerns users feel for their data remains a strong impediment to their wide scale adoption. An exciting technique in our repertoire to deal with these challenges is near-data processing. Near-data processing (NDP) is a data-centric paradigm which moves computation to where data resides. This dissertation exploits NDP to both process the data deluge we face efficiently and design low-overhead secure hardware designs. To this end, we first propose Compute Caches, a novel NDP technique. Simple augmentations to underlying SRAM design enable caches to perform commonly used operations. In-place computation in caches not only avoids excessive data movement over memory hierarchy, but also significantly reduces instruction processing energy as independent sub-units inside caches perform computation in parallel. Compute Caches significantly improve the performance and reduce energy expended for a suite of data intensive applications. Second, this dissertation identifies security advantages of NDP. While memory bus side channel has received much attention, a low-overhead hardware design which defends against it remains elusive. We observe that smart memory, memory with compute capability, can dramatically simplify this problem. To exploit this observation, we propose InvisiMem which uses the logic layer in the smart memory to implement cryptographic primitives, which aid in addressing memory bus side channel efficiently. Our solutions obviate the need for expensive constructs like Oblivious RAM (ORAM) and Merkle trees, and have one to two orders of magnitude lower overheads for performance, space, energy, and memory bandwidth, compared to prior solutions. This dissertation also addresses a related vulnerability of page fault side channel in which the Operating System (OS) induces page faults to learn application's address trace and deduces application secrets from it. To tackle it, we propose Sanctuary which obfuscates page fault channel while allowing the OS to manage memory as a resource. To do so, we design a novel construct, Oblivious Page Management (OPAM) which is derived from ORAM but is customized for page management context. We employ near-memory page moves to reduce OPAM overhead and also propose a novel memory partition to reduce OPAM transactions required. For a suite of cloud applications which process sensitive data we show that page fault channel can be tackled at reasonable overheads.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144139/1/shaizeen_1.pd
    • 

    corecore