592 research outputs found

    Mining and Managing Large-Scale Temporal Graphs

    Get PDF
    Large-scale temporal graphs are everywhere in our daily life. From online social networks, mobile networks, brain networks to computer systems, entities in these large complex systems communicate with each other, and their interactions evolve over time. Unlike traditional graphs, temporal graphs are dynamic: both topologies and attributes on nodes/edges may change over time. On the one hand, the dynamics have inspired new applications that rely on mining and managing temporal graphs. On the other hand, the dynamics also raise new technical challenges. First, it is difficult to discover or retrieve knowledge from complex temporal graph data. Second, because of the extra time dimension, we also face new scalability problems. To address these new challenges, we need to develop new methods that model temporal information in graphs so that we can deliver useful knowledge, new queries with temporal and structural constraints where users can obtain the desired knowledge, and new algorithms that are cost-effective for both mining and management tasks.In this dissertation, we discuss our recent works on mining and managing large-scale temporal graphs.First, we investigate two mining problems, including node ranking and link prediction problems. In these works, temporal graphs are applied to model the data generated from computer systems and online social networks. We formulate data mining tasks that extract knowledge from temporal graphs. The discovered knowledge can help domain experts identify critical alerts in system monitoring applications and recover the complete traces for information propagation in online social networks. To address computation efficiency problems, we leverage the unique properties in temporal graphs to simplify mining processes. The resulting mining algorithms scale well with large-scale temporal graphs with millions of nodes and billions of edges. By experimental studies over real-life and synthetic data, we confirm the effectiveness and efficiency of our algorithms.Second, we focus on temporal graph management problems. In these study, temporal graphs are used to model datacenter networks, mobile networks, and subscription relationships between stream queries and data sources. We formulate graph queries to retrieve knowledge that supports applications in cloud service placement, information routing in mobile networks, and query assignment in stream processing system. We investigate three types of queries, including subgraph matching, temporal reachability, and graph partitioning. By utilizing the relatively stable components in these temporal graphs, we develop flexible data management techniques to enable fast query processing and handle graph dynamics. We evaluate the soundness of the proposed techniques by both real and synthetic data. Through these study, we have learned valuable lessons. For temporal graph mining, temporal dimension may not necessarily increase computation complexity; instead, it may reduce computation complexity if temporal information can be wisely utilized. For temporal graph management, temporal graphs may include relatively stable components in real applications, which can help us develop flexible data management techniques that enable fast query processing and handle dynamic changes in temporal graphs

    GPU ν™˜κ²½μ—μ„œ λ¨Έμ‹ λŸ¬λ‹ μ›Œν¬λ‘œλ“œμ˜ 효율적인 μ‹€ν–‰

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(박사) -- μ„œμšΈλŒ€ν•™κ΅λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 컴퓨터곡학뢀, 2023. 2. 전병곀.Machine learning (ML) workloads are becoming increasingly important in many types of real-world applications. We attribute this trend to the development of software systems for ML, which have facilitated the widespread adoption of heterogeneous accelerators such as GPUs. Todays ML software stack has made great improvements in terms of efficiency, however, not all use cases are well supported. In this dissertation, we study how to improve execution efficiency of ML workloads on GPUs from a software system perspective. We identify workloads where current systems for ML have inefficiencies in utilizing GPUs and devise new system techniques that handle those workloads efficiently. We first present Nimble, a ML execution engine equipped with carefully optimized GPU scheduling. The proposed scheduling techniques can be used to improve execution efficiency by up to 22.34Γ—. Second, we propose Orca, an inference serving system specialized for Transformer-based generative models. By incorporating new scheduling and batching techniques, Orca significantly outperforms state-of-the-art systems – 36.9Γ— throughput improvement at the same level of latency. The last topic of this dissertation is WindTunnel, a framework that translates classical ML pipelines into neural networks, providing GPU training capabilities for classical ML workloads. WindTunnel also allows joint training of pipeline components via backpropagation, resulting in improved accuracy over the original pipeline and neural network baselines.졜근 κ²½ν–₯을 보면 λ‹€μ–‘ν•œ μ’…λ₯˜μ˜ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ—μ„œ λ¨Έμ‹  λŸ¬λ‹(ML) μ›Œν¬λ‘œλ“œκ°€ 점 점 더 μ€‘μš”ν•˜κ²Œ ν™œμš©λ˜κ³  μžˆλ‹€. μ΄λŠ” ML용 μ‹œμŠ€ν…œ μ†Œν”„νŠΈμ›¨μ–΄μ˜ κ°œλ°œμ„ 톡해 GPU 와 같은 이기쒅 κ°€μ†κΈ°μ˜ κ΄‘λ²”μœ„ν•œ ν™œμš©μ΄ κ°€λŠ₯ν•΄μ‘ŒκΈ° λ•Œλ¬Έμ΄λ‹€. λ§Žμ€ μ—°κ΅¬μžλ“€μ˜ 관심 덕에 ML용 μ‹œμŠ€ν…œ μ†Œν”„νŠΈμ›¨μ–΄ μŠ€νƒμ€ λΆ„λͺ… ν•˜λ£¨κ°€ λ‹€λ₯΄κ²Œ κ°œμ„ λ˜κ³  μžˆμ§€λ§Œ, μ—¬μ „νžˆ λͺ¨λ“  μ‚¬λ‘€μ—μ„œ 높은 νš¨μœ¨μ„±μ„ λ³΄μ—¬μ£Όμ§€λŠ” λͺ»ν•œλ‹€. 이 ν•™μœ„λ…Όλ¬Έμ—μ„œλŠ” μ‹œμŠ€ ν…œ μ†Œν”„νŠΈμ›¨μ–΄ κ΄€μ μ—μ„œ GPU ν™˜κ²½μ—μ„œ ML μ›Œν¬λ‘œλ“œμ˜ μ‹€ν–‰ νš¨μœ¨μ„±μ„ κ°œμ„ ν•˜λŠ” 방법을 μ—°κ΅¬ν•œλ‹€. κ΅¬μ²΄μ μœΌλ‘œλŠ” μ˜€λŠ˜λ‚ μ˜ ML용 μ‹œμŠ€ν…œμ΄ GPUλ₯Ό 효율적으둜 사 μš©ν•˜μ§€ λͺ»ν•˜λŠ” μ›Œν¬λ‘œλ“œλ₯Ό 규λͺ…ν•˜κ³  더 λ‚˜μ•„κ°€μ„œ ν•΄λ‹Ή μ›Œν¬λ‘œλ“œλ₯Ό 효율적으둜 μ²˜λ¦¬ν•  수 μžˆλŠ” μ‹œμŠ€ν…œ κΈ°μˆ μ„ κ³ μ•ˆν•˜λŠ” 것을 λͺ©ν‘œλ‘œ ν•œλ‹€. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” λ¨Όμ € μ΅œμ ν™”λœ GPU μŠ€μΌ€μ€„λ§μ„ κ°–μΆ˜ ML μ‹€ν–‰ 엔진인 Nimble 을 μ†Œκ°œν•œλ‹€. μƒˆ μŠ€μΌ€μ€„λ§ 기법을 톡해 Nimble은 κΈ°μ‘΄ λŒ€λΉ„ GPU μ‹€ν–‰ νš¨μœ¨μ„± 을 μ΅œλŒ€ 22.34λ°°κΉŒμ§€ ν–₯μƒμ‹œν‚¬ 수 μžˆλ‹€. λ‘˜μ§Έλ‘œ Transformer 기반의 생성 λͺ¨λΈμ— νŠΉν™”λœ μΆ”λ‘  μ„œλΉ„μŠ€ μ‹œμŠ€ν…œ Orcaλ₯Ό μ œμ•ˆν•œλ‹€. μƒˆλ‘œμš΄ μŠ€μΌ€μ€„λ§ 및 batching κΈ° μˆ μ— νž˜μž…μ–΄, OrcaλŠ” λ™μΌν•œ μˆ˜μ€€μ˜ 지연 μ‹œκ°„μ„ κΈ°μ€€μœΌλ‘œ ν–ˆμ„ λ•Œ κΈ°μ‘΄ μ‹œμŠ€ν…œ λŒ€λΉ„ 36.9λ°° ν–₯μƒλœ μ²˜λ¦¬λŸ‰μ„ 보인닀. λ§ˆμ§€λ§‰μœΌλ‘œ 신경망을 μ‚¬μš©ν•˜μ§€ μ•ŠλŠ” κ³ μ „ ML νŒŒμ΄ν”„λΌμΈμ„ μ‹ κ²½λ§μœΌλ‘œ λ³€ν™˜ν•˜λŠ” ν”„λ ˆμž„μ›Œν¬ WindTunnel을 μ†Œκ°œν•œλ‹€. 이 λ₯Ό 톡해 κ³ μ „ ML νŒŒμ΄ν”„λΌμΈ ν•™μŠ΅μ„ GPUλ₯Ό μ‚¬μš©ν•΄ 진행할 수 있게 λœλ‹€. λ˜ν•œ WindTunnel은 gradient backpropagation을 톡해 νŒŒμ΄ν”„λΌμΈμ˜ μ—¬λŸ¬ μš”μ†Œλ₯Ό ν•œ λ²ˆμ— κ³΅λ™μœΌλ‘œ ν•™μŠ΅ ν•  수 있으며, 이λ₯Ό 톡해 νŒŒμ΄ν”„λΌμΈμ˜ 정확도λ₯Ό 더 ν–₯μƒμ‹œν‚¬ 수 μžˆμŒμ„ ν™•μΈν•˜μ˜€λ‹€.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Dissertation Overview 2 1.3 Previous Publications 4 1.4 Roadmap 5 Chapter 2 Background 6 2.1 ML Workloads 6 2.2 The GPU Execution Model 7 2.3 GPU Scheduling in ML Frameworks 8 2.4 Engine Scheduling in Inference Servers 10 2.5 Inference Procedure of Generative Models 11 Chapter 3 Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning 17 3.1 Introduction 17 3.2 Motivation 21 3.3 System Design 24 3.3.1 Ahead-of-time (AoT) Scheduling 25 3.3.2 Stream Assignment Algorithm 28 3.4 Evaluation 32 3.4.1 Inference Latency 36 3.4.2 Impact of Multi-stream Execution 36 3.4.3 Training Throughput 37 3.5 Summary 38 Chapter 4 Orca: A Distributed Serving System for Transformer-Based Generative Models 40 4.1 Introduction 40 4.2 Challenges and Proposed Solutions 44 4.3 Orca System Design 51 4.3.1 Distributed Architecture 51 4.3.2 Scheduling Algorithm 54 4.4 Implementation 60 4.5 Evaluation 61 4.5.1 Engine Microbenchmark 63 4.5.2 End-to-end Performance 66 4.6 Summary 71 Chapter 5 WindTunnel: Towards Differentiable ML Pipelines Beyond a Single Model 72 5.1 Introduction 72 5.2 Pipeline Translation 78 5.2.1 Translating Arithmetic Operators 80 5.2.2 Translating Algorithmic Operators: GBDT 81 5.2.3 Translating Algorithmic Operators for Categorical Features 85 5.2.4 Fine-Tuning 87 5.3 Implementation 87 5.4 Experiments 88 5.4.1 Experimental Setup 89 5.4.2 Overall Performance 94 5.4.3 Ablation Study 95 5.5 Summary 98 Chapter 6 Related Work 99 Chapter 7 Conclusion 105 Bibliography 107 Appendix A Appendix: Nimble 131 A.1 Proofs on the Stream Assignment Algorithm of Nimble 131 A.1.1 Proof of Theorem 1 132 A.1.2 Proof of Theorem 2 134 A.1.3 Proof of Theorem 3 135 A.1.4 Time Complexity Analysis 137 A.2 Evaluation Results on Various GPUs 139 A.3 Evaluation Results on Different Training Batch Sizes 139λ°•

    Graph database management systems: storage, management and query processing

    Get PDF
    The proliferation of graph data, generated from diverse sources, have given rise to many research efforts concerning graph analysis. Interactions in social networks, publication networks, protein networks, software code dependencies and transportation systems are all examples of graph-structured data originating from a variety of application domains and demonstrating different characteristics. In recent years, graph database management systems (GDBMS) have been introduced for the management and analysis of graph data. Motivated by the growing number of real-life applications making use of graph database systems, this thesis focuses on the effectiveness and efficiency aspects of such systems. Specifically, we study the following topics relevant to graph database systems: (i) modeling large-scale applications in GDBMS; (ii) storage and indexing issues in GDBMS, and (iii) efficient query processing in GDBMS. In this thesis, we adopt two different application scenarios to examine how graph database systems can model complex features and perform relevant queries on each of them. Motivated by the popular application of social network analytics, we selected Twitter, a microblogging platform, to conduct our detailed analysis. Addressing limitations of existing models, we pro- pose a data model for the Twittersphere that proactively captures Twitter-specific interactions. We examine the feasibility of running analytical queries on GDBMS and offer empirical analysis of the performance of the proposed approach. Next, we consider a use case of modeling software code dependencies in a graph database system, and investigate how these systems can support capturing the evolution of a codebase overtime. We study a code comprehension tool that extracts software dependencies and stores them in a graph database. On a versioned graph built using a very large codebase, we demonstrate how existing code comprehension queries can be efficiently processed and also show the benefit of running queries across multiple versions. Another important aspect of this thesis is the study of storage aspects of graph systems. Throughput of many graph queries can be significantly affected by disk I/O performance; therefore graph database systems need to focus on effective graph storage for optimising disk operations. We observe that the locality of edges plays an important role and we address the edge-labeling problem which aims to label both incoming and outgoing edges of a graph maximizing the β€˜edge-consecutiveness’ metric. By achieving a better layout and locality of edges on disk, we show that our proposed algorithms result in significantly improved disk I/O performance leading to faster execution of neighbourhood queries. Some applications require the integrated processing of queries from graph and the textual domains within a graph database system. Aggregation of these dimensions facilitates gaining key insights in several application scenarios. For example, in a social network setting, one may want to find the closest k users in the network (graph traversal) who talk about a particular topic A (textual search). Motivated by such practical use cases, in this thesis we study the top-k social-textual ranking query that essentially requires efficient combination of a keyword search query with a graph traversal. We propose algorithms that leverage graph partitioning techniques, based on the premise that socially close users will be placed within the same partition, allowing more localised computations. We show that our proposed approaches are able to achieve significantly better results compared to standard baselines and demonstrating robust behaviour under changing parameters

    Big Graph Analyses: From Queries to Dependencies and Association Rules

    Get PDF

    Towards effective analysis of big graphs: from scalability to quality

    Get PDF
    This thesis investigates the central issues underlying graph analysis, namely, scalability and quality. We first study the incremental problems for graph queries, which aim to compute the changes to the old query answer, in response to the updates to the input graph. The incremental problem is called bounded if its cost is decided by the sizes of the query and the changes only. No matter how desirable, however, our first results are negative: for common graph queries such as graph traversal, connectivity, keyword search and pattern matching, their incremental problems are unbounded. In light of the negative results, we propose two new characterizations for the effectiveness of incremental computation, and show that the incremental computations above can still be effectively conducted, by either reducing the computations on big graphs to small data, or incrementalizing batch algorithms by minimizing unnecessary recomputation. We next study the problems with regards to improving the quality of the graphs. To uniquely identify entities represented by vertices in a graph, we propose a class of keys that are recursively defined in terms of graph patterns, and are interpreted with subgraph isomorphism. As an application, we study the entity matching problem, which is to find all pairs of entities in a graph that are identified by a given set of keys. Although the problem is proved to be intractable, and cannot be parallelized in logarithmic rounds, we provide two parallel scalable algorithms for it. In addition, to catch numeric inconsistencies in real-life graphs, we extend graph functional dependencies with linear arithmetic expressions and comparison predicates, referred to as NGDs. Indeed, NGDs strike a balance between expressivity and complexity, since if we allow non-linear arithmetic expressions, even of degree at most 2, the satisfiability and implication problems become undecidable. A localizable incremental algorithm is developed to detect errors using NGDs, where the cost is determined by small neighbors of nodes in the updates instead of the entire graph. Finally, a rule-based method to clean graphs is proposed. We extend graph entity dependencies (GEDs) as data quality rules. Given a graph, a set of GEDs and a block of ground truth, we fix violations of GEDs in the graph by combining data repairing and object identification. The method finds certain fixes to errors detected by GEDs, i.e., as long as the GEDs and the ground truth are correct, the fixes are assured correct as their logical consequences. Several fundamental results underlying the method are established, and an algorithm is developed to implement the method. We also parallelize the method and guarantee to reduce its running time with the increase of processors

    Change Management Systems for Seamless Evolution in Data Centers

    Get PDF
    Revenue for data centers today is highly dependent on the satisfaction of their enterprise customers. These customers often require various features to migrate their businesses and operations to the cloud. Thus, clouds today introduce new features at a swift pace to onboard new customers and to meet the needs of existing ones. This pace of innovation continues to grow on super linearly, e.g., Amazon deployed 1400 new features in 2017. However, such a rapid pace of evolution adds complexities both for users and the cloud. Clouds struggle to keep up with the deployment speed, and users struggle to learn which features they need and how to use them. The pace of these evolutions has brought us to a tipping point: we can no longer use rules of thumb to deploy new features, and customers need help to identify which features they need. We have built two systems: Janus and Cherrypick, to address these problems. Janus helps data center operators roll out new changes to the data center network. It automatically adapts to the data center topology, routing, traffic, and failure settings. The system reduces the risk of new deployments for network operators as they can now pick deployment strategies which are less likely to impact users’ performance. Cherrypick finds near-optimal cloud configurations for big data analytics. It adapts allows users to search through the new machine types the clouds are constantly introducing and find ones with a near-optimal performance that meets their budget. Cherrypick can adapt to new big-data frameworks and applications as well as the new machine types the clouds are constantly introducing. As the pace of cloud innovations increases, it is critical to have tools that allow operators to deploy new changes as well as those that would enable users to adapt to achieve good performance at low cost. The tools and algorithms discussed in this thesis help accomplish these goals

    HIGH PERFORMANCE DECENTRALISED COMMUNITY DETECTION ALGORITHMS FOR BIG DATA FROM SMART COMMUNICATION APPLICATIONS

    Get PDF
    Many systems in the world can be represented as models of complex networks and subsequently be analysed fruitfully. One fundamental property of the real-world networks is that they usually exhibit inhomogeneity in which the network tends to organise according to an underlying modular structure, commonly referred to as community structure or clustering. Analysing such communities in large networks can help people better understand the structural makeup of the networks. For example, it can be used in mobile ad-hoc and sensor networks to improve the energy consumption and communication tasks. Thus, community detection in networks has become an important research area within many application fields such as computer science, physical sciences, mathematics and biology. Driven by the recent emergence of big data, clustering of real-world networks using traditional methods and algorithms is almost impossible to be processed in a single machine. The existing methods are limited by their computational requirements and most of them cannot be directly parallelised. Furthermore, in many cases the data set is very big and does not fit into the main memory of a single machine, therefore needs to be distributed among several machines. The main topic of this thesis is about network community detection within these big data networks. More specifically, in this thesis, a novel approach, namely Decentralized Iterative Community Clustering Approach (DICCA) for clustering large and undirected networks is introduced. An important property of this approach is its ability to cluster the entire network without the global knowledge of the network topology. Moreover, an extension of the DICCA called Parallel Decentralized Iterative Community Clustering approach (PDICCA) is proposed for efficiently processing data distributed across several machines. PDICCA is based on MapReduce computing platform to work efficiently in distributed and parallel fashion. In addition, the real-world networks are usually noisy and imperfect with missing and false edges. These imperfections are often difficult to eliminate and highly affect the quality and accuracy of conventional methods used to find the community structure in the network. However, in real-world networks, node attribute information is also available in addition to topology information. Considering more than one source of information for community detection could produce meaningful clusters and improve the robustness of the network. Therefore, a pre-processing approach that considers attribute information, shared neighbours and connectivity information aspects of the network for community detection is presented in this thesis as part of my research. Finally, a set of real-world mobile phone usage data obtained from Cambridge Laboratories (Device Analyzer) has been analysed as an exploratory step for viability to apply the algorithms developed in this thesis. All the proposed approaches have been evaluated and verified for feasibility using real-world large data set. The evaluation results of these experimentations prove very promising for the type of large data networks considered

    Automated managed cloud-platforms based on energy policies

    Get PDF
    Delivering environmentally friendly services has become an important issue in Cloud Computing due to awareness provided by governments and environmental conservation organisations about the impact of electricity usage on carbon footprints. Cloud providers and cloud consumers (organisations/ enterprises) have their own defined greengreen policiespolicies to control energy consumption at their data centers. At service management level, greengreen policiespolicies can be mapped as energyenergy managementmanagement policiespolicies or managementmanagement policiespolicies. Focusing at cloud consumer's side, managementmanagement policiespolicies are described by business managers which can change regularly. The continuous changing is based on the nature of the technical environment, changes in regulation; and business requirements. Therefore, there is a gap between the level of describing and implementing managementmanagement policiespolicies in the cloud environment. This thesis provides a method to bridge that gap by (a) defining a specification for formulating managementmanagement policiespolicies into executable form for an infrastructure-as-a-service (IaaS) cloud model; (b) designing a framework to execute the described managementmanagement policiespolicies automatically; (c) proposing a modelling and analysis method to identify the potential energyenergy managementmanagement policypolicy that would save energy-cost. Each aspect covered in the thesis is evaluated with a help of an Energy Management Case Study for a private cloud scenario
    • …
    corecore