79,066 research outputs found

    Parallel Algorithms for Scalable Graph Mining: Applications on Big Data and Machine Learning

    Get PDF
    Parallel computing plays a crucial role in processing large-scale graph data. Complex network analysis is an exciting area of research for many applications in different scientific domains e.g., sociology, biology, online media, recommendation systems and many more. Graph mining is an area of interest with diverse problems from different domains of our daily life. Due to the advancement of data and computing technologies, graph data is growing at an enormous rate, for example, the number of links in social networks is growing every millisecond. Machine/Deep learning plays a significant role for technological accomplishments to work with big data in modern era. We work on a well-known graph problem, community detection (CD). We design parallelalgorithms for Louvain method for static networks and show around 12-fold speedup. The implementations use both shared-memory and distributed memory parallel algorithms. We also show the change of communities in dynamic networks in different time phases computing several graph metrics based on their temporal definition. We detect temporal communities in dynamicnetworks representing social/brain/communication/citation networks in a more concrete way. We present both shared-memory and distributed-memory parallel algorithms for CD in dynamic graphs using permanence, a vertex-based metric. The parallel CD algorithm implemented using Message Passing Interface (MPI) for temporal graphs is the first MPI-based algorithm to the best of our knowledge. Our algorithm achieves 30× speedup for the largest network with billions of edges. We present a scalable method for CD based on Graph Convolutional Network (GCN) via semi-supervised node classification using PyTorch with CUDA on GPU environment (4× performance gain). Our model achieves up to 86.9% accuracy and 0.85 F1 Score on different real-world datasets from diverse domains. We provide a scalable solution to the Sparse Deep Neural Network (DNN) Challenge by designing data parallel Sparse DNN using TensorFlow on GPU (4.7× speedup). We include the applications of webspam detection from webgraphs (billions of edges), sentiment analysis on social network, Twitter (1.2 million tweets) to reveal insights about COVID-19 vaccination awareness among the public and timeseries forecasting of the vaccinated population in the USA to portray the importance of graph mining in our daily activities

    Intelligent Management and Efficient Operation of Big Data

    Get PDF
    This chapter details how Big Data can be used and implemented in networking and computing infrastructures. Specifically, it addresses three main aspects: the timely extraction of relevant knowledge from heterogeneous, and very often unstructured large data sources, the enhancement on the performance of processing and networking (cloud) infrastructures that are the most important foundational pillars of Big Data applications or services, and novel ways to efficiently manage network infrastructures with high-level composed policies for supporting the transmission of large amounts of data with distinct requisites (video vs. non-video). A case study involving an intelligent management solution to route data traffic with diverse requirements in a wide area Internet Exchange Point is presented, discussed in the context of Big Data, and evaluated.Comment: In book Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence, IGI Global, 201
    • …
    corecore