79,066 research outputs found
Parallel Algorithms for Scalable Graph Mining: Applications on Big Data and Machine Learning
Parallel computing plays a crucial role in processing large-scale graph data. Complex network analysis is an exciting area of research for many applications in different scientific domains e.g., sociology, biology, online media, recommendation systems and many more. Graph mining is an area of interest with diverse problems from different domains of our daily life. Due to the advancement of data and computing technologies, graph data is growing at an enormous rate, for example, the number of links in social networks is growing every millisecond. Machine/Deep learning plays a significant role for technological accomplishments to work with big data in modern era. We work on a well-known graph problem, community detection (CD). We design parallelalgorithms for Louvain method for static networks and show around 12-fold speedup. The implementations use both shared-memory and distributed memory parallel algorithms. We also show the change of communities in dynamic networks in different time phases computing several graph metrics based on their temporal definition. We detect temporal communities in dynamicnetworks representing social/brain/communication/citation networks in a more concrete way. We present both shared-memory and distributed-memory parallel algorithms for CD in dynamic graphs using permanence, a vertex-based metric. The parallel CD algorithm implemented using Message Passing Interface (MPI) for temporal graphs is the first MPI-based algorithm to the best of our knowledge. Our algorithm achieves 30× speedup for the largest network with billions of edges. We present a scalable method for CD based on Graph Convolutional Network (GCN) via semi-supervised node classification using PyTorch with CUDA on GPU environment (4× performance gain). Our model achieves up to 86.9% accuracy and 0.85 F1 Score on different real-world datasets from diverse domains. We provide a scalable solution to the Sparse Deep Neural Network (DNN) Challenge by designing data parallel Sparse DNN using TensorFlow on GPU (4.7× speedup). We include the applications of webspam detection from webgraphs (billions of edges), sentiment analysis on social network, Twitter (1.2 million tweets) to reveal insights about COVID-19 vaccination awareness among the public and timeseries forecasting of the vaccinated population in the USA to portray the importance of graph mining in our daily activities
Intelligent Management and Efficient Operation of Big Data
This chapter details how Big Data can be used and implemented in networking
and computing infrastructures. Specifically, it addresses three main aspects:
the timely extraction of relevant knowledge from heterogeneous, and very often
unstructured large data sources, the enhancement on the performance of
processing and networking (cloud) infrastructures that are the most important
foundational pillars of Big Data applications or services, and novel ways to
efficiently manage network infrastructures with high-level composed policies
for supporting the transmission of large amounts of data with distinct
requisites (video vs. non-video). A case study involving an intelligent
management solution to route data traffic with diverse requirements in a wide
area Internet Exchange Point is presented, discussed in the context of Big
Data, and evaluated.Comment: In book Handbook of Research on Trends and Future Directions in Big
Data and Web Intelligence, IGI Global, 201
- …