2 research outputs found
Energy-Efficient GPU Clusters Scheduling for Deep Learning
Training deep neural networks (DNNs) is a major workload in datacenters
today, resulting in a tremendously fast growth of energy consumption. It is
important to reduce the energy consumption while completing the DL training
jobs early in data centers. In this paper, we propose PowerFlow, a GPU clusters
scheduler that reduces the average Job Completion Time (JCT) under an energy
budget. We first present performance models for DL training jobs to predict the
throughput and energy consumption performance with different configurations.
Based on the performance models, PowerFlow dynamically allocates GPUs and
adjusts the GPU-level or job-level configurations of DL training jobs.
PowerFlow applies network packing and buddy allocation to job placement, thus
avoiding extra energy consumed by cluster fragmentations. Evaluation results
show that under the same energy consumption, PowerFlow improves the average JCT
by 1.57 - 3.39 x at most, compared to competitive baselines
Learning Interpretable Features of Graphs and Time Series Data
Graphs and time series are two of the most ubiquitous representations of data of modern time. Representation learning of real-world graphs and time-series data is a key component for the downstream supervised and unsupervised machine learning tasks such as classification, clustering, and visualization. Because of the inherent high dimensionality, representation learning, i.e., low dimensional vector-based embedding of graphs and time-series data is very challenging. Learning interpretable features incorporates transparency of the feature roles, and facilitates downstream analytics tasks in addition to maximizing the performance of the downstream machine learning models. In this thesis, we leveraged tensor (multidimensional array) decomposition for generating interpretable and low dimensional feature space of graphs and time-series data found from three domains: social networks, neuroscience, and heliophysics. We present the theoretical models and empirical results on node embedding of social networks, biomarker embedding on fMRI-based brain networks, and prediction and visualization of multivariate time-series-based flaring and non-flaring solar events