22 research outputs found
Topic Sentiment Joint Model with Word Embeddings
Abstract. Topic sentiment joint model is an extended model which aims to deal with the problem of detecting sentiments and topics simultaneously from online reviews. Most of existing topic sentiment joint modeling algorithms infer resulting distributions from the co-occurrence of words. But when the training corpus is short and small, the resulting distributions might be not very satisfying. In this paper, we propose a novel topic sentiment joint model with word embeddings (TSWE), which introduces word embeddings trained on external large corpus. Furthermore, we implement TSWE with Gibbs sampling algorithms. The experiment results on Chinese and English data sets show that TSWE achieves significant performance in the task of detecting sentiments and topics simultaneously
Exploring a QoS Driven Scheduling Approach for Peer-to-Peer Live Streaming Systems with Network Coding
Most large-scale peer-to-peer (P2P) live streaming systems use mesh to organize peers and leverage pull scheduling to transmit packets for providing robustness in dynamic environment. The pull scheduling brings large packet delay. Network coding makes the push scheduling feasible in mesh P2P live streaming and improves the efficiency. However, it may also introduce some extra delays and coding computational overhead. To improve the packet delay, streaming quality, and coding overhead, in this paper are as follows. we propose a QoS driven push scheduling approach. The main contributions of this paper are: (i) We introduce a new network coding method to increase the content diversity and reduce the complexity of scheduling; (ii) we formulate the push scheduling as an optimization problem and transform it to a min-cost flow problem for solving it in polynomial time; (iii) we propose a push scheduling algorithm to reduce the coding overhead and do extensive experiments to validate the effectiveness of our approach. Compared with previous approaches, the simulation results demonstrate that packet delay, continuity index, and coding ratio of our system can be significantly improved, especially in dynamic environments
DAGC: Data-Volume-Aware Adaptive Sparsification Gradient Compression for Distributed Machine Learning in Mobile Computing
Distributed machine learning (DML) in mobile environments faces significant
communication bottlenecks. Gradient compression has emerged as an effective
solution to this issue, offering substantial benefits in environments with
limited bandwidth and metered data. Yet, they encounter severe performance drop
in non-IID environments due to a one-size-fits-all compression approach, which
does not account for the varying data volumes across workers. Assigning varying
compression ratios to workers with distinct data distributions and volumes is
thus a promising solution. This study introduces an analysis of distributed SGD
with non-uniform compression, which reveals that the convergence rate
(indicative of the iterations needed to achieve a certain accuracy) is
influenced by compression ratios applied to workers with differing volumes.
Accordingly, we frame relative compression ratio assignment as an -variables
chi-square nonlinear optimization problem, constrained by a fixed and limited
communication budget. We propose DAGC-R, which assigns the worker handling
larger data volumes the conservative compression. Recognizing the computational
limitations of mobile devices, we DAGC-A, which are computationally less
demanding and enhances the robustness of the absolute gradient compressor in
non-IID scenarios. Our experiments confirm that both the DAGC-A and DAGC-R can
achieve better performance when dealing with highly imbalanced data volume
distribution and restricted communication
Intelligent Segment Routing: Toward Load Balancing with Limited Control Overheads
Segment routing has been a novel architecture for traffic engineering in recent years. However, segment routing brings control overheads, i.e., additional packets headers should be inserted. The overheads can greatly reduce the forwarding efficiency for a large network, when segment headers become too long. To achieve the best of two targets, we propose the intelligent routing scheme for traffic engineering (IRTE), which can achieve load balancing with limited control overheads. To achieve optimal performance, we first formulate the problem as a mapping problem that maps different flows to key diversion points. Second, we prove the problem is nondeterministic polynomial (NP)-hard by reducing it to a k-dense subgraph problem. To solve this problem, we develop an ant colony optimization algorithm as improved ant colony optimization (IACO), which is widely used in network optimization problems. We also design the load balancing algorithm with diversion routing (LBA-DR), and analyze its theoretical performance. Finally, we evaluate the IRTE in different real-world topologies, and the results show that the IRTE outperforms traditional algorithms, e.g., the maximum bandwidth is 24.6% lower than that of traditional algorithms when evaluating on BellCanada topology
A Fast Overlapping Community Detection Algorithm with Self-Correcting Ability
Due to the defects of all kinds of modularity, this
paper defines a weighted modularity based on the density
and cohesion as the new evaluation measurement. Since the proportion of the overlapping nodes in network is very low, the number of the nodes’ repeat visits can be reduced by signing the vertices with the overlapping attributes. In this paper, we propose three test
conditions for overlapping nodes and present a fast overlapping
community detection algorithm with self-correcting
ability, which is decomposed into two processes. Under the
control of overlapping properties, the complexity of the
algorithm tends to be approximate linear. And we also give
a new understanding on membership vector. Moreover, we
improve the bridgeness function which evaluates the extent
of overlapping nodes. Finally, we conduct the experiments
on three networks with well known community structures
and the results verify the feasibility and effectiveness of our
algorithm
Learning-based network path planning for traffic engineering
Recent advances in traffic engineering offer a series of techniques to address the network problems due to the explosive growth of Internet traffic. In traffic engineering, dynamic path planning is essential for prevalent applications, e.g., load balancing, traffic monitoring and firewall. Application-specific methods can indeed improve the network performance but can hardly be extended to general scenarios. Meanwhile, massive data generated in the current Internet has not been fully exploited, which may convey much valuable knowledge and information to facilitate traffic engineering. In this paper, we propose a learning-based network path planning method under forwarding constraints for finer-grained and effective traffic engineering. We form the path planning problem as the problem of inferring a sequence of nodes in a network path and adapt a sequence-to-sequence model to learn implicit forwarding paths based on empirical network traffic data. To boost the model performance, attention mechanism and beam search are adapted to capture the essential sequential features of the nodes in a path and guarantee the path connectivity. To validate the effectiveness of the derived model, we implement it in Mininet emulator environment and leverage the traffic data generated by both a real-world GEANT network topology and a grid network topology to train and evaluate the model. Experiment results exhibit a high testing accuracy and imply the superiority of our proposal
Calculation Model for Progressive Residual Surface Subsidence above Mined-Out Areas Based on Logistic Time Function
The exploitation of underground coal resources has stepped up local economic and social development significantly. However, it was inevitable that time-dependent surface settlement would occur above the mined-out voids. Subsidence associated with local geo-mining can last from several months to scores of years and can seriously impact infrastructure, city planning, and underground space utilization. This paper addresses the problems in predicting progressive residual surface subsidence. The subsidence process was divided into three phases: a duration period, a residual subsidence period, and a long-term subsidence period. Then, a novel mathematical model calculating surface progressive residual subsidence was proposed based on the logistic time function. After the duration period, the residual subsidence period was extrapolated according to the threshold of the surface sinking rate. The validation for the proposed model was estimated in light of observed in situ data. The results demonstrate that the logistic time function is an ideal time function reflecting surface subsidence features from downward movement, subsidence rate, and sinking acceleration. The surface residual subsidence coefficient, which plays a crucial role in calculating surface settling, varies directly with model parameters and inversely with time. The influence of the amount of in situ data on predicted values is pronounced. Observation time for surface subsidence must extend beyond the active period. Thus back-calculated parameters with in situ measurement data can be reliable. Conversely, the deviation between predictive values and field-based ones is significant. The conclusions in this study can guide the project design of surface subsidence measurement resulting from longwall coal operation. The study affords insights valuable to land reutilization, city planning, and stabilization estimation of foundation above an abandoned workface
Calculation Model for Progressive Residual Surface Subsidence above Mined-Out Areas Based on Logistic Time Function
The exploitation of underground coal resources has stepped up local economic and social development significantly. However, it was inevitable that time-dependent surface settlement would occur above the mined-out voids. Subsidence associated with local geo-mining can last from several months to scores of years and can seriously impact infrastructure, city planning, and underground space utilization. This paper addresses the problems in predicting progressive residual surface subsidence. The subsidence process was divided into three phases: a duration period, a residual subsidence period, and a long-term subsidence period. Then, a novel mathematical model calculating surface progressive residual subsidence was proposed based on the logistic time function. After the duration period, the residual subsidence period was extrapolated according to the threshold of the surface sinking rate. The validation for the proposed model was estimated in light of observed in situ data. The results demonstrate that the logistic time function is an ideal time function reflecting surface subsidence features from downward movement, subsidence rate, and sinking acceleration. The surface residual subsidence coefficient, which plays a crucial role in calculating surface settling, varies directly with model parameters and inversely with time. The influence of the amount of in situ data on predicted values is pronounced. Observation time for surface subsidence must extend beyond the active period. Thus back-calculated parameters with in situ measurement data can be reliable. Conversely, the deviation between predictive values and field-based ones is significant. The conclusions in this study can guide the project design of surface subsidence measurement resulting from longwall coal operation. The study affords insights valuable to land reutilization, city planning, and stabilization estimation of foundation above an abandoned workface