207 research outputs found

    Progresses and Challenges in Link Prediction

    Full text link
    Link prediction is a paradigmatic problem in network science, which aims at estimating the existence likelihoods of nonobserved links, based on known topology. After a brief introduction of the standard problem and metrics of link prediction, this Perspective will summarize representative progresses about local similarity indices, link predictability, network embedding, matrix completion, ensemble learning and others, mainly extracted from thousands of related publications in the last decade. Finally, this Perspective will outline some long-standing challenges for future studies.Comment: 45 pages, 1 tabl

    State Estimation Fusion for Linear Microgrids over an Unreliable Network

    Get PDF
    Microgrids should be continuously monitored in order to maintain suitable voltages over time. Microgrids are mainly monitored remotely, and their measurement data transmitted through lossy communication networks are vulnerable to cyberattacks and packet loss. The current study leverages the idea of data fusion to address this problem. Hence, this paper investigates the effects of estimation fusion using various machine-learning (ML) regression methods as data fusion methods by aggregating the distributed Kalman filter (KF)-based state estimates of a linear smart microgrid in order to achieve more accurate and reliable state estimates. This unreliability in measurements is because they are received through a lossy communication network that incorporates packet loss and cyberattacks. In addition to ML regression methods, multi-layer perceptron (MLP) and dependent ordered weighted averaging (DOWA) operators are also employed for further comparisons. The results of simulation on the IEEE 4-bus model validate the effectiveness of the employed ML regression methods through the RMSE, MAE and R-squared indices under the condition of missing and manipulated measurements. In general, the results obtained by the Random Forest regression method were more accurate than those of other methods.This research was partially funded by public research projects of Spanish Ministry of Science and Innovation, references PID2020-118249RB-C22 and PDC2021-121567-C22 - AEI/10.13039/ 501100011033, and by the Madrid Government (Comunidad de Madrid-Spain) under the Multiannual Agreement with UC3M in the line of Excellence of University Professors, reference EPUC3M17

    Dealing with imbalanced and weakly labelled data in machine learning using fuzzy and rough set methods

    Get PDF

    Temporospatial Context-Aware Vehicular Crash Risk Prediction

    Get PDF
    With the demand for more vehicles increasing, road safety is becoming a growing concern. Traffic collisions take many lives and cost billions of dollars in losses. This explains the growing interest of governments, academic institutions and companies in road safety. The vastness and availability of road accident data has provided new opportunities for gaining a better understanding of accident risk factors and for developing more effective accident prediction and prevention regimes. Much of the empirical research on road safety and accident analysis utilizes statistical models which capture limited aspects of crashes. On the other hand, data mining has recently gained interest as a reliable approach for investigating road-accident data and for providing predictive insights. While some risk factors contribute more frequently in the occurrence of a road accident, the importance of driver behavior, temporospatial factors, and real-time traffic dynamics have been underestimated. This study proposes a framework for predicting crash risk based on historical accident data. The proposed framework incorporates machine learning and data analytics techniques to identify driving patterns and other risk factors associated with potential vehicle crashes. These techniques include clustering, association rule mining, information fusion, and Bayesian networks. Swarm intelligence based association rule mining is employed to uncover the underlying relationships and dependencies in collision databases. Data segmentation methods are employed to eliminate the effect of dependent variables. Extracted rules can be used along with real-time mobility to predict crashes and their severity in real-time. The national collision database of Canada (NCDB) is used in this research to generate association rules with crash risk oriented subsequents, and to compare the performance of the swarm intelligence based approach with that of other association rule miners. Many industry-demanding datasets, including road-accident datasets, are deficient in descriptive factors. This is a significant barrier for uncovering meaningful risk factor relationships. To resolve this issue, this study proposes a knwoledgebase approximation framework to enhance the crash risk analysis by integrating pieces of evidence discovered from disparate datasets capturing different aspects of mobility. Dempster-Shafer theory is utilized as a key element of this knowledgebase approximation. This method can integrate association rules with acceptable accuracy under certain circumstances that are discussed in this thesis. The proposed framework is tested on the lymphography dataset and the road-accident database of the Great Britain. The derived insights are then used as the basis for constructing a Bayesian network that can estimate crash likelihood and risk levels so as to warn drivers and prevent accidents in real-time. This Bayesian network approach offers a way to implement a naturalistic driving analysis process for predicting traffic collision risk based on the findings from the data-driven model. A traffic incident detection and localization method is also proposed as a component of the risk analysis model. Detecting and localizing traffic incidents enables timely response to accidents and facilitates effective and efficient traffic flow management. The results obtained from the experimental work conducted on this component is indicative of the capability of our Dempster-Shafer data-fusion-based incident detection method in overcoming the challenges arising from erroneous and noisy sensor readings

    Link prediction in evolving networks based on popularity of nodes

    Get PDF
    Link prediction aims to uncover the underlying relationship behind networks, which could be utilized to predict missing edges or identify the spurious edges. The key issue of link prediction is to estimate the likelihood of potential links in networks. Most classical static-structure based methods ignore the temporal aspects of networks, limited by the time-varying features, such approaches perform poorly in evolving networks. In this paper, we propose a hypothesis that the ability of each node to attract links depends not only on its structural importance, but also on its current popularity (activeness), since active nodes have much more probability to attract future links. Then a novel approach named popularity based structural perturbation method (PBSPM) and its fast algorithm are proposed to characterize the likelihood of an edge from both existing connectivity structure and current popularity of its two endpoints. Experiments on six evolving networks show that the proposed methods outperform state-of-the-art methods in accuracy and robustness. Besides, visual results and statistical analysis reveal that the proposed methods are inclined to predict future edges between active nodes, rather than edges between inactive nodes

    A fast algorithm for predicting links to nodes of interest

    Get PDF
    The problem of link prediction has recently attracted considerable attention in various domains, such as sociology, anthropology, information science, and computer science. In many real world applications, we must predict similarity scores only between pairs of vertices in which users are interested, rather than predicting the scores of all pairs of vertices in the network. In this paper, we propose a fast similarity-based method to predict links related to nodes of interest. In the method, we first construct a sub-graph centered at the node of interest. By choosing the proper size for such a sub-graph, we can restrict the error of the estimated similarities within a given threshold. Because the similarity score is computed within a small sub-graph, the algorithm can greatly reduce computation time. The method is also extended to predict potential links in the whole network to achieve high process speed and accuracy. Experimental results on real networks demonstrate that our algorithm can obtain high accuracy results in less time than other methods can
    corecore