4,245 research outputs found

    Distributed Semi-supervised Fuzzy Regression with Interpolation Consistency Regularization

    Full text link
    Recently, distributed semi-supervised learning (DSSL) algorithms have shown their effectiveness in leveraging unlabeled samples over interconnected networks, where agents cannot share their original data with each other and can only communicate non-sensitive information with their neighbors. However, existing DSSL algorithms cannot cope with data uncertainties and may suffer from high computation and communication overhead problems. To handle these issues, we propose a distributed semi-supervised fuzzy regression (DSFR) model with fuzzy if-then rules and interpolation consistency regularization (ICR). The ICR, which was proposed recently for semi-supervised problem, can force decision boundaries to pass through sparse data areas, thus increasing model robustness. However, its application in distributed scenarios has not been considered yet. In this work, we proposed a distributed Fuzzy C-means (DFCM) method and a distributed interpolation consistency regularization (DICR) built on the well-known alternating direction method of multipliers to respectively locate parameters in antecedent and consequent components of DSFR. Notably, the DSFR model converges very fast since it does not involve back-propagation procedure and is scalable to large-scale datasets benefiting from the utilization of DFCM and DICR. Experiments results on both artificial and real-world datasets show that the proposed DSFR model can achieve much better performance than the state-of-the-art DSSL algorithm in terms of both loss value and computational cost

    Learning structure and schemas from heterogeneous domains in networked systems: a survey

    Get PDF
    The rapidly growing amount of available digital documents of various formats and the possibility to access these through internet-based technologies in distributed environments, have led to the necessity to develop solid methods to properly organize and structure documents in large digital libraries and repositories. Specifically, the extremely large size of document collections make it impossible to manually organize such documents. Additionally, most of the document sexist in an unstructured form and do not follow any schemas. Therefore, research efforts in this direction are being dedicated to automatically infer structure and schemas. This is essential in order to better organize huge collections as well as to effectively and efficiently retrieve documents in heterogeneous domains in networked system. This paper presents a survey of the state-of-the-art methods for inferring structure from documents and schemas in networked environments. The survey is organized around the most important application domains, namely, bio-informatics, sensor networks, social networks, P2Psystems, automation and control, transportation and privacy preserving for which we analyze the recent developments on dealing with unstructured data in such domains.Peer ReviewedPostprint (published version

    Towards Data-centric Graph Machine Learning: Review and Outlook

    Full text link
    Data-centric AI, with its primary focus on the collection, management, and utilization of data to drive AI models and applications, has attracted increasing attention in recent years. In this article, we conduct an in-depth and comprehensive review, offering a forward-looking outlook on the current efforts in data-centric AI pertaining to graph data-the fundamental data structure for representing and capturing intricate dependencies among massive and diverse real-life entities. We introduce a systematic framework, Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of the graph data lifecycle, including graph data collection, exploration, improvement, exploitation, and maintenance. A thorough taxonomy of each stage is presented to answer three critical graph-centric questions: (1) how to enhance graph data availability and quality; (2) how to learn from graph data with limited-availability and low-quality; (3) how to build graph MLOps systems from the graph data-centric view. Lastly, we pinpoint the future prospects of the DC-GML domain, providing insights to navigate its advancements and applications.Comment: 42 pages, 9 figure

    Transforming Graph Representations for Statistical Relational Learning

    Full text link
    Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed
    • …
    corecore