4,245 research outputs found
Distributed Semi-supervised Fuzzy Regression with Interpolation Consistency Regularization
Recently, distributed semi-supervised learning (DSSL) algorithms have shown
their effectiveness in leveraging unlabeled samples over interconnected
networks, where agents cannot share their original data with each other and can
only communicate non-sensitive information with their neighbors. However,
existing DSSL algorithms cannot cope with data uncertainties and may suffer
from high computation and communication overhead problems. To handle these
issues, we propose a distributed semi-supervised fuzzy regression (DSFR) model
with fuzzy if-then rules and interpolation consistency regularization (ICR).
The ICR, which was proposed recently for semi-supervised problem, can force
decision boundaries to pass through sparse data areas, thus increasing model
robustness. However, its application in distributed scenarios has not been
considered yet. In this work, we proposed a distributed Fuzzy C-means (DFCM)
method and a distributed interpolation consistency regularization (DICR) built
on the well-known alternating direction method of multipliers to respectively
locate parameters in antecedent and consequent components of DSFR. Notably, the
DSFR model converges very fast since it does not involve back-propagation
procedure and is scalable to large-scale datasets benefiting from the
utilization of DFCM and DICR. Experiments results on both artificial and
real-world datasets show that the proposed DSFR model can achieve much better
performance than the state-of-the-art DSSL algorithm in terms of both loss
value and computational cost
Learning structure and schemas from heterogeneous domains in networked systems: a survey
The rapidly growing amount of available digital documents of various formats and the possibility to access these through internet-based technologies in distributed environments, have led to the necessity to develop solid methods to properly organize and structure documents in large digital libraries and repositories. Specifically, the extremely large size of document collections make it impossible to manually organize such documents. Additionally, most of the document sexist in an unstructured form and do not follow any schemas. Therefore, research efforts in this direction are being dedicated to automatically infer structure and schemas. This is essential in order to better organize huge collections as well as to effectively and efficiently retrieve documents in heterogeneous domains in networked system. This paper presents a survey of the state-of-the-art methods for inferring structure from documents and schemas in networked environments. The survey is organized around the most important application domains, namely, bio-informatics, sensor networks, social networks, P2Psystems, automation and control, transportation and privacy preserving for which we analyze the recent developments on dealing with unstructured data in such domains.Peer ReviewedPostprint (published version
Towards Data-centric Graph Machine Learning: Review and Outlook
Data-centric AI, with its primary focus on the collection, management, and
utilization of data to drive AI models and applications, has attracted
increasing attention in recent years. In this article, we conduct an in-depth
and comprehensive review, offering a forward-looking outlook on the current
efforts in data-centric AI pertaining to graph data-the fundamental data
structure for representing and capturing intricate dependencies among massive
and diverse real-life entities. We introduce a systematic framework,
Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of
the graph data lifecycle, including graph data collection, exploration,
improvement, exploitation, and maintenance. A thorough taxonomy of each stage
is presented to answer three critical graph-centric questions: (1) how to
enhance graph data availability and quality; (2) how to learn from graph data
with limited-availability and low-quality; (3) how to build graph MLOps systems
from the graph data-centric view. Lastly, we pinpoint the future prospects of
the DC-GML domain, providing insights to navigate its advancements and
applications.Comment: 42 pages, 9 figure
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
- …