402,697 research outputs found
Diffusion Adaptation Strategies for Distributed Estimation over Gaussian Markov Random Fields
The aim of this paper is to propose diffusion strategies for distributed
estimation over adaptive networks, assuming the presence of spatially
correlated measurements distributed according to a Gaussian Markov random field
(GMRF) model. The proposed methods incorporate prior information about the
statistical dependency among observations, while at the same time processing
data in real-time and in a fully decentralized manner. A detailed mean-square
analysis is carried out in order to prove stability and evaluate the
steady-state performance of the proposed strategies. Finally, we also
illustrate how the proposed techniques can be easily extended in order to
incorporate thresholding operators for sparsity recovery applications.
Numerical results show the potential advantages of using such techniques for
distributed learning in adaptive networks deployed over GMRF.Comment: Submitted to IEEE Transactions on Signal Processing. arXiv admin
note: text overlap with arXiv:1206.309
Energy Aware Deep Reinforcement Learning Scheduling for Sensors Correlated in Time and Space
Millions of battery-powered sensors deployed for monitoring purposes in a
multitude of scenarios, e.g., agriculture, smart cities, industry, etc.,
require energy-efficient solutions to prolong their lifetime. When these
sensors observe a phenomenon distributed in space and evolving in time, it is
expected that collected observations will be correlated in time and space. In
this paper, we propose a Deep Reinforcement Learning (DRL) based scheduling
mechanism capable of taking advantage of correlated information. We design our
solution using the Deep Deterministic Policy Gradient (DDPG) algorithm. The
proposed mechanism is capable of determining the frequency with which sensors
should transmit their updates, to ensure accurate collection of observations,
while simultaneously considering the energy available. To evaluate our
scheduling mechanism, we use multiple datasets containing environmental
observations obtained in multiple real deployments. The real observations
enable us to model the environment with which the mechanism interacts as
realistically as possible. We show that our solution can significantly extend
the sensors' lifetime. We compare our mechanism to an idealized, all-knowing
scheduler to demonstrate that its performance is near-optimal. Additionally, we
highlight the unique feature of our design, energy-awareness, by displaying the
impact of sensors' energy levels on the frequency of updates
Data Mining Techniques to Understand Textual Data
More than ever, information delivery online and storage heavily rely on text. Billions of texts are produced every day in the form of documents, news, logs, search queries, ad keywords, tags, tweets, messenger conversations, social network posts, etc. Text understanding is a fundamental and essential task involving broad research topics, and contributes to many applications in the areas text summarization, search engine, recommendation systems, online advertising, conversational bot and so on. However, understanding text for computers is never a trivial task, especially for noisy and ambiguous text such as logs, search queries. This dissertation mainly focuses on textual understanding tasks derived from the two domains, i.e., disaster management and IT service management that mainly utilizing textual data as an information carrier.
Improving situation awareness in disaster management and alleviating human efforts involved in IT service management dictates more intelligent and efficient solutions to understand the textual data acting as the main information carrier in the two domains. From the perspective of data mining, four directions are identified: (1) Intelligently generate a storyline summarizing the evolution of a hurricane from relevant online corpus; (2) Automatically recommending resolutions according to the textual symptom description in a ticket; (3) Gradually adapting the resolution recommendation system for time correlated features derived from text; (4) Efficiently learning distributed representation for short and lousy ticket symptom descriptions and resolutions. Provided with different types of textual data, data mining techniques proposed in those four research directions successfully address our tasks to understand and extract valuable knowledge from those textual data.
My dissertation will address the research topics outlined above. Concretely, I will focus on designing and developing data mining methodologies to better understand textual information, including (1) a storyline generation method for efficient summarization of natural hurricanes based on crawled online corpus; (2) a recommendation framework for automated ticket resolution in IT service management; (3) an adaptive recommendation system on time-varying temporal correlated features derived from text; (4) a deep neural ranking model not only successfully recommending resolutions but also efficiently outputting distributed representation for ticket descriptions and resolutions
Participation and Data Valuation in IoT Data Markets through Distributed Coalitions
This paper considers a market for trading Internet of Things (IoT) data that
is used to train machine learning models. The data, either raw or processed, is
supplied to the market platform through a network and the price of such data is
controlled based on the value it brings to the machine learning model. We
explore the correlation property of data in a game-theoretical setting to
eventually derive a simplified distributed solution for a data trading
mechanism that emphasizes the mutual benefit of devices and the market. The key
proposal is an efficient algorithm for markets that jointly addresses the
challenges of availability and heterogeneity in participation, as well as the
transfer of trust and the economic value of data exchange in IoT networks. The
proposed approach establishes the data market by reinforcing collaboration
opportunities between device with correlated data to avoid information leakage.
Therein, we develop a network-wide optimization problem that maximizes the
social value of coalition among the IoT devices of similar data types; at the
same time, it minimizes the cost due to network externalities, i.e., the impact
of information leakage due to data correlation, as well as the opportunity
costs. Finally, we reveal the structure of the formulated problem as a
distributed coalition game and solve it following the simplified
split-and-merge algorithm. Simulation results show the efficacy of our proposed
mechanism design toward a trusted IoT data market, with up to 32.72% gain in
the average payoff for each seller.Comment: 14 pages. Submitted for possible publicatio
Neural Distributed Compressor Discovers Binning
We consider lossy compression of an information source when the decoder has
lossless access to a correlated one. This setup, also known as the Wyner-Ziv
problem, is a special case of distributed source coding. To this day, practical
approaches for the Wyner-Ziv problem have neither been fully developed nor
heavily investigated. We propose a data-driven method based on machine learning
that leverages the universal function approximation capability of artificial
neural networks. We find that our neural network-based compression scheme,
based on variational vector quantization, recovers some principles of the
optimum theoretical solution of the Wyner-Ziv setup, such as binning in the
source space as well as optimal combination of the quantization index and side
information, for exemplary sources. These behaviors emerge although no
structure exploiting knowledge of the source distributions was imposed. Binning
is a widely used tool in information theoretic proofs and methods, and to our
knowledge, this is the first time it has been explicitly observed to emerge
from data-driven learning.Comment: draft of a journal version of our previous ISIT 2023 paper (available
at: arXiv:2305.04380). arXiv admin note: substantial text overlap with
arXiv:2305.0438
- …