169,996 research outputs found
Approaches to address the Data Skew Problem in Federated Learning
A Federated Learning approach consists of creating an AI model from multiple data sources, without moving large amounts of data across to a central environment. Federated learning can be very useful in a tactical coalition environment, where data can be collected individually by each of the coalition partners, but network connectivity is inadequate to move the data to a central environment. However, such data collected is often dirty and imperfect. The data can be imbalanced, and in some cases, some classes can be completely missing from some coalition partners. Under these conditions, traditional approaches for federated learning can result in models that are highly inaccurate. In this paper, we propose approaches that can result in good machine learning models even in the environments where the data may be highly skewed, and study their performance under different environments
Passport: enabling accurate country-level router geolocation using inaccurate sources
When does Internet traffic cross international borders? This question has major geopolitical, legal and social implications and is surprisingly difficult to answer. A critical stumbling block is a dearth of tools that accurately map routers traversed by Internet traffic to the countries in which they are located. This paper presents Passport: a new approach for efficient, accurate country-level router geolocation and a system that implements it. Passport provides location predictions with limited active measurements, using machine learning to combine information from IP geolocation databases, router hostnames, whois records, and ping measurements. We show that Passport substantially outperforms existing techniques, and identify cases where paths traverse countries with implications for security, privacy, and performance.First author draf
Passport: Enabling Accurate Country-Level Router Geolocation using Inaccurate Sources
When does Internet traffic cross international borders? This question has
major geopolitical, legal and social implications and is surprisingly difficult
to answer. A critical stumbling block is a dearth of tools that accurately map
routers traversed by Internet traffic to the countries in which they are
located. This paper presents Passport: a new approach for efficient, accurate
country-level router geolocation and a system that implements it. Passport
provides location predictions with limited active measurements, using machine
learning to combine information from IP geolocation databases, router
hostnames, whois records, and ping measurements. We show that Passport
substantially outperforms existing techniques, and identify cases where paths
traverse countries with implications for security, privacy, and performance
Transfer Learning for Improving Model Predictions in Highly Configurable Software
Modern software systems are built to be used in dynamic environments using
configuration capabilities to adapt to changes and external uncertainties. In a
self-adaptation context, we are often interested in reasoning about the
performance of the systems under different configurations. Usually, we learn a
black-box model based on real measurements to predict the performance of the
system given a specific configuration. However, as modern systems become more
complex, there are many configuration parameters that may interact and we end
up learning an exponentially large configuration space. Naturally, this does
not scale when relying on real measurements in the actual changing environment.
We propose a different solution: Instead of taking the measurements from the
real system, we learn the model using samples from other sources, such as
simulators that approximate performance of the real system at low cost. We
define a cost model that transform the traditional view of model learning into
a multi-objective problem that not only takes into account model accuracy but
also measurements effort as well. We evaluate our cost-aware transfer learning
solution using real-world configurable software including (i) a robotic system,
(ii) 3 different stream processing applications, and (iii) a NoSQL database
system. The experimental results demonstrate that our approach can achieve (a)
a high prediction accuracy, as well as (b) a high model reliability.Comment: To be published in the proceedings of the 12th International
Symposium on Software Engineering for Adaptive and Self-Managing Systems
(SEAMS'17
Pushing towards the Limit of Sampling Rate: Adaptive Chasing Sampling
Measurement samples are often taken in various monitoring applications. To
reduce the sensing cost, it is desirable to achieve better sensing quality
while using fewer samples. Compressive Sensing (CS) technique finds its role
when the signal to be sampled meets certain sparsity requirements. In this
paper we investigate the possibility and basic techniques that could further
reduce the number of samples involved in conventional CS theory by exploiting
learning-based non-uniform adaptive sampling.
Based on a typical signal sensing application, we illustrate and evaluate the
performance of two of our algorithms, Individual Chasing and Centroid Chasing,
for signals of different distribution features. Our proposed learning-based
adaptive sampling schemes complement existing efforts in CS fields and do not
depend on any specific signal reconstruction technique. Compared to
conventional sparse sampling methods, the simulation results demonstrate that
our algorithms allow less number of samples for accurate signal
reconstruction and achieve up to smaller signal reconstruction error
under the same noise condition.Comment: 9 pages, IEEE MASS 201
- …