5,115 research outputs found
Your Smart Home Can't Keep a Secret: Towards Automated Fingerprinting of IoT Traffic with Neural Networks
The IoT (Internet of Things) technology has been widely adopted in recent
years and has profoundly changed the people's daily lives. However, in the
meantime, such a fast-growing technology has also introduced new privacy
issues, which need to be better understood and measured. In this work, we look
into how private information can be leaked from network traffic generated in
the smart home network. Although researchers have proposed techniques to infer
IoT device types or user behaviors under clean experiment setup, the
effectiveness of such approaches become questionable in the complex but
realistic network environment, where common techniques like Network Address and
Port Translation (NAPT) and Virtual Private Network (VPN) are enabled. Traffic
analysis using traditional methods (e.g., through classical machine-learning
models) is much less effective under those settings, as the features picked
manually are not distinctive any more. In this work, we propose a traffic
analysis framework based on sequence-learning techniques like LSTM and
leveraged the temporal relations between packets for the attack of device
identification. We evaluated it under different environment settings (e.g.,
pure-IoT and noisy environment with multiple non-IoT devices). The results
showed our framework was able to differentiate device types with a high
accuracy. This result suggests IoT network communications pose prominent
challenges to users' privacy, even when they are protected by encryption and
morphed by the network gateway. As such, new privacy protection methods on IoT
traffic need to be developed towards mitigating this new issue
Security and Privacy Issues of Big Data
This chapter revises the most important aspects in how computing
infrastructures should be configured and intelligently managed to fulfill the
most notably security aspects required by Big Data applications. One of them is
privacy. It is a pertinent aspect to be addressed because users share more and
more personal data and content through their devices and computers to social
networks and public clouds. So, a secure framework to social networks is a very
hot topic research. This last topic is addressed in one of the two sections of
the current chapter with case studies. In addition, the traditional mechanisms
to support security such as firewalls and demilitarized zones are not suitable
to be applied in computing systems to support Big Data. SDN is an emergent
management solution that could become a convenient mechanism to implement
security in Big Data systems, as we show through a second case study at the end
of the chapter. This also discusses current relevant work and identifies open
issues.Comment: In book Handbook of Research on Trends and Future Directions in Big
Data and Web Intelligence, IGI Global, 201
Multitask Learning for Network Traffic Classification
Traffic classification has various applications in today's Internet, from
resource allocation, billing and QoS purposes in ISPs to firewall and malware
detection in clients. Classical machine learning algorithms and deep learning
models have been widely used to solve the traffic classification task. However,
training such models requires a large amount of labeled data. Labeling data is
often the most difficult and time-consuming process in building a classifier.
To solve this challenge, we reformulate the traffic classification into a
multi-task learning framework where bandwidth requirement and duration of a
flow are predicted along with the traffic class. The motivation of this
approach is twofold: First, bandwidth requirement and duration are useful in
many applications, including routing, resource allocation, and QoS
provisioning. Second, these two values can be obtained from each flow easily
without the need for human labeling or capturing flows in a controlled and
isolated environment. We show that with a large amount of easily obtainable
data samples for bandwidth and duration prediction tasks, and only a few data
samples for the traffic classification task, one can achieve high accuracy. We
conduct two experiment with ISCX and QUIC public datasets and show the efficacy
of our approach
- …