1 research outputs found
Large-scale Mobile App Identification Using Deep Learning
Many network services and tools (e.g. network monitors, malware-detection
systems, routing and billing policy enforcement modules in ISPs) depend on
identifying the type of traffic that passes through the network. With the
widespread use of mobile devices, the vast diversity of mobile apps, and the
massive adoption of encryption protocols (such as TLS), large-scale encrypted
traffic classification becomes increasingly difficult. In this paper, we
propose a deep learning model for mobile app identification that works even
with encrypted traffic. The proposed model only needs the payload of the first
few packets for classification, and, hence, it is suitable even for
applications that rely on early prediction, such as routing and QoS
provisioning. The deep model achieves between 84% to 98% accuracy for the
identification of 80 popular apps. We also perform occlusion analysis to bring
insight into what data is leaked from SSL/TLS protocol that allows accurate app
identification. Moreover, our traffic analysis shows that many apps generate
not only app-specific traffic, but also numerous ambiguous flows. Ambiguous
flows are flows generated by common functionality modules, such as
advertisement and traffic analytics. Because such flows are common among many
different apps, identifying the source app that generates ambiguous flows is
challenging. To address this challenge, we propose a CNN+LSTM model that uses
adjacent flows to learn the order and pattern of multiple flows, to better
identify the app that generates them. We show that such flow association
considerably improves the accuracy, particularly for ambiguous flows.
Furthermore, we show that our approach is robust to mixed traffic scenarios
where some unrelated flows may appear in adjacent flows. To the best of our
knowledge, this is the first work that identifies the source app for ambiguous
flows