2,368 research outputs found
Prochlo: Strong Privacy for Analytics in the Crowd
The large-scale monitoring of computer users' software activities has become
commonplace, e.g., for application telemetry, error reporting, or demographic
profiling. This paper describes a principled systems architecture---Encode,
Shuffle, Analyze (ESA)---for performing such monitoring with high utility while
also protecting user privacy. The ESA design, and its Prochlo implementation,
are informed by our practical experiences with an existing, large deployment of
privacy-preserving software monitoring.
(cont.; see the paper
Robust and Efficient Aggregation for Distributed Learning
Distributed learning paradigms, such as federated and decentralized learning,
allow for the coordination of models across a collection of agents, and without
the need to exchange raw data. Instead, agents compute model updates locally
based on their available data, and subsequently share the update model with a
parameter server or their peers. This is followed by an aggregation step, which
traditionally takes the form of a (weighted) average. Distributed learning
schemes based on averaging are known to be susceptible to outliers. A single
malicious agent is able to drive an averaging-based distributed learning
algorithm to an arbitrarily poor model. This has motivated the development of
robust aggregation schemes, which are based on variations of the median and
trimmed mean. While such procedures ensure robustness to outliers and malicious
behavior, they come at the cost of significantly reduced sample efficiency.
This means that current robust aggregation schemes require significantly higher
agent participation rates to achieve a given level of performance than their
mean-based counterparts in non-contaminated settings. In this work we remedy
this drawback by developing statistically efficient and robust aggregation
schemes for distributed learning
Emerging Security Threats in Modern Digital Computing Systems: A Power Management Perspective
Design of computing systems — from pocket-sized smart phones to massive cloud based data-centers — have one common daunting challenge : minimizing the power consumption. In this effort, power management sector is undergoing a rapid and profound transformation to promote clean and energy proportional computing. At the hardware end of system design, there is proliferation of specialized, feature rich and complex power management hardware components. Similarly, in the software design layer complex power management suites are growing rapidly. Concurrent to this development, there has been an upsurge in the integration of third-party components to counter the pressures of shorter time-to-market. These trends collectively raise serious concerns about trust and security of power management solutions.
In recent times, problems such as overheating, performance degradation and poor battery life, have dogged the mobile devices market, including the infamous recall of Samsung Note 7. Power outage in the data-center of a major airline left innumerable passengers stranded, with thousands of canceled flights costing over 100 million dollars. This research examines whether such events of unintentional reliability failure, can be replicated using targeted attacks by exploiting the security loopholes in the complex power management infrastructure of a computing system.
At its core, this research answers an imminent research question: How can system designers ensure secure and reliable operation of third-party power management units? Specifically, this work investigates possible attack vectors, and novel non-invasive detection and defense mechanisms to safeguard system against malicious power attacks. By a joint exploration of the threat model and techniques to seamlessly detect and protect against power attacks, this project can have a lasting impact, by enabling the design of secure and cost-effective next generation hardware platforms
SETTI: A Self-supervised Adversarial Malware Detection Architecture in an IoT Environment
In recent years, malware detection has become an active research topic in the
area of Internet of Things (IoT) security. The principle is to exploit
knowledge from large quantities of continuously generated malware. Existing
algorithms practice available malware features for IoT devices and lack
real-time prediction behaviors. More research is thus required on malware
detection to cope with real-time misclassification of the input IoT data.
Motivated by this, in this paper we propose an adversarial self-supervised
architecture for detecting malware in IoT networks, SETTI, considering samples
of IoT network traffic that may not be labeled. In the SETTI architecture, we
design three self-supervised attack techniques, namely Self-MDS, GSelf-MDS and
ASelf-MDS. The Self-MDS method considers the IoT input data and the adversarial
sample generation in real-time. The GSelf-MDS builds a generative adversarial
network model to generate adversarial samples in the self-supervised structure.
Finally, ASelf-MDS utilizes three well-known perturbation sample techniques to
develop adversarial malware and inject it over the self-supervised
architecture. Also, we apply a defence method to mitigate these attacks, namely
adversarial self-supervised training to protect the malware detection
architecture against injecting the malicious samples. To validate the attack
and defence algorithms, we conduct experiments on two recent IoT datasets:
IoT23 and NBIoT. Comparison of the results shows that in the IoT23 dataset, the
Self-MDS method has the most damaging consequences from the attacker's point of
view by reducing the accuracy rate from 98% to 74%. In the NBIoT dataset, the
ASelf-MDS method is the most devastating algorithm that can plunge the accuracy
rate from 98% to 77%.Comment: 20 pages, 6 figures, 2 Tables, Submitted to ACM Transactions on
Multimedia Computing, Communications, and Application
Unsupervised robust nonparametric learning of hidden community properties
We consider learning of fundamental properties of communities in large noisy
networks, in the prototypical situation where the nodes or users are split into
two classes according to a binary property, e.g., according to their opinions
or preferences on a topic. For learning these properties, we propose a
nonparametric, unsupervised, and scalable graph scan procedure that is, in
addition, robust against a class of powerful adversaries. In our setup, one of
the communities can fall under the influence of a knowledgeable adversarial
leader, who knows the full network structure, has unlimited computational
resources and can completely foresee our planned actions on the network. We
prove strong consistency of our results in this setup with minimal assumptions.
In particular, the learning procedure estimates the baseline activity of normal
users asymptotically correctly with probability 1; the only assumption being
the existence of a single implicit community of asymptotically negligible
logarithmic size. We provide experiments on real and synthetic data to
illustrate the performance of our method, including examples with adversaries.Comment: Experiments with new types of adversaries adde
Detection and Mitigation of Steganographic Malware
A new attack trend concerns the use of some form of steganography and information hiding to make malware stealthier and able to elude many standard security mechanisms. Therefore, this Thesis addresses the detection and the mitigation of this class of threats. In particular, it considers malware implementing covert communications within network traffic or cloaking malicious payloads within digital images.
The first research contribution of this Thesis is in the detection of network covert channels. Unfortunately, the literature on the topic lacks of real traffic traces or attack samples to perform precise tests or security assessments. Thus, a propaedeutic research activity has been devoted to develop two ad-hoc tools. The first allows to create covert channels targeting the IPv6 protocol by eavesdropping flows, whereas the second allows to embed secret data within arbitrary traffic traces that can be replayed to perform investigations in realistic conditions. This Thesis then starts with a security assessment concerning the impact of hidden network communications in production-quality scenarios. Results have been obtained by considering channels cloaking data in the most popular protocols (e.g., TLS, IPv4/v6, and ICMPv4/v6) and showcased that de-facto standard intrusion detection systems and firewalls (i.e., Snort, Suricata, and Zeek) are unable to spot this class of hazards.
Since malware can conceal information (e.g., commands and configuration files) in almost every protocol, traffic feature or network element, configuring or adapting pre-existent security solutions could be not straightforward. Moreover, inspecting multiple protocols, fields or conversations at the same time could lead to performance issues.
Thus, a major effort has been devoted to develop a suite based on the extended Berkeley Packet Filter (eBPF) to gain visibility over different network protocols/components and to efficiently collect various performance indicators or statistics by using a unique technology. This part of research allowed to spot the presence of network covert channels targeting the header of the IPv6 protocol or the inter-packet time of generic network conversations. In addition, the approach based on eBPF turned out to be very flexible and also allowed to reveal hidden data transfers between two processes co-located within the same host. Another important contribution of this part of the Thesis concerns the deployment of the suite in realistic scenarios and its comparison with other similar tools. Specifically, a thorough performance evaluation demonstrated that eBPF can be used to inspect traffic and reveal the presence of covert communications also when in the presence of high loads, e.g., it can sustain rates up to 3 Gbit/s with commodity hardware. To further address the problem of revealing network covert channels in realistic environments, this Thesis also investigates malware targeting traffic generated by Internet of Things devices. In this case, an incremental ensemble of autoencoders has been considered to face the ''unknown'' location of the hidden data generated by a threat covertly exchanging commands towards a remote attacker.
The second research contribution of this Thesis is in the detection of malicious payloads hidden within digital images. In fact, the majority of real-world malware exploits hiding methods based on Least Significant Bit steganography and some of its variants, such as the Invoke-PSImage mechanism. Therefore, a relevant amount of research has been done to detect the presence of hidden data and classify the payload (e.g., malicious PowerShell scripts or PHP fragments). To this aim, mechanisms leveraging Deep Neural Networks (DNNs) proved to be flexible and effective since they can learn by combining raw low-level data and can be updated or retrained to consider unseen payloads or images with different features. To take into account realistic threat models, this Thesis studies malware targeting different types of images (i.e., favicons and icons) and various payloads (e.g., URLs and Ethereum addresses, as well as webshells). Obtained results showcased that DNNs can be considered a valid tool for spotting the presence of hidden contents since their detection accuracy is always above 90% also when facing ''elusion'' mechanisms such as basic obfuscation techniques or alternative encoding schemes.
Lastly, when detection or classification are not possible (e.g., due to resource constraints), approaches enforcing ''sanitization'' can be applied. Thus, this Thesis also considers autoencoders able to disrupt hidden malicious contents without degrading the quality of the image
Structure and dynamics of core-periphery networks
Recent studies uncovered important core/periphery network structures
characterizing complex sets of cooperative and competitive interactions between
network nodes, be they proteins, cells, species or humans. Better
characterization of the structure, dynamics and function of core/periphery
networks is a key step of our understanding cellular functions, species
adaptation, social and market changes. Here we summarize the current knowledge
of the structure and dynamics of "traditional" core/periphery networks,
rich-clubs, nested, bow-tie and onion networks. Comparing core/periphery
structures with network modules, we discriminate between global and local
cores. The core/periphery network organization lies in the middle of several
extreme properties, such as random/condensed structures, clique/star
configurations, network symmetry/asymmetry, network
assortativity/disassortativity, as well as network hierarchy/anti-hierarchy.
These properties of high complexity together with the large degeneracy of core
pathways ensuring cooperation and providing multiple options of network flow
re-channelling greatly contribute to the high robustness of complex systems.
Core processes enable a coordinated response to various stimuli, decrease
noise, and evolve slowly. The integrative function of network cores is an
important step in the development of a large variety of complex organisms and
organizations. In addition to these important features and several decades of
research interest, studies on core/periphery networks still have a number of
unexplored areas.Comment: a comprehensive review of 41 pages, 2 figures, 1 table and 182
reference
A Hybrid Graph Neural Network Approach for Detecting PHP Vulnerabilities
This paper presents DeepTective, a deep learning approach to detect
vulnerabilities in PHP source code. Our approach implements a novel hybrid
technique that combines Gated Recurrent Units and Graph Convolutional Networks
to detect SQLi, XSS and OSCI vulnerabilities leveraging both syntactic and
semantic information. We evaluate DeepTective and compare it to the state of
the art on an established synthetic dataset and on a novel real-world dataset
collected from GitHub. Experimental results show that DeepTective achieves near
perfect classification on the synthetic dataset, and an F1 score of 88.12% on
the realistic dataset, outperforming related approaches. We validate
DeepTective in the wild by discovering 4 novel vulnerabilities in established
WordPress plugins.Comment: A poster version of this paper appeared as
https://doi.org/10.1145/3412841.344213
- …