26,768 research outputs found
EsPRESSo: Efficient Privacy-Preserving Evaluation of Sample Set Similarity
Electronic information is increasingly often shared among entities without
complete mutual trust. To address related security and privacy issues, a few
cryptographic techniques have emerged that support privacy-preserving
information sharing and retrieval. One interesting open problem in this context
involves two parties that need to assess the similarity of their datasets, but
are reluctant to disclose their actual content. This paper presents an
efficient and provably-secure construction supporting the privacy-preserving
evaluation of sample set similarity, where similarity is measured as the
Jaccard index. We present two protocols: the first securely computes the
(Jaccard) similarity of two sets, and the second approximates it, using MinHash
techniques, with lower complexities. We show that our novel protocols are
attractive in many compelling applications, including document/multimedia
similarity, biometric authentication, and genetic tests. In the process, we
demonstrate that our constructions are appreciably more efficient than prior
work.Comment: A preliminary version of this paper was published in the Proceedings
of the 7th ESORICS International Workshop on Digital Privacy Management (DPM
2012). This is the full version, appearing in the Journal of Computer
Securit
On the Reverse Engineering of the Citadel Botnet
Citadel is an advanced information-stealing malware which targets financial
information. This malware poses a real threat against the confidentiality and
integrity of personal and business data. A joint operation was recently
conducted by the FBI and the Microsoft Digital Crimes Unit in order to take
down Citadel command-and-control servers. The operation caused some disruption
in the botnet but has not stopped it completely. Due to the complex structure
and advanced anti-reverse engineering techniques, the Citadel malware analysis
process is both challenging and time-consuming. This allows cyber criminals to
carry on with their attacks while the analysis is still in progress. In this
paper, we present the results of the Citadel reverse engineering and provide
additional insight into the functionality, inner workings, and open source
components of the malware. In order to accelerate the reverse engineering
process, we propose a clone-based analysis methodology. Citadel is an offspring
of a previously analyzed malware called Zeus; thus, using the former as a
reference, we can measure and quantify the similarities and differences of the
new variant. Two types of code analysis techniques are provided in the
methodology, namely assembly to source code matching and binary clone
detection. The methodology can help reduce the number of functions requiring
manual analysis. The analysis results prove that the approach is promising in
Citadel malware analysis. Furthermore, the same approach is applicable to
similar malware analysis scenarios.Comment: 10 pages, 17 figures. This is an updated / edited version of a paper
appeared in FPS 201
File Detection on Network Traffic Using Approximate Matching
In recent years, Internet technologies changed enormously and allow faster Internet connections, higher data rates and mobile usage. Hence, it is possible to send huge amounts of data / files easily which is often used by insiders or attackers to steal intellectual property. As a consequence, data leakage prevention systems (DLPS) have been developed which analyze network traffic and alert in case of a data leak. Although the overall concepts of the detection techniques are known, the systems are mostly closed and commercial. Within this paper we present a new technique for network traffic analysis based on approximate matching (a.k.a fuzzy hashing) which is very common in digital forensics to correlate similar files. This paper demonstrates how to optimize and apply them on single network packets. Our contribution is a straightforward concept which does not need a comprehensive configuration: hash the file and store the digest in the database. Within our experiments we obtained false positive rates between 10−4 and 10−5 and an algorithm throughput of over 650 Mbit/s
Bytewise Approximate Matching: The Good, The Bad, and The Unknown
Hash functions are established and well-known in digital forensics, where they are commonly used for proving integrity and file identification (i.e., hash all files on a seized device and compare the fingerprints against a reference database). However, with respect to the latter operation, an active adversary can easily overcome this approach because traditional hashes are designed to be sensitive to altering an input; output will significantly change if a single bit is flipped. Therefore, researchers developed approximate matching, which is a rather new, less prominent area but was conceived as a more robust counterpart to traditional hashing. Since the conception of approximate matching, the community has constructed numerous algorithms, extensions, and additional applications for this technology, and are still working on novel concepts to improve the status quo. In this survey article, we conduct a high-level review of the existing literature from a non-technical perspective and summarize the existing body of knowledge in approximate matching, with special focus on bytewise algorithms. Our contribution allows researchers and practitioners to receive an overview of the state of the art of approximate matching so that they may understand the capabilities and challenges of the field. Simply, we present the terminology, use cases, classification, requirements, testing methods, algorithms, applications, and a list of primary and secondary literature
File Detection on Network Traffic Using Approximate Matching
In recent years, Internet technologies changed enormously and allow faster Internet connections, higher data rates and mobile usage. Hence, it is possible to send huge amounts of data / files easily which is often used by insiders or attackers to steal intellectual property. As a consequence, data leakage prevention systems (DLPS) have been developed which analyze network traffic and alert in case of a data leak. Although the overall concepts of the detection techniques are known, the systems are mostly closed and commercial. Within this paper we present a new technique for network traffic analysis based on approximate matching (a.k.a fuzzy hashing) which is very common in digital forensics to correlate similar files. This paper demonstrates how to optimize and apply them on single network packets. Our contri- bution is a straightforward concept which does not need a comprehensive configuration: hash the file and store the digest in the database. Within our experiments we obtained false positive rates between 10-4 and 10-5 and an algorithm throughput of over 650 Mbit/s
- …