Search CORE

37 research outputs found

Recommended from our members

Thwarting Attacks in Malcode-Bearing Documents by Altering Data Sector Values

Author: Li Wei-Jen
Stolfo Salvatore
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

Embedding malcode within documents provides a convenient means of attacking systems. Such attacks can be very targeted and difficult to detect to stop due to the multitude of document-exchange vectors and the vulnerabilities in modern document processing applications. Detecting malcode embedded in a document is difficult owing to the complexity of modern document formats that provide ample opportunity to embed code in a myriad of ways. We focus on Microsoft Word documents as malcode carriers as a case study in this paper. To detect stealthy embedded malcode in documents, we develop an arbitrary data transformation technique that changes the value of data segments in documents in such a way as to purposely damage any hidden malcode that may be embedded in those sections. Consequently, the embedded malcode will not only fail but also introduce a system exception that would be easily detected. The method is intended to be applied in a safe sandbox, the transformation is reversible after testing a document, and does not require any learning phase. The method depends upon knowledge of the structure of the document binary format to parse a document and identify the specific sectors to which the method can be safely applied for malcode detection. The method can be implemented in MS Word as a security feature to enhance the safety of Word documents

Columbia University Academic Commons

Recommended from our members

Designing Host and Network Sensors to Mitigate the Insider Threat

Author: Bowen Brian M.
Ben Salem Malek
Hershkop Shlomo
Keromytis Angelos D.
Stolfo Salvatore
Publication venue
Publication date: 29/05/2007
Field of study

We propose a design for insider threat detection that combines an array of complementary techniques that aims to detect evasive adversaries. We are motivated by real world incidents and our experience with building isolated detectors: such standalone mechanisms are often easily identified and avoided by malefactors. Our work-in-progress combines host-based user-event monitoring sensors with trap-based decoys and remote network detectors to track and correlate insider activity. We identify several challenges in scaling up, deploying, and validating our architecture in real environments

Columbia University Academic Commons

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

静的解析に基づくマルウェア分類システムに関する研究

Author: 岩本一樹
Publication venue: 信州大学
Publication date: 20/03/2015
Field of study

信州大学(Shinshu university)博士（工学）Thesis岩本　一樹. 静的解析に基づくマルウェア分類システムに関する研究. 信州大学, 2015, 博士論文. 博士（工学）, 甲第628号, 平成27年3月20日授与.doctoral thesi

Shinshu University Institutional Repository

Sec-Lib: Protecting Scholarly Digital Libraries From Infected Papers Using Active Machine Learning Framework

Author: A. Cohen
A. Lanzi
J. Wu
L. Giles
L. Rokach
N. Nissim
Y. Elovici
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2019
Field of study

Researchers from academia and the corporate-sector rely on scholarly digital libraries to access articles. Attackers take advantage of innocent users who consider the articles' files safe and thus open PDF-files with little concern. In addition, researchers consider scholarly libraries a reliable, trusted, and untainted corpus of papers. For these reasons, scholarly digital libraries are an attractive-target and inadvertently support the proliferation of cyber-attacks launched via malicious PDF-files. In this study, we present related vulnerabilities and malware distribution approaches that exploit the vulnerabilities of scholarly digital libraries. We evaluated over two-million scholarly papers in the CiteSeerX library and found the library to be contaminated with a surprisingly large number (0.3-2%) of malicious PDF documents (over 55% were crawled from the IPs of US-universities). We developed a two layered detection framework aimed at enhancing the detection of malicious PDF documents, Sec-Lib, which offers a security solution for large digital libraries. Sec-Lib includes a deterministic layer for detecting known malware, and a machine learning based layer for detecting unknown malware. Our evaluation showed that scholarly digital libraries can detect 96.9% of malware with Sec-Lib, while minimizing the number of PDF-files requiring labeling, and thus reducing the manual inspection efforts of security-experts by 98%

AIR Universita degli studi di Milano

Sec-Lib: Protecting Scholarly Digital Libraries From Infected Papers Using Active Machine Learning Framework

Author: Cohen Aviad
Elovici Yuval
Giles Lee
Lanzi Andrea
Nissim Nir
Rokach Lior
Wu Jian
Publication venue: ODU Digital Commons
Publication date: 01/01/2019
Field of study

Researchers from academia and the corporate-sector rely on scholarly digital libraries to access articles. Attackers take advantage of innocent users who consider the articles\u27 files safe and thus open PDF-files with little concern. In addition, researchers consider scholarly libraries a reliable, trusted, and untainted corpus of papers. For these reasons, scholarly digital libraries are an attractive-target and inadvertently support the proliferation of cyber-attacks launched via malicious PDF-files. In this study, we present related vulnerabilities and malware distribution approaches that exploit the vulnerabilities of scholarly digital libraries. We evaluated over two-million scholarly papers in the CiteSeerX library and found the library to be contaminated with a surprisingly large number (0.3-2%) of malicious PDF documents (over 55% were crawled from the IPs of US-universities). We developed a two layered detection framework aimed at enhancing the detection of malicious PDF documents, Sec-Lib, which offers a security solution for large digital libraries. Sec-Lib includes a deterministic layer for detecting known malware, and a machine learning based layer for detecting unknown malware. Our evaluation showed that scholarly digital libraries can detect 96.9% of malware with Sec-Lib, while minimizing the number of PDF-files requiring labeling, and thus reducing the manual inspection efforts of security-experts by 98%

AIR Universita degli studi di Milano

Old Dominion University

Static malware detection Using Stacked BiLSTM and GPT-2

Author: Acartürk Cengiz
Demirci Deniz
Sahin Nazenin
Sirlancis Melih
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

In recent years, cyber threats and malicious software attacks have been escalated on various platforms. Therefore, it has become essential to develop automated machine learning methods for defending against malware. In the present study, we propose stacked bidirectional long short-term memory (Stacked BiLSTM) and generative pre-trained transformer based (GPT-2) deep learning language models for detecting malicious code. We developed language models using assembly instructions extracted from .text sections of malicious and benign Portable Executable (PE) files. We treated each instruction as a sentence and each .text section as a document. We also labeled each sentence and document as benign or malicious, according to the file source. We created three datasets from those sentences and documents. The first dataset, composed of documents, was fed into a Document Level Analysis Model (DLAM) based on Stacked BiLSTM. The second dataset, composed of sentences, was used in Sentence Level Analysis Models (SLAMs) based on Stacked BiLSTM and DistilBERT, Domain Specific Language Model GPT-2 (DSLM-GPT2), and General Language Model GPT-2 (GLM-GPT2). Lastly, we merged all assembly instructions without labels for creating the third dataset; then we fed a custom pre-trained model with it. We then compared malware detection performances. The results showed that the pre-trained model improved the DSLM-GPT2 and GLM-GPT2 detection performance. The experiments showed that the DLAM, the SLAM based on DistilBERT, the DSLM-GPT2, and the GLM-GPT2 achieved 98.3%, 70.4%, 86.0%, and 76.2% F1 scores, respectively

Jagiellonian Univeristy Repository

Leveraging the Cloud for Software Security Services.

Author: Oberheide Jonathan Clarke
Publication venue
Publication date
Field of study

This thesis seeks to leverage the advances in cloud computing in order to address modern security threats, allowing for completely novel architectures that provide dramatic improvements and asymmetric gains beyond what is possible using current approaches. Indeed, many of the critical security problems facing the Internet and its users are inadequately addressed by current security technologies. Current security measures often are deployed in an exclusively network-based or host-based model, limiting their efficacy against modern threats. However, recent advancements in the past decade in cloud computing and high-speed networking have ushered in a new era of software services. Software services that were previously deployed on-premise in organizations and enterprises are now being outsourced to the cloud, leading to fundamentally new models in how software services are sold, consumed, and managed. This thesis focuses on how novel software security services can be deployed that leverage the cloud to scale elegantly in their capabilities, performance, and management. First, we introduce a novel architecture for malware detection in the cloud. Next, we propose a cloud service to protect modern mobile devices, an ever-increasing target for malicious attackers. Then, we discuss and demonstrate the ability for attackers to leverage the same benefits of cloud-centric services for malicious purposes. Next, we present new techniques for the large-scale analysis and classification of malicious software. Lastly, to demonstrate the benefits of cloud-centric architectures outside the realm of malicious software, we present a threshold signature scheme that leverages the cloud for robustness and resiliency.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91385/1/jonojono_1.pd

Deep Blue Documents at the University of Michigan

Advanced Detection Tool for PDF Threats

Author: B Lowagie
C Willems
E Filiol
HJ Abdelnur
I Witten
J François
J Kolter
M Hall
R Akbani
R Fan
U Bayer
Publication venue
Publication date: 13/09/2013
Field of study

In this paper we introduce an efficient application for malicious PDF detection: ADEPT. With targeted attacks rising over the recent past, exploring a new detection and mitigation paradigm becomes mandatory. The use of malicious PDF files that exploit vulnerabilities in well-known PDF readers has become a popular vector for targeted at- tacks, for which few efficient approaches exist. Although simple in theory, parsing followed by analysis of such files is resource-intensive and may even be impossible due to several obfuscation and reader-specific artifacts. Our paper describes a new approach for detecting such malicious payloads that leverages machine learning techniques and an efficient feature selection mechanism for rapidly detecting anomalies. We assess our approach on a large selection of malicious files and report the experimental performance results for the developed prototype

Crossref

Open Repository and Bibliography - Luxembourg

Data flows of classified documents

Author: Durães Benjamim Gomes da Silva
Publication venue
Publication date: 01/01/2010
Field of study

Tese de mestrado em Segurança Informática, apresentada à Universidade de Lisboa, através da Faculdade de Ciências, 2010Nos dias de hoje à medida que a evolução dos produtos e serviços acelera cada vez mais, significa que a informação é hoje uma das mais valiosas propriedades de qualquer empresa. Quanto mais “imaterial” é o produto, tais como serviços, propriedade intelectual e media (vídeo, música, fotografia) mais importante é a questão. Este ciclo de vida acelerado torna a sua exposição maior do que em épocas anteriores, onde o desenvolvimento mais lento permitiu um maior controlo sobre eles e sua exposição. Outros factores são a participação de um número crescente de colaboradores que acedem às informações confidenciais. A crescente digitalização da informação e conectividade entre as diversas entidades num mundo ligado de uma forma rápida faz com que a segurança da informação seja um assunto mais complicado de lidar do que em épocas anteriores. Vários métodos e técnicas foram desenvolvidos para proteger a confidencialidade, integridade, autenticidade e autorização de acesso. Menos atenção tem sido dada para detectar de forma estruturada onde e por quem a segurança da informação pode estar sendo comprometida. Neste projecto propõe-se a usar alguns métodos para marcar e controlar o uso de informações confidenciais no interior das instalações da empresa. Devido à natureza de algumas das técnicas utilizadas, algumas preocupações com a privacidade podem ser levantadas, mas como este é apenas para uso com dados sensíveis que pertencem à companhia essas preocupações poderão ser devidamente contra-argumentadas. Neste trabalho o tipo de documento considerado é o da Microsoft Office Word 2007 que implementa um novo tipo de ficheiro que tem uma natureza aberta ao contrário de versões anteriores de formato binário e fechado. Uma vez que esta é uma ferramenta amplamente utilizada dentro das corporações, justifica-se assim a sua escolha. Outros casos possíveis seriam os ficheiros do Excel e do PowerPoint que também têm uma arquitectura aberta a partir do Office 2007. Fora do mundo do Office, o caso mais significativo são os ficheiros PDF, mas estes requerem uma abordagem completamente diferente devido a uma estrutura também ela muito diferente. Além disso existe para o Office algumas ferramentas que facilitam a implementação da solução. Esta solução destina-se a uma grande empresa de telecomunicações que preenche as considerações iniciais - que comercializa produtos imateriais - Serviços e media possui um considerável volume de propriedade intelectual devido ao seu contínuo apoio na investigação sobre novos produtos e serviços. Esta solução no entanto poderia ser concretizada em qualquer outro tipo de empresa que tenha o seu funcionamento apoiado em dados digitais, como é o caso da grande maioria das empresas de hoje em dia.The present acceleration of the evolution of products and services makes information one of the world’s greatest assets. The more "immaterial" is the product such as services, intellectual property (IP) and media the more important is the matter. This accelerated life cycle makes the exposure of information to threats bigger than in the past, when the slower development permitted a tighter control over the information itself and its exposure. Another contributing factor is the involvement of an increasing number of company collaborators in accessing sensitive information. The increasing digitalization of data and connectivity between entities in a connected and accelerated world makes the security of the information a more complicated subject to deal with than in previous eras. Several methods and techniques have been developed to protect information’s confidentiality, integrity, authenticity and access authorization. Less attention has been given to detect in a structured way where or by whom the information may be leaking out of the company. This project aims to contribute to the solution of this problem of detecting the leakage of sensitive data. For that purpose, the project proposes methods to tag and control the use of sensitive information within the company premises. Due to the nature of some of the techniques proposed, privacy concerns may rise, but since the techniques are for use only with sensitive data that belongs to the company, those concerns are possible to be argued with. In this work the considered document type is Microsoft Office Word 2007 which implements a new file type that has an open nature as opposed to previous binary and closed format versions. Since this is a widely used tool within corporations its choice is well justified. Other possible cases would be Excel and PowerPoint file types that also have an open architecture starting from Office 2007. Outside of the world of Microsoft Office the most prominent case is the pdf files, but those will require a completely different approach as the Office files have some well built tools to implement the intended features. This solution is aimed at a major telecom company that has the concerns mentioned above: it commercializes intangible products - services and media and owns considerable IP due to its ongoing support for the investigation of new products and services. This could nevertheless be deployed in any other type of company that supports its operation by way of electronic data, as is the case in the large majority of today’s enterprises

Universidade de Lisboa: Repositório.UL