37 research outputs found

    静的解析に基づくマルウェア分類システムに関する研究

    Get PDF
    信州大学(Shinshu university)博士(工学)Thesis岩本 一樹. 静的解析に基づくマルウェア分類システムに関する研究. 信州大学, 2015, 博士論文. 博士(工学), 甲第628号, 平成27年3月20日授与.doctoral thesi

    Sec-Lib: Protecting Scholarly Digital Libraries From Infected Papers Using Active Machine Learning Framework

    Get PDF
    Researchers from academia and the corporate-sector rely on scholarly digital libraries to access articles. Attackers take advantage of innocent users who consider the articles' files safe and thus open PDF-files with little concern. In addition, researchers consider scholarly libraries a reliable, trusted, and untainted corpus of papers. For these reasons, scholarly digital libraries are an attractive-target and inadvertently support the proliferation of cyber-attacks launched via malicious PDF-files. In this study, we present related vulnerabilities and malware distribution approaches that exploit the vulnerabilities of scholarly digital libraries. We evaluated over two-million scholarly papers in the CiteSeerX library and found the library to be contaminated with a surprisingly large number (0.3-2%) of malicious PDF documents (over 55% were crawled from the IPs of US-universities). We developed a two layered detection framework aimed at enhancing the detection of malicious PDF documents, Sec-Lib, which offers a security solution for large digital libraries. Sec-Lib includes a deterministic layer for detecting known malware, and a machine learning based layer for detecting unknown malware. Our evaluation showed that scholarly digital libraries can detect 96.9% of malware with Sec-Lib, while minimizing the number of PDF-files requiring labeling, and thus reducing the manual inspection efforts of security-experts by 98%

    Sec-Lib: Protecting Scholarly Digital Libraries From Infected Papers Using Active Machine Learning Framework

    Get PDF
    Researchers from academia and the corporate-sector rely on scholarly digital libraries to access articles. Attackers take advantage of innocent users who consider the articles\u27 files safe and thus open PDF-files with little concern. In addition, researchers consider scholarly libraries a reliable, trusted, and untainted corpus of papers. For these reasons, scholarly digital libraries are an attractive-target and inadvertently support the proliferation of cyber-attacks launched via malicious PDF-files. In this study, we present related vulnerabilities and malware distribution approaches that exploit the vulnerabilities of scholarly digital libraries. We evaluated over two-million scholarly papers in the CiteSeerX library and found the library to be contaminated with a surprisingly large number (0.3-2%) of malicious PDF documents (over 55% were crawled from the IPs of US-universities). We developed a two layered detection framework aimed at enhancing the detection of malicious PDF documents, Sec-Lib, which offers a security solution for large digital libraries. Sec-Lib includes a deterministic layer for detecting known malware, and a machine learning based layer for detecting unknown malware. Our evaluation showed that scholarly digital libraries can detect 96.9% of malware with Sec-Lib, while minimizing the number of PDF-files requiring labeling, and thus reducing the manual inspection efforts of security-experts by 98%

    Static malware detection Using Stacked BiLSTM and GPT-2

    Get PDF
    In recent years, cyber threats and malicious software attacks have been escalated on various platforms. Therefore, it has become essential to develop automated machine learning methods for defending against malware. In the present study, we propose stacked bidirectional long short-term memory (Stacked BiLSTM) and generative pre-trained transformer based (GPT-2) deep learning language models for detecting malicious code. We developed language models using assembly instructions extracted from .text sections of malicious and benign Portable Executable (PE) files. We treated each instruction as a sentence and each .text section as a document. We also labeled each sentence and document as benign or malicious, according to the file source. We created three datasets from those sentences and documents. The first dataset, composed of documents, was fed into a Document Level Analysis Model (DLAM) based on Stacked BiLSTM. The second dataset, composed of sentences, was used in Sentence Level Analysis Models (SLAMs) based on Stacked BiLSTM and DistilBERT, Domain Specific Language Model GPT-2 (DSLM-GPT2), and General Language Model GPT-2 (GLM-GPT2). Lastly, we merged all assembly instructions without labels for creating the third dataset; then we fed a custom pre-trained model with it. We then compared malware detection performances. The results showed that the pre-trained model improved the DSLM-GPT2 and GLM-GPT2 detection performance. The experiments showed that the DLAM, the SLAM based on DistilBERT, the DSLM-GPT2, and the GLM-GPT2 achieved 98.3%, 70.4%, 86.0%, and 76.2% F1 scores, respectively

    Leveraging the Cloud for Software Security Services.

    Full text link
    This thesis seeks to leverage the advances in cloud computing in order to address modern security threats, allowing for completely novel architectures that provide dramatic improvements and asymmetric gains beyond what is possible using current approaches. Indeed, many of the critical security problems facing the Internet and its users are inadequately addressed by current security technologies. Current security measures often are deployed in an exclusively network-based or host-based model, limiting their efficacy against modern threats. However, recent advancements in the past decade in cloud computing and high-speed networking have ushered in a new era of software services. Software services that were previously deployed on-premise in organizations and enterprises are now being outsourced to the cloud, leading to fundamentally new models in how software services are sold, consumed, and managed. This thesis focuses on how novel software security services can be deployed that leverage the cloud to scale elegantly in their capabilities, performance, and management. First, we introduce a novel architecture for malware detection in the cloud. Next, we propose a cloud service to protect modern mobile devices, an ever-increasing target for malicious attackers. Then, we discuss and demonstrate the ability for attackers to leverage the same benefits of cloud-centric services for malicious purposes. Next, we present new techniques for the large-scale analysis and classification of malicious software. Lastly, to demonstrate the benefits of cloud-centric architectures outside the realm of malicious software, we present a threshold signature scheme that leverages the cloud for robustness and resiliency.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91385/1/jonojono_1.pd

    Advanced Detection Tool for PDF Threats

    Get PDF
    In this paper we introduce an efficient application for malicious PDF detection: ADEPT. With targeted attacks rising over the recent past, exploring a new detection and mitigation paradigm becomes mandatory. The use of malicious PDF files that exploit vulnerabilities in well-known PDF readers has become a popular vector for targeted at- tacks, for which few efficient approaches exist. Although simple in theory, parsing followed by analysis of such files is resource-intensive and may even be impossible due to several obfuscation and reader-specific artifacts. Our paper describes a new approach for detecting such malicious payloads that leverages machine learning techniques and an efficient feature selection mechanism for rapidly detecting anomalies. We assess our approach on a large selection of malicious files and report the experimental performance results for the developed prototype

    Data flows of classified documents

    Get PDF
    Tese de mestrado em Segurança Informática, apresentada à Universidade de Lisboa, através da Faculdade de Ciências, 2010Nos dias de hoje à medida que a evolução dos produtos e serviços acelera cada vez mais, significa que a informação é hoje uma das mais valiosas propriedades de qualquer empresa. Quanto mais “imaterial” é o produto, tais como serviços, propriedade intelectual e media (vídeo, música, fotografia) mais importante é a questão. Este ciclo de vida acelerado torna a sua exposição maior do que em épocas anteriores, onde o desenvolvimento mais lento permitiu um maior controlo sobre eles e sua exposição. Outros factores são a participação de um número crescente de colaboradores que acedem às informações confidenciais. A crescente digitalização da informação e conectividade entre as diversas entidades num mundo ligado de uma forma rápida faz com que a segurança da informação seja um assunto mais complicado de lidar do que em épocas anteriores. Vários métodos e técnicas foram desenvolvidos para proteger a confidencialidade, integridade, autenticidade e autorização de acesso. Menos atenção tem sido dada para detectar de forma estruturada onde e por quem a segurança da informação pode estar sendo comprometida. Neste projecto propõe-se a usar alguns métodos para marcar e controlar o uso de informações confidenciais no interior das instalações da empresa. Devido à natureza de algumas das técnicas utilizadas, algumas preocupações com a privacidade podem ser levantadas, mas como este é apenas para uso com dados sensíveis que pertencem à companhia essas preocupações poderão ser devidamente contra-argumentadas. Neste trabalho o tipo de documento considerado é o da Microsoft Office Word 2007 que implementa um novo tipo de ficheiro que tem uma natureza aberta ao contrário de versões anteriores de formato binário e fechado. Uma vez que esta é uma ferramenta amplamente utilizada dentro das corporações, justifica-se assim a sua escolha. Outros casos possíveis seriam os ficheiros do Excel e do PowerPoint que também têm uma arquitectura aberta a partir do Office 2007. Fora do mundo do Office, o caso mais significativo são os ficheiros PDF, mas estes requerem uma abordagem completamente diferente devido a uma estrutura também ela muito diferente. Além disso existe para o Office algumas ferramentas que facilitam a implementação da solução. Esta solução destina-se a uma grande empresa de telecomunicações que preenche as considerações iniciais - que comercializa produtos imateriais - Serviços e media possui um considerável volume de propriedade intelectual devido ao seu contínuo apoio na investigação sobre novos produtos e serviços. Esta solução no entanto poderia ser concretizada em qualquer outro tipo de empresa que tenha o seu funcionamento apoiado em dados digitais, como é o caso da grande maioria das empresas de hoje em dia.The present acceleration of the evolution of products and services makes information one of the world’s greatest assets. The more "immaterial" is the product such as services, intellectual property (IP) and media the more important is the matter. This accelerated life cycle makes the exposure of information to threats bigger than in the past, when the slower development permitted a tighter control over the information itself and its exposure. Another contributing factor is the involvement of an increasing number of company collaborators in accessing sensitive information. The increasing digitalization of data and connectivity between entities in a connected and accelerated world makes the security of the information a more complicated subject to deal with than in previous eras. Several methods and techniques have been developed to protect information’s confidentiality, integrity, authenticity and access authorization. Less attention has been given to detect in a structured way where or by whom the information may be leaking out of the company. This project aims to contribute to the solution of this problem of detecting the leakage of sensitive data. For that purpose, the project proposes methods to tag and control the use of sensitive information within the company premises. Due to the nature of some of the techniques proposed, privacy concerns may rise, but since the techniques are for use only with sensitive data that belongs to the company, those concerns are possible to be argued with. In this work the considered document type is Microsoft Office Word 2007 which implements a new file type that has an open nature as opposed to previous binary and closed format versions. Since this is a widely used tool within corporations its choice is well justified. Other possible cases would be Excel and PowerPoint file types that also have an open architecture starting from Office 2007. Outside of the world of Microsoft Office the most prominent case is the pdf files, but those will require a completely different approach as the Office files have some well built tools to implement the intended features. This solution is aimed at a major telecom company that has the concerns mentioned above: it commercializes intangible products - services and media and owns considerable IP due to its ongoing support for the investigation of new products and services. This could nevertheless be deployed in any other type of company that supports its operation by way of electronic data, as is the case in the large majority of today’s enterprises
    corecore