29 research outputs found

    Android Malware Clustering through Malicious Payload Mining

    Full text link
    Clustering has been well studied for desktop malware analysis as an effective triage method. Conventional similarity-based clustering techniques, however, cannot be immediately applied to Android malware analysis due to the excessive use of third-party libraries in Android application development and the widespread use of repackaging in malware development. We design and implement an Android malware clustering system through iterative mining of malicious payload and checking whether malware samples share the same version of malicious payload. Our system utilizes a hierarchical clustering technique and an efficient bit-vector format to represent Android apps. Experimental results demonstrate that our clustering approach achieves precision of 0.90 and recall of 0.75 for Android Genome malware dataset, and average precision of 0.98 and recall of 0.96 with respect to manually verified ground-truth.Comment: Proceedings of the 20th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2017

    Bytewise Approximate Matching: The Good, The Bad, and The Unknown

    Get PDF
    Hash functions are established and well-known in digital forensics, where they are commonly used for proving integrity and file identification (i.e., hash all files on a seized device and compare the fingerprints against a reference database). However, with respect to the latter operation, an active adversary can easily overcome this approach because traditional hashes are designed to be sensitive to altering an input; output will significantly change if a single bit is flipped. Therefore, researchers developed approximate matching, which is a rather new, less prominent area but was conceived as a more robust counterpart to traditional hashing. Since the conception of approximate matching, the community has constructed numerous algorithms, extensions, and additional applications for this technology, and are still working on novel concepts to improve the status quo. In this survey article, we conduct a high-level review of the existing literature from a non-technical perspective and summarize the existing body of knowledge in approximate matching, with special focus on bytewise algorithms. Our contribution allows researchers and practitioners to receive an overview of the state of the art of approximate matching so that they may understand the capabilities and challenges of the field. Simply, we present the terminology, use cases, classification, requirements, testing methods, algorithms, applications, and a list of primary and secondary literature

    Energy Efficient Downstream Communication in Wireless Sensor Networks

    Get PDF
    This dissertation studies the problem of energy efficient downstream communication in Wireless Sensor Networks (WSNs). First, we present the Opportunistic Source Routing (OSR), a scalable, reliable, and energy-efficient downward routing protocol for individual node actuation in data collection WSNs. OSR introduces opportunistic routing into traditional source routing based on the parent set of a node’s upward routing in data collection, significantly addressing the drastic link dynamics in low-power and lossy WSNs. We devise a novel adaptive Bloom filter mechanism to effectively and efficiently encode a downward source-route in OSR, which enables a significant reduction of the length of source-route field in the packet header. OSR is scalable to very large-size WSN deployments, since each resource-constrained node in the network stores only the set of its direct children. The probabilistic nature of the Bloom filter passively explores opportunistic routing. Upon a delivery failure at any hop along the downward path, OSR actively performs opportunistic routing to bypass the obsolete/bad link. The evaluations in both simulations and real-world testbed experiments demonstrate that OSR significantly outperforms the existing approaches in scalability, reliability, and energy efficiency. Secondly, we propose a mobile code dissemination tool for heterogeneous WSN deployments operating on low power links. The evaluation in lab experiment and a real world WSN testbed shows how our tool reduces the laborious work to reprogram nodes for updating the application. Finally, we present an empirical study of the network dynamics of an out-door heterogeneous WSN deployment and devise a benchmark data suite. The network dynamics analysis includes link level characteristics, topological characteristics, and temporal characteristics. The unique features of the benchmark data suite include the full path information and our approach to fill the missing paths based on the principle of the routing protocol

    Energy-efficient and cost-effective reliability design in memory systems

    Get PDF
    Reliability of memory systems is increasingly a concern as memory density increases, the cell dimension shrinks and new memory technologies move close to commercial use. Meanwhile, memory power efficiency has become another first-order consideration in memory system design. Conventional reliability scheme uses ECC (Error Correcting Code) and EDC (Error Detecting Code) to support error correction and detection in memory systems, putting a rigid constraint on memory organizations and incurring a significant overhead regarding the power efficiency and area cost. This dissertation studies energy-efficient and cost-effective reliability design on both cache and main memory systems. It first explores the generic approach called embedded ECC in main memory systems to provide a low-cost and efficient reliability design. A scheme called E3CC (Enhanced Embedded ECC) is proposed for sub-ranked low-power memories to alleviate the concern on reliability. In the design, it proposes a novel BCRM (Biased Chinese Remainder Mapping) to resolve the address mapping issue in page-interleaving scheme. The proposed BCRM scheme provides an opportunity for building flexible reliability system, which favors the consumer-level computers to save power consumption. Within the proposed E3CC scheme, we further explore address mapping schemes at DRAM device level to provide SEP (Selective Error Protection). We explore a group of address mapping schemes at DRAM device level to map memory requests to their designated regions. All the proposed address mapping schemes are based on modulo operation. They will be proven, in this thesis, to be efficient, flexible and promising to various scenarios to favor system requirements. Additionally, we propose Free ECC reliability design for compressed cache schemes. It utilizes the unused fragments in compressed cache to store ECC. Such a design not only reduces the chip overhead but also improves cache utilization and power efficiency. In the design, we propose an efficient convergent cache allocation scheme to organize the compressed data blocks more effectively than existing schemes. This new design makes compressed cache an increasingly viable choice for processors with requirements of high reliability. Furthermore, we propose a novel, system-level scheme of memory error detection based on memory integrity check, called MemGuard, to detect memory errors. It uses memory log hashes to ensure, by strong probability, that memory read log and write log match with each other. It is much stronger than conventional protection in error detection and incurs little hardware cost, no storage overhead and little power overhead. It puts no constraints on memory organization and no major complication to processor design and operating system design. In the thesis, we prove that the MemGuard reliability design is simple, robust and efficient

    Improving Cloud System Reliability Using Autonomous Agent Technology

    Get PDF
    Cloud computing platforms provide efficient and flexible ways to offer services and computation facilities to users. Service providers acquire resources according to their requirements and deploy their services in cloud. Service consumers can access services over networks. In cloud computing, virtualization techniques allow cloud providers provide computation and storage resources according to users’ requirement. However, reliability in the cloud is an important factor to measure the performance of a virtualized cloud computing platform. Reliability in cloud computing includes the usability and availability. Usability is defined as cloud computing platform provides functional and easy-to-use computation resources to users. In order to ensure usability, configurations and management policies have to be maintained and deployed by cloud computing providers. Availability of cloud is defined as cloud computing platform provides stable and reliable computation resources to users. My research concentrates on improving usability and availability of cloud computing platforms. I proposed a customized agent-based reliability monitoring framework to increase reliability of cloud computing

    Study to gather evidence on the working conditions of platform workers VT/2018/032 Final Report 13 December 2019

    Get PDF
    Platform work is a type of work using an online platform to intermediate between platform workers, who provide services, and paying clients. Platform work seems to be growing in size and importance. This study explores platform work in the EU28, Norway and Iceland, with a focus on the challenges it presents to working conditions and social protection, and how countries have responded through top-down (e.g. legislation and case law) and bottom-up actions (e.g. collective agreements, actions by platform workers or platforms). This national mapping is accompanied by a comparative assessment of selected EU legal instruments, mostly in the social area. Each instrument is assessed for personal and material scope to determine how it might impact such challenges. Four broad legal domains with relevance to platform work challenges are examined in stand-alone reflection papers. Together, the national mapping and legal analysis support a gap analysis, which aims to indicate where further action on platform work would be useful, and what form such action might take

    Um estudo sobre pareamento aproximado para busca por similaridade : técnicas, limitações e melhorias para investigações forenses digitais

    Get PDF
    Orientador: Marco Aurélio Amaral HenriquesTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A forense digital é apenas um dos ramos da Ciência da Computação que visa investigar e analisar dispositivos eletrônicos na busca por evidências de crimes. Com o rápido aumento da capacidade de armazenamento de dados, é necessário o uso de procedimentos automatizados para lidar com o grande volume de dados disponíveis atualmente, principalmente em investigações forenses, nas quais o tempo é um recurso escasso. Uma possível abordagem para tornar o processo mais eficiente é através da técnica KFF (Filtragem por arquivos conhecidos - Known File Filtering), onde uma lista de objetos de interesse é usada para reduzir/separar dados para análise. Com um banco de dados de hashes destes objetos, o examinador realiza buscas no dispositivo de destino sob investigação por qualquer item que seja igual ao buscado. No entanto, devido a limitações nas funções criptográficas de hash (incapacidade de detectar objetos semelhantes), novos métodos foram projetados baseando-se em funções de Pareamento Aproximado (ou Approximate Matching) (AM). Estas funções aparecem como candidatos para realizar buscas uma vez que elas têm a capacidade de identificar similaridade (no nível de bits) de uma maneira muito eficiente, criando e comparando representações compactas de objetos (conhecidos como resumos). Neste trabalho, apresentamos as funções de Pareamento Aproximado. Mostramos algumas das ferramentas de AM mais conhecidas e apresentamos as Estratégias de Busca por Similaridade baseadas em resumos, capazes de realizar a busca de similaridade (usando AM) de maneira mais eficiente, principalmente ao lidar com grandes conjuntos de dados. Realizamos também uma análise detalhada das estratégias atuais e, dado que as mesmas trabalham somente com algumas ferramentas específicas de AM, nós propomos uma nova abordagem baseada em uma ferramenta diferente que possui boas características para investigações forenses. Além disso, abordamos algumas limitações das ferramentas atuais de AM em relação ao processo de detecção de similaridade, onde muitas comparações apontadas como semelhantes, são de fato falsos positivos; as ferramentas geralmente são enganadas por blocos comuns (dados comuns em muitos objetos diferentes). Ao remover estes blocos dos resumos de AM, obtemos melhorias significativas na detecção de objetos similares. Também apresentamos neste trabalho uma análise teórica detalhada das capacidades de detecção da ferramenta de AM sdhash e propomos melhorias em sua função de comparação, onde a versão aprimorada apresenta uma medida de similaridade (score) mais precisa. Por último, novas aplicações de AM são apresentadas e analisadas: uma de identificação rápida de arquivos por meio de amostragem de dados e outra de identificação eficiente de impressões digitais. Esperamos que profissionais da área forense e de outras áreas relacionadas se beneficiem de nosso estudo sobre AM para resolver seus problemasAbstract: Digital forensics is a branch of Computer Science aiming at investigating and analyzing electronic devices in the search for crime evidence. With the rapid increase in data storage capacity, the use of automated procedures to handle the massive volume of data available nowadays is required, especially in forensic investigations, in which time is a scarce resource. One possible approach to make the process more efficient is the Known File Filter (KFF) technique, where a list of interest objects is used to reduce/separate data for analysis. Holding a database of hashes of such objects, the examiner performs lookups for matches against the target device under investigation. However, due to limitations over cryptographic hash functions (inability to detect similar objects), new methods have been designed based on Approximate Matching (AM). They appear as suitable candidates to perform this process because of their ability to identify similarity (bytewise level) in a very efficient way, by creating and comparing compact representations of objects (a.k.a. digests). In this work, we present the Approximate Matching functions. We show some of the most known AM tools and present the Similarity Digest Search Strategies (SDSS), capable of performing the similarity search (using AM) more efficiently, especially when dealing with large data sets. We perform a detailed analysis of current SDSS approaches and, given that current strategies only work for a few particular AM tools, we propose a new strategy based on a different tool that has good characteristics for forensic investigations. Furthermore, we address some limitations of current AM tools regarding the similarity detection process, where many matches pointed out as similar, are indeed false positives; the tools are usually misled by common blocks (pieces of data common in many different objects). By removing such blocks from AM digests, we obtain significant improvements in the detection of similar data. We also present a detailed theoretical analysis of the capabilities of sdhash AM tool and provide some improvements to its comparison function, where our improved version has a more precise similarity measure (score). Lastly, new applications of AM are presented and analyzed: One for fast file identification based on data samples and another for efficient fingerprint identification. We hope that practitioners in the forensics field and other related areas will benefit from our studies on AM when solving their problemsDoutoradoEngenharia de ComputaçãoDoutor em Engenharia Elétrica23038.007604/2014-69CAPE
    corecore