32 research outputs found

    On Designing Resilient Location-Privacy Obfuscators

    Get PDF
    The success of location-based services is growing together with the diffusion of GPS-equipped smart devices. As a consequence, privacy concerns are raising year by year. Location privacy is becoming a major interest in research and industry world, and many solutions have been proposed for it. One of the simplest and most flexible approaches is obfuscation, in which the precision of location data is artificially degraded before disclosing it. In this paper, we present an obfuscation approach capable of dealing with measurement imprecision, multiple levels of privacy, untrusted servers and adversarial knowledge of the map. We estimate its resistance against statistical-based deobfuscation attacks, and we improve it by means of three techniques, namely extreme vectors, enlarge-and-scale and hybrid vectors

    Opaque Predicate Detection by Abstract Interpretation

    Get PDF
    Code obfuscation and software watermarking are well known techniques designed to prevent the illegal reuse of software. Code obfuscation prevents malicious reverse engineering, while software watermarking protects code from piracy. An interesting class of algorithms for code obfuscation and software watermarking relies on the insertion of opaque predicates. It turns out that attackers based on a dynamic or an hybrid static-dynamic approach are either not precise or time consuming in eliminating opaque predicates. We present an abstract interpretation-based methodology for removing opaque predicates from programs. Abstract interpretation provides the right framework for proving the correctness of our approach, together with a general methodology for designing efficient attackers for a relevant class of opaque predicates. Experimental evaluations show that abstract interpretation based attacks significantly reduce the time needed to eliminate opaque predicates

    Hardware and software fingerprinting of mobile devices

    Get PDF
    This dissertation presents novel and practical algorithms to identify the software and hardware components on mobile devices. In particular, we make significant contributions in two challenging areas: library fingerprinting, to identify third-party software libraries, and device fingerprinting, to identify individual hardware components. Our work has significant implications for the privacy and security of mobile platforms. Software-based library fingerprinting can be used to detect vulnerable libraries and uncover large-scale data collection activities. We develop a novel Android library finger-printing tool, LibID, to reliably identify specific versions of in-app third-party libraries. LibID is more effective against code obfuscation than prior art. When comparing LibID with other tools in identifying the correct library version using obfuscated F-Droid apps, LibID achieves an F1 score of more than 0.5 in all cases while prior work is below 0.25. We also demonstrate the utility of LibID by detecting the use of a vulnerable version of the OkHttp library in nearly 10% of the 3 958 popular apps on the Google Play Store. Hardware-based device fingerprinting allows apps and websites to invade user privacy by tracking user activity online as the user moves between apps or websites. In particular, we present a new type of device fingerprinting attack, the factory calibration fingerprinting attack, that recovers embedded per-device factory calibration data from motion sensors in a smartphone. We investigate the calibration behaviour of each sensor and show that the calibration fingerprint is fast to generate, does not change over time or after a factory reset, and can be obtained without any special user permissions. We estimate the entropy of the calibration fingerprint and find the fingerprint is very likely to be globally unique for iOS devices (~67 bits of entropy for iPhone 6S) and recent Google Pixel devices (~57 bits of entropy for Pixel 4/4 XL). By comparison, the fingerprint generated by previous work has at most 13 bits of entropy. Following our disclosures, Apple deployed a fix in iOS 12.2 and Google in Android 11. Both code obfuscation and factory calibration help to hide software and hardware idiosyncrasies from third-parties, but this dissertation demonstrates that reliable software and hardware fingerprints can still be generated given sufficient knowledge and a suitable approach. Our work has significant practical implications and can be used to improve platform security and protect user privacy.China Scholarship Council The Boeing Company Microsoft Researc

    Security and trust in cloud computing and IoT through applying obfuscation, diversification, and trusted computing technologies

    Get PDF
    Cloud computing and Internet of Things (IoT) are very widely spread and commonly used technologies nowadays. The advanced services offered by cloud computing have made it a highly demanded technology. Enterprises and businesses are more and more relying on the cloud to deliver services to their customers. The prevalent use of cloud means that more data is stored outside the organization’s premises, which raises concerns about the security and privacy of the stored and processed data. This highlights the significance of effective security practices to secure the cloud infrastructure. The number of IoT devices is growing rapidly and the technology is being employed in a wide range of sectors including smart healthcare, industry automation, and smart environments. These devices collect and exchange a great deal of information, some of which may contain critical and personal data of the users of the device. Hence, it is highly significant to protect the collected and shared data over the network; notwithstanding, the studies signify that attacks on these devices are increasing, while a high percentage of IoT devices lack proper security measures to protect the devices, the data, and the privacy of the users. In this dissertation, we study the security of cloud computing and IoT and propose software-based security approaches supported by the hardware-based technologies to provide robust measures for enhancing the security of these environments. To achieve this goal, we use obfuscation and diversification as the potential software security techniques. Code obfuscation protects the software from malicious reverse engineering and diversification mitigates the risk of large-scale exploits. We study trusted computing and Trusted Execution Environments (TEE) as the hardware-based security solutions. Trusted Platform Module (TPM) provides security and trust through a hardware root of trust, and assures the integrity of a platform. We also study Intel SGX which is a TEE solution that guarantees the integrity and confidentiality of the code and data loaded onto its protected container, enclave. More precisely, through obfuscation and diversification of the operating systems and APIs of the IoT devices, we secure them at the application level, and by obfuscation and diversification of the communication protocols, we protect the communication of data between them at the network level. For securing the cloud computing, we employ obfuscation and diversification techniques for securing the cloud computing software at the client-side. For an enhanced level of security, we employ hardware-based security solutions, TPM and SGX. These solutions, in addition to security, ensure layered trust in various layers from hardware to the application. As the result of this PhD research, this dissertation addresses a number of security risks targeting IoT and cloud computing through the delivered publications and presents a brief outlook on the future research directions.Pilvilaskenta ja esineiden internet ovat nykyään hyvin tavallisia ja laajasti sovellettuja tekniikkoja. Pilvilaskennan pitkälle kehittyneet palvelut ovat tehneet siitä hyvin kysytyn teknologian. Yritykset enenevässä määrin nojaavat pilviteknologiaan toteuttaessaan palveluita asiakkailleen. Vallitsevassa pilviteknologian soveltamistilanteessa yritykset ulkoistavat tietojensa käsittelyä yrityksen ulkopuolelle, minkä voidaan nähdä nostavan esiin huolia taltioitavan ja käsiteltävän tiedon turvallisuudesta ja yksityisyydestä. Tämä korostaa tehokkaiden turvallisuusratkaisujen merkitystä osana pilvi-infrastruktuurin turvaamista. Esineiden internet -laitteiden lukumäärä on nopeasti kasvanut. Teknologiana sitä sovelletaan laajasti monilla sektoreilla, kuten älykkäässä terveydenhuollossa, teollisuusautomaatiossa ja älytiloissa. Sellaiset laitteet keräävät ja välittävät suuria määriä informaatiota, joka voi sisältää laitteiden käyttäjien kannalta kriittistä ja yksityistä tietoa. Tästä syystä johtuen on erittäin merkityksellistä suojata verkon yli kerättävää ja jaettavaa tietoa. Monet tutkimukset osoittavat esineiden internet -laitteisiin kohdistuvien tietoturvahyökkäysten määrän olevan nousussa, ja samaan aikaan suuri osuus näistä laitteista ei omaa kunnollisia teknisiä ominaisuuksia itse laitteiden tai niiden käyttäjien yksityisen tiedon suojaamiseksi. Tässä väitöskirjassa tutkitaan pilvilaskennan sekä esineiden internetin tietoturvaa ja esitetään ohjelmistopohjaisia tietoturvalähestymistapoja turvautumalla osittain laitteistopohjaisiin teknologioihin. Esitetyt lähestymistavat tarjoavat vankkoja keinoja tietoturvallisuuden kohentamiseksi näissä konteksteissa. Tämän saavuttamiseksi työssä sovelletaan obfuskaatiota ja diversifiointia potentiaalisiana ohjelmistopohjaisina tietoturvatekniikkoina. Suoritettavan koodin obfuskointi suojaa pahantahtoiselta ohjelmiston takaisinmallinnukselta ja diversifiointi torjuu tietoturva-aukkojen laaja-alaisen hyödyntämisen riskiä. Väitöskirjatyössä tutkitaan luotettua laskentaa ja luotettavan laskennan suoritusalustoja laitteistopohjaisina tietoturvaratkaisuina. TPM (Trusted Platform Module) tarjoaa turvallisuutta ja luottamuksellisuutta rakentuen laitteistopohjaiseen luottamukseen. Pyrkimyksenä on taata suoritusalustan eheys. Työssä tutkitaan myös Intel SGX:ää yhtenä luotettavan suorituksen suoritusalustana, joka takaa suoritettavan koodin ja datan eheyden sekä luottamuksellisuuden pohjautuen suojatun säiliön, saarekkeen, tekniseen toteutukseen. Tarkemmin ilmaistuna työssä turvataan käyttöjärjestelmä- ja sovellusrajapintatasojen obfuskaation ja diversifioinnin kautta esineiden internet -laitteiden ohjelmistokerrosta. Soveltamalla samoja tekniikoita protokollakerrokseen, työssä suojataan laitteiden välistä tiedonvaihtoa verkkotasolla. Pilvilaskennan turvaamiseksi työssä sovelletaan obfuskaatio ja diversifiointitekniikoita asiakaspuolen ohjelmistoratkaisuihin. Vankemman tietoturvallisuuden saavuttamiseksi työssä hyödynnetään laitteistopohjaisia TPM- ja SGX-ratkaisuja. Tietoturvallisuuden lisäksi nämä ratkaisut tarjoavat monikerroksisen luottamuksen rakentuen laitteistotasolta ohjelmistokerrokseen asti. Tämän väitöskirjatutkimustyön tuloksena, osajulkaisuiden kautta, vastataan moniin esineiden internet -laitteisiin ja pilvilaskentaan kohdistuviin tietoturvauhkiin. Työssä esitetään myös näkemyksiä jatkotutkimusaiheista

    Revisiting Binary Code Similarity Analysis using Interpretable Feature Engineering and Lessons Learned

    Full text link
    Binary code similarity analysis (BCSA) is widely used for diverse security applications such as plagiarism detection, software license violation detection, and vulnerability discovery. Despite the surging research interest in BCSA, it is significantly challenging to perform new research in this field for several reasons. First, most existing approaches focus only on the end results, namely, increasing the success rate of BCSA, by adopting uninterpretable machine learning. Moreover, they utilize their own benchmark sharing neither the source code nor the entire dataset. Finally, researchers often use different terminologies or even use the same technique without citing the previous literature properly, which makes it difficult to reproduce or extend previous work. To address these problems, we take a step back from the mainstream and contemplate fundamental research questions for BCSA. Why does a certain technique or a feature show better results than the others? Specifically, we conduct the first systematic study on the basic features used in BCSA by leveraging interpretable feature engineering on a large-scale benchmark. Our study reveals various useful insights on BCSA. For example, we show that a simple interpretable model with a few basic features can achieve a comparable result to that of recent deep learning-based approaches. Furthermore, we show that the way we compile binaries or the correctness of underlying binary analysis tools can significantly affect the performance of BCSA. Lastly, we make all our source code and benchmark public and suggest future directions in this field to help further research.Comment: 22 pages, under revision to Transactions on Software Engineering (July 2021

    Use of Cryptography in Malware Obfuscation

    Full text link
    Malware authors often use cryptographic tools such as XOR encryption and block ciphers like AES to obfuscate part of the malware to evade detection. Use of cryptography may give the impression that these obfuscation techniques have some provable guarantees of success. In this paper, we take a closer look at the use of cryptographic tools to obfuscate malware. We first find that most techniques are easy to defeat (in principle), since the decryption algorithm and the key is shipped within the program. In order to clearly define an obfuscation technique's potential to evade detection we propose a principled definition of malware obfuscation, and then categorize instances of malware obfuscation that use cryptographic tools into those which evade detection and those which are detectable. We find that schemes that are hard to de-obfuscate necessarily rely on a construct based on environmental keying. We also show that cryptographic notions of obfuscation, e.g., indistinghuishability and virtual black box obfuscation, may not guarantee evasion detection under our model. However, they can be used in conjunction with environmental keying to produce hard to de-obfuscate versions of programs

    Techniques for advanced android malware triage

    Get PDF
    Mención Internacional en el título de doctorAndroid is the leading operating system in smartphones with a big difference. Statistics show that 88% of all smartphones sold to end users in the second quarter of 2018 were phones with the Android OS. Regardless of the operating systems which are running on smartphones, most of the functionalities of these devices are offered through applications. There are currently over 2 million apps only on the official Google store, known as Google Play. This huge market with billions of users is tempting for attackers to develop and distribute their malicious apps (or malware). Mobile malware has raised explosively since 2009. Symantec reported an increase of 54% in the new mobile malware variants in 2017 as compared to the previous year. Additionally, more incentive has been provided for profit-driven malware by the growth of black markets. This rise has happened for Android malware as well since only 20% of devices are running the newest major version of Android OS based on Symantec report in 2018. Android continued to be the most targeted platform with the biggest number of attacks in 2015. After that year, attacks against the Android platform slowed for the first time as attackers were faced with improved security architectures though Android is still the main appealing target OS for attackers. Moreover, advanced types of Android malware are found which make use of extensive anit-analysis techniques to evade static or dynamic analysis. To address the security and privacy concerns of complex Android malware, this dissertation focuses on three main objectives. First of all, we propose a light-weight yet efficient method to identify risky Android applications. Next, we present a precise approach to characterize Android malware based on their malicious behavior. Finally, we propose an adaptive learning system to address the security concerns of obfuscation in Android malware. Identifying potentially dangerous and risky applications is an important step in Android malware analysis. To this end, we develop a triage system to rank applications based on their potential risk. Our approach, called TriFlow, relies on static features which are quick to obtain. TriFlow combines a probabilistic model to predict the existence of information flows with a metric of how significant a flow is in benign and malicious apps. Based on this, TriFlow provides a score for each application that can be used to prioritize analysis. It also provides the analysts with an explanatory report of the associated risk. Our tool can also be used as a complement with computationally expensive static and dynamic analysis tools. Another important step towards Android malware analysis lies in their accurate characterization. Labeling Android malware is challenging yet crucially important, as it helps to identify upcoming malware samples and threats. A key challenge is that different researchers and anti-virus vendors assign labels using their own criteria, and it is not known to what extent these labels are aligned with the apps’ real behavior. Based on this, we propose a new behavioral characterization method for Android apps based on their extracted information flows. As information flows can be used to track why and how apps use specific pieces of information, a flowbased characterization provides a relatively easy-to-interpret summary of the malware sample’s behavior. Not all Android malware are easy to analyze due to advanced and easyto-apply anti-analysis techniques that are available nowadays. Obfuscation is the most common anti-analysis technique that Android malware use to evade detection. Obfuscation techniques modify an app’s source (or machine) code in order to make it more difficult to analyze. This is typically applied to protect intellectual property in benign apps, or to hinder the process of extracting actionable information in the case of malware. Since malware analysis often requires considerable resource investment, detecting the particular obfuscation technique used may contribute to apply the right analysis tools, thus leading to some savings. Therefore, we propose AndrODet, a mechanism to detect three popular types of obfuscation in Android applications, namely identifier renaming, string encryption, and control flow obfuscation. AndrODet leverages online learning techniques, thus being suitable for resource-limited environments that need to operate in a continuous manner. We compare our results with a batch learning algorithm using a dataset of 34,962 apps from both malware and benign apps. Experimental results show that online learning approaches are not only able to compete with batch learning methods in terms of accuracy, but they also save significant amount of time and computational resources. Finally, we present a number of open research directions based on the outcome of this thesis.Android es el sistema operativo líder en teléfonos inteligentes (también denominados con la palabra inglesa smartphones), con una gran diferencia con respecto al resto de competidores. Las estadísticas muestran que el 88% de todos los smartphones vendidos a usuarios finales en el segundo trimestre de 2018 fueron teléfonos con sistema operativo Android. Independientemente de su sistema operativo, la mayoría de las funcionalidades de estos dispositivos se ofrecen a través de aplicaciones. Actualmente hay más de 2 millones de aplicaciones solo en la tienda oficial de Google, conocida como Google Play. Este enorme mercado con miles de millones de usuarios es tentador para los atacantes, que buscan distribuir sus aplicaciones malintencionadas (o malware). El malware para dispositivos móviles ha aumentado de forma exponencial desde 2009. Symantec ha detectado un aumento del 54% en las nuevas variantes de malware para dispositivos móviles en 2017 en comparación con el año anterior. Además, el crecimiento del mercado negro (es decir, plataformas no oficiales de descargas de aplicaciones) supone un incentivo para los programas maliciosos con fines lucrativos. Este aumento también ha ocurrido en el malware de Android, aprovechando la circunstancia de que solo el 20% de los dispositivos ejecutan la versión mas reciente del sistema operativo Android, de acuerdo con el informe de Symantec en 2018. De hecho, Android ha sido la plataforma que ha centrado los esfuerzos de los atacantes desde 2015, aunque los ataques decayeron ligeramente tras ese año debido a las mejoras de seguridad incorporadas en el sistema operativo. En todo caso, existen formas avanzadas de malware para Android que hacen uso de técnicas sofisticadas para evadir el análisis estático o dinámico. Para abordar los problemas de seguridad y privacidad que causa el malware en Android, esta Tesis se centra en tres objetivos principales. En primer lugar, se propone un método ligero y eficiente para identificar aplicaciones de Android que pueden suponer un riesgo. Por otra parte, se presenta un mecanismo para la caracterización del malware atendiendo a su comportamiento. Finalmente, se propone un mecanismo basado en aprendizaje adaptativo para la detección de algunos tipos de ofuscación que son empleados habitualmente en las aplicaciones maliciosas. Identificar aplicaciones potencialmente peligrosas y riesgosas es un paso importante en el análisis de malware de Android. Con este fin, en esta Tesis se desarrolla un mecanismo de clasificación (llamado TriFlow) que ordena las aplicaciones según su riesgo potencial. La aproximación se basa en características estáticas que se obtienen rápidamente, siendo de especial interés los flujos de información. Un flujo de información existe cuando un cierto dato es recibido o producido mediante una cierta función o llamada al sistema, y atraviesa la lógica de la aplicación hasta que llega a otra función. Así, TriFlow combina un modelo probabilístico para predecir la existencia de un flujo con una métrica de lo habitual que es encontrarlo en aplicaciones benignas y maliciosas. Con ello, TriFlow proporciona una puntuación para cada aplicación que puede utilizarse para priorizar su análisis. Al mismo tiempo, proporciona a los analistas un informe explicativo de las causas que motivan dicha valoración. Así, esta herramienta se puede utilizar como complemento a otras técnicas de análisis estático y dinámico que son mucho más costosas desde el punto de vista computacional. Otro paso importante hacia el análisis de malware de Android radica en caracterizar su comportamiento. Etiquetar el malware de Android es un desafío de crucial importancia, ya que ayuda a identificar las próximas muestras y amenazas de malware. Una cuestión relevante es que los diferentes investigadores y proveedores de antivirus asignan etiquetas utilizando sus propios criterios, de modo no se sabe en qué medida estas etiquetas están en línea con el comportamiento real de las aplicaciones. Sobre esta base, en esta Tesis se propone un nuevo método de caracterización de comportamiento para las aplicaciones de Android en función de sus flujos de información. Como dichos flujos se pueden usar para estudiar el uso de cada dato por parte de una aplicación, permiten proporcionar un resumen relativamente sencillo del comportamiento de una determinada muestra de malware. A pesar de la utilidad de las técnicas de análisis descritas, no todos los programas maliciosos de Android son fáciles de analizar debido al uso de técnicas anti-análisis que están disponibles en la actualidad. Entre ellas, la ofuscación es la técnica más común que se utiliza en el malware de Android para evadir la detección. Dicha técnica modifica el código de una aplicación para que sea más difícil de entender y analizar. Esto se suele aplicar para proteger la propiedad intelectual en aplicaciones benignas o para dificultar la obtención de pistas sobre su funcionamiento en el caso del malware. Dado que el análisis de malware a menudo requiere una inversión considerable de recursos, detectar la técnica de ofuscación que se ha utilizado en un caso particular puede contribuir a utilizar herramientas de análisis adecuadas, contribuyendo así a un cierto ahorro de recursos. Así, en esta Tesis se propone AndrODet, un mecanismo para detectar tres tipos populares de ofuscación, a saber, el renombrado de identificadores, cifrado de cadenas de texto y la modificación del flujo de control de la aplicación. AndrODet se basa en técnicas de aprendizaje automático en línea (online machine learning), por lo que es adecuado para entornos con recursos limitados que necesitan operar de forma continua, sin interrupción. Para medir su eficacia respecto de las técnicas de aprendizaje automático tradicionales, se comparan los resultados con un algoritmo de aprendizaje por lotes (batch learning) utilizando un dataset de 34.962 aplicaciones de malware y benignas. Los resultados experimentales muestran que el enfoque de aprendizaje en línea no solo es capaz de competir con el basado en lotes en términos de precisión, sino que también ahorra una gran cantidad de tiempo y recursos computacionales. Tras la exposición de las contribuciones anteriormente mencionadas, esta Tesis concluye con la identificación de una serie de líneas abiertas de investigación con el fin de alentar el desarrollo de trabajos futuros en esta dirección.Omid Mirzaei is a Ph.D. candidate in the Computer Security Lab (COSEC) at the Department of Computer Science and Engineering of Universidad Carlos III de Madrid (UC3M). His Ph.D. is funded by the Community of Madrid and the European Union through the research project CIBERDINE (Ref. S2013/ICE-3095).Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Gregorio Martínez Pérez.- Secretario: Pedro Peris López.- Vocal: Pablo Picazo Sánche
    corecore