18 research outputs found

    Detecting phishing e-mails using text mining and features analysis

    No full text
    Phishing e-mails are used by malicious actors with the aim of obtaining sensitive information from a victim, deceiving or blackmailing them. An inattentive or uninformed user may often fail to recognise if an e-mail is sent by an authentic sender or is a scam. We therefore sought to develop a method that can effectively and efficiently detect phishing e-mails and report them to the user. We analyse all the information available on receipt of the e-mail both statically and performing text mining on the content and subject of the e-mail. In addition to indicating weather e-mails are suspicious, the degree of accuracy with which the above statement is made is also reported, and the aspects of the e-mail that are characteristic of a phishing e-mail are highlighted. Excellent results were achieved with our methodology, reaching 99.2% accuracy

    On the Dissection of Evasive Malware

    No full text
    Complex malware samples feature measures to impede automatic and manual analyses, making their investigation cumbersome. While automatic characterization of malware benefits from recently proposed designs for passive monitoring, the subsequent dissection process still sees human analysts struggling with adversarial behaviors, many of which also closely resemble those studied for automatic systems. This gap affects the day-to-day analysis of complex samples and researchers have not yet attempted to bridge it. We make a first step down this road by proposing a design that can reconcile transparency requirements with manipulation capabilities required for dissection. Our open-source prototype BluePill (i) offers a customizable execution environment that remains stealthy when analysts intervene to alter instructions and data or run third-party tools, (ii) is extensible to counteract newly encountered anti-analysis measures using insights from the dissection, and (iii) can accommodate program analyses that aid analysts, as we explore for taint analysis. On a set of highly evasive samples BluePill resulted as stealthy as commercial sandboxes while offering new intervention and customization capabilities for dissection

    Static analysis of PE files using neural network techniques for a pocket tool

    No full text
    The continuous growth in the number of malware instances has posed a serious challenge to the security of computer systems; hence, malware detection is a key factor in securing various devices, from personal devices to large servers. Static analysis allows for the extraction of multiple file characteristics belonging to different categories of information without incurring the overhead of dynamic analysis and the risks associated with it. In this paper, we present a methodology to classify Portable Executable (PE) files as malware or non-malware by exploiting the technology of neural networks, adapting it to the collected data to obtain better results. The aim of our methodology is to create a pocket tool, i.e., a tool that can be used even on devices with limited available resources. Hence our tests were conducted entirely using a personal computer with only 16GB of RAM. After a careful analysis of the techniques at our disposal and a selection of the most relevant information, we reduced the amount of resources used, both in terms of time and space, while maintaining a high accuracy of 93%

    Cross-national health care database utilization between Spain and France: results from the EPICHRONIC study assessing the prevalence of type 2 diabetes mellitus

    No full text
    Guillaume Moulis,1–3,* Berta Ibañez,4–6,* Aurore Palmaro,2,3 Felipe Aizpuru,6–8 Eduardo Millan,6,8 Maryse Lapeyre-Mestre,2,3,9 Laurent Sailler,1–3 Koldo Cambra5,6,10 1Department of Internal Medicine, Toulouse University Hospital, Toulouse, France; 2UMR1027 INSERM, University of Toulouse, Toulouse, France; 3Clinical Investigation Center 1436, Toulouse University Hospital, Toulouse, France; 4Navarrabiomed, Health Department, Public University of Navarra, Pamplona, Spain; 5IdiSNA, Pamplona, Spain; 6Health Service Research on Chronic Patients Network (REDISSEC), Pamplona, Spain; 7Research Unit Araba (BioAraba), Osakidetza-Basque Health Department, Vitoria-Gasteiz, Spain; 8Healthcare Services Sub-directorate, Osakidetza-Basque Health Service, Araba, Spain; 9Department of Medical and Clinical Pharmacology, Toulouse University Hospital, Toulouse, France; 10Institute of Public Health and Labour Health of Navarra, Pamplona, Spain *These authors contributed equally to this work Aim: The EPICHRONIC (EPIdemiology of CHRONIC diseases) project investigated the possibility of developing common procedures for French and Spanish electronic health care databases to enable large-scale pharmacoepidemiological studies on chronic diseases. A feasibility study assessed the prevalence of type 2 diabetes mellitus (T2DM) in Navarre and the Basque Country (Spain) and the Midi-Pyrénées region (France). Patients and methods: We described and compared database structures and the availability of hospital, outpatient, and drug-dispensing data from 5.9 million inhabitants. Due to differences in database structures and recorded data, we could not develop a common procedure to estimate T2DM prevalence, but identified an algorithm specific to each database. Patients were identified using primary care diagnosis codes previously validated in Spanish databases and a combination of primary care diagnosis codes, hospital diagnosis codes, and data on exposure to oral antidiabetic drugs from the French database. Results: Spanish and French databases (the latter termed Système National d’Information Inter-Régimes de l’Assurance Maladie [SNIIRAM]) included demographic, primary care diagnoses, hospital diagnoses, and outpatient drug-dispensing data. Diagnoses were encoded using the International Classification of Primary Care (version 2) and the International Classification of Diseases, version 9 and version 10 (ICD-9 and ICD-10) in the Spanish databases, whereas the SNIIRAM contained ICD-10 codes. All data were anonymized before transferring to researchers. T2DM prevalence in the population over 20 years was estimated to be 6.6–7.0% in the Spanish regions and 6.3% in the Midi-Pyrénées region with ~2% higher estimates for males in the three regions. Conclusion: Tailored procedures can be designed to estimate the prevalence of T2DM in population-based studies from Spanish and French electronic health care records. Keywords: epidemiology, pharmacoepidemiology, electronic health care database, cross-national study, population-based study, type 2 diabetes mellitu
    corecore