4,002 research outputs found

    Data mining for detecting Bitcoin Ponzi schemes

    Full text link
    Soon after its introduction in 2009, Bitcoin has been adopted by cyber-criminals, which rely on its pseudonymity to implement virtually untraceable scams. One of the typical scams that operate on Bitcoin are the so-called Ponzi schemes. These are fraudulent investments which repay users with the funds invested by new users that join the scheme, and implode when it is no longer possible to find new investments. Despite being illegal in many countries, Ponzi schemes are now proliferating on Bitcoin, and they keep alluring new victims, who are plundered of millions of dollars. We apply data mining techniques to detect Bitcoin addresses related to Ponzi schemes. Our starting point is a dataset of features of real-world Ponzi schemes, that we construct by analysing, on the Bitcoin blockchain, the transactions used to perform the scams. We use this dataset to experiment with various machine learning algorithms, and we assess their effectiveness through standard validation protocols and performance metrics. The best of the classifiers we have experimented can identify most of the Ponzi schemes in the dataset, with a low number of false positives

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 17th International Annual Conference on Cyber Security, CNCERT 2021, held in Beijing, China, in AJuly 2021. The 14 papers presented were carefully reviewed and selected from 51 submissions. The papers are organized according to the following topical sections: ​data security; privacy protection; anomaly detection; traffic analysis; social network security; vulnerability detection; text classification

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 17th International Annual Conference on Cyber Security, CNCERT 2021, held in Beijing, China, in AJuly 2021. The 14 papers presented were carefully reviewed and selected from 51 submissions. The papers are organized according to the following topical sections: ​data security; privacy protection; anomaly detection; traffic analysis; social network security; vulnerability detection; text classification

    A scalable and automated machine learning framework to support risk management

    Get PDF
    Due to the growth of data and wide spread usage of Machine Learning (ML) by non-experts, automation and scalability are becoming key issues for ML. This paper presents an automated and scalable framework for ML that requires minimum human input. We designed the framework for the domain of telecommunications risk management. This domain often requires non-ML-experts to continuously update supervised learning models that are trained on huge amounts of data. Thus, the framework uses Automated Machine Learning (AutoML), to select and tune the ML models, and distributed ML, to deal with Big Data. The modules included in the framework are task detection (to detect classification or regression), data preprocessing, feature selection, model training, and deployment. In this paper, we focus the experiments on the model training module. We first analyze the capabilities of eight AutoML tools: Auto-Gluon, Auto-Keras, Auto-Sklearn, Auto-Weka, H2O AutoML, Rminer, TPOT, and TransmogrifAI. Then, to select the tool for model training, we performed a benchmark with the only two tools that address a distributed ML (H2O AutoML and TransmogrifAI). The experiments used three real-world datasets from the telecommunications domain (churn, event forecasting, and fraud detection), as provided by an analytics company. The experiments allowed us to measure the computational effort and predictive capability of the AutoML tools. Both tools obtained high- quality results and did not present substantial predictive differences. Nevertheless, H2O AutoML was selected by the analytics company for the model training module, since it was considered a more mature technology that presented a more interesting set of features (e.g., integration with more platforms). After choosing H2O AutoML for the ML training, we selected the technologies for the remaining components of the architecture (e.g., data preprocessing and web interface).This work was executed under the project IRMDA - Intelligent Risk Management for the Digital Age, Individual Project, NUP: POCI-01-0247-FEDER-038526, co- funded by the Incentive System for Research and Technological Development, from the Thematic Operational Program Competitiveness of the national framework program - Portugal2020

    Fraud Detection in Financial Services using Machine Learning

    Get PDF
    The banking industry is an important part of modern actions as it manages the movement of funds between different parties. However, this area is synonymous with some cases of fraud where people are being swindled their money, illegal transactions are being made and others. The complexity of ensuring that transactions stay legitimate has since made it almost impossible to regulate fraud in this industry correctly. This report presents an approach that utilizes Machine Learning techniques to build a model that detects fraudulent transactions and flags them. The approach utilizes a dataset that contains a collection of observation points on transactions and which can be useful in understanding the nature of transactions. The data is clearly imbalanced, an issue fixed in the data preparation section of the analysis. The models of choice are CatBoost classifier, Decision Trees classifier and Random forest classifier. The CatBoost classifier performs best, followed by the decision tree and then the Random forest. However, they all present quite high accuracy rates of classification at over 90%. The high accuracy results of the models are indicative of their readiness to use in a real-world setting. This performance means that the likelihood of a fraudulent case passing through is quite low. The recommendation to utilize the approach is to add more variables for better descriptions of fraud transactions and improve the results. It is also possible to improve the results by increasing the size of the dataset or using more models and comparing the results. A combination of models can also turn out to be a good approach for much better results for use in the real world

    Graph Mining for Cybersecurity: A Survey

    Full text link
    The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society. Securing cyberspace has become an utmost concern for organizations and governments. Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities. In recent years, with the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance. It is imperative to summarize existing graph-based cybersecurity solutions to provide a guide for future studies. Therefore, as a key contribution of this paper, we provide a comprehensive review of graph mining for cybersecurity, including an overview of cybersecurity tasks, the typical graph mining techniques, and the general process of applying them to cybersecurity, as well as various solutions for different cybersecurity tasks. For each task, we probe into relevant methods and highlight the graph types, graph approaches, and task levels in their modeling. Furthermore, we collect open datasets and toolkits for graph-based cybersecurity. Finally, we outlook the potential directions of this field for future research

    Artificial Intelligence Adoption in Criminal Incestigations: Challenges and Opportunities for Research

    Get PDF
    Artificial Intelligence (AI) offers the potential to transform organisational decision-making and knowledge-sharing processes that support criminal investigations. Yet, there is still limited evidence-based knowledge concerning the successful use of AI for criminal investigations in literature. This paper identifies the main areas and current dynamics of the adoption of AI in criminal investigations using bibliometric analysis. We synthesise existing research by identifying key themes researchers have delved into on AI in criminal investigations. The themes include crime prediction and human-centred issues relating to AI use in criminal investigations. Finally, the paper elaborates on the challenges that may influence AI adoption in criminal investigations by police professionals. These challenges include possible laggard effects with AI adoption, implementation challenges, lack of government oversight, and a skills gap
    • …
    corecore