3,951 research outputs found

    Dark Web Activity Classification Using Deep Learning

    Full text link
    In contemporary times, people rely heavily on the internet and search engines to obtain information, either directly or indirectly. However, the information accessible to users constitutes merely 4% of the overall information present on the internet, which is commonly known as the surface web. The remaining information that eludes search engines is called the deep web. The deep web encompasses deliberately hidden information, such as personal email accounts, social media accounts, online banking accounts, and other confidential data. The deep web contains several critical applications, including databases of universities, banks, and civil records, which are off-limits and illegal to access. The dark web is a subset of the deep web that provides an ideal platform for criminals and smugglers to engage in illicit activities, such as drug trafficking, weapon smuggling, selling stolen bank cards, and money laundering. In this article, we propose a search engine that employs deep learning to detect the titles of activities on the dark web. We focus on five categories of activities, including drug trading, weapon trading, selling stolen bank cards, selling fake IDs, and selling illegal currencies. Our aim is to extract relevant images from websites with a ".onion" extension and identify the titles of websites without images by extracting keywords from the text of the pages. Furthermore, we introduce a dataset of images called Darkoob, which we have gathered and used to evaluate our proposed method. Our experimental results demonstrate that the proposed method achieves an accuracy rate of 94% on the test dataset.Comment: 11 pages , 16 figures , 2 tables , New Dataset For DarkWeb Activity Classificatio

    Machine Learning in Automated Text Categorization

    Full text link
    The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

    Automating the construction of scene classifiers for content-based video retrieval

    Get PDF
    This paper introduces a real time automatic scene classifier within content-based video retrieval. In our envisioned approach end users like documentalists, not image processing experts, build classifiers interactively, by simply indicating positive examples of a scene. Classification consists of a two stage procedure. First, small image fragments called patches are classified. Second, frequency vectors of these patch classifications are fed into a second classifier for global scene classification (e.g., city, portraits, or countryside). The first stage classifiers can be seen as a set of highly specialized, learned feature detectors, as an alternative to letting an image processing expert determine features a priori. We present results for experiments on a variety of patch and image classes. The scene classifier has been used successfully within television archives and for Internet porn filtering

    Data-driven and production-oriented tendering design using artificial intelligence

    Get PDF
    Construction projects are facing an increase in requirements since the projects are getting larger, more technology is integrated into the buildings, and new sustainability and CO2 equivalent emissions requirements are introduced. As a result, requirement management quickly gets overwhelming, and instead of having systematic requirement management, the construction industry tends to trust craftsmanship. One method for a more systematic requirement management approach successful in other industries is the systems engineering approach, focusing on requirement decomposition and linking proper verifications and validations. This research project explores if a systems engineering approach, supported by natural language processing techniques, can enable more systematic requirement management in construction projects and facilitate knowledge transfer from completed projects to new tendering projects.The first part of the project explores how project requirements can be extracted, digitised, and analysed in an automated way and how this can benefit the tendering specialists. The study is conducted by first developing a work support tool targeting tendering specialists and then evaluating the challenges and benefits of such a tool through a workshop and surveys. The second part of the project explores inspection data generated in production software as a requirement and quality verification method. First, a dataset containing over 95000 production issues is examined to understand the data quality level of standardisation. Second, a survey addressing production specialists evaluates the current benefits of digital inspection reporting. Third, future benefits of using inspection data for knowledge transfers are explored by applying the Knowledge Discovery in Databases method and clustering techniques. The results show that applying natural language processing techniques can be a helpful tool for analysing construction project requirements, facilitating the identification of essential requirements, and enabling benchmarking between projects. The results from the clustering process suggested in this thesis show that inspection data can be used as a knowledge base for future projects and quality improvement within a project-based organisation. However, higher data quality and standardisation would benefit the knowledge-generation process.This research project provides insights into how artificial intelligence can facilitate knowledge transfer, enable data-informed design choices in tendering projects, and automate the requirements analysis in construction projects as a possible step towards more systematic requirements management
    corecore