3,951 research outputs found
Dark Web Activity Classification Using Deep Learning
In contemporary times, people rely heavily on the internet and search engines
to obtain information, either directly or indirectly. However, the information
accessible to users constitutes merely 4% of the overall information present on
the internet, which is commonly known as the surface web. The remaining
information that eludes search engines is called the deep web. The deep web
encompasses deliberately hidden information, such as personal email accounts,
social media accounts, online banking accounts, and other confidential data.
The deep web contains several critical applications, including databases of
universities, banks, and civil records, which are off-limits and illegal to
access. The dark web is a subset of the deep web that provides an ideal
platform for criminals and smugglers to engage in illicit activities, such as
drug trafficking, weapon smuggling, selling stolen bank cards, and money
laundering. In this article, we propose a search engine that employs deep
learning to detect the titles of activities on the dark web. We focus on five
categories of activities, including drug trading, weapon trading, selling
stolen bank cards, selling fake IDs, and selling illegal currencies. Our aim is
to extract relevant images from websites with a ".onion" extension and identify
the titles of websites without images by extracting keywords from the text of
the pages. Furthermore, we introduce a dataset of images called Darkoob, which
we have gathered and used to evaluate our proposed method. Our experimental
results demonstrate that the proposed method achieves an accuracy rate of 94%
on the test dataset.Comment: 11 pages , 16 figures , 2 tables , New Dataset For DarkWeb Activity
Classificatio
Recommended from our members
State-of-the-art on research and applications of machine learning in the building life cycle
Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science
Machine Learning in Automated Text Categorization
The automated categorization (or classification) of texts into predefined
categories has witnessed a booming interest in the last ten years, due to the
increased availability of documents in digital form and the ensuing need to
organize them. In the research community the dominant approach to this problem
is based on machine learning techniques: a general inductive process
automatically builds a classifier by learning, from a set of preclassified
documents, the characteristics of the categories. The advantages of this
approach over the knowledge engineering approach (consisting in the manual
definition of a classifier by domain experts) are a very good effectiveness,
considerable savings in terms of expert manpower, and straightforward
portability to different domains. This survey discusses the main approaches to
text categorization that fall within the machine learning paradigm. We will
discuss in detail issues pertaining to three different problems, namely
document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey
Automating the construction of scene classifiers for content-based video retrieval
This paper introduces a real time automatic scene classifier within content-based video retrieval. In our envisioned approach end users like documentalists, not image processing experts, build classifiers interactively, by simply indicating positive examples of a scene. Classification consists of a two stage procedure. First, small image fragments called patches are classified. Second, frequency vectors of these patch classifications are fed into a second classifier for global scene classification (e.g., city, portraits, or countryside). The first stage classifiers can be seen as a set of highly specialized, learned feature detectors, as an alternative to letting an image processing expert determine features a priori. We present results for experiments on a variety of patch and image classes. The scene classifier has been used successfully within television archives and for Internet porn filtering
Data-driven and production-oriented tendering design using artificial intelligence
Construction projects are facing an increase in requirements since the projects are getting larger, more technology is integrated into the buildings, and new sustainability and CO2 equivalent emissions requirements are introduced. As a result, requirement management quickly gets overwhelming, and instead of having systematic requirement management, the construction industry tends to trust craftsmanship. One method for a more systematic requirement management approach successful in other industries is the systems engineering approach, focusing on requirement decomposition and linking proper verifications and validations. This research project explores if a systems engineering approach, supported by natural language processing techniques, can enable more systematic requirement management in construction projects and facilitate knowledge transfer from completed projects to new tendering projects.The first part of the project explores how project requirements can be extracted, digitised, and analysed in an automated way and how this can benefit the tendering specialists. The study is conducted by first developing a work support tool targeting tendering specialists and then evaluating the challenges and benefits of such a tool through a workshop and surveys. The second part of the project explores inspection data generated in production software as a requirement and quality verification method. First, a dataset containing over 95000 production issues is examined to understand the data quality level of standardisation. Second, a survey addressing production specialists evaluates the current benefits of digital inspection reporting. Third, future benefits of using inspection data for knowledge transfers are explored by applying the Knowledge Discovery in Databases method and clustering techniques. The results show that applying natural language processing techniques can be a helpful tool for analysing construction project requirements, facilitating the identification of essential requirements, and enabling benchmarking between projects. The results from the clustering process suggested in this thesis show that inspection data can be used as a knowledge base for future projects and quality improvement within a project-based organisation. However, higher data quality and standardisation would benefit the knowledge-generation process.This research project provides insights into how artificial intelligence can facilitate knowledge transfer, enable data-informed design choices in tendering projects, and automate the requirements analysis in construction projects as a possible step towards more systematic requirements management
- …