31 research outputs found

    Clustering of spam domains using machine learning methods

    Get PDF
    Кластеризація спам-доменів методами машинного навчання // Кваліфікаційна робота освітнього рівня «Магістр» // Грицюк Владислав Петрович// Тернопільський національний технічний університет імені Івана Пулюя, факультет комп’ютерно-інформаційних систем і програмної інженерії, кафедра кібербезпеки, група СБм-61 // Тернопіль, 2022В кваліфікаційній роботі вирішується проблема кластеризації спам доменів з використанням k-means, LSH, групування з метою подальшого застосування при процесі фільтрації різноманітних листів електронної пошти.. В роботі наведено основні методи фільтрації від спаму, а також основні методології їх виникнення. Детально розглянуто основні методи кластеризації, такі як: k-means, групування, ієрархічні методи, дерева, LSH, DBSCAN. Наведено методи оцінки кластеризації. Здійснено кластеризацію спам доменів на основі реального сформованого набору даних з використанням інформації з сайтів Alexa та stopforumspams.com. Здійснено оцінку результату кластеризації з використанням додатково штучно введених функцій при маркуванні набору даних.The qualification work solves the problem of clustering spam domains using k-means, LSH, grouping with the purpose of further application in the process of filtering various e-mails. The work provides the main methods of spam filtering, as well as the main methodologies of their occurrence. The main methods of clustering, such as: k-means, grouping, hierarchical methods, trees, LSH, DBSCAN, are considered in detail. Methods of clustering assessment are presented. Clustering of spam domains was carried out on the basis of a real generated data set using information from the Alexa and stopforumspams.com sites. The result of clustering was evaluated using additionally artificially introduced functions when labeling the data set.ПЕРЕЛІК УМОВНИХ ПОЗНАЧЕНЬ, СИМВОЛІВ, ОДИНИЦЬ, СКОРОЧЕНЬ І ТЕРМІНІВ ....8 ВСТУП....9 1 СПАМ та його основні методи фільтрації ....11 1.1 Спам статистика ....11 1.2 Методи фільтрації спаму....14 1.3 Збір спаму ....18 1.4 Принцип роботи електронної пошти ....19 1.5 Аналіз останній досліджень....22 2 Кластерний аналіз ....25 2.1 Сфери застосування кластерного аналізу ....25 2.2 Алгоритми кластеризації ....27 2.2.1 Групування ....27 2.2.2 k-means ....28 2.2.3 Ієрархічні методи ....30 2.2.4 Дерева k-d ....32 2.2.5 Local-sensitive хешування (LSH) ....33 2.2.6 DBSCAN ....35 2.3 Оцінка результатів кластеризації ....38 3 Кластеризація SPAM-доменів ....40 3.1 Процес спам-кластеризації ....40 3.2 Генерування кластерів ....42 3.2.1 Групування кластерів ....44 3.2.2 LSH ....45 3.2.3 k-means ....47 3.3 Оцінка кластерів ....49 4 ОХОРОНА ПРАЦІ ТА БЕЗПЕКА В НАДЗВИЧАЙНИХ СИТУАЦІЯХ ....54 4.1 Охорона праці....54 4.2 Концепція захисту населення і територій у разі загрози та виникненні надзвичайних ситуацій ....57 ВИСНОВКИ ....62 СПИСОК ЛІТЕРАТУРНИХ ДЖЕРЕЛ ....64 ДОДАТКИ ....6

    PROVIDE: hiding from automated network scans with proofs of identity

    Full text link
    Network scanners are a valuable tool for researchers and administrators, however they are also used by malicious actors to identify vulnerable hosts on a network. Upon the disclosure of a security vulnerability, scans are launched within hours. These opportunistic attackers enumerate blocks of IP addresses in hope of discovering an exploitable host. Fortunately, defensive measures such as port knocking protocols (PKPs) allow a service to remain stealth to unauthorized IP addresses. The service is revealed only when a client includes a special authentication token (AT) in the IP/TCP header. However this AT is generated from a secret shared between the clients/servers and distributed manually to each endpoint. As a result, these defense measures have failed to be widely adopted by other protocols such as HTTP/S due to challenges in distributing the shared secrets. In this paper we propose a scalable solution to this problem for services accessed by domain name. We make the following observation: automated network scanners access servers by IP address, while legitimate clients access the server by name. Therefore a service should only reveal itself to clients who know its name. Based on this principal, we have created a proof of the verifier’s identity (a.k.a. PROVIDE) protocol that allows a prover (legitimate user) to convince a verifier (service) that it is knowledgeable of the verifier’s identity. We present a PROVIDE implementation using a PKP and DNS (PKP+DNS) that uses DNS TXT records to distribute identification tokens (IDT) while DNS PTR records for the service’s domain name are prohibited to prevent reverse DNS lookups. Clients are modified to make an additional DNS TXT query to obtain the IDT which is used by the PKP to generate an AT. The inclusion of an AT in the packet header, generated from the DNS TXT query, is proof the client knows the service’s identity. We analyze the effectiveness of this mechanism with respect to brute force attempts for various strength ATs and discuss practical considerations.This work has been supported by the National Science Foundation (NSF) awards #1430145, #1414119, and #1012798

    Spam message detection based on DNS records and address prefixes

    Get PDF

    Technology Corner: Analysing E-Mail Headers for Forensic Investigation

    Get PDF
    Electronic Mail (E-Mail), which is one of the most widely used applications of Internet, has become a global communication infrastructure service. However, security loopholes in it enable cybercriminals to misuse it by forging its headers or by sending it anonymously for illegitimate purposes, leading to e-mail forgeries. E-mail messages include transit handling envelope and trace information in the form of structured fields which are not stripped after messages are delivered, leaving a detailed record of e-mail transactions. A detailed header analysis can be used to map the networks traversed by messages, including information on the messaging software and patching policies of clients and gateways, etc. Cyber forensic e-mail analysis is employed to collect credible evidence to bring criminals to justice. This paper projects the need for e-mail forensic investigation and lists various methods and tools used for its realization. A detailed header analysis of a multiple tactic spoofed e-mail message is carried out in this paper. It also discusses various possibilities for detection of spoofed headers and identification of its originator. Further, difficulties that may be faced by investigators during forensic investigation of an e-mail message have been discussed along with their possible solutions

    Camouflages and Token Manipulations-The Changing Faces of the Nigerian Fraudulent 419 Spammers

    Full text link
    The inefficiencies of current spam filters against fraudulent (419) mails is not unrelated to the use by spammers of good-word attacks, topic drifts, parasitic spamming, wrong categorization and recategorization of electronic mails by e-mail clients and of course the fuzzy factors of greed and gullibility on the part of the recipients who responds to fraudulent spam mail offers. In this paper, we establish that mail token manipulations remain, above any other tactics, the most potent tool used by Nigerian scammers to fool statistical spam filters. While hoping that the uncovering of these manipulative evidences will prove useful in future antispam research, our findings also sensitize spam filter developers on the need to inculcate within their antispam architecture robust modules that can deal with the identified camouflages

    An Extensible Format for Email Feedback Reports

    Full text link

    That ain’t you: Blocking spearphishing through behavioral modelling

    Get PDF
    One of the ways in which attackers steal sensitive information from corporations is by sending spearphishing emails. A typical spearphishing email appears to be sent by one of the victim’s coworkers or business partners, but has instead been crafted by the attacker. A particularly insidious type of spearphishing emails are the ones that do not only claim to be written by a certain person, but are also sent by that person’s email account, which has been compromised. Spearphishing emails are very dangerous for companies, because they can be the starting point to a more sophisticated attack or cause intellectual property theft, and lead to high financial losses. Currently, there are no effective systems to protect users against such threats. Existing systems leverage adaptations of anti-spam techniques. However, these techniques are often inadequate to detect spearphishing attacks. The reason is that spearphishing has very different characteristics from spam and even traditional phishing. To fight the spearphishing threat, we propose a change of focus in the techniques that we use for detecting malicious emails: instead of looking for features that are indicative of attack emails, we look for emails that claim to have been written by a certain person within a company, but were actually authored by an attacker. We do this by modelling the email-sending behavior of users over time, and comparing any subsequent email sent by their accounts against this model. Our approach can block advanced email attacks that traditional protection systems are unable to detect, and is an important step towards detecting advanced spearphishing attacks

    DANE Trusted Email for Supply Chain Management

    Get PDF
    Supply chain management is critically dependent on trusted email mechanisms that address forgery, confidentiality, and sender authenticity. The IETF protocol ‘Domain Authentication of Named Entities’ (DANE) described in this paper has been extended from its initial goal of providing TLS web site validation to also offer a foundation for globally scalable and interoperable email security. Widespread deployment of DANE will require more than raw technology standards, however. Workflow automation mechanisms will need to emerge in order to simplify the publishing and retrieval of cryptographic credentials that are applicable for general audiences. Security policy enforcement will also need to be addressed. This paper gives a descriptive tutorial of trusted email technologies, shows how DANE solves key distribution logistics, and then suggests desirable automation components that could accelerate deployment of DANE-based trusted email. Pilot deployments are briefly described
    corecore