983 research outputs found

    CEAI: CCM based Email Authorship Identification Model

    Full text link
    In this paper we present a model for email authorship identification (EAI) by employing a Cluster-based Classification (CCM) technique. Traditionally, stylometric features have been successfully employed in various authorship analysis tasks; we extend the traditional feature-set to include some more interesting and effective features for email authorship identification (e.g. the last punctuation mark used in an email, the tendency of an author to use capitalization at the start of an email, or the punctuation after a greeting or farewell). We also included Info Gain feature selection based content features. It is observed that the use of such features in the authorship identification process has a positive impact on the accuracy of the authorship identification task. We performed experiments to justify our arguments and compared the results with other base line models. Experimental results reveal that the proposed CCM-based email authorship identification model, along with the proposed feature set, outperforms the state-of-the-art support vector machine (SVM)-based models, as well as the models proposed by Iqbal et al. [1, 2]. The proposed model attains an accuracy rate of 94% for 10 authors, 89% for 25 authors, and 81% for 50 authors, respectively on Enron dataset, while 89.5% accuracy has been achieved on authors' constructed real email dataset. The results on Enron dataset have been achieved on quite a large number of authors as compared to the models proposed by Iqbal et al. [1, 2]

    A systematic survey of online data mining technology intended for law enforcement

    Get PDF
    As an increasing amount of crime takes on a digital aspect, law enforcement bodies must tackle an online environment generating huge volumes of data. With manual inspections becoming increasingly infeasible, law enforcement bodies are optimising online investigations through data-mining technologies. Such technologies must be well designed and rigorously grounded, yet no survey of the online data-mining literature exists which examines their techniques, applications and rigour. This article remedies this gap through a systematic mapping study describing online data-mining literature which visibly targets law enforcement applications, using evidence-based practices in survey making to produce a replicable analysis which can be methodologically examined for deficiencies

    Visualizing Instant Messaging Author Writeprints for Forensic Analysis

    Get PDF
    As cybercrime continues to increase, new cyber forensics techniques are needed to combat the constant challenge of Internet anonymity. In instant messaging (IM) communications, criminals use virtual identities to hide their true identity, which hinders social accountability and facilitates cybercrime. Current instant messaging products are not addressing the anonymity and ease of impersonation over instant messaging. It is necessary to have IM cyber forensics techniques to assist in identifying cyber criminals as part of the criminal investigation. Instant messaging behavioral biometrics include online writing habits, which may be used to create an author writeprint to assist in identifying an author of a set of instant messages. The writeprint is a digital fingerprint that represents an author’s distinguishing stylometric features that occur in his/her computer-mediated communications. Writeprints can provide cybercrime investigators a unique tool for analyzing IMassisted cybercrimes. The analysis of IM author writeprints in this paper provides a foundation for using behavioral biometrics as a cyber forensics element of criminal investigations. This paper demonstrates a method to create and analyze behavioral biometrics-based instant messaging writeprints as cyber forensics input for cybercrime investigations. The research uses the Principal Component Analysis (PCA) statistical method to analyze IM conversation logs from two distinct data sets to visualize authorship identification. Keywords: writeprints, authorship attribution, authorship identification, principal component analysi

    Forensics Writer Identification using Text Mining and Machine Learning

    Get PDF
    Constant technological growth has resulted in the danger and seriousness of cyber-attacks, which has recently unmistakably developed in various institutions that have complex Information Technology (IT) infrastructure. For instance, for the last three (3) years, the most horrendous instances of cybercrimes were perceived globally with enormous information breaks, fake news spreading, cyberbullying, crypto-jacking, and cloud computing services. To this end, various agencies improvised techniques to curb this vice and bring perpetrators, both real and perceived, to book in relation to such serious cybersecurity issues. Consequently, Forensic Writer Identification was introduced as one of the most effective remedies to the concerned issue through a stylometry application. Indeed, the Forensic Writer Identification is a complex forensic science technology that utilizes Artificial Intelligence (AI) technology to safeguard, recognize proof, extraction, and documentation of the computer or digital explicit proof that can be utilized by the official courtroom, especially, the investigative officers in case of a criminal issue or just for data analytics. This research\u27s fundamental objective was to scrutinize Forensic Writer Identification technology aspects in twitter authorship analytics of various users globally and apply it to reduce the time to find criminals by providing the Police with the most accurate methodology. As well as compare the accuracy of different techniques. The report shall analytically follow a logical literature review that observes the vital text analysis techniques. Additionally, the research applied agile text mining methodology to extract and analyze various Twitter users\u27 texts. In essence, digital exploration for appropriate academics and scholarly artifacts was affected in various online and offline databases to expedite this research. Forensic Writer Identification for text extraction, analytics have recently appreciated reestablished attention, with extremely encouraging outcomes. In fact, this research presents an overall foundation and reason for text and author identification techniques. Scope of current techniques and applications are given, additionally tending to the issue of execution assessment. Results on various strategies are summed up, and a more inside and out illustration of two consolidated methodologies are introduced. By encompassing textural, algorithms, and allographic, emerging technologies are beginning to show valuable execution levels. Nevertheless, user acknowledgment would play a vital role with regards to the future of technology. To this end, the goal of coming up with a project proposal was to come up with an analytical system that would automate the process of authorship identification methodology in various Web 2.0 Technologies aspects globally, hence addressing the contemporary cybercrime issues

    Atribuição de autoria em micro-mensagens

    Get PDF
    Orientadores: Ariadne Maria Brito Rizzoni Carvalho, Anderson de Rezende RochaDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação CientíficaResumo: Com o crescimento continuo do uso de midias sociais, a atribuição de autoria tem um papel imortante na prevenção dos crimes cibernéticos e na análise de rastros online deixados por assediadores, \textit{bullies}, ladrões de identidade entre outros. Nesta dissertação, nós propusemos um método para atribuição de autoria que é de cem a mil vezes mais rápido que o estado da arte. Nós também obtivemos uma acurácia 65\% na classificação de 50 autores. O método proposto se baseia numa representação de caracteristicas escalável utilizando os padrões das mensagens dos micro-blogs, e também nos utilizamos de um classificador de padrões customizado para lidar com grandes quantidades de dados e alta dimensionalidade. Por fim, nós discutimos a redução do espaço de busca na análise de centenas de suspeitos online e milões de micro mensagens online, o que torna essa abordagem valiosa para forense digital e aplicação das leisAbstract: With the ever-growing use of social media, authorship attribution plays an important role in avoiding cybercrime, and helping the analysis of online trails left behind by cyber pranks, stalkers, bullies, identity thieves and alike. In this dissertation, we propose a method for authorship attribution in micro blogs with efficiency one hundred to a thousand times faster than state-of-the-art counterparts. We also achieved a accuracy of 65% when classifying texts from 50 authors. The method relies on a powerful and scalable feature representation approach taking advantage of user patterns on micro-blog messages, and also on a custom-tailored pattern classifier adapted to deal with big data and high-dimensional data. Finally, we discuss search space reduction when analysing hundreds of online suspects and millions of online micro messages, which makes this approach invaluable for digital forensics and law enforcementMestradoCiência da ComputaçãoMestre em Ciência da Computaçã

    Analysis of behavior of automatic learning algorithms to identify criminal messages

    Get PDF
    In this type of explanation, strictly economic or criminal motives predominate: mainly the control of routes and places, and the punishment of desertion or treason. The precarious and fragmentary nature of the public discourse of drug traffickers as well as the preponderance of police narratives has concealed the strictly political dimension of "criminal" violence in Colombia. In pragmatic terms, organized crime and politics are more similar than we would like to assume. They have in common the objective of dominating territories, resources and populations; both tend to stand as a system of "parasitic intermediation". Both mafias and the state offer "protection" in exchange for payment of fees, reward loyalty and punish treason. It is the discursive acts that accompany violence and the series of institutional procedures in which they are registered that allow us to draw the line between the political and the criminal, the legitimate and the illegitimate, the just and the unjust. In Colombia, that border has lost clarity. In this study, an analysis of narco-messages found in banners, social networks and other databases is carried out by applying data mining, in order to propose a geospatial model through which it is possible to identify and geographically distribute the authors of the messages

    Detecting the Usage of Vulgar Words in Cyberbully Activities from Twitter

    Get PDF
    Nowadays, nearly all people utilize the device which is connected to Internet. People are accustomed to the use information technology devices in their daily life to interact with other people. Currently, many social media platforms such as Facebook, Twitter, Instagram, and YouTube are becoming popular. This study selected Twitter platforms, which is started to gain popularity. By the rapid growth of users signing up for Twitter accounts, at the same time, cybercrime started to bloom each year in social media platforms. Cyberbully is one of the cybercrime practices which had caused a significant impact on the targeted victims. The victims experienced social pressure, which they need to bear each day while the bullies stayed free behind the veil of anonymity. This study aims to identify the common vulgar words used by the cyberbullies on Twitter. Also, this study is subject to produce essential features of Twitter based on the collected tweets. The evaluation in this study includes the occurrences of the vulgar word perpetrated by the cyberbullies from Twitter. This study detected the usage of vulgar words in cyberbully activities on Twitter platform. A list of vulgar words were extracted and evaluated from a corpus of 50 Twitter users who posted a various number of tweets. The vulgar words detection in the tweets enable the tracking process of the cyberbully activities. In the evaluation section, we discussed how the usage of the vulgar words would define the user’s earnestness in doing the cyberbully activities in the Twitter. This study shows there are users with a low number of tweets have a high number of vulgar words occurrences, while other users with high numbers of tweets but less number of vulgar words occurrences. The information collected in this study is expected to assist marking users with a high number of vulgar words occurrences who tend to have high possibilities in doing cyber-bully activities

    Análisis forense digital y su papel en la promoción del enjuiciamiento penal

    Get PDF
    Digital forensics is essentially synonymous with computer forensics, but the term "digital forensics" is generally used for the technical review of all devices that have the ability to store data. Today, digital criminology is challenged in cloud computing. The first problem is to understand why and how criminal and social actions are so unique and complex. The second problem is the lack of accurate scientific tools for forensic medicine in cyberspace. So far, no complete tools or explanations for criminology have been provided in the virtual infrastructure, and no training for security researchers has been provided in detail. Therefore, the author of the present descriptive-analytical research is based on library resources and using fish taking tools. To investigate suspicious cases related to cyberspace, criminologists must be well-equipped with technical and legal issues to deal with. In this article, we analyze digital criminology and its role in judicial law. The benefit of computer forensic knowledge is not only an indispensable necessity for security and judicial institutions, but also professional users and owners of computer systems, systems and networks must be fully aware of and properly comply with its legal and technical requirements.El análisis forense digital es esencialmente sinónimo de análisis forense informático, pero el término "análisis forense digital" se utiliza generalmente para la revisión técnica de todos los dispositivos que tienen la capacidad de almacenar datos. Hoy en día, la criminología digital se enfrenta al desafío de la computación en la nube. El primer problema es comprender por qué y cómo las acciones criminales y sociales son tan únicas y complejas. El segundo problema es la falta de herramientas científicas precisas para la medicina forense en el ciberespacio. Hasta ahora, no se han proporcionado herramientas completas o explicaciones para la criminología en la infraestructura virtual, y no se ha proporcionado ninguna formación detallada a los investigadores de seguridad. Por lo tanto, el autor de la presente investigación descriptivo-analítica se basa en los recursos de la biblioteca y en el uso de herramientas de pesca. Para investigar casos sospechosos relacionados con el ciberespacio, los criminólogos deben estar bien equipados con los problemas técnicos y legales que abordar. En este artículo analizamos la criminología digital y su papel en el derecho judicial. El beneficio del conocimiento forense informático no solo es una necesidad indispensable para las instituciones de seguridad y judiciales, sino que también los usuarios profesionales y propietarios de sistemas, sistemas y redes informáticas deben conocer y cumplir debidamente sus requisitos legales y técnicos
    • …
    corecore