76 research outputs found

    A novel hybrid approach of SVM combined with NLP and probabilistic neural network for email phishing

    Get PDF
    Phishing attacks are one of the slanting cyber-attacks that apply socially engineered messages that are imparted to individuals from expert hackers going for tricking clients to uncover their delicate data, the most mainstream correspondence channel to those messages is through clients' emails. Phishing has turned into a generous danger for web clients and a noteworthy reason for money related misfortunes. Therefore, different arrangements have been created to handle this issue. Deceitful emails, also called phishing emails, utilize a scope of impact strategies to convince people to react, for example, promising a fiscal reward or summoning a feeling of criticalness. Regardless of far reaching alerts and intends to instruct clients to distinguish phishing sends, these are as yet a pervasive practice and a worthwhile business. The creators accept that influence, as a style of human correspondence intended to impact others, has a focal job in fruitful advanced tricks. Cyber criminals have ceaselessly propelling their techniques for assault. The current strategies to recognize the presence of such malevolent projects and to keep them from executing are static, dynamic and hybrid analysis. In this work we are proposing a hybrid methodology for phishing detection incorporating feature extraction and classification of the mails using SVM. At last, alongside the chose features, the PNN characterizes the spam mails from the genuine mails with more exactness and accuracy

    Exploring Text Mining and Analytics for Applications in Public Security: An in-depth dive into a systematic literature review

    Get PDF
    Text mining and related analytics emerge as a technological approach to support human activities in extracting useful knowledge through texts in several formats. From a managerial point of view, it can help organizations in planning and decision-making processes, providing information that was not previously evident through textual materials produced internally or even externally. In this context, within the public/governmental scope, public security agencies are great beneficiaries of the tools associated with text mining, in several aspects, from applications in the criminal area to the collection of people's opinions and sentiments about the actions taken to promote their welfare. This article reports details of a systematic literature review focused on identifying the main areas of text mining application in public security, the most recurrent technological tools, and future research directions. The searches covered four major article bases (Scopus, Web of Science, IEEE Xplore, and ACM Digital Library), selecting 194 materials published between 2014 and the first half of 2021, among journals, conferences, and book chapters. There were several findings concerning the targets of the literature review, as presented in the results of this article

    Text Style Transfer: A Review and Experimental Evaluation

    Full text link
    The stylistic properties of text have intrigued computational linguistics researchers in recent years. Specifically, researchers have investigated the Text Style Transfer (TST) task, which aims to change the stylistic properties of the text while retaining its style independent content. Over the last few years, many novel TST algorithms have been developed, while the industry has leveraged these algorithms to enable exciting TST applications. The field of TST research has burgeoned because of this symbiosis. This article aims to provide a comprehensive review of recent research efforts on text style transfer. More concretely, we create a taxonomy to organize the TST models and provide a comprehensive summary of the state of the art. We review the existing evaluation methodologies for TST tasks and conduct a large-scale reproducibility study where we experimentally benchmark 19 state-of-the-art TST algorithms on two publicly available datasets. Finally, we expand on current trends and provide new perspectives on the new and exciting developments in the TST field

    Leveraging Longitudinal Data for Personalized Prediction and Word Representations

    Full text link
    This thesis focuses on personalization, word representations, and longitudinal dialog. We first look at users expressions of individual preferences. In this targeted sentiment task, we find that we can improve entity extraction and sentiment classification using domain lexicons and linear term weighting. This task is important to personalization and dialog systems, as targets need to be identified in conversation and personal preferences affect how the system should react. Then we examine individuals with large amounts of personal conversational data in order to better predict what people will say. We consider extra-linguistic features that can be used to predict behavior and to predict the relationship between interlocutors. We show that these features improve over just using message content and that training on personal data leads to much better performance than training on a sample from all other users. We look not just at using personal data for these end-tasks, but also constructing personalized word representations. When we have a lot of data for an individual, we create personalized word embeddings that improve performance on language modeling and authorship attribution. When we have limited data, but we have user demographics, we can instead construct demographic word embeddings. We show that these representations improve language modeling and word association performance. When we do not have demographic information, we show that using a small amount of data from an individual, we can calculate similarity to existing users and interpolate or leverage data from these users to improve language modeling performance. Using these types of personalized word representations, we are able to provide insight into what words vary more across users and demographics. The kind of personalized representations that we introduce in this work allow for applications such as predictive typing, style transfer, and dialog systems. Importantly, they also have the potential to enable more equitable language models, with improved performance for those demographic groups that have little representation in the data.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167971/1/cfwelch_1.pd

    Understanding, Analyzing and Predicting Online User Behavior

    Get PDF
    abstract: Due to the growing popularity of the Internet and smart mobile devices, massive data has been produced every day, particularly, more and more users’ online behavior and activities have been digitalized. Making a better usage of the massive data and a better understanding of the user behavior become at the very heart of industrial firms as well as the academia. However, due to the large size and unstructured format of user behavioral data, as well as the heterogeneous nature of individuals, it leveled up the difficulty to identify the SPECIFIC behavior that researchers are looking at, HOW to distinguish, and WHAT is resulting from the behavior. The difference in user behavior comes from different causes; in my dissertation, I am studying three circumstances of behavior that potentially bring in turbulent or detrimental effects, from precursory culture to preparatory strategy and delusory fraudulence. Meanwhile, I have access to the versatile toolkit of analysis: econometrics, quasi-experiment, together with machine learning techniques such as text mining, sentiment analysis, and predictive analytics etc. This study creatively leverages the power of the combined methodologies, and apply it beyond individual level data and network data. This dissertation makes a first step to discover user behavior in the newly boosting contexts. My study conceptualize theoretically and test empirically the effect of cultural values on rating and I find that an individualist cultural background are more likely to lead to deviation and more expression in review behaviors. I also find evidence of strategic behavior that users tend to leverage the reporting to increase the likelihood to maximize the benefits. Moreover, it proposes the features that moderate the preparation behavior. Finally, it introduces a unified and scalable framework for delusory behavior detection that meets the current needs to fully utilize multiple data sources.Dissertation/ThesisDoctoral Dissertation Business Administration 201

    Cultural Heritage Storytelling, Engagement and Management in the Era of Big Data and the Semantic Web

    Get PDF
    The current Special Issue launched with the aim of further enlightening important CH areas, inviting researchers to submit original/featured multidisciplinary research works related to heritage crowdsourcing, documentation, management, authoring, storytelling, and dissemination. Audience engagement is considered very important at both sites of the CH production–consumption chain (i.e., push and pull ends). At the same time, sustainability factors are placed at the center of the envisioned analysis. A total of eleven (11) contributions were finally published within this Special Issue, enlightening various aspects of contemporary heritage strategies placed in today’s ubiquitous society. The finally published papers are related but not limited to the following multidisciplinary topics:Digital storytelling for cultural heritage;Audience engagement in cultural heritage;Sustainability impact indicators of cultural heritage;Cultural heritage digitization, organization, and management;Collaborative cultural heritage archiving, dissemination, and management;Cultural heritage communication and education for sustainable development;Semantic services of cultural heritage;Big data of cultural heritage;Smart systems for Historical cities – smart cities;Smart systems for cultural heritage sustainability

    Archives, Access and Artificial Intelligence: Working with Born-Digital and Digitized Archival Collections

    Get PDF
    Digital archives are transforming the Humanities and the Sciences. Digitized collections of newspapers and books have pushed scholars to develop new, data-rich methods. Born-digital archives are now better preserved and managed thanks to the development of open-access and commercial software. Digital Humanities have moved from the fringe to the center of academia. Yet, the path from the appraisal of records to their analysis is far from smooth. This book explores crossovers between various disciplines to improve the discoverability, accessibility, and use of born-digital archives and other cultural assets

    Archives, Access and Artificial Intelligence

    Get PDF
    Digital archives are transforming the Humanities and the Sciences. Digitized collections of newspapers and books have pushed scholars to develop new, data-rich methods. Born-digital archives are now better preserved and managed thanks to the development of open-access and commercial software. Digital Humanities have moved from the fringe to the center of academia. Yet, the path from the appraisal of records to their analysis is far from smooth. This book explores crossovers between various disciplines to improve the discoverability, accessibility, and use of born-digital archives and other cultural assets
    • …
    corecore