11 research outputs found

    Fake accounts detection system based on bidirectional gated recurrent unit neural network

    Get PDF
    Online social networks have become the most widely used medium to interact with friends and family, share news and important events or publish daily activities. However, this growing popularity has made social networks a target for suspicious exploitation such as the spreading of misleading or malicious information, making them less reliable and less trustworthy. In this paper, a fake account detection system based on the bidirectional gated recurrent unit (BiGRU) model is proposed. The focus has been on the content of users’ tweets to classify twitter user profile as legitimate or fake. Tweets are gathered in a single file and are transformed into a vector space using the GloVe word embedding technique in order to preserve the semantic and syntax context. Compared with the baseline models such as long short-term memory (LSTM) and convolutional neural networks (CNN), the results are promising and confirm that using GloVe with BiGRU classifier outperforms with 99.44% for accuracy and 99.25% for precision. To prove the efficiency of our approach the results obtained with GloVe were compared to Word2vec under the same conditions. Results confirm that GloVe with BiGRU classifier performs the best results for detection of fake Twitter accounts using only tweets content feature

    A systematic literature review on spam content detection and classification

    Get PDF
    The presence of spam content in social media is tremendously increasing, and therefore the detection of spam has become vital. The spam contents increase as people extensively use social media, i.e ., Facebook, Twitter, YouTube, and E-mail. The time spent by people using social media is overgrowing, especially in the time of the pandemic. Users get a lot of text messages through social media, and they cannot recognize the spam content in these messages. Spam messages contain malicious links, apps, fake accounts, fake news, reviews, rumors, etc. To improve social media security, the detection and control of spam text are essential. This paper presents a detailed survey on the latest developments in spam text detection and classification in social media. The various techniques involved in spam detection and classification involving Machine Learning, Deep Learning, and text-based approaches are discussed in this paper. We also present the challenges encountered in the identification of spam with its control mechanisms and datasets used in existing works involving spam detection

    An improved grey wolf with whale algorithm for optimization functions

    Get PDF
    The Grey Wolf Optimization (GWO) is a nature-inspired, meta-heuristic search optimization algorithm. It follows the social hierarchical structure of a wolf pack and their ability to hunt in packs. Since its inception in 2014, GWO is able to successfully solve several optimization problems and has shown better convergence than the Particle Swarm Optimization (PSO), Gravitational Search Algorithm (GSA), Differential Evolution (DE), and Evolutionary Programming (EP). Despite providing successful solutions to optimization problems, GWO has an inherent problem of poor exploration capability. The position-update equation in GWO mostly relies on the information provided by the previous solutions to generate new candidate solutions which result in poor exploration activity. Therefore, to overcome the problem of poor exploration in the GWO the exploration part of the Whale optimization algorithm (WOA) is integrated in it. The resultant Grey Wolf Whale Optimization Algorithm (GWWOA) offers better exploration ability and is able to solve the optimization problems to find the most optimal solution in search space. The performance of the proposed algorithm is tested and evaluated on five benchmarked unimodal and five multimodal functions. The simulation results show that the proposed GWWOA is able to find a fine balance between exploration and exploitation capabilities during convergence to global minima as compared to the standard GWO and WOA algorithms

    Swarm intelligence-based model for improving prediction performance of low-expectation teams in educational software engineering projects

    Get PDF
    Software engineering is one of the most significant areas, which extensively used in educational and industrial fields. Software engineering education plays an essential role in keeping students up to date with software technologies, products, and processes that are commonly applied in the software industry. The software development project is one of the most important parts of the software engineering course, because it covers the practical side of the course. This type of project helps strengthening students' skills to collaborate in a team spirit to work on software projects. Software project involves the composition of software product and process parts. Software product part represents software deliverables at each phase of Software Development Life Cycle (SDLC) while software process part captures team activities and behaviors during SDLC. The low-expectation teams face challenges during different stages of software project. Consequently, predicting performance of such teams is one of the most important tasks for learning process in software engineering education. The early prediction of performance for low-expectation teams would help instructors to address difficulties and challenges related to such teams at earliest possible phases of software project to avoid project failure. Several studies attempted to early predict the performance for low-expectation teams at different phases of SDLC. This study introduces swarm intelligence -based model which essentially aims to improve the prediction performance for low-expectation teams at earliest possible phases of SDLC by implementing Particle Swarm Optimization-K Nearest Neighbours (PSO-KNN), and it attempts to reduce the number of selected software product and process features to reach higher accuracy with identifying less than 40 relevant features. Experiments were conducted on the Software Engineering Team Assessment and Prediction (SETAP) project dataset. The proposed model was compared with the related studies and the state-of-the-art Machine Learning (ML) classifiers: Sequential Minimal Optimization (SMO), Simple Linear Regression (SLR), Naïve Bayes (NB), Multilayer Perceptron (MLP), standard KNN, and J48. The proposed model provides superior results compared to the traditional ML classifiers and state-of-the-art studies in the investigated phases of software product and process development

    Recipe popularity prediction in Finnish social media by machine learning models

    Get PDF
    Abstract. In recent times, the internet has emerged as a primary source of cooking inspiration, eating experiences and food social gathering with a majority of individuals turning to online recipes, surpassing the usage of traditional cookbooks. However, there is a growing concern about the healthiness of online recipes. This thesis focuses on unraveling the determinants of online recipe popularity by analyzing a dataset comprising more than 5000 recipes from Valio, one of Finland’s leading corporations. Valio’s website serves as a representation of diverse cooking preferences among users in Finland. Through examination of recipe attributes such as nutritional content (energy, fat, salt, etc.), food preparation complexity (cooking time, number of steps, required ingredients, etc.), and user engagement (the number of comments, ratings, sentiment of comments, etc.), we aim to pinpoint the critical elements influencing the popularity of online recipes. Our predictive model-Logistic Regression (classification accuracy and F1 score are 0.93 and 0.9 respectively)- substantiates the existence of pertinent recipe characteristics that significantly influence their rates. The dataset we employ is notably influenced by user engagement features, particularly the number of received ratings and comments. In other words, recipes that garner more attention in terms of comments and ratings tend to have higher rates values (i.e., more popular). Additionally, our findings reveal that a substantial portion of Valio’s recipes falls within the medium health Food Standards Agency (FSA) score range, and intriguingly, recipes deemed less healthy tend to receive higher average ratings from users. This study advances our comprehension of the factors contributing to the popularity of online recipes, providing valuable insights into contemporary cooking preferences in Finland as well as guiding future dietary policy shift.Reseptin suosion ennustaminen suomalaisessa sosiaalisessa mediassa koneoppimismalleilla. Tiivistelmä. Internet on viime aikoina noussut ensisijaiseksi inspiraation lähteeksi ruoanlaitossa, ja suurin osa ihmisistä on siirtynyt käyttämään verkkoreseptejä perinteisten keittokirjojen sijaan. Huoli verkkoreseptien terveellisyydestä on kuitenkin kasvava. Tämä opinnäytetyö keskittyy verkkoreseptien suosioon vaikuttavien tekijöiden selvittämiseen analysoimalla yli 5000 reseptistä koostuvaa aineistoa Suomen johtavalta maitotuoteyritykseltä, Valiolta. Valion verkkosivujen reseptit edustavat monipuolisesti suomalaisten käyttäjien ruoanlaittotottumuksia. Tarkastelemalla reseptin ominaisuuksia, kuten ravintoarvoa (energia, rasva, suola, jne.), valmistuksen monimutkaisuutta (keittoaika, vaiheiden määrä, tarvittavat ainesosat, jne.) ja käyttäjien sitoutumista (kommenttien määrä, arviot, kommenttien mieliala, jne.), pyrimme paikantamaan kriittiset tekijät, jotka vaikuttavat verkkoreseptien suosioon. Ennustava mallimme — Logistic Regression (luokituksen tarkkuus 0,93 ja F1-pisteet 0,9 ) — osoitti merkitsevien reseptiominaisuuksien olemassaolon. Ne vaikuttivat merkittävästi reseptien suosioon. Käyttämiimme tietojoukkoihin vaikuttivat merkittävästi käyttäjien sitoutumisominaisuudet, erityisesti vastaanotettujen arvioiden ja kommenttien määrä. Toisin sanoen reseptit, jotka saivat enemmän huomiota kommenteissa ja arvioissa, olivat yleensä suositumpia. Lisäksi selvisi, että huomattava osa Valion resepteistä kuuluu keskitason terveyspisteiden alueelle (arvioituna FSA Scorella), ja mielenkiintoisesti, vähemmän terveellisiksi katsotut reseptit saavat käyttäjiltä yleensä korkeamman keskiarvon. Tämä tutkimus edistää ymmärrystämme verkkoreseptien suosioon vaikuttavista tekijöistä ja tarjoaa arvokasta näkemystä nykypäivän ruoanlaittotottumuksista Suomessa

    Tune your brown clustering, please

    Get PDF
    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly unexplored. Accordingly, we present information for practitioners on the behaviour of Brown clustering in order to assist hyper-parametre tuning, in the form of a theoretical model of Brown clustering utility. This model is then evaluated empirically in two sequence labelling tasks over two text types. We explore the dynamic between the input corpus size, chosen number of classes, and quality of the resulting clusters, which has an impact for any approach using Brown clustering. In every scenario that we examine, our results reveal that the values most commonly used for the clustering are sub-optimal

    Detección de fallas en cajas de engranajes utilizando el método de aprendizaje de máquinas Support Vector Machine (SVM)

    Get PDF
    El objetivo de esta investigación fue crear un modelo predictivo bajo el enfoque de aprendizaje de máquinas y verificar su efectividad para clasificar y detectar fallas en cajas de engranajes de manera automática, para lo cual se utilizó un conjunto de datos de señales de vibración obtenido del repositorio de Iniciativa de Datos de Energía Abierta (OEDI) del departamento de energía de EE. UU. La creación del modelo se llevó a cabo utilizando el método de aprendizaje de máquinas supervisado Support Vector Machine (SVM) y con la ayuda del software de programación Python, donde se realizó el preprocesamiento y análisis del conjunto de datos. Al conjunto de datos se le extrajo características en el dominio del tiempo y dominio de la frecuencia. Para seleccionar las mejores características se aplicó el método de Eliminación Recursiva de Características con Validación Cruzada (RFECV). Para ingresar al clasificador SVM los datos se dividieron en 70% para entrenamiento y 30% para prueba. Como resultado se obtuvo tres modelos de detección de fallas, un primer modelo donde se utilizó un conjunto de datos recopilados por cuatro acelerómetros bajo una carga de 50%, un segundo modelo donde se combinó los datos recopilados por cuatro acelerómetros y cargas en un rango de 0 a 90% y un tercer modelo utilizando los datos de un solo acelerómetro del modelo dos. Cada modelo se entrenó y probo obteniéndose excelentes resultados, logrando una exactitud de 99,84% y una precisión de 99,82% para el mejor modelo. Los resultados demuestran que el método empleado clasifica y predice fallas con alta exactitud y precisión, siendo un método prometedor y de gran aporte para el mantenimiento industrial. Se recomienda reducir y estandarizar el conjunto de características, de esa forma se consigue reducir la carga computacional y a su vez mejorar el rendimiento del modelo.The objective of this research was to create a predictive model under the machine learning approach and verify its effectiveness to classify and detect faults in gearboxes automatically, for which a data set of vibration signals obtained from the repository was used from the Open Energy Data Initiative (OEDI) of the US Department of Energy. The creation of the model was carried out using the Support Vector Machine (SVM) supervised machine learning method and with the aid of Python programming software, where the preprocessing and analysis of the data set was performed. Features in the time domain and frequency domain were extracted from the data set. To select the best features, the Recursive Features Elimination with Cross Validation (RFECV) method was applied. To enter the SVM classifier, the data was divided into 70% for training and 30% for testing. As a result, three fault detection models were obtained, a first model where a set of data collected by four accelerometers under a load of 50% was produced, a second model where the data collected by four accelerometers and loads in a range of 0 to 90% and a third model using the data from a single accelerometer of model two. Each model was trained and tested obtaining excellent results, achieving an accuracy of 99,84% and a precision of 99,82% for the best model. The results show that the method used classifies and predicts faults with high accuracy and precision, being a promising method and of great contribution to industrial maintenance. It is recommended to reduce and standardize the set of features, in this way it is possible to reduce the computational load and in turn improve the performance of the model

    Algorithmic business and EU law on fair trading

    Get PDF
    This thesis studies how commercial practice is developing with artificial intelligence (AI) technologies and discusses some normative concepts in EU consumer law. The author analyses the phenomenon of 'algorithmic business', which defines the increasing use of data-driven AI in marketing organisations for the optimisation of a range of consumer-related tasks. The phenomenon is orienting business-consumer relations towards some general trends that influence power and behaviors of consumers. These developments are not taking place in a legal vacuum, but against the background of a normative system aimed at maintaining fairness and balance in market transactions. The author assesses current developments in commercial practices in the context of EU consumer law, which is specifically aimed at regulating commercial practices. The analysis is critical by design and without neglecting concrete practices tries to look at the big picture. The thesis consists of nine chapters divided in three thematic parts. The first part discusses the deployment of AI in marketing organisations, a brief history, the technical foundations, and their modes of integration in business organisations. In the second part, a selected number of socio-technical developments in commercial practice are analysed. The following are addressed: the monitoring and analysis of consumers’ behaviour based on data; the personalisation of commercial offers and customer experience; the use of information on consumers’ psychology and emotions, the mediation through marketing conversational applications. The third part assesses these developments in the context of EU consumer law and of the broader policy debate concerning consumer protection in the algorithmic society. In particular, two normative concepts underlying the EU fairness standard are analysed: manipulation, as a substantive regulatory standard that limits commercial behaviours in order to protect consumers’ informed and free choices and vulnerability, as a concept of social policy that portrays people who are more exposed to marketing practices

    Communication Trends in the Post-Literacy Era: Polylingualism, Multimodality and Multiculturalism As Preconditions for New Creativity : monograph

    Full text link
    The monograph presents the research results of the discussion held at the Fifth International Research Conference “Communication trends in the post-literacy era: polylingualism, multimodality and multiculturalism as prerequisites for new creativity” (Ekaterinburg, UrFU, November 26–28, 2020). The book is a result of joint efforts by the research group “Multilingualism and Interculturalism in the Post-Literacy Era”. The research results are presented in the form of sections that consistently reveal the features of modern media culture; its contradictory manifestations associated with both positive and negative consequences of mass media use; the positive role of new media in education during the COVID‑19 pandemic; creative potential of contemporary art and mediation, contemporary art and media environment. The collective monograph will be of interest to researchers in media culture, media education, media art and tools of social networks and new media in modern education, primarily in teaching foreign languages and Russian as a foreign language, in the professional education of journalists and specialists in the field of media communications.Published with the support of RFBR grant 20‑011‑22081 “The Fifth International Research Conference “Communication trends in the post-literacy era: polylingualism, multimodality and multiculturalism as prerequisites for new creativity”
    corecore