    DNA-inspired online behavioral modeling and its application to spambot detection

    We propose a strikingly novel, simple, and effective approach to model online user behavior: we extract and analyze digital DNA sequences from user online actions and we use Twitter as a benchmark to test our proposal. We obtain an incisive and compact DNA-inspired characterization of user actions. Then, we apply standard DNA analysis techniques to discriminate between genuine and spambot accounts on Twitter. An experimental campaign supports our proposal, showing its effectiveness and viability. To the best of our knowledge, we are the first ones to identify and adapt DNA-inspired techniques to online user behavioral modeling. While Twitter spambot detection is a specific use case on a specific social media, our proposed methodology is platform and technology agnostic, hence paving the way for diverse behavioral characterization tasks

    Detecting Social Spamming on Facebook Platform

    TĂ€napĂ€eval toimub vĂ€ga suur osa kommunikatsioonist elektroonilistes suhtlusvĂ”rgustikes. Ühest kĂŒljest lihtsustab see omavahelist suhtlemist ja uudiste levimist, teisest kĂŒljest loob see ideaalse pinnase sotsiaalse rĂ€mpsposti levikuks. Rohkem kui kahe miljardi kasutajaga Facebooki platvorm on hetkel rĂ€mpsposti levitajate ĂŒks pĂ”hilisi sihtmĂ€rke. Platvormi kasutajad puutuvad igapĂ€evaselt kokku ohtude ja ebameeldivustega nagu pahavara levitavad lingid, vulgaarsused, vihakĂ”ned, kĂ€ttemaksuks levitatav porno ja muu. Kuigi uurijad on esitanud erinevaid tehnikaid sotsiaalmeedias rĂ€mpspostituste vĂ€hendamiseks, on neid rakendatud eelkĂ”ige Twitteri platvormil ja vaid vĂ€hesed on seda teinud Facebookis. Pidevalt arenevate rĂ€mpspostitusmeetoditega vĂ”itlemiseks tuleb vĂ€lja töötada jĂ€rjest uusi rĂ€mpsposti avastamise viise. KĂ€esolev magistritöö keskendub Facebook platvormile, kuhu on lĂ”putöö raames paigutatud kĂŒmme „meepurki” (ingl honeypot), mille abil mÀÀratakse kindlaks vĂ€ljakutsed rĂ€mpsposti tuvastamisel, et pakkuda tĂ”husamaid lahendusi. Kasutades kĂ”iki sisendeid, kaasa arvatud varem mujal sotsiaalmeedias testitud meetodid ja informatsioon „meepurkidest”, luuakse andmekaeve ja masinĂ”ppe meetoditele tuginedes klassifikaator, mis suudab eristada rĂ€mpspostitaja profiili tavakasutaja profiilist. Nimetatu saavutamiseks vaadeldakse esmalt peamisi vĂ€ljakutseid ja piiranguid rĂ€mpsposti tuvastamisel ning esitletakse varasemalt tehtud uuringuid koos tulemustega. SeejĂ€rel kirjeldatakse rakenduslikku protsessi, alustades „meepurgi” ehitusest, andmete kogumisest ja ettevalmistamisest kuni klassifikaatori ehitamiseni. LĂ”puks esitatakse „meepurkidelt” saadud vaatlusandmed koos klassifikaatori tulemustega ning vĂ”rreldakse neid uurimistöödega teiste sotsiaalmeedia platvormide kohta. Selle lĂ”putöö peamine panus on klassifikaator, mis suudab eristada Facebooki kasutaja profiilid spĂ€mmerite omast. Selle lĂ”putöö originaalsus seisneb eesmĂ€rgis avastada erinevat sotsiaalset spĂ€mmi, mitte ainult pahavara levitajaid vaid ka neid, kes levitavad roppust, massiliselt sĂ”numeid, heakskiitmata sisu jne.OSNs (Online Social Networks) are dominating the human interaction nowadays, easing the communication and spreading of news on one hand and providing a global fertile soil to grow all different kinds of social spamming, on the other. Facebook platform, with its 2 billions current active users, is currently on the top of the spammers' targets. Its users are facing different kind of social threats everyday, including malicious links, profanity, hate speech, revenge porn and others. Although many researchers have presented their different techniques to defeat spam on social media, specially on Twitter platform, very few have targeted Facebook's.To fight the continuously evolving spam techniques, we have to constantly develop and enhance the spam detection methods. This research digs deeper in the Facebook platform, through 10 implemented honeypots, to state the challenges that slow the spam detection process, and ways to overcome it. Using all the given inputs, including the previous techniques tested on other social medias along with observations driven from the honeypots, the final product is a classifier that distinguish the spammer profiles from legitimate ones through data mining and machine learning techniques. To achieve this, the research first overviews the main challenges and limitations that obstruct the spam detection process, and presents the related researches with their results. It then, outlines the implementation steps, from the honeypot construction step, passing through the data collection and preparation and ending by building the classifier itself. Finally, it presents the observations driven from the honeypot and the results from the classifier and validates it against the results from previous researches on different social platforms. The main contribution of this thesis is the end classifier which will be able to distinguish between the legitimate Facebook profiles and the spammer ones. The originality of the research lies in its aim to detect all kind of social spammers, not only the spreading-malware spammers, but also spamming in its general context, e.g. the ones spreading profanity, bulk messages and unapproved contents

    Survey of review spam detection using machine learning techniques

